2009/2/24 Dave Kempe <[email protected]>: > Hi, > I need to checksum recursively alot of data, and store the checksums in a > database. I can do most via a shell script, but was wondering if anyone > could recommend a checksumming tool that was the fastest. > I know about md5sum, sha1sum, cfv (not recursive enough). I want to be able > to produce a checksum of many files (2.1TB worth) for verification against > other copies of the files in various locations. I need the fastest available > algorithm, not necessarily the most secure etc. > Any suggestions?
I'm no expert but I think md4 is considered very weak but also faster than other hash algorithms. It is therefore used where security is less of a concern (e.g. to checksum data which is already signed by stronger algorithms). According to its wikipedia article rsync uses it. openssl comes with md4 so you can do, for instance, "openssl md4 /etc/passwd". Try comparing the relative performance by replacing "md4" by "md5". On my system, I ran it multiple times on a 892Mb file and once the file was all cached in memory md4 persistently ran for 2.06 seconds elapsed time on it while md5 settled at 3.2 seconds. That's a 35% speedup compared to md5. Maybe there are faster algorithms around. --Amos -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
