On Thu, 30 Dec 2004 18:02:07 -0500, Aaron Sherman <[EMAIL PROTECTED]> wrote: > On Wed, 2004-12-29 at 18:10, Ben Tilly wrote: > > > Under normal circumstances, to get non-miniscule odds of having > > a collision somewhere between MD5 keys, you'd need about 2**64 > > keys. If you have less than, say, a billion keys then you can ignore > > that possibility for all practical intents and purposes. > > I understand risk assessment and the idea that nothing is 100% safe, but > when you have a situation where you KNOW from day one that some keys > will collide, and your data will be corrupted, you don't build that into > your system if you have an easy out.
Then I recommend that you never use rsync. As for me, I'm sometimes willing to accept the possibility of algorithm failures which are less than the odds of my program going wrong because of cosmic radiation. > This is hashing 101. You hash, you bucket based on the hashes, and then > you store a list at each bucket with key and value tuple for a linear > search. There are other ways to do it, but this is the classic. Yes, I'm familiar with this, and outlined it in a previous email in this thread. > Of course, Perl does this for you. That extra time that I measured is > almost certainly the time spent comparing the two strings, which your > tie interface will also have to do because of collisions. Want to bet whether Perl spends more time in computing hash values or comparing strings? Cheers, Ben _______________________________________________ Boston-pm mailing list [email protected] http://mail.pm.org/mailman/listinfo/boston-pm

