Re: [Boston.pm] Max hash key length

2005-01-03 Thread Aaron Sherman
On Thu, 2004-12-30 at 20:45, Ben Tilly wrote: On Thu, 30 Dec 2004 18:02:07 -0500, Aaron Sherman [EMAIL PROTECTED] wrote: I understand risk assessment and the idea that nothing is 100% safe, but when you have a situation where you KNOW from day one that some keys will collide, and your data

Re: [Boston.pm] Max hash key length

2005-01-03 Thread Bob Rogers
From: Aaron Sherman [EMAIL PROTECTED] Date: Mon, 03 Jan 2005 10:21:32 -0500 . . . And yes, while I usually just trust to the law of probability (which is a very strange feature of our universe, if you stop to think about it), You really think so? It seems to me that well-behaved

RE: [Boston.pm] Max hash key length

2004-12-30 Thread Palit, Nilanjan
-Original Message- From: Uri Guttman [mailto:[EMAIL PROTECTED] $ so specify your data and its use better. how many are there in each list $ to be hashed? The application that I'm looking into this for, has 1000 chars per string. [Though in a separate offline discussion w/ Greg London,

Re: [Boston.pm] Max hash key length

2004-12-30 Thread Aaron Sherman
On Wed, 2004-12-29 at 18:10, Ben Tilly wrote: Under normal circumstances, to get non-miniscule odds of having a collision somewhere between MD5 keys, you'd need about 2**64 keys. If you have less than, say, a billion keys then you can ignore that possibility for all practical intents and

Re: [Boston.pm] Max hash key length

2004-12-30 Thread Ben Tilly
On Thu, 30 Dec 2004 18:02:07 -0500, Aaron Sherman [EMAIL PROTECTED] wrote: On Wed, 2004-12-29 at 18:10, Ben Tilly wrote: Under normal circumstances, to get non-miniscule odds of having a collision somewhere between MD5 keys, you'd need about 2**64 keys. If you have less than, say, a

Re: [Boston.pm] Max hash key length

2004-12-29 Thread Aaron Sherman
On Tue, 2004-12-28 at 13:46, Ian Langworth wrote: On 28.Dec.2004 01:14AM -0500, Tom Metro wrote: If you are concerned about the performance impact of long keys, and your application fits a write-once, read-many model, then you could always hash the hash keys. Say generate an MD5 digest

Re: [Boston.pm] Max hash key length

2004-12-29 Thread Ben Tilly
On Wed, 29 Dec 2004 10:49:19 -0500, Aaron Sherman [EMAIL PROTECTED] wrote: On Tue, 2004-12-28 at 13:46, Ian Langworth wrote: On 28.Dec.2004 01:14AM -0500, Tom Metro wrote: If you are concerned about the performance impact of long keys, and your application fits a write-once, read-many

RE: [Boston.pm] Max hash key length

2004-12-29 Thread Palit, Nilanjan
Folks, Thanks for the good ideas the performance discussion. I'll try out the different suggestions. Now, regarding Tom Metro's original suggestion for using an MD5 Digest: I read that the original MD5 algorithm has known issues with collisions. Any experiences with how well Digest::MD5 does

Re: [Boston.pm] Max hash key length

2004-12-29 Thread John Saylor
hi ( 04.12.29 13:13 -0800 ) Palit, Nilanjan: Now, regarding Tom Metro's original suggestion for using an MD5 Digest: I read that the original MD5 algorithm has known issues with collisions. i think it's more that there is a way to produce a collision while altering the file being hashed. this

Re: [Boston.pm] Max hash key length

2004-12-29 Thread Ben Tilly
On Wed, 29 Dec 2004 13:13:22 -0800, Palit, Nilanjan [EMAIL PROTECTED] wrote: Folks, Thanks for the good ideas the performance discussion. I'll try out the different suggestions. Now, regarding Tom Metro's original suggestion for using an MD5 Digest: I read that the original MD5 algorithm

Re: [Boston.pm] Max hash key length

2004-12-29 Thread Tom Metro
Palit, Nilanjan wrote: I read that the original MD5 algorithm has known issues with collisions. Do I need to test for collisions myself...or is it pretty well tested (or proved?) to stand up to an intensive application? A bit of googling turned up this:

Re: [Boston.pm] Max hash key length

2004-12-29 Thread Tom Metro
Ben Tilly wrote: That said, the suggestion of using MD5 keys is a non-starter for eliminating the performance issue. Calculating an MD5 hash of a string of length n is O(n). The qualifier I added to my suggestion of using MD5 was that the application be of a write-once, read-many nature, with

Re: [Boston.pm] Max hash key length

2004-12-29 Thread Ben Tilly
On Wed, 29 Dec 2004 18:54:41 -0500, Tom Metro [EMAIL PROTECTED] wrote: Ben Tilly wrote: That said, the suggestion of using MD5 keys is a non-starter for eliminating the performance issue. Calculating an MD5 hash of a string of length n is O(n). The qualifier I added to my suggestion of

Re: [Boston.pm] Max hash key length

2004-12-28 Thread Ian Langworth
On 28.Dec.2004 01:14AM -0500, Tom Metro wrote: If you are concerned about the performance impact of long keys, and your application fits a write-once, read-many model, then you could always hash the hash keys. Say generate an MD5 digest of the key string, and then use the digest as the hash

[Boston.pm] Max hash key length

2004-12-27 Thread Palit, Nilanjan
I wanted to know if there are any limitations to the max key length used for hashes in Perl. Also, what are the performance implications, if any, of using long keys? I have an application that needs key lengths in the range of ~1000, but with relatively limited numbers of keys (few to low tens of

Re: [Boston.pm] Max hash key length

2004-12-27 Thread Ben Tilly
On Mon, 27 Dec 2004 16:36:38 -0800, Palit, Nilanjan [EMAIL PROTECTED] wrote: I wanted to know if there are any limitations to the max key length used for hashes in Perl. Also, what are the performance implications, if any, of using long keys? I have an application that needs key lengths in the

Re: [Boston.pm] Max hash key length

2004-12-27 Thread Tom Metro
Palit, Nilanjan wrote: I have an application that needs key lengths in the range of ~1000, but with relatively limited numbers of keys (few to low tens of thousands). If you are concerned about the performance impact of long keys, and your application fits a write-once, read-many model, then you