question: why isn't a byte of a hash more uniform? how could I improve my code to cure that?

2009-08-07 Thread László Sándor
Hi all, I am a Python novice, and right now I would be happy to simply get my job done with it, but I could appreciate some thoughts on the issue below. I need to assign one of four numbers to names in a list. The assignment should be pseudo-random: no pattern whatsoever, but deterministic,

Re: question: why isn't a byte of a hash more uniform? how could I improve my code to cure that?

2009-08-07 Thread Tim Chase
After I have written a short Python script that hashes my textfile line by line and collects the numbers next to the original, I checked what I got. Instead of getting around 25% in each treatment, the range is 17.8%-31.3%. That sounds suspiciously like 25% with a +/- 7% fluctuation one might

Re: question: why isn't a byte of a hash more uniform? how could I improve my code to cure that?

2009-08-07 Thread László Sándor
Thank you, Tim. My comments are below. On 2009-08-07 13:19:47 -0400, Tim Chase python.l...@tim.thechases.com said: After I have written a short Python script that hashes my textfile line by line and collects the numbers next to the original, I checked what I got. Instead of getting around 25%

Re: question: why isn't a byte of a hash more uniform? how could I improve my code to cure that?

2009-08-07 Thread Ethan Furman
László Sándor wrote: Thank you, Tim. My comments are below. On 2009-08-07 13:19:47 -0400, Tim Chase python.l...@tim.thechases.com said: After I have written a short Python script that hashes my textfile line by line and collects the numbers next to the original, I checked what I got.

Re: question: why isn't a byte of a hash more uniform? how could I improve my code to cure that?

2009-08-07 Thread Dave Angel
L wrote: Hi all, I am a Python novice, and right now I would be happy to simply get my job done with it, but I could appreciate some thoughts on the issue below. I need to assign one of four numbers to names in a list. The assignment should be pseudo-random: no

Re: question: why isn't a byte of a hash more uniform? how could I improve my code to cure that?

2009-08-07 Thread Peter Otten
Dave Angel wrote: [clever analysis snipped] I'd use digest() instead of hexdigest(), and of course reduce the subscript to 63 or less. OP: You could also try hash(line) % 4 While AFAIK it doesn't make promises about randomness it might still be good enough in practice. Peter --

Re: question: why isn't a byte of a hash more uniform? how could I improve my code to cure that?

2009-08-07 Thread Paul Rubin
László Sándor sand...@gmail.com writes: OK, I understand. Could anyone suggest a better way to do this, then? (Recap: random-looking, close-to uniform assignment of one number out of four possibilities to strings.) Use a cryptographic hash function like md5 (deprecated for security purposes