It doesn't look very safe to me.

if you don't want to expose your id and create fake ids.

random.seed(id + SALT)

string_id ="%x" % random.randint(0, 0xffffffff)

and now you have an 8 chars id, with very few risks of collison. Thank
to Mersenne Twister a pretty good pseudo-random number generator
(http://docs.python.org/lib/module-random.html)

Then hex encoding waste a lot of place, because it's a very small alphabet.

0123456789abcdef

you can use a bigger one like base32, base 64 (see wikipedia and
replace "/" by something else like "_") or your own alphabet:

>>> import encode # function I've written, pretty trivial: while value > 0: 
>>> value, rest = divmod(value, len(alphabet)); ...
>>> import random
>>> random.seed(1)
>>> x = random.randint(0, 0xffffffff)
>>> x
577090034
>>> "%x" % x
'2265b1f2'
>>> encode.encode(x, encode.BASE32)
'h6bcfi'
>>> encode.encode(x, encode.BASE64)
'ypr7O'

hex, base16 -> 8
base32 -> 7
base64 -> 6
...

I would say: avoid using MD5 as a random string generator because it
hasn't been created for that purpose.

Does anyone see any flaws with this, apart that the max is 2**32
items, and it's always bad to have a ceiling limit.

Cheers,

-- Yoan

On Sun, Jul 6, 2008 at 12:23 AM, Jonathan Vanasco <[EMAIL PROTECTED]> wrote:
>
> On Jul 5, 4:06 pm, jerry <[EMAIL PROTECTED]> wrote:
>> However, I wonder how an md5 string can be squeezed into a 10, or even
>> 6-character field with no concern of (future) collision -- or am I mis-
>> understanding your db schema?
>
> You're misunderstanding the concept.
>
> 1. md5(random+time) to get a random string with good dispersion.
> 2. Then truncate to 6/8/10/however many chars I need
> 3. check for collision, if found goto 1
> 4. insert into db ( in unique field )
>
> there's no need to reverse mapping or use the md5 as a checksum.  its
> just a way to find random numbers.
>
> the strings are just unique ids, they don't have to map backwards and
> forwards.  md5 is just a very efficient way to create a reasonably
> unique id, and then you can do a quick check for collision.
>
> in terms of large-scale sites, i believe both bebo and imeem both do
> this with 6char ids
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to