It doesn't look very safe to me. if you don't want to expose your id and create fake ids.
random.seed(id + SALT) string_id ="%x" % random.randint(0, 0xffffffff) and now you have an 8 chars id, with very few risks of collison. Thank to Mersenne Twister a pretty good pseudo-random number generator (http://docs.python.org/lib/module-random.html) Then hex encoding waste a lot of place, because it's a very small alphabet. 0123456789abcdef you can use a bigger one like base32, base 64 (see wikipedia and replace "/" by something else like "_") or your own alphabet: >>> import encode # function I've written, pretty trivial: while value > 0: >>> value, rest = divmod(value, len(alphabet)); ... >>> import random >>> random.seed(1) >>> x = random.randint(0, 0xffffffff) >>> x 577090034 >>> "%x" % x '2265b1f2' >>> encode.encode(x, encode.BASE32) 'h6bcfi' >>> encode.encode(x, encode.BASE64) 'ypr7O' hex, base16 -> 8 base32 -> 7 base64 -> 6 ... I would say: avoid using MD5 as a random string generator because it hasn't been created for that purpose. Does anyone see any flaws with this, apart that the max is 2**32 items, and it's always bad to have a ceiling limit. Cheers, -- Yoan On Sun, Jul 6, 2008 at 12:23 AM, Jonathan Vanasco <[EMAIL PROTECTED]> wrote: > > On Jul 5, 4:06 pm, jerry <[EMAIL PROTECTED]> wrote: >> However, I wonder how an md5 string can be squeezed into a 10, or even >> 6-character field with no concern of (future) collision -- or am I mis- >> understanding your db schema? > > You're misunderstanding the concept. > > 1. md5(random+time) to get a random string with good dispersion. > 2. Then truncate to 6/8/10/however many chars I need > 3. check for collision, if found goto 1 > 4. insert into db ( in unique field ) > > there's no need to reverse mapping or use the md5 as a checksum. its > just a way to find random numbers. > > the strings are just unique ids, they don't have to map backwards and > forwards. md5 is just a very efficient way to create a reasonably > unique id, and then you can do a quick check for collision. > > in terms of large-scale sites, i believe both bebo and imeem both do > this with 6char ids > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pylons-discuss" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/pylons-discuss?hl=en -~----------~----~----~----~------~----~------~--~---
