> > I still cannot fathom why anyone would assign random numbers or (even
> more useless) long random blobs to use as psuedo-keys.  It just boggles
> the mind.
 
> I take it you’re not a cryptographer :) All modern ciphers do this. For
> example, an RSA key pair is simply a pair of large random numbers (both
> prime) that meet certain criteria. Or if you use a more modern cipher like
> Curve25519, the private key is quite literally just any 256 bits of random
> data. You generate a key-pair by reading 32 bytes from /dev/random into
> the private key, and then performing a transformation on that to get the
> public key.

I know and understand the uses of random numbers, encryption, and digests when 
used for the purpose for which they were invented.  What I do not understand is 
why one would use a UUID (randomly generated bunch of bytes) as a key in a 
database.  It is long, every use must be checked for collisions, and inherently 
far less efficient than the simple integer sequence it is replacing.

Of course, it is just a fad (like big huge wastes of whitespace and unreadable 
low-contrast ittybitty fonts in current web-page bootifications) adopted by 
those unable to comprehend the consequences of their decisions (and if they 
haven't had any yet, they are very lucky indeed).

> Obviously collisions are possible with long random numbers or digests, but
> secure systems are designed such that random collisions are vanishingly
> unlikely to occur for {insert large power of ten here} years, which makes
> the probability effectively zero.

No, you are incorrect.  A "good hash function" will evenly spread its 
collisions over its digest space. If you feed all possible 512-bit blocks into 
a 512-bit hash to obtain the output digests, when you feed in one more 513-bit 
input, you will get one collision.  If you feed in another 513-bit input you 
will get a different collision.  The "collision" digest will not be predictable 
(that is it will not "just always be the same as the first 512-bit blocks input 
digest with bit 438 flipped).  It is the property of being unable (very complex 
and taking a long time) to generate an input (chosen text) which results in a 
specific digest which is the useful property -- the fact that it can and must 
have a 100% probability of collision when the input space is larger than the 
output space is irrelevant.

THe problem is an inability to properly determine and assess risk.  When using 
a sequence the probability of a collision is 0.  When using a random generated 
number (passing a bunch of random data through a digest function) has a 
probability of collision of 100%.  Only if you have (for example) a sequence 
assigned "systemid" which is used as part of the input to the digest function, 
and use the generated recordid sequence number as input to the digest along 
with the random data does the probability of collision reduce from 100% to some 
small number greater than 0%.  Using the systemid sequence and the recordid 
sequence directly however, has a 0% probability of collision, so any rational 
person would use that directly and forgo entirely the introduction of 
uncertainty and bugs using "UUID" type crappola will cause.

Unfortunately there is a massive shortage of rational life on this planet.

 
> —Jens
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to