On Mon, Jul 15, 2024, at 23:29, Tim Düsterhus wrote: > Hi > > On 7/15/24 16:12, Rob Landers wrote: > > This always gets me. "safer" doesn't have a consistent meaning. For > > Yes it does. SHA-256 is safer than MD5. And on modern CPUs with sha_ni > extensions, it's also faster. The following is on a Intel i7-1365U: > > > $ openssl speed md5 sha1 sha256 sha512 > > *snip* > > version: 3.0.10 > > built on: Wed Feb 21 10:45:39 2024 UTC > > options: bn(64,64) > > compiler: *snip* > > CPUINFO: OPENSSL_ia32cap=0x7ffaf3ffffebffff:0x98c027bc239c27eb > > The 'numbers' are in 1000s of bytes per second processed. > > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 > > bytes 16384 bytes > > md5 114683.10k 286174.51k 550288.90k 715171.50k > > 783611.22k 788556.46k > > sha1 138578.57k 440607.38k 1082163.29k 1674088.45k > > 2017296.38k 2047377.41k > > sha256 150670.11k 460483.71k 1054829.57k 1553830.57k > > 1807897.94k 1823981.57k > > sha512 41246.76k 181566.07k 341457.66k 645468.50k > > 781042.81k 804296.02k > > ---- > > > example, if you were to want to create a "content addressable > > address" using a hash and it needs to fit inside a 128 bit number > > (such as a GUID), you may be tempted to take SHA-X and just truncate > > it. However, this biases the resulting numbers, which this bias may > > This is false. For a hash algorithm to be considered cryptographically > secure (which I consider to be a reasonable definition of "safe"), it - > among other properties - needs to have the "avalanche effect" property, > which means that any change in the input is going to affect each output > bit with 50% probability.
from a practical perspective across hundreds of millions of hashes of unique ids, I can say that there is a practical and detectable bias when truncating sha-256 hashes. Enough that we were having to throw out a/b test results… I’m not going to write a paper on it and I’m not going to bother arguing the point that no hash function is perfect, but I will point out that “theory” and “reality” don’t always agree. > > This means that for a cryptographic hash algorithm - such as the SHA-2 > family - the resulting hash is indistinguishable from uniformly selected > random bits. And this property also holds after truncation - you just > have fewer bits of course. > > See also: https://security.stackexchange.com/a/34797/21705 > > > be considered unsafe (such as using it in an A/B testing tool). Just > > because you have a short hash, doesn't make it "unsafe" as longer > > hashes can also be considered "unsafe." What people usually mean by > > this is in the context of encryption, and in those cases it is > > unsafe, but in the context of non-encryption, usage of truncated > > larger hashes is just as unsafe. > > > > I'm afraid I don't understand what you are attempting to say here. > > Best regards > Tim Düsterhus > — Rob