On Wed, Apr 17, 2013 at 05:47:55PM +0300, Ants Aasma wrote: > The SSE4.1 implementation of this would be as fast as the last pat, > generic version will be faster and we avoid the linearity issue. By > using different offsets for each of the partial hashes we don't > directly suffer from commutativity of the final xor folding. By using > the xor-then-multiply variant the last values hashed have their bits > mixed before folding together. > > Speaking against this option is the fact that we will need to do CPU > detection at startup to make it fast on the x86 that support SSE4.1, > and the fact that AMD CPUs before 2011 will run it an order of > magnitude slower (but still faster than the best CRC). > > Any opinions if it would be a reasonable tradeoff to have a better > checksum with great performance on latest x86 CPUs and good > performance on other architectures at the expense of having only ok > performance on older AMD CPUs? > > Also, any good suggestions where should we do CPU detection when we go > this route?
As much as I love the idea of improving the algorithm, it is disturbing we are discussing this so close to beta, with an algorithm that is under analysis, with no (runtime) CPU detection, and in something that is going to be embedded into our data page format. I can't even think of another case where we do run-time CPU detection. I am wondering if we need to tell users that pg_upgrade will not be possible if you enable page-level checksums, so we are not trapped with something we want to improve in 9.4. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers