On 4/17/13 6:32 PM, Tom Lane wrote:
The more I read of this thread, the more unhappy I get.  It appears that
the entire design process is being driven by micro-optimization for CPUs
being built by Intel in 2013.

And that's not going to get anyone past review, since all the tests I've been doing the last two weeks are on how fast an AMD Opteron 6234 with OS cache >> shared_buffers can run this. The main thing I'm still worried about is what happens when you have a fast machine that can move memory around very quickly and an in-memory workload, but it's hamstrung by the checksum computation--and it's not a 2013 Intel machine.

The question I started with here was answered to some depth and then skipped past. I'd like to jerk attention back to that, since I thought some good answers from Ants went by. Is there a simple way to optimize the committed CRC computation (or a similar one with the same error detection properties) based on either:

a) Knowing that the input will be a 8K page, rather than the existing use case with an arbitrary sized WAL section.

b) Straightforward code rearrangement or optimization flags.

That was all I thought was still feasible to consider changing for 9.3 a few weeks ago. And the possible scope has only been shrinking since then.

And I reiterate that there is theory out there about the error detection
capabilities of CRCs.  I'm not seeing any theory here, which leaves me
with very little confidence that we know what we're doing.

Let me see if I can summarize where the messages flying by are at since you'd like to close this topic for now:

-Original checksum feature used Fletcher checksums. Its main problems, to quote wikipedia, include that it "cannot distinguish between blocks of all 0 bits and blocks of all 1 bits".

-Committed checksum feature uses truncated CRC-32. This has known good error detection properties, but is expensive to compute. There's reason to believe that particular computation will become cheaper on future platforms though. But taking full advantage of that will require adding CPU-specific code to the database.

-The latest idea is using the Fowler–Noll–Vo hash function: https://en.wikipedia.org/wiki/Fowler_Noll_Vo_hash There's 20 years of research around when that is good or bad. The exactly properties depend on magic "FNV primes": http://isthe.com/chongo/tech/comp/fnv/#fnv-prime that can vary based on both your target block size and how many bytes you'll process at a time. For PostgreSQL checksums, one of the common problems--getting an even distribution of the hashed values--isn't important the way it is for other types of hashes. Ants and Florian have now dug into how exactly that and specific CPU optimization concerns impact the best approach for 8K database pages. This is very clearly a 9.4 project that is just getting started.

--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to