On 4/17/13 8:56 PM, Ants Aasma wrote:
Nothing from the two points, but the CRC calculation algorithm can be
switched out for slice-by-4 or slice-by-8 variant. Speed up was around
factor of 4 if I remember correctly...I can provide you
> with a patch of the generic version of any of the discussed algorithms
> within an hour, leaving plenty of time in beta or in 9.4 to
> accommodate the optimized versions.

Can you nail down a solid, potential for commit slice-by-4 or slice-by-8 patch then? You dropped into things like per-byte overhead to reach this conclusion, which was fine to let the methods battle each other. Maybe I missed it, but I didn't remember seeing an obvious full patch for this implementation then come back up from that. With the schedule pressure this needs to return to more database-level tests. Your concerns about the committed feature being much slower then the original Fletcher one are troubling, and we might as well do that showdown again now with the best of the CRC implementations you've found.

Actually the state is that with the [CRC] polynomial used there is
currently close to zero hope of CPUs optimizing for us.

Ah, I didn't catch that before. It sounds like the alternate slicing implementation should also use a different polynomial then, which sounds reasonable. This doesn't even have to be exactly the same CRC function that the WAL uses. A CRC that's modified for performance or having a better future potential is fine; there's just a lot of resistance to using something other than a CRC right now.

I'm not sure about the 9.4 part: if we ship with the builtin CRC as
committed, there is a 100% chance that we will want to switch out the
algorithm in 9.4, and there will be quite a large subset of users that
will find the performance unusable.

Now I have to switch out my reviewer hat for my 3 bit fortune telling one. (It uses a Magic 8 Ball) This entire approach is squeezing what people would prefer to be a 32 bit CRC into a spare 16 bits, as a useful step advancing toward a long term goal. I have four major branches of possible futures here I've thought about:

1) Database checksums with 16 bits are good enough, but they have to be much faster to satisfy users. It may take a different checksum implementation altogether to make that possible, and distinguishing between the two of them requires borrowing even more metadata bits from somewhere. (This seems the future you're worried about)

2) Database checksums work out well, but they have to be 32 bits to satisfy users and/or error detection needs. Work on pg_upgrade and expanding the page headers will be needed. Optimization of the CRC now has a full 32 bit target.

3) The demand for database checksums is made obsolete by either mainstream filesystem checksumming, performance issues, or just general market whim. The 16 bit checksum PostgreSQL implements becomes a vestigial feature, and whenever it gets in the way of making changes someone proposes eliminating them. (I call this one the "rules" future)

4) 16 bit checksums turn out to be such a problem in the field that everyone regrets the whole thing, and discussions turn immediately toward how to eliminate that risk.

It's fair that you're very concerned about (1), but I wouldn't give it 100% odds of happening either. The user demand that's motivated me to work on this will be happy with any of (1) through (3), and in two of them optimizing the 16 bit checksums now turns out to be premature.

--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to