On Apr18, 2013, at 00:32 , Tom Lane <t...@sss.pgh.pa.us> wrote: > Ants Aasma <a...@cybertec.at> writes: >> I was thinking about something similar too. The big issue here is that >> the parallel checksums already hide each other latencies effectively >> executing one each of movdqu/pmullw/paddw each cycle, that's why the >> N_SUMS adds up to 128 bytes not 16 bytes. > > The more I read of this thread, the more unhappy I get. It appears that > the entire design process is being driven by micro-optimization for CPUs > being built by Intel in 2013. That ought to be, at best, a fifth-order > consideration, with full recognition that it'll be obsolete in two years, > and is already irrelevant to anyone not running one of those CPUs.
Micro-optimization for particular CPUs yes, but general performance considerations, no. For example, 2^n is probably one of the worst modulus you can pick for a hash function - any prime would work much better. But doing the computations modulo 2^16 or 2^32 carries zero performance overhead, whereas picking another modulus requires some renormalization after every operation. That, however, is *not* a given - it stems from the fact nearly all CPUs in existence operated on binary integers. This fact must thus enter into the design phase very early, and makes 2^16 or 2^32 a sensible choice for a modulus *despite* it's shortcomings, simply because it allows for fast implementations. > I would like to ban all discussion of assembly-language optimizations > until after 9.3 is out, so that we can concentrate on what actually > matters. Which IMO is mostly the error detection rate and the probable > nature of false successes. I'm glad to see that you're paying at least > some attention to that, but the priorities in this discussion are > completely backwards. I'd say lots of attention is paid to that, but there's *also* attention paid to speed. Which I good, because ideally we want to end up with a checksum with both has good error-detection properties *and* good performance. If performance is of no concern to us, then there's little reason not to use CRC… > And I reiterate that there is theory out there about the error detection > capabilities of CRCs. I'm not seeing any theory here, which leaves me > with very little confidence that we know what we're doing. If you've got any pointers to literature on error-detection capabilities of CPU-friendly checksum functions, please share. I am aware of the vast literature on CRC, and also on some other algebraic approaches, but none of those even come close to the speed of FNV+shift (unless there's a special CRC instruction, that is). And there's also a ton of stuff on cryptographic hashing, but those are optimized for a completely different use-case... best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers