On Sun, Aug 31, 2025 at 08:43:20PM +0700, John Naylor wrote: > 1) It's trivial to reduce a 64-bit universal hash to a 32 (or > whatever)-bit universal hash by composing it with another universal > hash function -- a convenient example is Dietzfelbinger's > multiplicative hash (see section 2.3 in [1]), which is just a multiply > and a shift. > > 2) An XOR-based longitudinal redundancy check (LRC) [2] also detects > errors of any odd number of bitflips, and is cheap enough to compute > along with the hashing loop such that it's likely free on superscalar > CPUs. > > For WAL we could use e.g. a 31-bit hash plus an LRC reduced to a > single parity bit, or keep 32 bits of hash and steal a WAL header > padding byte to store a 1-byte LRC (which would additionally detect > most 2-bit errors and some burst errors (see section 3.1 in [3]) if we > wanted to keep some weaker CRC-type guarantees). > > 128-bit arithmetic has easier portability than CRC instructions (if we > avoid designs that require carryless multiplication),
Hmm. Okay. If this is more efficient and more reliable, why not. There's no free cake in this area so my understanding is that it is usually a trade-off between efficiency and reliability, and that we cannot have both of them. Of course, this counts for the case of more efficiency with the same reliability across two different methods, and vice-versa the efficiency/reliability part. > and it'd be nice > to get consistent performance across platforms, possibly with better > error detection for sector-sized errors. Assuming a suitable function > (and licence) can be found. With the amount of platforms that Postgres needs to support, this feels like potentially like a lot of work in terms of benchmarking. I am wondering about some actual numbers, in terms of micro-benchmarking the computations, of course. If this provides for example a very high performance improvement on Linux and the most popular BSDs (say as a matter of 2x or 3x), that may be worth considering to me anyway, even if for example WIN32 shows no performance difference for reasons I am not aware of, which would likely come down to the instructions we have through the MSVC compilers. Saying that, the argument of reducing 64-bit hash to 32-bit sounds kind of appealing, still I would worry about the cost of the extra computation. -- Michael
signature.asc
Description: PGP signature