On Apr17, 2013, at 23:44 , Ants Aasma <a...@cybertec.at> wrote:
> Performance results:
> Mul-add checksums: 12.9 bytes/s
> FNV-1a checksums: 13.5 bytes/s
> FNV-1a + srl-1: 7.4 bytes/s
> Detection rates:
> False positive rates:
>                 Add-mul       FNV-1a     FNV-1a + srl-1
> Single bit flip: 1:inf         1:129590   1:64795
> Double bit flip: 1:148         1:511      1:53083
> Triple bit flip: 1:673         1:5060     1:61511
>  Quad bit flip: 1:1872        1:19349    1:68320
> Write 0x00 byte: 1:774538137   1:118776   1:68952
> Write 0xFF byte: 1:165399500   1:137489   1:68958
>  Partial write: 1:59949       1:71939    1:89923
>  Write garbage: 1:64866       1:64980    1:67732
> Write run of 00: 1:57077       1:61140    1:59723
> Write run of FF: 1:63085       1:59609    1:62977
> Test descriptions:
> N bit flip: picks N random non-overlapping bits and flips their value.
> Write X byte: overwrites a single byte with X.
> Partial write: picks a random cut point, overwrites everything from
> there to end with 0x00.
> Write garbage/run of X: picks two random cut points and fills
> everything in between with random values/X bytes.

Cool, thanks for testing that! The results for FNV-1a + srl-1 look
promising, I think. Its failure rate is consistently about 1:2^16,
which is the value you'd expect. That gives me some confidence that
the additional shift as working as expected.

BTW, which prime are you using for FNV-1a and FNV-1a+srl1?

> So adding in the shifted value nearly cuts the performance in half. I
> think that by playing with the instruction order I might coax the CPU
> scheduler to schedule the instructions better, but even in best case
> it will be somewhat slower. The point to keep in mind that even this
> slower speed is still faster than hardware accelerated CRC32, so all
> in all the hit might not be so bad.

Yeah. ~7 bytes/cycle still translates to over 10GB/s on typical CPU,
so that's still plenty fast I'd say...

> The effect on false positive rates
> for double bit errors is particularly impressive. I'm now running a
> testrun that shift right by 13 to see how that works out, intuitively
> it should help dispersing the bits a lot faster.

Maybe, but it also makes *only* bits 14 and 15 actually affects bits
below them, because all other's are shifted out. If you choose the
right prime it may still work, you'd have to pick one which with
enough lower bits set so that every bits affects bit 14 or 15 at some

All in all a small shift seems better to me - if 1 for some reason
isn't a good choice, I'd expect 3 or so to be a suitable
replacement, but nothing much larger…

I should have some time tomorrow to spent on this, and will try 
to validate our FNV-1a modification, and see if I find a way to judge
whether 1 is a good shift.

>>> I wonder if we use 32bit FNV-1a's (the h = (h^v)*p variant) with
>>> different offset-basis values, would it be enough to just XOR fold the
>>> resulting values together. The algorithm looking like this:
>> Hm, this will make the algorithm less resilient to some particular
>> input permutations (e.g. those which swap the 64*i-th and the (64+1)-ith
>> words), but those seem very unlikely to occur randomly. But if we're
>> worried about that, we could use your linear combination method for
>> the aggregation phase.
> I don't think it significantly reduces resilience to permutations
> thanks to using different basis offsets and multiply not distributing
> over xor.

Oh, yeah, I though you were still using 0 as base offset. If you don't,
the objection is moot.

best regards,
Florian Pflug

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to