On Dec24, 2010, at 11:23 , Nicolas Barbier wrote: > 2010/12/24 Florian Pflug <f...@phlo.org>: > >> On Dec23, 2010, at 20:39 , Tomas Vondra wrote: >> >>> I guess we could use the highest possible value (equal to the number >>> of tuples) - according to wiki you need about 10 bits per element >>> with 1% error, i.e. about 10MB of memory for each million of >>> elements. >> >> Drat. I had expected these number to come out quite a bit lower than >> that, at least for a higher error target. But even with 10% false >> positive rate, it's still 4.5MB per 1e6 elements. Still too much to >> assume the filter will always fit into memory, I fear :-( > > I have the impression that both of you are forgetting that there are 8 > bits in a byte. 10 bits per element = 1.25MB per milion elements.
Uh, of course. So in the real universe, the numbers are ~1.2MB per 1e6 elements for a false positive rate of 1% ~0.5MB per 1e6 elements for a false positive rate of 10% Hm. So for a table with a billion distinct elements, we'd need half a gigabyte per column for the filter. A tuple with two int columns takes at least 24+2*4 = 32bytes to store I think, making such a table at least 32GB in size. The filter size would thus be 1/64 of the table size in the worst case. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers