So for example if your bloom filter is 4 bits per column your error rate for a single column where clause will be 1/2^(4/1.44) or a little worse than returning 1 record in 7. If you test two or three columns then it'll be about 1 in 49 or 1 in 343.
Hmm, I don't understand your calculations. In your example (4 bits per column, index has only one column) each row is hashed by 4 bits in 80-bit vector (signature), so, probability of collision is (1/80)^4
Also, as it stands now it's an unordered list. I assume you have no plans to implement vacuum for this? Perhaps it would be better to store a btree of signatures ordered by htid. That would let you easily vacuum dead entries. Though it would necessitate random seeks to do the full scan of the index unless you can solve the full index scan concurrent with page splits problem.
Vacuum is already implemented by producing a full scan of index like all other indexes.
-- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers