So for example if your bloom filter is 4 bits per column your error
rate for a single column where clause will be 1/2^(4/1.44) or a little
worse than returning 1 record in 7. If you test two or three columns
then it'll be about 1 in 49 or 1 in 343.

Hmm, I don't understand your calculations. In your example (4 bits per column, index has only one column) each row is hashed by 4 bits in 80-bit vector (signature), so, probability of collision is (1/80)^4


Also, as it stands now it's an unordered list. I assume you have no
plans to implement vacuum for this? Perhaps it would be better to
store a btree of signatures ordered by htid. That would let you easily
vacuum dead entries. Though it would necessitate random seeks to do
the full scan of the index unless you can solve the full index scan
concurrent with page splits problem.

Vacuum is already implemented by producing a full scan of index like all other indexes.
--
Teodor Sigaev                                   E-mail: teo...@sigaev.ru
                                                   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to