On Dec 20, 2010, at 7:01 PM, Anders F Björklund wrote:
> Jeff Johnson wrote:
>
> Should make it into a generic library eventually, once this prototyping
> is done... Amazing how many silly bitarrays and digests are out there,
> like using scripted byte arrays and for instance MD5, for Bloom filters.
> It'll be interesting to see how the performance does against SQLite...
>
Yes there's a lot of foolishness around, particularly when choosing
MD5 or SHA1 as a Bloom filter hash, that's just ... naive (to be polite).
Are you just creating one huge store, or do you also have per-something
identifiers
attached to your file lists, and have the file lists broken down into smaller?
I'm looking to see if I can figger assertion inequalities like
Requires: kernel >= 2.6.32
into Bloom filters. There's likely a means to recode inequalities
into Bloom filters, but it will take some thought. Oddly geohashing
from Gustavo might be a means to capture "closest" in a Bloom filter
(but I'm just muddling, no idea how to encode inequalities into
a Bloom filter usefully yet).
73 de Jeff
______________________________________________________________________
RPM Package Manager http://rpm5.org
Developer Communication List [email protected]