On 5/12/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
The one reason i can think of why having both types of DocSet could be
advantagous, is if the memory footprint of an OpenBitSet is significantly
larger then a regular BitSet

They should be identical in that regard... 64 bits packed into a long.
The speed gains for cardinality() are due to a superior algorithm that
counts 8 words at a time.
Additional speed gains for cardinality(intersection(a,b)) are due to
not having to construct a new BitSet just to count it.

Other speed gains are more basic... careful coding with an eye toward
maximum performance, and tradeoffs toward performance... things like:
 - counting down and comparing to zero instead of using a for loop
 - if one passes a negative index to get(), I use a signed shift to
calculate the word for the bit, thus keeping the word index negative
and forcing an index-out-of-bounds exception w/o the need for an
explicit check.

Keeping both would complicate code that tries to find the most
efficient way to take intersections, etc.   And an
intersection(BitSet,OpenBitSet) would be much slower than either
intersection(BitSet,Bitset) or intersection(OpenBitSet,OpenBitSet).

-Yonik

Reply via email to