On 5/12/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
The one reason i can think of why having both types of DocSet could be advantagous, is if the memory footprint of an OpenBitSet is significantly larger then a regular BitSet
They should be identical in that regard... 64 bits packed into a long. The speed gains for cardinality() are due to a superior algorithm that counts 8 words at a time. Additional speed gains for cardinality(intersection(a,b)) are due to not having to construct a new BitSet just to count it. Other speed gains are more basic... careful coding with an eye toward maximum performance, and tradeoffs toward performance... things like: - counting down and comparing to zero instead of using a for loop - if one passes a negative index to get(), I use a signed shift to calculate the word for the bit, thus keeping the word index negative and forcing an index-out-of-bounds exception w/o the need for an explicit check. Keeping both would complicate code that tries to find the most efficient way to take intersections, etc. And an intersection(BitSet,OpenBitSet) would be much slower than either intersection(BitSet,Bitset) or intersection(OpenBitSet,OpenBitSet). -Yonik