: Took 421ms to get 5000 bitset intersection counts : Took 1465ms to get 5000 openbitset intersection counts : : ...and I'm wondering what I've done wrong. The results are consistent : across differenct jvms and different hardware setups. I'm using the 7/22 : nightly of Solr. See my test code below.
Yonik may have more insight into this, but i believe what you are seeing is because you only have one bit "set" in each "bit set" and that's not a situation OpenBitSet has been optimized for -- in Solr, if you had a set of documents that small, it would most likely be modeled with a HasDocSet instead of BitDocSet. the fact that all of your intersections are of two sets that are "equal" (bit #5000 is set in every set) may also be triggering some shortcircuting optimization in BitSet.and (but that's purely hypothetical speculation) Try running your test with a larger number of more randomly distributed set bits and see how the results change. If you are interested, you can find Yonik's orriginal performance testing utility here... http://svn.apache.org/viewvc/incubator/solr/trunk/src/test/org/apache/solr/util/BitSetPerf.java?view=log ...searching all Lucene mailing lists (not just SOlr lists) for discussions about OpenBitSets will also point out some other tests some people tried, as well as some discussion about hypothetical tests (or different distributions of set bits) that were discussed but never executed (as far as i know) ... http://www.nabble.com/forum/Search.jtp?forum=44&local=y&query=OpenBitSet : FYI, my interest in Solr/Lucene stems from a need to create a facetted : browse experience with 10s of thousands of facets derived from millions of : documents. DocSets will be your friend ... the fact that Solr will choose between HashDocSets and BitDocSets depending on how many set docs there are in a particular set is your friend's really cool roomate, and the filterCache will be your friend's really sweet apartment -- both of which will make your friend even more fun to hang out with. -Hoss