: Took 421ms to get 5000 bitset intersection counts
: Took 1465ms to get 5000 openbitset intersection counts
:
: ...and I'm wondering what I've done wrong.  The results are consistent
: across differenct jvms and different hardware setups.  I'm using the 7/22
: nightly of Solr.  See my test code below.

Yonik may have more insight into this, but i believe what you are seeing
is because you only have one bit "set" in each "bit set" and that's not a
situation OpenBitSet has been optimized for -- in Solr, if you had a set
of documents that small, it would most likely be modeled with a HasDocSet
instead of BitDocSet.

the fact that all of your intersections are of two sets that are "equal"
(bit #5000 is set in every set) may also be triggering some shortcircuting
optimization in BitSet.and (but that's purely hypothetical speculation)


Try running your test with a larger number of more randomly distributed
set bits and see how the results change.  If you are interested, you can
find Yonik's orriginal performance testing utility here...
http://svn.apache.org/viewvc/incubator/solr/trunk/src/test/org/apache/solr/util/BitSetPerf.java?view=log

...searching all Lucene mailing lists (not just SOlr lists) for
discussions about OpenBitSets will also point out some other tests some
people tried, as well as some discussion about hypothetical tests (or
different distributions of set bits) that were discussed but never
executed (as far as i know) ...

http://www.nabble.com/forum/Search.jtp?forum=44&local=y&query=OpenBitSet

: FYI, my interest in Solr/Lucene stems from a need to create a facetted
: browse experience with 10s of thousands of facets derived from millions of
: documents.

DocSets will be your friend ... the fact that Solr will choose between
HashDocSets and BitDocSets depending on how many set docs there are in a
particular set is your friend's really cool roomate, and the filterCache
will be your friend's really sweet apartment -- both of which will make
your friend even more fun to hang out with.



-Hoss

Reply via email to