Ok, I've restructured the tests and am now seeing performance differences
very close to the claims in the javadocs.  Thanks much, Yonik and Hoss.

Took 953ms to get 5000 bitset intersection counts
Took 516ms to get 5000 openbitset intersection counts

New code...

   public void testMultipleOpenBitSetAndsSpeed() {


       int i = 0;
       try {
           //create 5000 openbitsets, put them all in a single map
           Map<String,OpenBitSet> bsMap = new HashMap<String,OpenBitSet>();
           OpenBitSet bs;
           for ( ; i < 5000; i++ ) {
               bs = new OpenBitSet( 500000 );
               for ( int x = 0; x < 500000; x = x + 2 ) {
                   bs.fastSet( x );
               }
               bsMap.put( String.valueOf(i), bs );
           }

           //our comparsion openbitset
           OpenBitSet bs2 = new OpenBitSet( 500000 );
           for ( int x = 1; x < 500000; x = x + 5 ) {
               bs2.fastSet( x );
           }

//            ensure jvm warmup
           Set<String> set = bsMap.keySet();
           for ( String key : set ) {
               OpenBitSet.intersectionCount( bs2, bsMap.get( key ) );
           }

           // meat of the test
           long start = System.currentTimeMillis();
           for ( String key : set ) {
               OpenBitSet.intersectionCount( bs2, bsMap.get( key ) );
           }
           long finish = System.currentTimeMillis();

           System.out.println( "Took " + (finish-start) + "ms to get 5000
openbitset intersection counts" );

       } catch ( Throwable t ) {
           t.printStackTrace( System.out );
           System.out.println( i );
       }
   }

   public void testMultipleBitSetAndsSpeed() {


       int i = 0;
       try {
           //create 5000 bitsets, put them all in a single map
           Map<String,BitSet> bsMap = new HashMap<String,BitSet>();
           BitSet bs;
           for ( ; i < 5000; i++ ) {
               bs = new BitSet( 500000 );
               for ( int x = 0; x < 500000; x = x + 2 ) {
                   bs.set( x );
               }
               bsMap.put( String.valueOf(i), bs );
           }

           BitSet bs2 = new BitSet( 500000 );
           for ( int x = 1; x < 500000; x = x + 5 ) {
               bs2.set( x );
           }

           //ensure jvm warmup
           Set<String> set = bsMap.keySet();
           for ( String key : set ) {
               BitSet copy = (BitSet)bs2.clone();
               copy.and( bsMap.get( key ) );
           }

           BitSet copy;

//            meat of the test
           int count = 0;
           long start = System.currentTimeMillis();
           for ( String key : set ) {
               copy = (BitSet)bs2.clone();
               copy.and( bsMap.get( key ) );
               copy.cardinality();
           }
           long finish = System.currentTimeMillis();

           System.out.println( "Took " + (finish-start) + "ms to get 5000
bitset intersection counts" );

       } catch ( Throwable t ) {
           t.printStackTrace( System.out );
           System.out.println( i );
       }

   }

On 7/22/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:

You are essentially testing sets of size 5000 against sets of size
500,000.
BitSet keeps track of the largest bit you set (which is 5000) and
doesn't actually calculate the intersection or the populationCount
beyond that.  OpenBitSet does not (it tries to do the minimum
necessary and make everything as fast as possible.)

So if you changed the single bit you set from 5000 to 499,999, you
should see very different results.

In any case, if one is only going to set a single bit, there are much
faster ways of calculating intersection sizes ;-)

-Yonik


On 7/22/06, Cass Costello <[EMAIL PROTECTED]> wrote:
> Hello all,
>
> I'm newish to both Lucene and Solr, but I'm loving learning both.  I was
> intrigued by the comments in the OpenBitSet javadocs...
>
> <snip>
> OpenBitSet is faster than java.util.BitSet in most operations and *much*
> faster at calculating cardinality of sets and results of set operations.
> </snip>
>
> ...so I decided to see for myself and concocted a couple tests, which
> consistently result in...
>
> Took 421ms to get 5000 bitset intersection counts
> Took 1465ms to get 5000 openbitset intersection counts
>
> ...and I'm wondering what I've done wrong.  The results are consistent
> across differenct jvms and different hardware setups.  I'm using the
7/22
> nightly of Solr.  See my test code below.
>
> FYI, my interest in Solr/Lucene stems from a need to create a facetted
> browse experience with 10s of thousands of facets derived from millions
of
> documents.
>
> Thanks for your time,
> Cass Costello
>
>
> package playground.cass;
>
> import java.util.BitSet;
> import java.util.HashMap;
> import java.util.Map;
> import java.util.Set;
> import junit.framework.TestCase;
> import org.apache.solr.util.OpenBitSet;
>
> public class TestsBitSetOperations extends TestCase {
>     public TestsBitSetOperations(String arg0) {
>         super(arg0);
>     }
>
>     protected void setUp() throws Exception {
>         super.setUp();
>     }
>
>     protected void tearDown() throws Exception {
>         super.tearDown();
>     }
>
>     public void testMultipleOpenBitSetAndsSpeed() {
>
>
>         int i = 0;
>         try {
>             //create 5000 openbitsets, put them all in a single map
>             Map<String,OpenBitSet> bsMap = new
HashMap<String,OpenBitSet>();
>             OpenBitSet bs;
>             for ( ; i < 5000; i++ ) {
>                 bs = new OpenBitSet( 500000 );
>                 bs.fastSet( 5000 );
>                 bsMap.put( String.valueOf(i), bs );
>             }
>
>             //our comparsion openbitset
>             OpenBitSet bs2 = new OpenBitSet( 500000 );
>             bs2.fastSet( 5000 );
>
> //            ensure jvm warmup
>             Set<String> set = bsMap.keySet();
>             for ( String key : set ) {
>                 OpenBitSet.intersectionCount( bs2, bsMap.get( key ) );
>             }
>
>             // meat of the test
>             long start = System.currentTimeMillis();
>             for ( String key : set ) {
>                 long count = OpenBitSet.intersectionCount( bs2,
bsMap.get(
> key ) );
>             }
>             long finish = System.currentTimeMillis();
>
>             System.out.println( "Took " + (finish-start) + "ms to get
5000
> openbitset intersection counts" );
>
>         } catch ( Throwable t ) {
>             t.printStackTrace( System.out );
>             System.out.println( i );
>         }
>     }
>
>     public void testMultipleBitSetAndsSpeed() {
>
>
>         int i = 0;
>         try {
>             //create 5000 bitsets, put them all in a single map
>             Map<String,BitSet> bsMap = new HashMap<String,BitSet>();
>             BitSet bs;
>             for ( ; i < 5000; i++ ) {
>                 bs = new BitSet( 500000 );
>                 bs.set( 5000 );
>                 bsMap.put( String.valueOf(i), bs );
>             }
>
>             BitSet bs2 = new BitSet( 500000 );
>             bs2.set( 5000 );
>
>             //ensure jvm warmup
>             Set<String> set = bsMap.keySet();
>             for ( String key : set ) {
>                 BitSet copy = (BitSet)bs2.clone();
>                 copy.and( bsMap.get( key ) );
>             }
>
>             BitSet copy;
>
> //            meat of the test
>             long start = System.currentTimeMillis();
>             for ( String key : set ) {
>                 copy = (BitSet)bs2.clone();
>                 copy.and( bsMap.get( key ) );
>                 long count = copy.cardinality();
>             }
>             long finish = System.currentTimeMillis();
>
>             System.out.println( "Took " + (finish-start) + "ms to get
5000
> bitset intersection counts" );
>
>         } catch ( Throwable t ) {
>             t.printStackTrace( System.out );
>             System.out.println( i );
>         }
>
>     }
>
>
> }

Reply via email to