Re: Performance improvements - BitArrays

Andrei Alecu Tue, 16 Dec 2008 16:50:45 -0800

Michael,

Unsafe code is not necessarily required, it's just an extra squeeze ofperformance juice I use for myself. If you need to stay away from unsafecode, then it's fine, but by using pointers instead of accessing thearray in a managed way, you get a pretty nice performance boost in tightloops.

You can look at the assembler code that the JIT generates for an arraylookup vs accessing the same memory location with a pointer, you'll seethat it's a bit more efficient the pointer way.

But, like I said, all BitArray needs is a more efficient next set bitimplementation, and access to the underlaying memory store it uses (in.NET BitArray's case, an array of ints).


Andrei

Michael Garski wrote:

In 2.3, the document id is checked in the filter after it is scored and
before it is passed to the hit collector, which can result in a poor
performing search executed with a common term and a sparsely populated
filter.  I created my own filter implementation based off of the
DocSet/OpenBitSet classes that are in Solr, where the implementation of
getting the next set bit is very efficient, and does not use unsafe
code.  With my own filter implementation I was also able to work around
the memory leak issue with the cached BitArrays that Digy has noted
earlier.

Filter implementation in Lucene 2.4 is overhauled to allow you to create
your own filter implementation, defaulting to the OpenBitSet.
Additionally, I believe the filter is enumerated along with the
termdocs, leading to faster searches with sparsely populated filters.


Michael

Re: Performance improvements - BitArrays

Reply via email to