As suspected in my first email on this topic, test is wrong, garbage collection was kicking in at wrong times so the results were convoluted. Ah, old good predictable c times are long gone.
Anyhow, appologies for this noise (particulary to Yonik). The real speed relationships are attached now, everything looks as it should be, BitSetIterator is slower than OpenBitSet in skipTo() scenario... ----- Original Message ---- From: eks dev <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Thursday, 7 September, 2006 10:03:45 AM Subject: Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet The same effect showed-up on athlon CPU... The only diference between your PerfTest and what I have been doing is that your loop is as tight as it goes. *Maybe* worth of investigating a bit futher as my tests look a bit closer to the real usage scenarios. I will try to identify cases where OpenBitSet nextSetBit starts being slower ... Loks a lot like what you said, ntz() vs Long.numberOfTrailingZeros(). The second one is (I guess) native call, so optimizations from ntz() go in and out of some cache (too deep for me anyhow, just shooting in the dark). Good news, inspite of your doubts, BitSetIterator rocks in skipTo scenario! VInt as well, even in medium density cases (memory profitability border is now the only condidtion for its usage). on the other note, the key for really efficiant matchers will be good SmartMatcherFactory that picks the best representation for given density/"sortednes". The cases I've been able to identify so far: Very Low density - IntList Low density VIntSortedList Dense - OpenBitSet/BitSetIterator or such Sorted - (imagine case where you have an oportunity to sort your index on category field, quite offten I guess as it does not require absolute "sortedness", it is enough to sort periodicly without caring for smaller updates). There, one simple interval list can do the magic in just a few bytes of memory, even in high density cases. Finding good strategy to find optimal Matcher representation is not all that hard, but really deciding when to do it will be a bit trickier (in warm-up case is easy, but in dynamic LRU case it gets complicated as you need as the first efficiently to collect bits in some hits collector, and than transform it to the optimal representation (SmartBitSet), quickly as otherwise the benefit of LRU caching will quickly get lost on time spant updateing cache...) More ideas on this? ----- Original Message ---- From: Yonik Seeley <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Wednesday, 6 September, 2006 11:54:14 PM Subject: Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet On 9/6/06, eks dev <[EMAIL PROTECTED]> wrote: > still, DenseBitsMatcher (BitSetIterator warpped in Matcher) works faster than > anything else for this case: > > int skip(Matcher m) throws IOException{ > int doc=-1, ret = 0; > while(m.skipTo(doc+1)){ > doc = m.doc(); > ret+=doc; > } > return ret; > } Huh... that doesn't make much sense. Maybe it's your pentium-M with it's shorter pipeline and less penalty for a pipeline flush on a branch prediction miss. You could try substituting BitUtil.ntz2 for ntz in OpenBitSet.nextSetBit... it uses a full binary search down to the byte level, and then an array lookup. The nice thing about being *open* is that you can write more than one type of iterator if you really want to:-) -Yonik --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]