Re: speed of BooleanQueries on 2.9

eks dev Mon, 13 Jul 2009 04:29:20 -0700

Hi Mike, 

getMaxNumOfCandidates() in test was 200, Index is optimised and read-only


We found (due to an error in our warm-up code, funny) that only this Query runs 
slower on 2.9. 

A hint where to look could be that this Query cointains two, the most frequent 
tokens in two particular fields 
NAME:hans and ZIPS:berlin (index has ca 80Mio very short documents, 3Mio unique 
terms)

But all of this *could be just wrong measurement*, I just could not spend more 
time to get to the bottom of this. We moved forward as we got overall better 
average performance (sweet 10% in average) on much bigger real query log from 
our regression test.

Anyhow I just wanted to throw it out, maybe it triggers some synapses :) If 
false alarm, sorry. 





----- Original Message ----
> From: Michael McCandless <luc...@mikemccandless.com>
> To: java-user@lucene.apache.org
> Sent: Monday, 13 July, 2009 11:50:48
> Subject: Re: speed of BooleanQueries on 2.9
> 
> This is not expected; 2.9 has had a number of changes that ought to
> reduce CPU cost of searching.  If this holds up we definitely need to
> get to the root cause.
> 
> Did your test exclude the warmup query for both 2.4.1 & 2.9?  How many
> segments in the index?  What is the actual value of
> getMaxNumOfCandidates()?  If you simplify the query down (eg just do
> the NAME clause or the ZIPSS clause, alone) are those also 4X slower?
> 
> Mike
> 
> On Sun, Jul 12, 2009 at 12:53 PM, eks devwrote:
> >
> > Is it possible that the same BooleanQuery on 2.9 runs significantly slower 
> than on 2.4?
> >
> > we have some strange effects where the following query runs approx 4(ouch!) 
> times slower on 2.9, test done by 1000 times executing the same Query... But! 
> if 
> I run test from some real Query log with mixed Queries, I get almost the same 
> results (?!), even slightly faster on 2.9 !?
> >
> >
> > Query:
> > +((NAME:hans NAME:hahns^0.23232001 NAME:hams^0.27648002 NAME:hamz^0.25392 
> NAME:hanas^0.18722998 NAME:hanbs^0.18722998 NAME:hanfs^0.18722998 
> NAME:hangs^0.18722998 NAME:hanhs^0.24030754 NAME:hanis^0.18722998 
> NAME:hanjs^0.18722998 NAME:hanks^0.18722998 NAME:hanms^0.18722998 
> NAME:hanos^0.18722998 NAME:hanrs^0.18722998 NAME:hansb^0.20172001 
> NAME:hansd^0.20172001 NAME:hansf^0.20172001 NAME:hansg^0.20172001 
> NAME:hansi^0.20172001 NAME:hansj^0.20172001 NAME:hansk^0.20172001 
> NAME:hansl^0.20172001 NAME:hansn^0.20172001 NAME:hanso^0.20172001 
> NAME:hansp^0.20172001 NAME:hanst^0.20172001 NAME:hansu^0.20172001 
> NAME:hansw^0.20172001 NAME:hansy^0.20172001 NAME:hansz^0.20172001 
> NAME:hants^0.18722998 NAME:hanus^0.18722998 NAME:hanws^0.18722998 
> NAME:hehns^0.20172001 NAME:hens^0.2736075 NAME:hins^0.24843 NAME:hons^0.24843 
> NAME:huhns^0.1801875 NAME:huns^0.24843)^2.0)
> > +(((ZIPS:berlin ZIPS:barlin^0.28227 ZIPS:berien^0.25947002 
> ZIPS:berling^0.23232001 ZIPS:perlin^0.26133335))^1.2)
> >
> > The question is just to get some hints where I should look...
> >
> > Both fealds are without norms, omitTf(true) , RAMDirectory, using
> > TopDocs top = ixSearcher.search(q, null, getMaxNumOfCandidates());
> > and BooleanQuery.setAllowDocsOutOfOrder(true);
> >
> > maybe we made some mistakes on measuring, but we did simple timing here on 
> search() method... strange. I would bet it is something we did, but I cannot 
> see 
> where ...
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: speed of BooleanQueries on 2.9

Reply via email to