Hi Mike, getMaxNumOfCandidates() in test was 200, Index is optimised and read-only
We found (due to an error in our warm-up code, funny) that only this Query runs slower on 2.9. A hint where to look could be that this Query cointains two, the most frequent tokens in two particular fields NAME:hans and ZIPS:berlin (index has ca 80Mio very short documents, 3Mio unique terms) But all of this *could be just wrong measurement*, I just could not spend more time to get to the bottom of this. We moved forward as we got overall better average performance (sweet 10% in average) on much bigger real query log from our regression test. Anyhow I just wanted to throw it out, maybe it triggers some synapses :) If false alarm, sorry. ----- Original Message ---- > From: Michael McCandless <luc...@mikemccandless.com> > To: java-user@lucene.apache.org > Sent: Monday, 13 July, 2009 11:50:48 > Subject: Re: speed of BooleanQueries on 2.9 > > This is not expected; 2.9 has had a number of changes that ought to > reduce CPU cost of searching. If this holds up we definitely need to > get to the root cause. > > Did your test exclude the warmup query for both 2.4.1 & 2.9? How many > segments in the index? What is the actual value of > getMaxNumOfCandidates()? If you simplify the query down (eg just do > the NAME clause or the ZIPSS clause, alone) are those also 4X slower? > > Mike > > On Sun, Jul 12, 2009 at 12:53 PM, eks devwrote: > > > > Is it possible that the same BooleanQuery on 2.9 runs significantly slower > than on 2.4? > > > > we have some strange effects where the following query runs approx 4(ouch!) > times slower on 2.9, test done by 1000 times executing the same Query... But! > if > I run test from some real Query log with mixed Queries, I get almost the same > results (?!), even slightly faster on 2.9 !? > > > > > > Query: > > +((NAME:hans NAME:hahns^0.23232001 NAME:hams^0.27648002 NAME:hamz^0.25392 > NAME:hanas^0.18722998 NAME:hanbs^0.18722998 NAME:hanfs^0.18722998 > NAME:hangs^0.18722998 NAME:hanhs^0.24030754 NAME:hanis^0.18722998 > NAME:hanjs^0.18722998 NAME:hanks^0.18722998 NAME:hanms^0.18722998 > NAME:hanos^0.18722998 NAME:hanrs^0.18722998 NAME:hansb^0.20172001 > NAME:hansd^0.20172001 NAME:hansf^0.20172001 NAME:hansg^0.20172001 > NAME:hansi^0.20172001 NAME:hansj^0.20172001 NAME:hansk^0.20172001 > NAME:hansl^0.20172001 NAME:hansn^0.20172001 NAME:hanso^0.20172001 > NAME:hansp^0.20172001 NAME:hanst^0.20172001 NAME:hansu^0.20172001 > NAME:hansw^0.20172001 NAME:hansy^0.20172001 NAME:hansz^0.20172001 > NAME:hants^0.18722998 NAME:hanus^0.18722998 NAME:hanws^0.18722998 > NAME:hehns^0.20172001 NAME:hens^0.2736075 NAME:hins^0.24843 NAME:hons^0.24843 > NAME:huhns^0.1801875 NAME:huns^0.24843)^2.0) > > +(((ZIPS:berlin ZIPS:barlin^0.28227 ZIPS:berien^0.25947002 > ZIPS:berling^0.23232001 ZIPS:perlin^0.26133335))^1.2) > > > > The question is just to get some hints where I should look... > > > > Both fealds are without norms, omitTf(true) , RAMDirectory, using > > TopDocs top = ixSearcher.search(q, null, getMaxNumOfCandidates()); > > and BooleanQuery.setAllowDocsOutOfOrder(true); > > > > maybe we made some mistakes on measuring, but we did simple timing here on > search() method... strange. I would bet it is something we did, but I cannot > see > where ... > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org