I do not know exactly why, but when I BooleanQuery.setAllowDocsOutOfOrder(true); I have the problem, but with setAllowDocsOutOfOrder(false); no problems whatsoever
not really scientific method to find such bug, but does the job and makes me happy. Empirical, "deprecated methods are not to be taken as thoroughly tested, as they have short life expectancy" ----- Original Message ---- > From: eks dev <eks...@yahoo.co.uk> > To: java-user@lucene.apache.org > Sent: Wednesday, 15 July, 2009 0:24:43 > Subject: Re: speed of BooleanQueries on 2.9 > > > Mike, we are definitely hitting something with this one! > > we had report from our QA chaps that our servers got stuck (limit is on 180 > Seconds Request)... We are on average 14 Requsts per second.... has nothing > to > do with gc() as > we can repeat it with freshly restarted searcher. > > - it happens on a less than 0.1% of queries, not much of a pattern, > repeatable > on our index... > it is always combination of two expanded tokens (we use > minimumNooShouldMatch)... > > (+(t1 [up to 40 expansions]) +(t2 [up to 40 expansions of t2])) > all tokens are with set boost, and minNumShouldMatch is set to two > > I cannot provide self-contained test, nor index (contains sensitive data and > is > rather big, ~5G) > > I can repeat this test on t1 and t2 with 40 expansions each. even if I take > the > most frequent tokens in collection it runs well under one second...but these > two > particular tokens with their "expansions" are making it run forever... > > and yes, if I run t1 plus expansions only, it runs super fast, the same for t2 > > java 1.4U14, tried wit 1.6U6, no changes... > > will report if I dig something out > > partial stack trace while "stuck", cpu is on max: > > org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(Unknown > > Source) > org.apache.lucene.search.BooleanScorer.score(Unknown Source) > org.apache.lucene.search.BooleanScorer.score(Unknown Source) > org.apache.lucene.search.IndexSearcher.search(Unknown Source) > org.apache.lucene.search.IndexSearcher.search(Unknown Source) > org.apache.lucene.search.Searcher.search(Unknown Source) > > > > > > ----- Original Message ---- > > From: eks dev > > To: java-user@lucene.apache.org > > Sent: Monday, 13 July, 2009 13:28:45 > > Subject: Re: speed of BooleanQueries on 2.9 > > > > Hi Mike, > > > > getMaxNumOfCandidates() in test was 200, Index is optimised and read-only > > > > We found (due to an error in our warm-up code, funny) that only this Query > runs > > slower on 2.9. > > > > A hint where to look could be that this Query cointains two, the most > > frequent > > > tokens in two particular fields > > NAME:hans and ZIPS:berlin (index has ca 80Mio very short documents, 3Mio > unique > > terms) > > > > But all of this *could be just wrong measurement*, I just could not spend > > more > > > time to get to the bottom of this. We moved forward as we got overall > > better > > average performance (sweet 10% in average) on much bigger real query log > > from > > our regression test. > > > > Anyhow I just wanted to throw it out, maybe it triggers some synapses :) If > > false alarm, sorry. > > > > > > > > > > > > ----- Original Message ---- > > > From: Michael McCandless > > > To: java-user@lucene.apache.org > > > Sent: Monday, 13 July, 2009 11:50:48 > > > Subject: Re: speed of BooleanQueries on 2.9 > > > > > > This is not expected; 2.9 has had a number of changes that ought to > > > reduce CPU cost of searching. If this holds up we definitely need to > > > get to the root cause. > > > > > > Did your test exclude the warmup query for both 2.4.1 & 2.9? How many > > > segments in the index? What is the actual value of > > > getMaxNumOfCandidates()? If you simplify the query down (eg just do > > > the NAME clause or the ZIPSS clause, alone) are those also 4X slower? > > > > > > Mike > > > > > > On Sun, Jul 12, 2009 at 12:53 PM, eks devwrote: > > > > > > > > Is it possible that the same BooleanQuery on 2.9 runs significantly > > > > slower > > > > than on 2.4? > > > > > > > > we have some strange effects where the following query runs approx > 4(ouch!) > > > times slower on 2.9, test done by 1000 times executing the same Query... > But! > > if > > > I run test from some real Query log with mixed Queries, I get almost the > same > > > results (?!), even slightly faster on 2.9 !? > > > > > > > > > > > > Query: > > > > +((NAME:hans NAME:hahns^0.23232001 NAME:hams^0.27648002 > > > > NAME:hamz^0.25392 > > > NAME:hanas^0.18722998 NAME:hanbs^0.18722998 NAME:hanfs^0.18722998 > > > NAME:hangs^0.18722998 NAME:hanhs^0.24030754 NAME:hanis^0.18722998 > > > NAME:hanjs^0.18722998 NAME:hanks^0.18722998 NAME:hanms^0.18722998 > > > NAME:hanos^0.18722998 NAME:hanrs^0.18722998 NAME:hansb^0.20172001 > > > NAME:hansd^0.20172001 NAME:hansf^0.20172001 NAME:hansg^0.20172001 > > > NAME:hansi^0.20172001 NAME:hansj^0.20172001 NAME:hansk^0.20172001 > > > NAME:hansl^0.20172001 NAME:hansn^0.20172001 NAME:hanso^0.20172001 > > > NAME:hansp^0.20172001 NAME:hanst^0.20172001 NAME:hansu^0.20172001 > > > NAME:hansw^0.20172001 NAME:hansy^0.20172001 NAME:hansz^0.20172001 > > > NAME:hants^0.18722998 NAME:hanus^0.18722998 NAME:hanws^0.18722998 > > > NAME:hehns^0.20172001 NAME:hens^0.2736075 NAME:hins^0.24843 > NAME:hons^0.24843 > > > NAME:huhns^0.1801875 NAME:huns^0.24843)^2.0) > > > > +(((ZIPS:berlin ZIPS:barlin^0.28227 ZIPS:berien^0.25947002 > > > ZIPS:berling^0.23232001 ZIPS:perlin^0.26133335))^1.2) > > > > > > > > The question is just to get some hints where I should look... > > > > > > > > Both fealds are without norms, omitTf(true) , RAMDirectory, using > > > > TopDocs top = ixSearcher.search(q, null, getMaxNumOfCandidates()); > > > > and BooleanQuery.setAllowDocsOutOfOrder(true); > > > > > > > > maybe we made some mistakes on measuring, but we did simple timing here > > > > on > > > > search() method... strange. I would bet it is something we did, but I > > > cannot > > > see > > > where ... > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org