Hi Mike: That's weird. Let me take a look at the patch. Need to brush up on python though :) Thanks -John
On Tue, Oct 20, 2009 at 10:25 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > OK I posted a patch that folds the MultiPQ approach into > contrib/benchmark, plus a simple python wrapper to run old/new tests > across different queries, sort, topN, etc. > > But I got different results... MultiPQ looks generally slower than > SinglePQ. So I think we now need to reconcile what's different > between our tests. > > Mike > > On Mon, Oct 19, 2009 at 9:28 PM, John Wang <john.w...@gmail.com> wrote: > > Hi Michael: > > Was wondering if you got a chance to take a look at this. > > Since deprecated APIs are being removed in 3.0, I was wondering > if/when > > we would decide on keeping the ScoreDocComparator API and thus would be > kept > > for Lucene 3.0. > > Thanks > > -John > > > > On Fri, Oct 16, 2009 at 9:53 AM, Michael McCandless > > <luc...@mikemccandless.com> wrote: > >> > >> Oh, no problem... > >> > >> Mike > >> > >> On Fri, Oct 16, 2009 at 12:33 PM, John Wang <john.w...@gmail.com> > wrote: > >> > Mike, just a clarification on my first perf report email. > >> > The first section, numHits is incorrectly labeled, it should be 20 > >> > instead > >> > of 50. Sorry about the possible confusion. > >> > Thanks > >> > -John > >> > > >> > On Fri, Oct 16, 2009 at 3:21 AM, Michael McCandless > >> > <luc...@mikemccandless.com> wrote: > >> >> > >> >> Thanks John; I'll have a look. > >> >> > >> >> Mike > >> >> > >> >> On Fri, Oct 16, 2009 at 12:57 AM, John Wang <john.w...@gmail.com> > >> >> wrote: > >> >> > Hi Michael: > >> >> > I added classes: ScoreDocComparatorQueue > >> >> > and OneSortNoScoreCollector > >> >> > as > >> >> > a more general case. I think keeping the old api for > >> >> > ScoreDocComparator > >> >> > and > >> >> > SortComparatorSource would work. > >> >> > Please take a look. > >> >> > Thanks > >> >> > -John > >> >> > > >> >> > On Thu, Oct 15, 2009 at 6:52 PM, John Wang <john.w...@gmail.com> > >> >> > wrote: > >> >> >> > >> >> >> Hi Michael: > >> >> >> It is > >> >> >> open, http://code.google.com/p/lucene-book/source/checkout > >> >> >> I think I sent the https url instead, sorry. > >> >> >> The multi PQ sorting is fairly self-contained, I have 2 > >> >> >> versions, 1 > >> >> >> for string and 1 for int, each are Collector impls. > >> >> >> I shouldn't say the Multi Q is faster on int sort, it is > within > >> >> >> the > >> >> >> error boundary. The diff is very very small, I would stay they are > >> >> >> more > >> >> >> equal. > >> >> >> If you think it is a good thing to go this way, (if not for > the > >> >> >> perf, > >> >> >> just for the simpler api) I'd be happy to work on a patch. > >> >> >> Thanks > >> >> >> -John > >> >> >> On Thu, Oct 15, 2009 at 5:18 PM, Michael McCandless > >> >> >> <luc...@mikemccandless.com> wrote: > >> >> >>> > >> >> >>> John, looks like this requires login -- any plans to open that > up, > >> >> >>> or, > >> >> >>> post the code on an issue? > >> >> >>> > >> >> >>> How self-contained is your Multi PQ sorting? EG is it a > standalone > >> >> >>> Collector impl that I can test? > >> >> >>> > >> >> >>> Mike > >> >> >>> > >> >> >>> On Thu, Oct 15, 2009 at 6:33 PM, John Wang <john.w...@gmail.com> > >> >> >>> wrote: > >> >> >>> > BTW, we are have a little sandbox for these experiments. And > all > >> >> >>> > my > >> >> >>> > testcode > >> >> >>> > are at. They are not very polished. > >> >> >>> > > >> >> >>> > https://lucene-book.googlecode.com/svn/trunk > >> >> >>> > > >> >> >>> > -John > >> >> >>> > > >> >> >>> > On Thu, Oct 15, 2009 at 3:29 PM, John Wang < > john.w...@gmail.com> > >> >> >>> > wrote: > >> >> >>> >> > >> >> >>> >> Numbers Mike requested for Int types: > >> >> >>> >> > >> >> >>> >> only the time/cputime are posted, others are all the same > since > >> >> >>> >> the > >> >> >>> >> algorithm is the same. > >> >> >>> >> > >> >> >>> >> Lucene 2.9: > >> >> >>> >> numhits: 10 > >> >> >>> >> time: 14619495 > >> >> >>> >> cpu: 146126 > >> >> >>> >> > >> >> >>> >> numhits: 20 > >> >> >>> >> time: 14550568 > >> >> >>> >> cpu: 163242 > >> >> >>> >> > >> >> >>> >> numhits: 100 > >> >> >>> >> time: 16467647 > >> >> >>> >> cpu: 178379 > >> >> >>> >> > >> >> >>> >> > >> >> >>> >> my test: > >> >> >>> >> numHits: 10 > >> >> >>> >> time: 14101094 > >> >> >>> >> cpu: 144715 > >> >> >>> >> > >> >> >>> >> numHits: 20 > >> >> >>> >> time: 14804821 > >> >> >>> >> cpu: 151305 > >> >> >>> >> > >> >> >>> >> numHits: 100 > >> >> >>> >> time: 15372157 > >> >> >>> >> cpu time: 158842 > >> >> >>> >> > >> >> >>> >> Conclusions: > >> >> >>> >> The are very similar, the differences are all within error > >> >> >>> >> bounds, > >> >> >>> >> especially with lower PQ sizes, which second sort alg again > >> >> >>> >> slightly > >> >> >>> >> faster. > >> >> >>> >> > >> >> >>> >> Hope this helps. > >> >> >>> >> > >> >> >>> >> -John > >> >> >>> >> > >> >> >>> >> > >> >> >>> >> On Thu, Oct 15, 2009 at 3:04 PM, Yonik Seeley > >> >> >>> >> <yo...@lucidimagination.com> > >> >> >>> >> wrote: > >> >> >>> >>> > >> >> >>> >>> On Thu, Oct 15, 2009 at 5:33 PM, Michael McCandless > >> >> >>> >>> <luc...@mikemccandless.com> wrote: > >> >> >>> >>> > Though it'd be odd if the switch to searching by segment > >> >> >>> >>> > really was most of the gains here. > >> >> >>> >>> > >> >> >>> >>> I had assumed that much of the improvement was due to > ditching > >> >> >>> >>> MultiTermEnum/MultiTermDocs. > >> >> >>> >>> Note that LUCENE-1483 was before LUCENE-1596... but that only > >> >> >>> >>> helps > >> >> >>> >>> with queries that use a TermEnum (range, prefix, etc). > >> >> >>> >>> > >> >> >>> >>> -Yonik > >> >> >>> >>> http://www.lucidimagination.com > >> >> >>> >>> > >> >> >>> >>> > >> >> >>> >>> > >> >> >>> >>> > --------------------------------------------------------------------- > >> >> >>> >>> To unsubscribe, e-mail: > java-dev-unsubscr...@lucene.apache.org > >> >> >>> >>> For additional commands, e-mail: > >> >> >>> >>> java-dev-h...@lucene.apache.org > >> >> >>> >>> > >> >> >>> >> > >> >> >>> > > >> >> >>> > > >> >> >>> > >> >> >>> > >> >> >>> > --------------------------------------------------------------------- > >> >> >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > >> >> >>> For additional commands, e-mail: java-dev-h...@lucene.apache.org > >> >> >>> > >> >> >> > >> >> > > >> >> > > >> >> > >> >> --------------------------------------------------------------------- > >> >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > >> >> For additional commands, e-mail: java-dev-h...@lucene.apache.org > >> >> > >> > > >> > > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-dev-h...@lucene.apache.org > >> > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >