Hi Mike: I have been playing with the patch, and I think I have some information that you might like.
Let me spend sometime and gather some more numbers and update in jira. Thanks btw: About the conversion on multi values fields, I am not sure I get it (sorry for being ignorant): say bottom has ords 23, 45, 76, each corresponding to a string. When moving to the next segment, you need to make bottom to have ords that can be comparable to other docs in this new segment, so you would need to find the new ords for the values in 23,45 and 76, don't you? To find it, assuming the values are s1,s2,s3, you would do a bin. search on the new val array, and find index for s1,s2,s3. Which is 3 bin searches per convert, I am not sure how you can short circuit it. Are you suggesting we call Comparable on compareBottom until some doc beats it? That would hurt performance I lot though, no? -John On Wed, Oct 21, 2009 at 3:11 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Tue, Oct 20, 2009 at 11:55 AM, John Wang <john.w...@gmail.com> wrote: > > > the simpler api places less restriction on the type of custom > > sorting that can be done. > > Just to verify: this is not a back-compat break, right? > > Because, in 2.4, such an interesting custom sort must've been > operating at the top-level index reader level, which is easy to carry > over to 2.9 (you just rebase the docIDs). > > But, of course in moving to 2.9, you would like to also switch your > custom sort to be per-segment (for faster reopen/near real-time perf), > but the new sort API makes this more difficult because it requires > that you are able to compare hits across different segments during the > search, not just at the end. > > But then I don't understand the difficulty of doing that: if we had a > Collector with the MultiPQ approach, at the end during merge, you'd > also have to compare results across segments, ie, upgrade your ords to > their real values. The MultiPQ approach does this by calling > sortValue (returns Comparable) in the end. > > Putting performance aside for now... when comparing bottom, you don't > actually have to "truly invert" Comparable -> ord on segment > transition. You could, instead, get the Comparable for each and > compare, but then note the smallest ord for the current segment that > has failed to compete, and short-ciruit the compareBottom test by > checking against that ord. That should enable carrying over the custom > sort to the single PQ API without needing invert ord->value. > > We'd obviously have to test performance... > > Or, we could commit the MultiPQ approach as another sorting collector? > I know it's not great having two wildly differenet sort APIs, but both > APIs seem to have their strengths in different cases. > > Mike > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >