Hey Michael: Would you mind rerunning the test you have with jdk1.5?
Also, if you would, change the comparator method to avoid brachning for int and string comparators, e.g. return index.order[i.doc] - index.order[j.doc]; Thanks -John On Thu, Oct 22, 2009 at 2:38 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Thu, Oct 22, 2009 at 2:17 AM, John Wang <john.w...@gmail.com> wrote: > > > I have been playing with the patch, and I think I have some > information > > that you might like. > > Let me spend sometime and gather some more numbers and update in > jira. > > Excellent! > > > say bottom has ords 23, 45, 76, each corresponding to a string. When > > moving to the next segment, you need to make bottom to have ords that can > be > > comparable to other docs in this new segment, so you would need to find > the > > new ords for the values in 23,45 and 76, don't you? To find it, assuming > the > > values are s1,s2,s3, you would do a bin. search on the new val array, and > > find index for s1,s2,s3. > > It's that inversion (from ord->Comparable in first seg, and > Comparable->ord in second seg) that I'm trying to avoid (w/ this new > proposal). > > > Which is 3 bin searches per convert, I am not sure > > how you can short circuit it. Are you suggesting we call Comparable on > > compareBottom until some doc beats it? > > I'm saying on seg transition you indeed get the Comparable for current > bottom, but, don't attempt to invert it. Instead, as seg 2 finds a > hit, you get that hit's Comparables and compare to bottom. If it > beats bottom, it goes into the queue. If it does not, you use the ord > (in seg 2's ord space) to "learn" a bottom in the ord space of seg 2. > > > That would hurt performance I lot though, no? > > Yeah I think likely it would, since we're talking about a binary > search on transition VS having to do possibly many > upgrade-to-Comparable and compare-Comparabls to slowly learn the > equivalent ord in the new segment. I was proposing it for cases where > inversion is very difficult. But realistically, since you must keep > around the ful ord -> Comparable for every segment anyway (in order to > merge in the end), inversion shouldn't ever actually be "difficult" -- > it'd just be a binary search on presumably in-RAM storage. > > Mike > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >