Hi Mike: Here are the results for numHits = 10:
Lucene 2.9: num string compares: 86 num conversions: 21 num inserts: 115 time: 15069705 cpu: 174294 my test sort: num string compares: 49 num conversions: 0 num inserts: 778 time: 14665375 cpu: 156442 This is how the test data is indexed, basically a random positive number assigned to each doc: int num = Math.abs(rand.nextInt()); String val = _formatter.get().format(num); Document doc = new Document(); Field f = new Field("id",String.valueOf(docNum),Store.YES,Index.NOT_ANALYZED_NO_NORMS); f.setOmitTf(true); Field f2 = new Field("num",val,Store.NO,Index.NOT_ANALYZED_NO_NORMS); f2.setOmitTf(true); String filterVal; if (docNum%2 == 0){ filterVal = "even"; } else{ filterVal = "odd"; } Field f3 = new Field("filter",filterVal,Store.NO,Index.NOT_ANALYZED_NO_NORMS); f3.setOmitTf(true); doc.add(f); doc.add(f2); doc.add(f3); Hope this helps about reverse mapping and conversion: basically the contract is convert a document's meta information that is associated with a segment to a new segment. The only comparable part of the data is the value. For fast sort, usually ordinal compare is used, which requires a reverse mapping. As for the "mistery" why the old all fieldcache implementation was slower I think was the lack of tuning in the collector as well as use of the PQ. Which means a larger constant on the "dominating" part. (I must say, the level of code-tuning in the 2.9 code is impressive!) -John On Thu, Oct 15, 2009 at 2:12 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > Nice results! Comments below... > > On Thu, Oct 15, 2009 at 3:58 PM, John Wang <john.w...@gmail.com> wrote: > > Hi guys: > > > > I did some Big O math a few times and reached the same conclusion > Jake > > had. > > > > I was not sure about the code tuning opportunities we could have done > > with the MergeAtTheEnd method as Yonik mentioned and the internal > behavior > > with PQ Mike suggested, so I went ahead and implemented the algorithm and > > compared against the current Lucene 2.9 implementation: > > > > set up: > > index size: 999992 docs > > segment count: 8 > > sorting on 1 field, no score, e.g. comparing against > > "OneComparatorNonScoringCollector" class. > > TermQuery used on field with only 2 values (even/odd), returning 499992 > > docs. > > What text field are you sorting on? (Ie where did it come from -- is > it a natural or synthetic source?). > > > I broke the tests into two parts, numHits = 20 and numHits = 50, for > each, I > > ran 100 iterations, took the time from 6 - 100 (to avoid field cache load > > time) and take the average. Did it 3 times and took that average. > > Can you try numHits=10 and 20? I'm curious how this difference scales > down. I believe most searches don't go beyond page 1. > > It'd be great to see sorting by other typed (simple numerics) fields, > and queries matching different # results, too. > > > Here are the numbers (times are measured in nanoseconds): > > > > numHits = 50: > > > > Lucene 2.9/OneComparatorNonScoringCollector: > > num string compares: 251 > > num conversions: 34 > > time: 15272049 > > cpu time: 161105 > > num inserts into PQ: 211 > > > > My test sort algorithm: > > num string compares: 51 > > num conversions: 0 > > time: 14943172 > > cpu time: 153722 > > num inserts into PQ: 1500 > > > > > > numHits = 100: > > > > > > Lucene 2.9/OneComparatorNonScoringCollector: > > num string compares: 2285 > > num conversions: 172 > > time: 17703866 > > cpu time: 187073 > > num inserts into PQ: 982 > > > > My test sort algorithm: > > num string compares: 210 > > num conversions: 0 > > time: 15823473 > > cpu time: 160737 > > num inserts into PQ: 6011 > > These are nice results. Single PQ does looks faster for this case. > > > Conclusion: > > > > My measurement is consistent with mine and Jake's Big O analysis, where > the > > dominating factor is the same between both algorithms. And in terms of > > timing/cpu, they are very similar with the second algorithm consistently > > perform a slightly better than Lucene 2.9. > > Mike is absolutely correct about the number of PQ insertions, but that > win > > seems to be shadowed by the cost of extra string compares and > conversions. > > If you look at the "growth" between numHits=20 and 100, the time gap > > increases, e.g. as the PQ gets larger (case where user navigates to a > higher > > page of the search result) the num string compares increase faster than > the > > increase of the size of PQ. > > There is an assumption that the conversion cost is low, which may not be > > true for custom sort extensions, and could easily nominate the entire > query > > time. (more analysis below) > > > > Given this analysis, seems like at the least the second algorithm is not > > worse in terms of performance, and I would argue it is actually better. > > I agree, so far! > > > With the current API which ties into this algorithm in Lucene 2.9 > > (FieldComparator), there is actually a subtle behavior change, such that, > to > > implement a custom sort, user would have to be able to do this > conversion, > > which involves a reverse mapping from value to ordinals. This is a change > on > > the contract for writing custom sort comparison, where before, > > ordinal->value mapping can be 1 to many, and now, with the requirement > for > > this reverse mapping, it is much more difficult and expensive to do. > > But a custom comparator need not do the reverse conversion when > implementing FieldComparator? Ie, it can implement the conversion > however it wants? Also one can always implement their own Collector > to do the sorting. > > > If you agree, I think it is worthwhile to open a jira ticket for this > while > > this FieldComparator API is still fresh. And I'd be happy to provide my > > implementation, test code and my index. > > +1 > > Mike > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >