I generally don't trust the compiler, if/when I have that freedom. If you can fix a hotspot in Lucene to avoid an extra method call, an extra add/multiply, etc., you should. Doing so ensures the cost can't be there. Not doing so means you rely on the JRE to be smart enough, and it very easily may not be (there are so many variables), and that also makes Lucene's performance more fragile/env-specific.
Why take that chance? I also don't rely on benchmarks to validate this on a case-by-case basis; the cost for any single change (like this one) can easily be in the noise, yet these micro-costs do add up. Different rules apply "down low". It's like quantum physics! I think, besides avoiding method calls, there are compelling reasons to consider a stronger decoupling of matching & scoring. A Query really ought to be two separable things -- matching (like Filter) and scoring. EG DisjunctionMaxQuery has its own matching code that duplicates what BooleanQuery does if the query is all SHOULD clauses. Why duplicate this code? Why restrict the "max score of all subs = doc's score" to only SHOULD-only BooleanQueries? If we had full matching/scoring decoupling, we wouldn't have to. Or, eg the BM25 patch (LUCENE-2091) had to create its own BM25BooleanQuery to do matching & scoring, which is silly -- if it's only changing how scoring works, it should be able to reuse the existing matching code in BooleanQuery. That said, there are challenges; eg the higher performance BooleanScorer (which scores docs in "chunks" and is free to collect them out-of-order) would be challenging to fully decouple from scoring since it's not strictly "doc-at-once". On the other part of the proposal (allowing .score() to take an arbitrary docID), that does sound like a can of worms. MG4J's model (scorer receives the full "state" of the matcher and can peek in as necessary) sounds compelling... Mike On Wed, Jun 9, 2010 at 3:35 AM, Earwin Burrfoot <ear...@gmail.com> wrote: > Lies, lies, lies :) > I mean, Sun JIT is overrelied on. Especially in regards to inlining. > > But, there are some cases when you can trust it. I.e. if you call a > virtual method and this exact call-site gets refs to different objects > at runtime (meaning here - you wrap different Queries in your > WrapperQuery) - you can definetly rely on a call not being inlined. > > So, I agree with John on his /rough/ overhead estimates, on the part > that it exists, and it's detectable. I don't agree on allowing > arbitrary doc scoring. People who really need this for some strange > applications, can emulate this now - by advancing() scorer to needed > doc, and calling score(). But for most people it's unnecessary, and as > I said - will lead to scaaary code. > > If you really think that one or two method calls in a loop are > neglible, I ask you to join my holy crusade and erase > Scorer.score(Collector) set of methods :) they exist there for the > sole purporse of cutting on a few method calls, and are really, > really, really confusing. > > > 2010/6/9 Shai Erera <ser...@gmail.com>: >> I don't think the method call is an overhead John. You don't need to >> reiterate it. The compiler does make optimizations and inlines such >> code/calls if it can. More than that, the query processing involves so much >> method calls, that I do think that's insignificant. > > Woohoo! Mexican standoff! :) > > -- > Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) > Phone: +7 (495) 683-567-4 > ICQ: 104465785 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org