Re: Make TermScorer non final

Grant Ingersoll Wed, 18 Mar 2009 11:05:48 -0700


On Mar 18, 2009, at 7:57 AM, Michael McCandless wrote:


Coming from the discussions in LUCENE-1522 (improving highlighter), I
think at some point we should merge Span*Query into their normal
counterparts, if possible.

Ie, there should be only one TermQuery that can do both what the
current TermQuery does, and also what SpanTermQuery does.  It's able
to enumerate the spans/payloads for a given document, and if you don't
request those, the performance should hopefully be equal to that of
the current TermQuery.

The highligher would in fact request spans for a "normal" TermQuery,
on a single doc index at a time, in order to locate the hits.

Likewise for SpanOrQuery, SpanAndQuery.

I have no real sense of how much work this is, what problems would
ensue (eg possible difference in scoring, etc.), but from
highlighter's standpoint, ideally all queries need to be able to
enumerate the collection of positions that established the match.

Maybe they should all implement a common Interface that provideshighlighting info? I don't know what it would be, but it seems easierto do that then to merge them all, but I'm not sure. Not that Iwouldn't want to see a simpler query system. There's some coolthings you can do w/ spans, but they still have some fundamental flawsthat make them annoying. Namely, often times one of the reasons youwant Spans is b/c you care about what is going on around the match,i.e. co-occurrence data, yet it is still annoying/difficult to getthat information w/o pivoting around either term vectors or reanalyzing the document. With the new Attribute stuff, however, itmight be getting a little easier, as one could now store offsetinformation at the term level (which you can do w/ payloads, too) andthen use that to index into the original String.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Make TermScorer non final

Reply via email to