[jira] Updated: (LUCENE-1685) Make the Highlighter use SpanScorer by default

Mark Miller (JIRA) Sun, 02 Aug 2009 16:18:41 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mark Miller updated LUCENE-1685:
--------------------------------

    Attachment: LUCENE-1685.patch

Another rev making things a little easier.

QueryScorer now takes a TokenStream rather than a CachingTokenFilter - if there 
are any position sensitive clauses, the TokenStream will be wrapped in a 
CachingTokenFilter if it is not already a CachingTokenFilter.

This also removes having to call setTokenStream after constructing a 
QueryScorer and between calls to getBestFragment - instead, the new 
init(TokenStream) that the Highlighter already calls is used. This frees the 
user from having to make that call.

init(TokenStream) now can return a new TokenStream for the Highlighter to 
continue using (ie the QueryScorer may return a CachingTokenFilter if their is 
a position sensitive clause in the query) or null to keep using the same 
TokenStream.

Now you can use the SpanScorer (as QueryScorer now) the same way you could use 
the old QueryScorer impl:

    QueryScorer scorer =  new QueryScorer(query, FIELD_NAME);
    Highlighter highlighter = new Highlighter(this,scorer);
    highlighter.setTextFragmenter(new SimpleFragmenter(40));
    
    for (int i = 0; i < hits.length(); i++) {
      String text = hits.doc(i).get(FIELD_NAME);
      TokenStream tokenStream = analyzer.tokenStream(FIELD_NAME, new 
StringReader(text));

      String result = highlighter.getBestFragments(tokenStream, text, 
maxNumFragmentsRequired,
          "...");
      System.out.println("\t" + result);
    }

> Make the Highlighter use SpanScorer by default
> ----------------------------------------------
>
>                 Key: LUCENE-1685
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1685
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1685.patch, LUCENE-1685.patch
>
>
> I've always thought this made sense, but frankly, it took me a year to get 
> the SpanScorer included with Lucene at all, so I was pretty much ready to 
> move on after I it got in, rather than push for it as a default.
> I think it makes sense as the default in Solr as well, and I mentioned that 
> back when it was put in, but alas, its an option there as well.
> The Highlighter package has no back compat req, but custom has been 
> conservative - one reason I havn't pushed for this change before. Might be 
> best to actually make the switch in 3? I could go either way - as is, I know 
> a bunch of people use it, but I'm betting its the large minority. It has 
> never been listed in a changes entry and its not in LIA 1, so you pretty much 
> have to stumble upon it, and figure out what its for.
> I'll point out again that its just as fast as the standard scorer for any 
> clause of a query that is not position sensitive. Position sensitive query 
> clauses will obviously be somewhat slower to highlight, but that is because 
> they will be highlighted correctly rather than ignoring position.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Updated: (LUCENE-1685) Make the Highlighter use SpanScorer by default

Reply via email to