[ 
https://issues.apache.org/jira/browse/LUCENE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607415#action_12607415
 ] 

Tavi Nathanson commented on LUCENE-794:
---------------------------------------

Hi,

I'm new to Lucene and the highlighter, so I apologize if my question is 
obvious. In any case, I'm trying to allow phrase highlighting in my instance of 
Lucene, so I applied this patch to 2.3.2. I'm confused, though, about the 
structure of SpanScorer vs. QueryScorer. Why does SpanScorer require the stream 
of source text tokens (i.e. SpanScorer(Query query, String field, 
CachingTokenFilter cachingTokenFilter)) while QueryScorer does not (i.e. 
QueryScorer(Query query, String fieldName))?

Intuitively, if QueryScorer is scoring based on the number of unique query 
terms found in the document, wouldn't the stream of source text tokens be 
necessary for this calculation? I'm wondering a) why is this not necessary in 
QueryScorer? and b) what makes it necessary in SpanScorer? I'm having some 
trouble understanding the code, and was wondering if I could get any guidance 
:).

Thanks!

Tavi

> Extend contrib Highlighter to properly support PhraseQuery, SpanQuery,  
> ConstantScoreRangeQuery
> -----------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-794
>                 URL: https://issues.apache.org/jira/browse/LUCENE-794
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Other
>            Reporter: Mark Miller
>            Priority: Minor
>             Fix For: 2.3.2
>
>         Attachments: MultiPhraseQueryExtraction.patch, 
> SpanHighlighter-01-26-2008.patch, SpanHighlighter-01-28-2008.patch, 
> SpanHighlighter-02-10-2008.patch, SpanHighlighter-RemovSysOut.patch, 
> spanhighlighter.patch, spanhighlighter10.patch, spanhighlighter11.patch, 
> spanhighlighter12.patch, spanhighlighter2.patch, spanhighlighter3.patch, 
> spanhighlighter5.patch, spanhighlighter6.patch, spanhighlighter7.patch, 
> spanhighlighter8.patch, spanhighlighter9.patch, 
> spanhighlighter_24_January_2008.patch, spanhighlighter_patch_4.zip
>
>
> This patch adds a new Scorer class (SpanQueryScorer) to the Highlighter 
> package that scores just like QueryScorer, but scores a 0 for Terms that did 
> not cause the Query hit. This gives 'actual' hit highlighting for the range 
> of SpanQuerys, PhraseQuery, and  ConstantScoreRangeQuery. New Query types are 
> easy to add. There is also a new Fragmenter that attempts to fragment without 
> breaking up Spans.
> See http://issues.apache.org/jira/browse/LUCENE-403 for some background.
> There is a dependency on MemoryIndex.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to