[ 
https://issues.apache.org/jira/browse/LUCENE-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13597151#comment-13597151
 ] 

Robert Muir commented on LUCENE-4816:
-------------------------------------

I don't think it should be returning any html tags here. This highlighter 
breaks the document into sentences. each sentence is scored and the top-N 
matching sentences are returned.

it doesn't know about or deal with html tags, nor does it return documents.

the patch here would return the whole rest of the document after the 
highlighted portion. I dont think we should do this.


                
> PassageFormatter in PostingsHighlighter trunk the message returned
> ------------------------------------------------------------------
>
>                 Key: LUCENE-4816
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4816
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/highlighter
>    Affects Versions: 4.1
>         Environment: NA
>            Reporter: Sebastien Dionne
>         Attachments: package.html, PassageFormatter.java, 
> PassageFormatter-PATCH.java
>
>
> when I try to highlight the word zero [0]  in the file : 
> org\apache\lucene\search\postingshighlight\package.html
> the 2 last lines weren't return.  There are 4 Passages : 
> 2-65
> 277-434
> 434-735
> 735-968
> but the length of the file is 984.
> in the file : PassageFormatter.format(...)
> it should return all the original content with the words highlighted.
> PATCH
> need to add this at the end of the method
> // at line : 91 add this
> if(pos<content.length()){
>     sb.append(content.substring(pos));
> }
>     
> return sb.toString();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to