[ 
https://issues.apache.org/jira/browse/LUCENE-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053245#comment-13053245
 ] 

Mike Sokolov commented on LUCENE-3080:
--------------------------------------

There could be a good reason though for using byte-offsets in highlighting. I 
have in mind an optimization that would pull in text from an external file or 
other source, enabling highlighting without stored fields.  For best 
performance the snippet should be pulled from the external source using random 
access to storage, but this requires byte offsets.  I think this might be a big 
win for large field values.

This could only be done if the highlighter doesn't need to perform any text 
manipulation itself, so it's not really appropriate for Highlighter, as Robert 
said, but in the case of FVH it might be possible to implement.  I'm looking at 
this, but wondering before I get too deep in if anyone can comment on the 
feasibility of using byte offsets - I'm unclear on what they get used for other 
than highlighting: would it cause problems to have a CharFilter that returns 
"corrected" offsets such that char positions in the analyzed text are 
translated into byte positions in the source text? 

> cutover highlighter to BytesRef
> -------------------------------
>
>                 Key: LUCENE-3080
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3080
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/highlighter
>            Reporter: Michael McCandless
>
> Highlighter still uses char[] terms (consumes tokens from the analyzer as 
> char[] not as BytesRef), which is causing problems for merging SOLR-2497 to 
> trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to