[ https://issues.apache.org/jira/browse/LUCENE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470327 ]
Mark Harwood commented on LUCENE-794: ------------------------------------- >>Sorry about all that Mark H No need for any apologies - all help is gratefully received! I don't mean to criticise your efforts or seem picky - I just wanted to record my findings somewhere useful if we were to consider working a solution up from this "test code" rather than tweaking the current highlighter - I'm still uncertain about the best approach. I also thought it might be useful to point the potential issues out to you if you were already reliant on using this code somewhere. >>I need to read the TokenStream at least twice >>I used the horribly hackey but quick-for-me method of adding a method to >>MemoryIndex that accepts a List of Tokens. Any ideas? I'm not sure about modifying MemoryIndex. It should be easy enough to create a subclass of TokenStream - ("CachedTokenStream" perhaps?) which takes a real TokenStream in it's constructor and delegates all "next" calls to it (and also records them in a List) for the the first use. This can then be "rewound" and re-used to run through the same set of tokens held in the list from the first run. >>if position increment equals 0 skip printing out the token...but I am not >>totally confident it is perfect yet. I think it's possible some of the more Byzantine analyzers may have a position increment >0 but overlap in terms of their byte offsets. I'd need to check the old Junit tests to be sure on this. Welcome to my hell! Thanks again for your help. Mark H > Beginnings of a span based highlighter > -------------------------------------- > > Key: LUCENE-794 > URL: https://issues.apache.org/jira/browse/LUCENE-794 > Project: Lucene - Java > Issue Type: Improvement > Components: Other > Reporter: Mark Miller > Priority: Minor > Attachments: DefaultEncoder.java, Encoder.java, Formatter.java, > Highlighter.java, Highlighter.java, HighlighterTest.java, > HighlighterTest.java, MemoryIndex.java, QuerySpansExtractor.java, > SimpleFormatter.java > > > This is some test code to start the work of adding a span based highlighting > approach to the existing highlighter in contrib. See > http://issues.apache.org/jira/browse/LUCENE-403 for some background. > There is a dependency on MemoryIndex. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]