[ http://issues.apache.org/jira/browse/LUCENE-627?page=comments#action_12438333 ] Kerang Lv commented on LUCENE-627: ----------------------------------
Hi Mark, sorry for the long time missing! Here is the test, it fails again with the lastest version (Revision 450719): Expected :A<B>BC</B>DE<B>FG</B>HIJ Actual:A<B>BCDEFG</B>HIJ public void testOverlapAnalyzer4() throws Exception { String s = "ABCDEFGHIJ"; // the token stream for the string above: TokenStream ts = new TokenStream() { Iterator iter; { List lst = new ArrayList(); Token t; t = new Token("AB",0,2); lst.add(t); t = new Token("BC",1,3); lst.add(t); t = new Token("CD",2,4); lst.add(t); t = new Token("DE",3,5); lst.add(t); t = new Token("EF",4,6); lst.add(t); t = new Token("FG",5,7); lst.add(t); t = new Token("GH",6,8); lst.add(t); t = new Token("HI",7,9); lst.add(t); t = new Token("IJ",8,10); lst.add(t); iter = lst.iterator(); } public Token next() throws IOException { return iter.hasNext() ? (Token)iter.next() : null; } }; String srchkey = "BC FG"; QueryParser parser=new QueryParser("text",new WhitespaceAnalyzer()); Query query = parser.parse(srchkey); Highlighter highlighter = new Highlighter(new QueryScorer(query)); // Get 3 best fragments and seperate with a "..." String result = highlighter.getBestFragments(ts, s, 3, "..."); String expectedResult="A<B>BC</B>DE<B>FG</B>HIJ"; assertEquals(expectedResult,result); } > highlighter problems with overlapping tokens > -------------------------------------------- > > Key: LUCENE-627 > URL: http://issues.apache.org/jira/browse/LUCENE-627 > Project: Lucene - Java > Issue Type: Bug > Components: Other > Affects Versions: 2.0.1 > Reporter: Yonik Seeley > Fix For: 2.0.1 > > Attachments: highlight_overlap.diff, Highlighter.java.diff > > > The lucene highlighter has problems when tokens that overlap are generated. > For example, if analysis of iPod generates the tokens "i", "pod", "ipod" > (with pod and ipod in the same position), > then the highlighter will output this as iipod, regardless of if any of those > tokens are highlighted. > Discovered via http://issues.apache.org/jira/browse/SOLR-24 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]