[ http://issues.apache.org/jira/browse/LUCENE-627?page=comments#action_12420920 ]
Mark Harwood commented on LUCENE-627: ------------------------------------- The problem appears to be because the "pod" token advances the start position to 1 while the next token "ipod" takes a step back (to 0) I've found if you just arrange the tokens to be emitted in start pos order all is fine - see below public void testOverlapAnalyzer2() throws Exception { String s = "iPod foo"; // the token stream for the string above: TokenStream ts = new TokenStream() { Iterator iter; { List lst = new ArrayList(); Token t; //moved this token to start t = new Token("ipod",0,4); lst.add(t); t = new Token("i",0,1); lst.add(t); t = new Token("pod",1,4); t.setPositionIncrement(0); lst.add(t); t = new Token("foo",5,8); lst.add(t); iter = lst.iterator(); } public Token next() throws IOException { return iter.hasNext() ? (Token)iter.next() : null; } }; String srchkey = "foo"; QueryParser parser=new QueryParser("text",new WhitespaceAnalyzer()); Query query = parser.parse(srchkey); Highlighter highlighter = new Highlighter(new QueryScorer(query)); // Get 3 best fragments and seperate with a "..." String result = highlighter.getBestFragments(ts, s, 3, "..."); //had to upper case the P in the test here String expectedResult="iPod <B>foo</B>"; assertEquals(expectedResult,result); } > highlighter problems with overlapping tokens > -------------------------------------------- > > Key: LUCENE-627 > URL: http://issues.apache.org/jira/browse/LUCENE-627 > Project: Lucene - Java > Type: Bug > Components: Other > Versions: 2.0.1 > Reporter: Yonik Seeley > > The lucene highlighter has problems when tokens that overlap are generated. > For example, if analysis of iPod generates the tokens "i", "pod", "ipod" > (with pod and ipod in the same position), > then the highlighter will output this as iipod, regardless of if any of those > tokens are highlighted. > Discovered via http://issues.apache.org/jira/browse/SOLR-24 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]