thihy created LUCENE-6200:
-----------------------------
Summary: Highlight with cjkanalyzer went wrong
Key: LUCENE-6200
URL: https://issues.apache.org/jira/browse/LUCENE-6200
Project: Lucene - Core
Issue Type: Bug
Components: modules/highlighter
Affects Versions: 4.10.2
Reporter: thihy
I have write a test case for this. I expect "<B>游戏</B>是<B>游戏</B>", but get
"<B>游戏是游戏</B>"
{code:java}
public static void main(String[] args) throws IOException,
InvalidTokenOffsetsException {
String text = "游戏是游戏";
String query = "游戏";
CJKAnalyzer analyzer = new CJKAnalyzer();
Scorer fragmentScorer = new QueryScorer(new TermQuery(new
Term("field",
query)));
Highlighter highlighter = new Highlighter(fragmentScorer);
String fragment = highlighter.getBestFragment(
analyzer.tokenStream("field", text), text);
analyzer.close();
System.out.println(fragment); // println: <B>游戏是游戏</B>
}
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]