I am using Lucene 4.0 and trying to use its highlighting feature. I am not getting the desired result due to some mistake that I am not able to identify. My source code looks like
String sourceText = "liver disease kidney transplant"; String termString ="\"transplant\""; SimpleAnalyzer simpleAnalyzer = new SimpleAnalyzer(Version.LUCENE_40); Query query = new QueryParser(Version.LUCENE_40,"contents", simpleAnalyzer).parse(termString); TokenStream tokenStream = simpleAnalyzer.tokenStream("contents", new StringReader(sourceText)); QueryScorer scorer = new QueryScorer(query,"contents"); scorer.setExpandMultiTermQuery(true); Fragmenter fragmenter = new SimpleSpanFragmenter(scorer); SimpleHTMLFormatter simpleHTMLFormatter = new SimpleHTMLFormatter( "*", "*") ; Highlighter highlighter = new Highlighter(simpleHTMLFormatter, scorer ); highlighter.setTextFragmenter(fragmenter); highlighter.setMaxDocCharsToAnalyze(10000); String resultString = highlighter.getBestFragments(tokenStream,sourceText,1000, "..."); System.out.println("Source Text1 = "+sourceText); System.out.println("Result Text1 = "+resultString); sourceText = "for liver transplantation."; tokenStream = simpleAnalyzer.tokenStream("contents", new StringReader(sourceText)); resultString = highlighter.getBestFragments(tokenStream,sourceText,1000, "..."); System.out.println("Source Text2 = "+sourceText); System.out.println("Result Text2 = "+resultString); For the first text, I am getting the result properly but not for the second one Source Text1 = liver disease kidney transplant Result Text1 = liver disease kidney *transplant* Source Text2 = for liver transplantation. Result Text2 = I am expecting the result for second one like for liver *transplant*ation or for liver *transplantation* What is wrong in my code? -- View this message in context: http://lucene.472066.n3.nabble.com/highlighting-tp542569p3178841.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org