Re: getBestFragments with SimpleSpanFragmenter
If you open the source, you will see it internally calls this.getBestFragments(tokenStream, text, maxNumFragments) which in turn calls this.getBestTextFragments(tokenStream, text, true, maxNumFragments) (*with flag true*) which will merge the fragments automatically. Regards. -- View this message in context: http://lucene.472066.n3.nabble.com/getBestFragments-with-SimpleSpanFragmenter-tp4301065p4301069.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: getBestFragments with SimpleSpanFragmenter
Thanks for advice However I asked about different API semantic: String fragments = highlighter.*getBestFragments*(tokenStream, value, *3*,FRAGMENT_DELIMITER ); when fragmenter = new SimpleSpanFragmenter(queryScorer,*5*); highlighter.setTextFragmenter(fragmenter); Regards On Thu, Oct 13, 2016 at 7:37 PM, lukes wrote: > Please pass false to mergeContiguousFragments in > getBestTextFragments(TokenStream tokenStream, String text, boolean > mergeContiguousFragments, int maxNumFragments) and it should work as > expected. > > Regards. > > > > -- > View this message in context: http://lucene.472066.n3. > nabble.com/getBestFragments-with-SimpleSpanFragmenter- > tp4301065p4301066.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >
Re: getBestFragments with SimpleSpanFragmenter
Please pass false to mergeContiguousFragments in getBestTextFragments(TokenStream tokenStream, String text, boolean mergeContiguousFragments, int maxNumFragments) and it should work as expected. Regards. -- View this message in context: http://lucene.472066.n3.nabble.com/getBestFragments-with-SimpleSpanFragmenter-tp4301065p4301066.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
getBestFragments with SimpleSpanFragmenter
Hi all, I have the following 2 indexed data for the field, title_t_en: "\"War and Peace\" by \"Leo Tolstoy\" \"Three sisters" by \"Anton Chekhov\"" I am searching by : +((title_t_en:war) (title_t_en:sister)) For every found doc's index *value* the following code is called: SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter(); QueryScorer queryScorer = new QueryScorer(luceneQuery); Highlighter highlighter = new Highlighter(htmlFormatter, queryScorer); SimpleSpanFragmenter fragmenter = new SimpleSpanFragmenter(queryScorer, *5)*; String bestFragments = highlighter.getBestFragments(tokenStream, *value*, *3,*FRAGMENT_DELIMITER ); The code produces the following bestFragments for found values: "\"War and Peace\" by \"Leo Tolstoy\"" "\"Three sisters\" by \"Anton Chekhov\"" Question: Why does bestFragments contain more then 5 bytes? Should the getBestFragments() return 3 fragments with delimiters , where each fragment does not exceed 5 bytes? Regards, Vlad