getBestFragments with SimpleSpanFragmenter

Vladimir Svetov Thu, 13 Oct 2016 18:28:38 -0700

Hi  all,


I have the following 2 indexed data for the field, title_t_en:

       "\"War and Peace\" by \"Leo Tolstoy\"
       \"Three sisters" by \"Anton Chekhov\""

I am searching by :  +((title_t_en:war) (title_t_en:sister))

For every found doc's index *value*  the following code is called:

   SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter();
   QueryScorer  queryScorer = new QueryScorer(luceneQuery);
   Highlighter   highlighter = new Highlighter(htmlFormatter, queryScorer);
   SimpleSpanFragmenter fragmenter = new SimpleSpanFragmenter(queryScorer,
*5)*;
   String bestFragments  = highlighter.getBestFragments(tokenStream, *value*,
*3,*FRAGMENT_DELIMITER );

  The code produces the following bestFragments for found values:
                      "\"<B>War</B> and Peace\" by \"Leo Tolstoy\""
                       "\"Three <B>sisters</B>\" by \"Anton Chekhov\""

  Question:
                 Why does bestFragments  contain more then  5  bytes?
                 Should the getBestFragments() return  3 fragments with
delimiters , where each fragment  does not exceed 5 bytes?

Regards,
Vlad

getBestFragments with SimpleSpanFragmenter

Reply via email to