Re: getBestFragments with SimpleSpanFragmenter

2016-10-13 Thread lukes
If you open the source, you will see it internally calls 

this.getBestFragments(tokenStream, text, maxNumFragments) which in turn
calls 

this.getBestTextFragments(tokenStream, text, true, maxNumFragments) (*with
flag true*) which will merge the fragments automatically. 

Regards.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/getBestFragments-with-SimpleSpanFragmenter-tp4301065p4301069.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: getBestFragments with SimpleSpanFragmenter

2016-10-13 Thread Vladimir Svetov
Thanks for advice
However I asked about  different API semantic:

String fragments = highlighter.*getBestFragments*(tokenStream, value,
*3*,FRAGMENT_DELIMITER
);

when

  fragmenter = new SimpleSpanFragmenter(queryScorer,*5*);
  highlighter.setTextFragmenter(fragmenter);

Regards


On Thu, Oct 13, 2016 at 7:37 PM, lukes  wrote:

> Please pass false to mergeContiguousFragments in
> getBestTextFragments(TokenStream tokenStream, String text, boolean
> mergeContiguousFragments, int maxNumFragments) and it should work as
> expected.
>
> Regards.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/getBestFragments-with-SimpleSpanFragmenter-
> tp4301065p4301066.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: getBestFragments with SimpleSpanFragmenter

2016-10-13 Thread lukes
Please pass false to mergeContiguousFragments in
getBestTextFragments(TokenStream tokenStream, String text, boolean
mergeContiguousFragments, int maxNumFragments) and it should work as
expected.

Regards.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/getBestFragments-with-SimpleSpanFragmenter-tp4301065p4301066.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



getBestFragments with SimpleSpanFragmenter

2016-10-13 Thread Vladimir Svetov
Hi  all,


I have the following 2 indexed data for the field, title_t_en:

   "\"War and Peace\" by \"Leo Tolstoy\"
   \"Three sisters" by \"Anton Chekhov\""

I am searching by :  +((title_t_en:war) (title_t_en:sister))

For every found doc's index *value*  the following code is called:

   SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter();
   QueryScorer  queryScorer = new QueryScorer(luceneQuery);
   Highlighter   highlighter = new Highlighter(htmlFormatter, queryScorer);
   SimpleSpanFragmenter fragmenter = new SimpleSpanFragmenter(queryScorer,
*5)*;
   String bestFragments  = highlighter.getBestFragments(tokenStream, *value*,
*3,*FRAGMENT_DELIMITER );

  The code produces the following bestFragments for found values:
  "\"War and Peace\" by \"Leo Tolstoy\""
   "\"Three sisters\" by \"Anton Chekhov\""

  Question:
 Why does bestFragments  contain more then  5  bytes?
 Should the getBestFragments() return  3 fragments with
delimiters , where each fragment  does not exceed 5 bytes?

Regards,
Vlad