[
https://issues.apache.org/jira/browse/LUCENE-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143528#comment-13143528
]
Steven Rowe commented on LUCENE-2587:
-------------------------------------
Hi Terje,
Can you upload your IMSentenceFragmenter.java file again, but this time click
on the radio button next to "Grant license to ASF for inclusion in ASF works
(as per the Apache License ยง5)"?
Thanks,
Steve
> Highlighter picks wrong offset for fragment boundaries
> ------------------------------------------------------
>
> Key: LUCENE-2587
> URL: https://issues.apache.org/jira/browse/LUCENE-2587
> Project: Lucene - Java
> Issue Type: Bug
> Components: modules/highlighter
> Affects Versions: 3.0.2
> Environment: Java 6 + Lucene 3.0.2
> Reporter: Terje Eggestad
> Priority: Trivial
> Labels: newdev
> Attachments: IMSentenceFragmenter.java, LUCENE-2587.patch
>
>
> I have written a new Fragmenter since we need fragments for hitlines to be
> on sentence boundaries and not cross paragraphs.
> When using it with org.apache.lucene.search.highlight.Highlighter, I get
> hitlines that starts with ". ", "? ", "! "...
> Consider the text "A b c d e. F g h i j! K l m n o. "
> which become the tokenstream : (A) (b) (c) (d) (e) (F) (g) (h) (i) (j) (K)
> (l) (m) (n) (o)
> If the fragmenter return isNewFragment() = true on F and K and Highlighter
> pick the middle fragment, lets say we search on "g" the hitline becomes:
> ". F <B>g</B> h i j"
> The reason, it seems, is that the offset to the fragment boundaries found by
> taking the endOffset of the last token in a fragment ,
> not the startOffset of the first.
> TJ
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]