[
https://issues.apache.org/jira/browse/LUCENE-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15936810#comment-15936810
]
David Smiley commented on LUCENE-7729:
--------------------------------------
Nice work Amrit. Just curious; is this feature driven by a search-app
requirement or ...
One issue with the implementation I see is that if it starts to find a match
but ultimately doesn't, then the position is not reset back to the start (plus
1). This means hypothetically a string separator of {{ab}} would fail to find
the substring in the input {{aab}}. I didn't try with your patch but do you
concur? I'm a little concerned about possible overhead for this mode. Maybe
subclassing to override advanceForward and advanceBackward would make sense.
If this were an inner class to do the string, then a factory method instead of
constructor could be used. I think CustomSeparatorBreakIterator should
continue to accept a single char constructor arg; I imagine most uses of this
would remain to be one character.
nitpick: most Lucene/Solr code is stylistically different than yours. Please
always use braces where they are optional in Java. And please always put
spaces around keywords, and around squiggly brackets. If per chance you use
IntelliJ, then "ant idea" should have the formatting already configured.
> Support for string type separator for CustomSeparatorBreakIterator
> ------------------------------------------------------------------
>
> Key: LUCENE-7729
> URL: https://issues.apache.org/jira/browse/LUCENE-7729
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/highlighter
> Reporter: Amrit Sarkar
> Attachments: LUCENE-7729.patch
>
>
> LUCENE-6485: currently CustomSeparatorBreakIterator breaks the text when the
> _char_ passed is found.
> Improved CustomSeparatorBreakIterator; as it now supports separator of string
> type of arbitrary length.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]