[ 
https://issues.apache.org/jira/browse/LUCENE-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937734#comment-15937734
 ] 

Amrit Sarkar commented on LUCENE-7729:
--------------------------------------

bq. len > 0 (as a comment) but in all cases you probably mean len > 1?
Yes, that is correct.

bq. Let me give a better example of length 3: aab would fail to match aaab. I 
just wrote a test for that to confirm it failed. Here's another example of 
length 4 that may be more clear: A separator of acab would fail to be detected 
in acacab.
I see. The implemented is flawed, the algorithm I thought is incomplete and 
though some minor tweaking will make it work surely. I never considered 
repetitive pattern in the separator.

bq.  To be clear, I never asked or recommended. 
David, I completely understand and aware, I just pointed out the conversation 
which motivates me to look into it. I am thankful to you for taking your time 
out to provide healthy insights and feedback on the patch. I will not get 
discouraged if some of my work doesn't get into the main project, even I want 
to contribute which is useful not flawed.

With that, I will check out SimplePatternTokenizer and the Automation part. 
Thank you for your time again, really appreciate that. Should I leave this JIRA 
as it is? or instead atleast fix the implementation?

> Support for string type separator for CustomSeparatorBreakIterator
> ------------------------------------------------------------------
>
>                 Key: LUCENE-7729
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7729
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/highlighter
>            Reporter: Amrit Sarkar
>         Attachments: LUCENE-7729.patch, LUCENE-7729.patch
>
>
> LUCENE-6485: currently CustomSeparatorBreakIterator breaks the text when the 
> _char_ passed is found.
> Improved CustomSeparatorBreakIterator; as it now supports separator of string 
> type of arbitrary length.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to