[ 
https://issues.apache.org/jira/browse/LUCENE-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5111:
--------------------------------

    Attachment: LUCENE-5111.patch

here is a patch. Its not super-optimized, but the 3 common conditions (no 
delimiters, all delimiters, just one word surrounded by delimiters) are just as 
fast. for the concatenation+parts stuff I used captureState (we can avoid it, 
it was just about correctness for me).

I think this is fairly important to fix so users can use e.g. postings 
highlighter and don't hit bugs like 
http://stackoverflow.com/questions/20324016/shingle-filter-factory-startoffset-must-be-non-negative-and-endoffset-must-be
 

> Fix WordDelimiterFilter
> -----------------------
>
>                 Key: LUCENE-5111
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5111
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>         Attachments: LUCENE-5111.patch
>
>
> WordDelimiterFilter is documented as broken is TestRandomChains 
> (LUCENE-4641). Given how used it is, we should try to fix it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to