[ 
https://issues.apache.org/jira/browse/LUCENE-4984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651026#comment-13651026
 ] 

Robert Muir commented on LUCENE-4984:
-------------------------------------

tokenizing from a breakiterator can get a little tricky.

we had some support for this (it should be re-reviewed) in the initial kuromoji 
integration (SegmentingTokenizerBase.java and its test)
But we ended out adding a streaming viterbi search so we didnt need it anymore:

http://svn.apache.org/viewvc?view=revision&revision=1230748
                
> Fix ThaiWordFilter
> ------------------
>
>                 Key: LUCENE-4984
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4984
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>         Attachments: LUCENE-4984.patch
>
>
> ThaiWordFilter is an offender in TestRandomChains because it creates 
> positions and updates offsets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to