[
https://issues.apache.org/jira/browse/LUCENE-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated LUCENE-2219:
--------------------------------
Attachment: LUCENE-2219.patch
i merged Koji's fix and tests to CJK from LUCENE-2207 into this patch, and
improved CJKTokenizer's tests to always use assertAnalyzesTo, for better
checking.
i plan to commit soon
> improve BaseTokenStreamTestCase to test end()
> ---------------------------------------------
>
> Key: LUCENE-2219
> URL: https://issues.apache.org/jira/browse/LUCENE-2219
> Project: Lucene - Java
> Issue Type: Bug
> Components: Analysis, contrib/analyzers
> Affects Versions: 2.9, 3.0
> Reporter: Robert Muir
> Assignee: Robert Muir
> Fix For: 2.9.2, 3.0.1, 3.1
>
> Attachments: LUCENE-2219.patch, LUCENE-2219.patch, LUCENE-2219.patch
>
>
> If offsetAtt/end() is not implemented correctly, then there can be problems
> with highlighting: see LUCENE-2207 for an example with CJKTokenizer.
> In my opinion you currently have to write too much code to test this.
> This patch does the following:
> * adds optional Integer finalOffset (can be null for no checking) to
> assertTokenStreamContents
> * in assertAnalyzesTo, automatically fill this with the String length()
> In my opinion this is correct, for assertTokenStreamContents the behavior
> should be optional, it may not even have a Tokenizer. If you are using
> assertTokenStreamContents with a Tokenizer then simply provide the extra
> expected value to check it.
> for assertAnalyzesTo then it is implied there is a tokenizer so it should be
> checked.
> the tests pass for core but there are failures in contrib even besides
> CJKTokenizer (apply Koji's patch from LUCENE-2207, it is correct).
> Specifically ChineseTokenizer has a similar problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]