[ 
https://issues.apache.org/jira/browse/LUCENE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721457#action_12721457
 ] 

Robert Muir commented on LUCENE-1692:
-------------------------------------

Michael, I think it would be nice to fix the Thai offset bug, so highlighter 
will work. this is a safe one-line fix and its an obvious error.

The SmartChineseAnalyzer empty token bug is pretty serious, i think indexing 
empty tokens for every piece of punctuation could really hurt similarity 
computation (am i wrong, never tried?)

The Thai .type() bug is something that could be fixed later, i don't think the 
token type being ALPHANUM versus NUM is really hurting anyone.

The issue where DutchAnalyzer doesnt do what it claims, i think thats not 
really hurting anyone, and they can use the snowball version if they want 
accurate snowball behavior.
I do think the huge files in DutchAnalyzer that aren't being used can be 
removed if you want to save 1MB, but I'm not sure how important that is.

Let me know your thoughts. 

> Contrib analyzers need tests
> ----------------------------
>
>                 Key: LUCENE-1692
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1692
>             Project: Lucene - Java
>          Issue Type: Test
>          Components: contrib/analyzers
>            Reporter: Robert Muir
>            Assignee: Michael McCandless
>             Fix For: 2.9
>
>         Attachments: LUCENE-1692.txt, LUCENE-1692.txt, LUCENE-1692.txt, 
> LUCENE-1692.txt
>
>
> The analyzers in contrib need tests, preferably ones that test the behavior 
> of all the Token 'attributes' involved (offsets, type, etc) and not just what 
> they do with token text.
> This way, they can be converted to the new api without breakage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to