Add katakana filter to better deal with katakana spelling variants
------------------------------------------------------------------

                 Key: LUCENE-3901
                 URL: https://issues.apache.org/jira/browse/LUCENE-3901
             Project: Lucene - Java
          Issue Type: New Feature
          Components: modules/analysis
            Reporter: Christian Moen
             Fix For: 3.6, 4.0


Many Japanese katakana words end in a long sound that is sometimes optional.

For example, パーティー and パーティ are both perfectly valid for "party".  Similarly we 
have センター and センタ that are variants of "center" as well as サーバー and サーバ for 
"server".

I'm proposing that we add a katakana stemmer that removes this long sound if 
the terms are longer than a configurable length.  It's also possible to add the 
variant as a synonym, but I think stemming is preferred from a ranking point of 
view.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to