Add katakana filter to better deal with katakana spelling variants ------------------------------------------------------------------
Key: LUCENE-3901 URL: https://issues.apache.org/jira/browse/LUCENE-3901 Project: Lucene - Java Issue Type: New Feature Components: modules/analysis Reporter: Christian Moen Fix For: 3.6, 4.0 Many Japanese katakana words end in a long sound that is sometimes optional. For example, パーティー and パーティ are both perfectly valid for "party". Similarly we have センター and センタ that are variants of "center" as well as サーバー and サーバ for "server". I'm proposing that we add a katakana stemmer that removes this long sound if the terms are longer than a configurable length. It's also possible to add the variant as a synonym, but I think stemming is preferred from a ranking point of view. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org