[ https://issues.apache.org/jira/browse/SOLR-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640615#action_12640615 ]
Todd Feak commented on SOLR-814: -------------------------------- Yes, they are used differently. However, a word written in Hiragana is the *same* word when written in Katakana. Same meaning. Futhermore, it's not always cut and dried which to use. For example, a movie title may be written in Hiragana or Katakana, depending on the Director's preference. The user (searcher) may not have remembered the Director's preference, so may search using the other. Without this normalization they would get a search miss. I don't doubt your experience at Ultraseek, but this feature was explicitly asked for by Japanese (native speaking) engineers at Sony. I *just* (literally) double checked with a couple of onsite native speaking Japanese engineers and both agree that this is useful, at least for our searches. I would say that it should be up to the schema developer as to whether this functionality is useful or not for their situation. Either way, I offer it up to the community for their decision. > Add new Japanese Hiragana Filter and Factory > -------------------------------------------- > > Key: SOLR-814 > URL: https://issues.apache.org/jira/browse/SOLR-814 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.3 > Reporter: Todd Feak > Priority: Minor > Attachments: SOLR-814.patch > > > Japanese Hiragana and Katakana character sets can be easily translated > between. This filter normalizes all Hiragana characters to their Katakana > counterpart, allowing for indexing and searching using either. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.