[ 
https://issues.apache.org/jira/browse/SOLR-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640618#action_12640618
 ] 

Todd Feak commented on SOLR-815:
--------------------------------

It's a hidden storage format in the index. As long as index and search do it 
the same way, it's a coin toss.

For this particular case, Full-Width was chosen as the underlying format, as 
the majority of Japanese text and searches that we are seeing actually uses the 
Full-Width versions of both the Katakana and Latin characters. This is probably 
due to the platform we are on. This means less conversion occurs. Admittedly, 
it's a minor performance choice, but this is what we have. 

I'm not stuck on it being one way or the other and change should be easy.

> Add new Japanese half-width/full-width normalizaton Filter and Factory
> ----------------------------------------------------------------------
>
>                 Key: SOLR-815
>                 URL: https://issues.apache.org/jira/browse/SOLR-815
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.3
>            Reporter: Todd Feak
>            Priority: Minor
>         Attachments: SOLR-815.patch
>
>
> Japanese Katakana and  Latin alphabet characters exist as both a "half-width" 
> and "full-width" version. This new Filter normalizes to the full-width 
> version to allow searching and indexing using both.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to