[jira] Created: (SOLR-1410) remove deprecated custom encoding support in russian/greek analysis

Robert Muir (JIRA) Thu, 03 Sep 2009 14:29:23 -0700

remove deprecated custom encoding support in russian/greek analysis
-------------------------------------------------------------------


                 Key: SOLR-1410
                 URL: https://issues.apache.org/jira/browse/SOLR-1410
             Project: Solr
          Issue Type: Task
          Components: Analysis
            Reporter: Robert Muir


In this case, analyzers have strange encoding support and it has been 
deprecated in lucene.

For example someone using CP1251 in the russian analyzer is simply storing Ж as 
0xC6, its being represented as Æ

LUCENE-1793: Deprecate the custom encoding support in the Greek and Russian
    Analyzers. If you need to index text in these encodings, please use Java's
    character set conversion facilities (InputStreamReader, etc) during I/O, 
    so that Lucene can analyze this text as Unicode instead.

I noticed in solr, the factories for these tokenstreams allow these 
configuration options, which are deprecated in 2.9 to be removed in 3.0

Let me know the policy (how do you deprecate a config option in solr exactly, 
log a warning, etc?) and I'd be happy to create a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1410) remove deprecated custom encoding support in russian/greek analysis

Reply via email to