+1 On 11/16/09, Robert Muir (JIRA) <j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/LUCENE-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778378#action_12778378 > ] > > Robert Muir commented on LUCENE-1689: > ------------------------------------- > > a couple people have asked me about this issue lately, I would prefer to > spin off smaller issues rather than create large patches that become out of > date. > > also I think Simon is interested in working on some of this, so more jira > spam but i think easier to make progress. > > >> supplementary character handling >> -------------------------------- >> >> Key: LUCENE-1689 >> URL: https://issues.apache.org/jira/browse/LUCENE-1689 >> Project: Lucene - Java >> Issue Type: Improvement >> Reporter: Robert Muir >> Priority: Minor >> Fix For: 3.1 >> >> Attachments: LUCENE-1689.patch, LUCENE-1689.patch, >> LUCENE-1689.patch, LUCENE-1689_lowercase_example.txt, >> testCurrentBehavior.txt >> >> >> for Java 5. Java 5 is based on unicode 4, which means variable-width >> encoding. >> supplementary character support should be fixed for code that works with >> char/char[] >> For example: >> StandardAnalyzer, SimpleAnalyzer, StopAnalyzer, etc should at least be >> changed so they don't actually remove suppl characters, or modified to >> look for surrogates and behave correctly. >> LowercaseFilter should be modified to lowercase suppl. characters >> correctly. >> CharTokenizer should either be deprecated or changed so that isTokenChar() >> and normalize() use int. >> in all of these cases code should remain optimized for the BMP case, and >> suppl characters should be the exception, but still work. > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >
--------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org