[ https://issues.apache.org/jira/browse/LUCENE-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-2399: -------------------------------- Attachment: LUCENE-2399.patch This patch is so simple, instead of 3 hairy tokenfilters. I would like to commit tomorrow (upgrading our icu.jar in contrib/icu to 4.4), unless there are any objections. > Add support for ICU's Normalizer2 > --------------------------------- > > Key: LUCENE-2399 > URL: https://issues.apache.org/jira/browse/LUCENE-2399 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/* > Affects Versions: 3.1 > Reporter: Robert Muir > Assignee: Robert Muir > Fix For: 3.1 > > Attachments: LUCENE-2399.patch > > > While there are separate Case Folding, Normalization, and Ignorable-removal > filters in LUCENE-1488, > the new ICU Normalizer2 API does this all at once with nfkc_cf (based on the > new NFKC_Casefold property in Unicode). > This is great, because it provides a ton of unicode functionality that is > really needed. > And the new Normalizer2 API takes CharSequence and writes to Appendable... -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org