Hi Yuta, Are you looking for a specific analyzer like CJKANalyzer or do you look for tokenstreams like lowercaseTokenFilter etc. A fair bit of the token filters are already converted to support handle surrogate pairs correctly. If you need help to figure out how to use stuff from trunk I'm happy to help.
simon On Fri, Mar 12, 2010 at 5:27 AM, Yuta Kawadai <yuta...@gmail.com> wrote: > Thank you. > > Now I use own Analyzer which based on "MeCab" (It's open source > Japanese morphological analyzer). > I try to modify it to support surrogate pairs. > > And I'm expecting the next release! > > Yuta > > 2010/3/11 Robert Muir <rcm...@gmail.com>: >> On Wed, Mar 10, 2010 at 6:52 PM, Yuta Kawadai <yuta...@gmail.com> wrote: >>> Hi >>> >>> Can Lucene use surrogate pairs (and its term positions or length) ? >>> >>> Thanks, >>> Yuta >> >> Yes, just make sure you use an Analyzer that supports them... >> unfortunately most of the ones included with released versions of >> Lucene (e.g. CJKAnalyzer) will not do the right thing, hopefully in >> the next release they will. >> >> -- >> Robert Muir >> rcm...@gmail.com >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org