Re: surrogate pairs

2010-03-12 Thread David Leangen
Hi, Yuta-san, >> Now I use own Analyzer which based on "MeCab" (It's open source >> Japanese morphological analyzer). >> I try to modify it to support surrogate pairs. >> >> And I'm expecting the next release! Cool! I look forward to that. Is there a link somewhere to your project? I am very

Re: surrogate pairs

2010-03-11 Thread Simon Willnauer
Hi Yuta, Are you looking for a specific analyzer like CJKANalyzer or do you look for tokenstreams like lowercaseTokenFilter etc. A fair bit of the token filters are already converted to support handle surrogate pairs correctly. If you need help to figure out how to use stuff from trunk I'm happy to

Re: surrogate pairs

2010-03-11 Thread Yuta Kawadai
Thank you. Now I use own Analyzer which based on "MeCab" (It's open source Japanese morphological analyzer). I try to modify it to support surrogate pairs. And I'm expecting the next release! Yuta 2010/3/11 Robert Muir : > On Wed, Mar 10, 2010 at 6:52 PM, Yuta Kawadai wrote: >> Hi >> >> Can Lu

Re: surrogate pairs

2010-03-11 Thread Yuta Kawadai
I'm sorry for lack of talk. I try to treat the text which contains "surrogate pairs" in Lucene. So, I want to confirm whether Lucene(core part, Analyzer, TokenFilter and so on) can treat terms which contains "surrogate pairs" or not. Thanks, Yuta 2010/3/11 Erick Erickson : > Please describe the

Re: surrogate pairs

2010-03-11 Thread Simon Willnauer
On Thu, Mar 11, 2010 at 2:28 AM, Robert Muir wrote: > On Wed, Mar 10, 2010 at 6:52 PM, Yuta Kawadai wrote: >> Hi >> >> Can Lucene use surrogate pairs (and its term positions or length) ? >> >> Thanks, >> Yuta > > Yes, just make sure you use an Analyzer that supports them... > unfortunately most o

Re: surrogate pairs

2010-03-10 Thread Robert Muir
On Wed, Mar 10, 2010 at 6:52 PM, Yuta Kawadai wrote: > Hi > > Can Lucene use surrogate pairs (and its term positions or length) ? > > Thanks, > Yuta Yes, just make sure you use an Analyzer that supports them... unfortunately most of the ones included with released versions of Lucene (e.g. CJKAnal

Re: surrogate pairs

2010-03-10 Thread Erick Erickson
Please describe the problem you're trying to solve, what *you* mean by "surrogate pairs" and how you'd like Lucene to use them. The lack of these details forces any responder to guess, almost certainly wrongly. Best Erick On Wed, Mar 10, 2010 at 6:52 PM, Yuta Kawadai wrote: > Hi > > Can Lucene