Hi Yuta,
Are you looking for a specific analyzer like CJKANalyzer or do you
look for tokenstreams like lowercaseTokenFilter etc.
A fair bit of the token filters are already converted to support
handle surrogate pairs correctly. If you need help to figure out how
to use stuff from trunk I'm happy to help.

simon

On Fri, Mar 12, 2010 at 5:27 AM, Yuta Kawadai <[email protected]> wrote:
> Thank you.
>
> Now I use own Analyzer which based on "MeCab" (It's open source
> Japanese morphological analyzer).
> I try to modify it to support surrogate pairs.
>
> And I'm expecting the next release!
>
> Yuta
>
> 2010/3/11 Robert Muir <[email protected]>:
>> On Wed, Mar 10, 2010 at 6:52 PM, Yuta Kawadai <[email protected]> wrote:
>>> Hi
>>>
>>> Can Lucene use surrogate pairs (and its term positions or length) ?
>>>
>>> Thanks,
>>> Yuta
>>
>> Yes, just make sure you use an Analyzer that supports them...
>> unfortunately most of the ones included with released versions of
>> Lucene (e.g. CJKAnalyzer) will not do the right thing, hopefully in
>> the next release they will.
>>
>> --
>> Robert Muir
>> [email protected]
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to