Hi all,

I am making a Japanese analyzer and patches for Lucene tools now. I
have already wrote a tokenizer which uses a Japanese morphological
analyzer internally because Japanese texts aren't separated by
whitespaces. I have a plan to put them to lucene sandbox, but it will
take more time to make patches for tools.

From: Peter Carlson <[EMAIL PROTECTED]>
Subject: Call for features in next release
Date: Mon, 20 May 2002 11:19:15 -0700
Message-ID: <[EMAIL PROTECTED]>
> I would like be able to have the features that we are planning on having for
> the next release.

I would like to request the following:

1, Selecting a language-specific analyzer according to a locale.

Now we rewrite parts of lucene codes in order to use another
analyzer. It will be useful to select analyzer without touching codes.

2, Adding "-encoding" option and encoding-sensitive methods to tools.

Current tools needs minor changes on a Japanese (and other language)
environment: adding an "-encode" option and argument, useing
Reader/Writer classes instead of InputStream/OutputStream classes,
etc.

Thanks,

Kazuhiro Kazama ([EMAIL PROTECTED])       NTT Network Innovation Laboratories

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to