Hi all, I am making a Japanese analyzer and patches for Lucene tools now. I have already wrote a tokenizer which uses a Japanese morphological analyzer internally because Japanese texts aren't separated by whitespaces. I have a plan to put them to lucene sandbox, but it will take more time to make patches for tools.
From: Peter Carlson <[EMAIL PROTECTED]> Subject: Call for features in next release Date: Mon, 20 May 2002 11:19:15 -0700 Message-ID: <[EMAIL PROTECTED]> > I would like be able to have the features that we are planning on having for > the next release. I would like to request the following: 1, Selecting a language-specific analyzer according to a locale. Now we rewrite parts of lucene codes in order to use another analyzer. It will be useful to select analyzer without touching codes. 2, Adding "-encoding" option and encoding-sensitive methods to tools. Current tools needs minor changes on a Japanese (and other language) environment: adding an "-encode" option and argument, useing Reader/Writer classes instead of InputStream/OutputStream classes, etc. Thanks, Kazuhiro Kazama ([EMAIL PROTECTED]) NTT Network Innovation Laboratories -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
