Try the version from SVN, I just applied Cheolgoo's patch. Otis
--- Youngho Cho <[EMAIL PROTECTED]> wrote: > Hello, > > Is there any plan to add this patch into lucene core ? > I am using CJKAnalyzer but I hope to switch to the StanadardAnalyzer. > > Thanks, > > Youngho > > ----- Original Message ----- > From: "Cheolgoo Kang (JIRA)" <[EMAIL PROTECTED]> > To: <java-dev@lucene.apache.org> > Sent: Tuesday, October 04, 2005 11:26 PM > Subject: [jira] Created: (LUCENE-444) StandardTokenizer loses Korean > characters > > > > StandardTokenizer loses Korean characters > > ----------------------------------------- > > > > Key: LUCENE-444 > > URL: http://issues.apache.org/jira/browse/LUCENE-444 > > Project: Lucene - Java > > Type: Bug > > Components: Analysis > > Reporter: Cheolgoo Kang > > Priority: Minor > > > > > > While using StandardAnalyzer, exp. StandardTokenizer with Korean > text stream, StandardTokenizer ignores the Korean characters. This is > because the definition of CJK token in StandardTokenizer.jj JavaCC > file doesn't have enough range covering Korean syllables described in > Unicode character map. > > This patch adds one line of 0xAC00~0xD7AF, the Korean syllables > range to the StandardTokenizer.jj code. > > > > -- > > This message is automatically generated by JIRA. > > - > > If you think it was sent incorrectly contact one of the > administrators: > > http://issues.apache.org/jira/secure/Administrators.jspa > > - > > For more information on JIRA, see: > > http://www.atlassian.com/software/jira > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]