Hi Daniel,
On 09/22/2008 at 12:49 AM, Daniel Noll wrote:
I have a question about Korean tokenisation. Currently there
is a rule in StandardTokenizerImpl.jflex which looks like this:
ALPHANUM = ({LETTER}|{DIGIT}|{KOREAN})+
LUCENE-1126 https://issues.apache.org/jira/browse/LUCENE-1126
Steven A Rowe wrote:
Korean has been treated differently from Chinese and Japanese since
LUCENE-461 https://issues.apache.org/jira/browse/LUCENE-461. The
grouping of Hangul with digits was introduced in this issue.
Certainly I found LUCENE-461 during my search, and certainly grouping