DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=27182>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=27182 Thai Analysis Enhancement Summary: Thai Analysis Enhancement Product: Lucene Version: unspecified Platform: All OS/Version: All Status: NEW Severity: Enhancement Priority: Other Component: Analysis AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Unlike other languages, Thai do not have a clear word boundary within a sentence. Words are written consecutively without a delimiter. The Lucene StandardTokenizer currently cannot tokenize a Thai sentence and returns the whole sentence as a token. A special tokenizer to break Thai sentences into words is required. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]