[ 
http://issues.apache.org/jira/browse/LUCENE-503?page=comments#action_12377071 ] 

Daniel Naber commented on LUCENE-503:
-------------------------------------

Thanks for your contribution. We're currently preparing Lucene 2.0 and as 
feature updates are only planned for the release after 2.0 it will take some 
more time to integrate this. 

Two remarks:

-It uses the english stop words, does that make sense?
-Could you write some test cases, similar maybe to those for the French 
analyzer?


> Contrib: ThaiAnalyzer to enable Thai full-text search in Lucene
> ---------------------------------------------------------------
>
>          Key: LUCENE-503
>          URL: http://issues.apache.org/jira/browse/LUCENE-503
>      Project: Lucene - Java
>         Type: New Feature

>   Components: Analysis
>     Versions: 1.4
>     Reporter: Samphan Raruenrom
>  Attachments: ThaiAnalyzer.java, ThaiWordFilter.java
>
> Thai text don't have space between words. Usually, a dictionary-based 
> algorithm is used to break string into words. For Lucene to be usable for 
> Thai, an Analyzer that know how to break Thai words is needed.
> I've implemented such Analyzer, ThaiAnalyzer, using ICU4j 
> DictionaryBasedBreakIterator for word breaking. I'll upload the code later.
> I'm normally a C++ programmer and very new to Java. Please review the code 
> for any problem. One possible problem is that it requires ICU4j. I don't know 
> whether this is OK.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to