On Tuesday 26 July 2005 22:14, Valliappan Annamalai wrote:

> I would like to work on "Design and build code that will combine
> information on prefixes and suffixes".

The thesaurus would benefit from code that can find the base form for any 
word. E.g. walked -> walk, children -> child. This could be plugged into 
the existing thesaurus code easily, it's basically just one method like 
"getBaseform(String)". Of course it would need to support several 
languages. Some languages are very irregular, this also needs to be 
handled efficiently.

BTW, an easier thing to start with would be to check how the thesaurus code 
can be modified so it supports UTF-8. A standalone version of the 
thesaurus code is available at 
http://lingucomponent.openoffice.org/thesaurus.html

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to