Hello all,

I'm using lucene to search through a couple of documents to find interesting documents. Most documents are in Dutch language. I saw that the default snowball stemmer wasn't doing well on text written in a foreign language. Lucky i found a Dutch text analyzer in de lucene sandbox project.

I've read the javadoc and found out it needs a stemdictionary. You can load this dictionary with the following function:
DutchAnalyzer.setStemDictionary(File f)


The format needs to be a tab separator list (word [tab] stem).

To be sure i do everything correctly i've got a question about the dictonary:
Can i just get:
<http://snowball.tartarus.org/dutch/diffs.txt>
and convert it to a tab separated list and then "feed" it to the setStemDictionary() function?


Kind regards,
Twan Kogels




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to