Dutch Analyzer dictionary format?

Twan Kogels Fri, 26 Nov 2004 01:38:37 -0800

Hello all,

I'm using lucene to search through a couple of documents to find interesting documents. Most documents are in Dutch language. I saw that the default snowball stemmer wasn't doing well on text written in a foreign language. Lucky i found a Dutch text analyzer in de lucene sandbox project.

I've read the javadoc and found out it needs a stemdictionary. You can load this dictionary with the following function: DutchAnalyzer.setStemDictionary(File f)

The format needs to be a tab separator list (word [tab] stem).

To be sure i do everything correctly i've got a question about the dictonary: Can i just get: <http://snowball.tartarus.org/dutch/diffs.txt> and convert it to a tab separated list and then "feed" it to the setStemDictionary() function?

Kind regards, Twan Kogels

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Dutch Analyzer dictionary format?

Reply via email to