Hi,

Dave Kor wrote:
> 
> > Hm, a dictionary solution, it think.
> 
> Fyi, I am currently working on a dictionary library
> (based on Lucene of course) for tokenizing and
> stemming the Chinese language. From what has been
> mentioned in this thread, it may be useful for the
> German language too.

Sure. I have wrote a german stemmer, it works dictionary independent,
but resolving compound words is not possible without a dictionary.

The stemming should remain dictionary independent (for german
language), but a dictionary for preprocessing compound nouns would
be a great benefit for the search precision.

> I must warn you that it is not completed, bug-tested
> nor full-featured.. I don't really have the time to
> work on it for now because I'm busy on another
> project. Would anyone like to try adapting it for
> German or any other language?

Send me the sources, I'll show if I can use it for the german
stemmer.


Gerhard

_______________________________________________
Lucene-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/lucene-users

Reply via email to