Hi,
Dmitry Serebrennikov wrote:
> Stemming by itself couldn't solve this problem, it seems, because I don't think it
>is designed for splitting compound > words. Yet, this seems like a common issue that
>people would run into constantly. So I was wandering:
> - Do German stemmers typically split compound words as well as chooping them down to
>a root form?
No, not typically, but it can be implemented.
> - Does this processing require dictionary-based approaches or are there enough clues
>in the word structure to allow words > to be split algorithmically (ala Porter
>stemmer)?
It's not possible without a dictionary. There are some rules how to
compound some words, but no common rule that is valid for all compounds.
And there are many traps.
> - How is this problem typically solved, in terms of smaller search engines and in
>terms of Yahoos and Googles of the
> German landscape?
Hm, a dictionary solution, it think. Or pattern matching.
> Thanks very much for any information to help with this!
> - Dmitry
Greets,
Gerhard
_______________________________________________
Lucene-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/lucene-users