Hi,

Dmitry Serebrennikov wrote:
> Stemming by itself couldn't solve this problem, it seems, because I don't think it 
>is designed for splitting compound > words. Yet, this seems like a common issue that 
>people would run into constantly. So I was wandering:
> - Do German stemmers typically split compound words as well as chooping them down to 
>a root form?

No, not typically, but it can be implemented.

> - Does this processing require dictionary-based approaches or are there enough clues 
>in the word structure to allow words > to be split algorithmically (ala Porter 
>stemmer)?

It's not possible without a dictionary. There are some rules how to
compound some words, but no common rule that is valid for all compounds.
And there are many traps.

> - How is this problem typically solved, in terms of smaller search engines and in 
>terms of Yahoos and Googles of the 
> German landscape?

Hm, a dictionary solution, it think. Or pattern matching.

> Thanks very much for any information to help with this!
> - Dmitry

Greets,
Gerhard

_______________________________________________
Lucene-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/lucene-users

Reply via email to