Hi,
now I played around with the snowball porter stemmer and it definitely
feels really good (used German2 as suggested).
For some cases (e.g. product types like top/tops, bermuda/bermudas or
hoody/hoodies) additionally we need synonyms. At first I thought it
would be good to use synonyms only
Hi Daniel,
thanx for your suggestions, being able to export a large synonyms.txt
sounds very well!
Thx cheers,
Martin
On Wed, 2007-10-10 at 23:38 +0200, Daniel Naber wrote:
On Wednesday 10 October 2007 12:00, Martin Grotzke wrote:
Basically I see two options: stemming and the usage of
Martin Grotzke schrieb:
Try the SnowballPorterFilterFactory with German2 as language attribute
first and use synonyms for combined words i.e. Herrenhose = Herren,
Hose.
so you use a combined approach?
Yes, we define the relevant parts of compounded words (keywords only) as
synonyms
Hello,
with our application we have the issue, that we get different
results for singular and plural searches (german language).
E.g. for hose we get 1.000 documents back, but for hosen
we get 10.000 docs. The same applies to t-shirt or t-shirts,
of e.g. hut and hüte - lots of cases :)
This is
in short: use stemming
Try the SnowballPorterFilterFactory with German2 as language attribute
first and use synonyms for combined words i.e. Herrenhose = Herren,
Hose.
By using stemming you will maybe have some interesting results, but it
is much better living with them than having no or
On Wednesday 10 October 2007 12:00, Martin Grotzke wrote:
Basically I see two options: stemming and the usage of synonyms. Are
there others?
A large list of German words and their forms is available from a Windows
software called Morphy