Hello,

You might use a synonym analyzer, which boils down to a lot of work for
persons having to maintain this synonym list. Also, probably, you want
multi words synonyms (there has been something added in the lucene trunk
lately about this, but IMO not really feasible ATM), which makes it much
much harder.

org.apache.lucene.analysis.compound might make it better, though you
need a good (extended) dutch dictionary, see [1]. I am not sure how it
exactly works, so can't help you out here. You might be able to use the
lucene index terms itself to extract the dictionary. The only problem I
can imagine is that the lucene terms contains stemmed words (for example
dier instead of 'dieren'). 

Anyway, you just might want to play little with it. The synonym approach
doesn't seem feasible to me. It will be a lot of work anyway. Try to
temper the expectations of your customer, because they might be too high
(you do not want to think about having to implement LSI (latent semantic
searching) kind of things (searching for Tiger Woods is search about
golf, not about tigers and wood))

You also might want to take a look at the N-grams. They help you find
similar hits


[1]
http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apac
he/lucene/analysis/compound/DictionaryCompoundWordTokenFilter.html

> 
> Hi Ard,
>  
> I have a question about lucene and word parts:
>  
> If you look for the dutch word 'genees' , lucene won't find 
> products with 'geneesmiddelen' or diergeneesmiddelen' words in it. 
> Is it possible (wihout using "*XXXX*" in the search) to find 
> word parts in a product like the example above?
> Any advise will do :)
>  
> ps. I thought of the use of the synonym analyzer or 
> org.apache.lucene.analysis.compound. 
>  
> THanks in advance,
>  
> Best regards,
> Amon
> _________________________________________________________________
> Express yourself instantly with MSN Messenger! Download today 
> it's FREE!
> http://messenger.msn.click-url.com/go/onm00200471ave/direct/01
> /********************************************
> Hippocms-dev: Hippo CMS development public mailinglist
> 
> Searchable archives can be found at:
> MarkMail: http://hippocms-dev.markmail.org
> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
> 
> 
********************************************
Hippocms-dev: Hippo CMS development public mailinglist

Searchable archives can be found at:
MarkMail: http://hippocms-dev.markmail.org
Nabble: http://www.nabble.com/Hippo-CMS-f26633.html

Reply via email to