RE: Highlight Wildcard Queries: Scores

Uwe Schindler Wed, 26 Jan 2011 08:34:32 -0800

You can always decompose because QueryParser will also decompose and will
do-the-right-thing (internal using a PhraseQuery - don't hurt me, Robert).


Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----
> From: Wulf Berschin [mailto:bersc...@dosco.de]
> Sent: Wednesday, January 26, 2011 5:07 PM
> To: java-user@lucene.apache.org
> Subject: Re: Highlight Wildcard Queries: Scores
> 
> Hallo Uwe,
> 
> yes, thanks for the hint, that sounds good, but it seems to me I would
then
> need more fields for all our search modes:
> 
> Now we have the fields "contents" without stoppwords and with stemming
> and "contents-unstemmed" whithout stemming.
> 
> The search options are:
> - whole word (search "contents", no asterisks are being added before
> search)
> - exact match (search "contents-unstemmed", implies whole word)
> 
> When decomposition comes into play I will need a third field "contents-
> undecomposed" (sorry) to perform the whole word search.
> Furthermore the contents-unstemmed should not be decomposed as well.
> 
> Would you still prefer this approach?
> 
> Viele Grüße aus Heidelberg
> Wulf
> 
> 
> 
> 
> 
> 
> Am 26.01.2011 16:00, schrieb Uwe Schindler:
> > Hi Wulf,
> >
> > You should consider decompounding! There are filters based on
> > dictionaries that support decompounding german words. Its a
> > TokenFilter to be put into your analysis chain.
> > There is a simple Lucene-Rule: Whenever you need wildcards think about
> > your analysis, you probably did something wrong :-) Add stemming,
> > decompounding, synonyms,...
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >
> >> -----Original Message-----
> >> From: Wulf Berschin [mailto:bersc...@dosco.de]
> >> Sent: Wednesday, January 26, 2011 3:56 PM
> >> To: java-user@lucene.apache.org
> >> Subject: Re: ****SPAM(5.0)**** Re: Highlight Wildcard Queries: Scores
> >>
> >> Hi Erick,
> >>
> >> good points, but:
> >>
> >> our index is fed with german text. In german (in contrast to english)
> > nouns
> >> are just appended to create new words. E.g.
> >>
> >> Kaffee
> >> Kaffeemaschine
> >> Kaffeemaschinensatzbehälter
> >>
> >> In our scenario standard fulltext search on "Maschine" shall present
> >> all
> > of
> >> these nouns. That's why we add * before and after on each term.
> >>
> >> Of course we provide an option "full words only" which finds none of
> > these.
> >>
> >> Since we do not wrap * around words shorter than 4 characters we
> >> weren't yet faced with the too many clauses exception.
> >>
> >> Greetings
> >> Wulf
> >>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: Highlight Wildcard Queries: Scores

Reply via email to