You can always decompose because QueryParser will also decompose and will do-the-right-thing (internal using a PhraseQuery - don't hurt me, Robert).
Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Wulf Berschin [mailto:bersc...@dosco.de] > Sent: Wednesday, January 26, 2011 5:07 PM > To: java-user@lucene.apache.org > Subject: Re: Highlight Wildcard Queries: Scores > > Hallo Uwe, > > yes, thanks for the hint, that sounds good, but it seems to me I would then > need more fields for all our search modes: > > Now we have the fields "contents" without stoppwords and with stemming > and "contents-unstemmed" whithout stemming. > > The search options are: > - whole word (search "contents", no asterisks are being added before > search) > - exact match (search "contents-unstemmed", implies whole word) > > When decomposition comes into play I will need a third field "contents- > undecomposed" (sorry) to perform the whole word search. > Furthermore the contents-unstemmed should not be decomposed as well. > > Would you still prefer this approach? > > Viele Grüße aus Heidelberg > Wulf > > > > > > > Am 26.01.2011 16:00, schrieb Uwe Schindler: > > Hi Wulf, > > > > You should consider decompounding! There are filters based on > > dictionaries that support decompounding german words. Its a > > TokenFilter to be put into your analysis chain. > > There is a simple Lucene-Rule: Whenever you need wildcards think about > > your analysis, you probably did something wrong :-) Add stemming, > > decompounding, synonyms,... > > > > Uwe > > > > ----- > > Uwe Schindler > > H.-H.-Meier-Allee 63, D-28213 Bremen > > http://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > > >> -----Original Message----- > >> From: Wulf Berschin [mailto:bersc...@dosco.de] > >> Sent: Wednesday, January 26, 2011 3:56 PM > >> To: java-user@lucene.apache.org > >> Subject: Re: ****SPAM(5.0)**** Re: Highlight Wildcard Queries: Scores > >> > >> Hi Erick, > >> > >> good points, but: > >> > >> our index is fed with german text. In german (in contrast to english) > > nouns > >> are just appended to create new words. E.g. > >> > >> Kaffee > >> Kaffeemaschine > >> Kaffeemaschinensatzbehälter > >> > >> In our scenario standard fulltext search on "Maschine" shall present > >> all > > of > >> these nouns. That's why we add * before and after on each term. > >> > >> Of course we provide an option "full words only" which finds none of > > these. > >> > >> Since we do not wrap * around words shorter than 4 characters we > >> weren't yet faced with the too many clauses exception. > >> > >> Greetings > >> Wulf > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org