I'm interested to this analyzer.. it had escaped me and solves an old problem!
Could you report about its usage:
- did you have to feed words in a dictionary?
- does anyone have user-measures already?
... and the last question for the research fun: is there any approach towards preferring Überwachunggesetz as a concept than, say, Fleischüberwachung? (again, that could be based on a dictionary probably).

thanks in advance

paul


Le 21-oct.-09 à 04:00, Robert Muir a écrit :

hi, it will work because it will also decompound "Rindfleish" into Rind and
fleish, with posIncr=0

so if you index Rindfleischüberwachungsgesetz, then query with "Rindfleish", its matching because Rindfleish also gets decompounded into Rind and fleish.

On Tue, Oct 20, 2009 at 8:35 PM, Benjamin Douglas
<bbdoug...@basistech.com>wrote:

Hello,

I've found a number of posts in different places talking about how to
perform decompounding, but I haven't found too many discussing how to use the results of decompounding. If anyone can answer this question or point me
to an existing discussion it would be very helpful.

In the description of the org.apache.lucene.analysis.compound package, it
gives the following example:

      Rindfleischüberwachungsgesetz, 0, 29
      Rind, 0, 4, posIncr=0
      fleisch, 4, 11, posIncr=0
      überwachung, 11, 22, posIncr=0
      gesetz, 23, 29, posIncr=0

And I see how this allows me to find single components such as "gesetz" or
"Rind". But what if I want to find combinations of components such as
"Rindfleisch" or "überwachungsgesetz"? It seems that the pattern of using posIncr=0 for all components excludes the possibility of finding sub-strings
that are made up of multiple components.

Any comments or thoughts would be appreciated.

Ben Douglas

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




--
Robert Muir
rcm...@gmail.com

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to