You could create a custom analyzer that splits compound words into its
parts. That is applying the analyzer to the word "bergbahn" would yield
the terms "berg" and "bahn"

Splitting compound words can be done quite effectively simply by using
a large wordlist. I have done this for swedish.

/magnus


Tino Sch�llhorn wrote:

Hi,

I have a problem which I'd like to understand - and perhaps it is also possible to solve it ;-).

I built an index using Lucene with the GermanAnalyzer. Now I have the following phenomenon:

- when searching for "bahn" the result contains hardly any "bergbahn"

I am aware that the Lucene Query-Api supports wildcards, but as far as I know I cannot add a * in front of a query-term.

Do you have any suggestions how I could find "bergbahn" with the query "bahn"? (this applies to other compound words as well).

With regards
Tino


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to