You could create a custom analyzer that splits compound words into its parts. That is applying the analyzer to the word "bergbahn" would yield the terms "berg" and "bahn"
Splitting compound words can be done quite effectively simply by using a large wordlist. I have done this for swedish.
/magnus
Tino Sch�llhorn wrote:
Hi,
I have a problem which I'd like to understand - and perhaps it is also possible to solve it ;-).
I built an index using Lucene with the GermanAnalyzer. Now I have the following phenomenon:
- when searching for "bahn" the result contains hardly any "bergbahn"
I am aware that the Lucene Query-Api supports wildcards, but as far as I know I cannot add a * in front of a query-term.
Do you have any suggestions how I could find "bergbahn" with the query "bahn"? (this applies to other compound words as well).
With regards Tino
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
