> Von: Valeriy Felberg > If you want that query "jacke" matches a document containing the word > "windjacke" or "kinderjacke", you could use a custom update processor. > This processor could search the indexed text for words matching the > pattern ".*jacke" and inject the word "jacke" into an additional field > which you can search against. You would need a whole list of possible > suffixes, of course.
Merci, Valeriy - I agree on the feasability of such an approach. The list would likely have to be composed of the most frequently used terms for your specific domain. In our case, it's things people would buy in shops. Reducing overly complicated and convoluted product descriptions to proper basic terms - that would do the job. It's like going to a restaurant boasting fancy and unintelligible names for the dishes you may order when they are really just ordinary stuff like pork and potatoes. Thinking some more about it, giving sufficient boost to the attached category data might also do the job. That would shift the burden of supplying proper semantics to the guys doing the categorization. > It would slow down the update process but you don't need to split > words during search. > > Le 12 avr. 2012 à 11:52, Michael Ludwig a écrit : > > > >> Given an input of "Windjacke" (probably "wind jacket" in English), > >> I'd like the code that prepares the data for the index (tokenizer > >> etc) to understand that this is a "Jacke" ("jacket") so that a > >> query for "Jacke" would include the "Windjacke" document in its > >> result set. A query for "Windjacke" or "Kinderjacke" would probably not have to be de-specialized to "Jacke" because, well, that's the user input and users looking for specific things are probably doing so for a reason. If no matches are found you can still tell them to just broaden their search. Michael