D.L.B. wrote: > Given that this is the case, I don't think it's possible to come up > with a solution that will cover every case. That said, I believe it > is still worthwhile to try to do something reasonable to cover most > cases. > > The company I work for has public text searchable websites in the > following languages: English, Danish, Spanish, French, Dutch, > Norwegian, Finnish, and Swedish. The approach we took, as I > mentioned in an earlier mail, was to only stem prefix and "suffix" > queries (of the form *someText). In these cases, don't pass the > wildcard character to the stemmer and only use the stemmed result if > it is a single word. > [...] > It turns out that this wildcard policy works well for us -- the users > tend to get the results they expect. Whatever solution falls out of > this argument, I just wanted to mention what is working for us. I'm > thinking that adding a suffix term notion, parallel to prefix term in > QueryParser.jj, creating subclassable methods to handle these, maybe > providing a subclass that performs the imperfect stemming solution > mentioned above, might be enough to please a lot of users.
This might be the way to go. Perhaps we could extend this, and provide a special flag like "%men*" or simply enclose the query in "" to prevent automatic stemming. This way one would also be able to find "menigitis". Christoph --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
