On Thu, Dec 18, 2003 at 05:59:34PM +0100, Viparthi, Kiran (AFIS) wrote: > We want to provide "did you mean" search suggestions on our search results > pages. Most of the "did you mean" searches will be derived from synonyms, > translations and other information from our ontology(KAON).
Just a comment, I'm not really answering the questions you ask. It would seem that frequency and, possibly, spelling play a big role in google's "did you mean" strategy. So if instead of searching "lucen index java" it suggests "lucent index java" instead of what I was searching for "lucene index java" because lucent shows up much more often than lucene. It looks like for the most part this strategy works fine. I know that it's not too hard to get a list of words and their frequencies, but I'm not sure what the performence implication would be. Also, you'd need to figure out a misspelling strategy, but I suspect that's not too hard. > > 1. It would be nice to be able to navigate the Query object created by the > QueryParser.parse(String) and modify the Query expanding certain clauses > prior to calling Query.toString() to create the "did you mean" searches. > This would require accessor methods to navigate the query clauses and > methods to actually change the Query. These do not appear to be present in > the current API. To our minds the inferior alternative is to modify the > QueryParser itself to do the expansion and build in a expand/nonexpand > instruction into the QueryParser grammar. Does anyone have better ideas? > > 2. A related issue is that we are basically happy with the standard Lucene > QueryParser though we need to make some minor changes to the grammar. In > this case it would be convenient to create an equivalent of the > Query.toString() method to serialize conforming to new grammar outside of > the Query class. The problem here is there don't appear to be enough > accessor methods in the Query classes to write a new X.toString(Query). > > Richard and Kiran > > -- Dror Matalon Zapatec Inc 1700 MLK Way Berkeley, CA 94709 http://www.fastbuzz.com http://www.zapatec.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
