Re: autocomplete with multiple terms

karl wettin Thu, 22 Feb 2007 02:17:12 -0800


22 feb 2007 kl. 10.09 skrev Martin Braun:

the only thing I have found in the list before concerning this subject
is http://issues.apache.org/jira/browse/LUCENE-625, but I'm notsure if
it does the things I want.

I am not sure if we get enough queries for a search over an index base
on the user-queries.

If the content of your corpus is static enough, then time is thefriend that will enable you gather enough user queries to build thesuggestion data set.

Otherwise you have to produce simulated user queries by reducing yourdata set to the most common information. Perhaps using Markov chains,top n paths of terms with Dijkstra or so could be an easy way out.You can also start looking at the documents people choose to inspect,and use these as the base for phrase training.

I think you will get further considering this from a behavioralpsychology angle rather than how to access the corpus accessproblem. Also, navigating a reduced data set (such as the trie inLUCENE-625 compared to the corpus it suggests to) will save you a lotof system resources.


Hope this helps some.

--
karl





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: autocomplete with multiple terms

Reply via email to