Hi Fredrik Thanks for your reply:) It is true that you can recommed the top-n most popular queries on each indexed field. See the example: http://www.business.com/index.asp?p=true (please select the "Job" tab).
However, I think betherebesquare.com is a bit different. I mean if my goal is to recommend <what><when><where> -- the multi-fields query. I think this recommended query is quite meaningful. Is that possible in nutch? Or something I should refer to? /Jack On 12/12/05, Fredrik Andersson <[EMAIL PROTECTED]> wrote: > Hi again, Jack. > > I don't see the problem of saving separate statistics for each field in > your query? In my applications, I pass the query string down to the > statistics index prior to QueryParser, i.e I just save "foo bar", not > "field1:foo field1:bar field2:foo field2:bar". If you have a similar thing > like betherebesquare.com, it shouldn't be a problem to tuck the different > fields (name, date and location) in to three statistical indices and do a > simultaneous (threaded) lookup on the three when getting a new query, to > make suggestions. > Speaking from experience, you might want to separate the working copy and > the live copy of this statistical index, since you will want to have > exclusive read-access to the live index without someone writing stuff > (locking it) sometimes. Each low-traffic period, copy the built-up > statistical index, optimize() it, and replace the current live index with > the new copy. > > Good luck, > Fredrik > > > On 12/12/05, Jack Tang <[EMAIL PROTECTED]> wrote: > > Hi > > > > The approach is great for one sigle query field. How about multi-fields? > > Say I want do some recommends( or show hot search) for the event search > engine > > http://betherebesquare.com/ . > > > > Any great thought? > > > > /Jack > > > > On 9/29/05, Fredrik Andersson <[EMAIL PROTECTED]> wrote: > > > Hi Jack! > > > > > > I like these things to be driven by statistics rather than content of > the > > > index. If you run a search engine, and want any kind of feedback, you > will > > > at least save all queries entered. You can store these in an index or > > > database, and run a Levenshtein metric on the, potentially misspelled, > > > query. If my memory serves me right, a Lucene FuzzyQuery uses this > metric, > > > so a good approach would be to keep a Lucene index with > |query,frequency| > > > tuples (updated nightly, weekly, or whatever), and simply search this > index > > > with a FuzzyQuery with some defined similarity, and pick the most > frequent > > > query for suggestion. > > > > > > Fredrik > > > > > > On 9/29/05, Jack Tang <[EMAIL PROTECTED] > wrote: > > > > Hi > > > > > > > > I am very like Google's "Did you mean" and I notice that nutch now > > > > does not provider this function. > > > > > > > > In this article http://today.java.net/lpt/a/211 , author Tim White > > > > implemented suggestion using n-gram to generate suggestion index. Do > > > > you think is it good for nutch? I mean index in nutch will be really > > > > huge. Or just provide some dictionaries like jazzy(LGPL) does? > > > > > > > > Thanks > > > > /Jack > > > > -- > > > > Keep Discovering ... ... > > > > http://www.jroller.com/page/jmars > > > > > > > > > > > > > > > > -- > > Keep Discovering ... ... > > http://www.jroller.com/page/jmars > > > > -- Keep Discovering ... ... http://www.jroller.com/page/jmars
