Hi Fredrik Thanks for your reply:) It is true that you can recommed the top-n most popular queries on each indexed field. See the example: http://www.business.com/index.asp?p=true (please select the "Job" tab).
However, I think betherebesquare.com is a bit different. I mean if my goal is to recommend <what><when><where> -- the multi-fields query. I think this recommended query is quite meaningful. Is that possible in nutch? Or something I should refer to? /Jack On 12/12/05, Fredrik Andersson <[EMAIL PROTECTED]> wrote: > Hi again, Jack. > > I don't see the problem of saving separate statistics for each field in > your query? In my applications, I pass the query string down to the > statistics index prior to QueryParser, i.e I just save "foo bar", not > "field1:foo field1:bar field2:foo field2:bar". If you have a similar thing > like betherebesquare.com, it shouldn't be a problem to tuck the different > fields (name, date and location) in to three statistical indices and do a > simultaneous (threaded) lookup on the three when getting a new query, to > make suggestions. > Speaking from experience, you might want to separate the working copy and > the live copy of this statistical index, since you will want to have > exclusive read-access to the live index without someone writing stuff > (locking it) sometimes. Each low-traffic period, copy the built-up > statistical index, optimize() it, and replace the current live index with > the new copy. > > Good luck, > Fredrik > > > On 12/12/05, Jack Tang <[EMAIL PROTECTED]> wrote: > > Hi > > > > The approach is great for one sigle query field. How about multi-fields? > > Say I want do some recommends( or show hot search) for the event search > engine > > http://betherebesquare.com/ . > > > > Any great thought? > > > > /Jack > > > > On 9/29/05, Fredrik Andersson <[EMAIL PROTECTED]> wrote: > > > Hi Jack! > > > > > > I like these things to be driven by statistics rather than content of > the > > > index. If you run a search engine, and want any kind of feedback, you > will > > > at least save all queries entered. You can store these in an index or > > > database, and run a Levenshtein metric on the, potentially misspelled, > > > query. If my memory serves me right, a Lucene FuzzyQuery uses this > metric, > > > so a good approach would be to keep a Lucene index with > |query,frequency| > > > tuples (updated nightly, weekly, or whatever), and simply search this > index > > > with a FuzzyQuery with some defined similarity, and pick the most > frequent > > > query for suggestion. > > > > > > Fredrik > > > > > > On 9/29/05, Jack Tang <[EMAIL PROTECTED] > wrote: > > > > Hi > > > > > > > > I am very like Google's "Did you mean" and I notice that nutch now > > > > does not provider this function. > > > > > > > > In this article http://today.java.net/lpt/a/211 , author Tim White > > > > implemented suggestion using n-gram to generate suggestion index. Do > > > > you think is it good for nutch? I mean index in nutch will be really > > > > huge. Or just provide some dictionaries like jazzy(LGPL) does? > > > > > > > > Thanks > > > > /Jack > > > > -- > > > > Keep Discovering ... ... > > > > http://www.jroller.com/page/jmars > > > > > > > > > > > > > > > > -- > > Keep Discovering ... ... > > http://www.jroller.com/page/jmars > > > > -- Keep Discovering ... ... http://www.jroller.com/page/jmars ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_idv37&alloc_id865&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
