Search engine results, including Nutch are based on scores. There is an
index time score and and query time score that get combined (multiplied)
to return a final score. Documents are returned by score desc by default.
For what you were asking, you could write a MapReduce job to run through
a listing of common queries on major search engines. Save the highest
scoring pages and then add those scores to nutch documents are index
time. If you wanted to do it dynamically you could run your query on
other search engines first and then use a query plugin to boost certain
urls, determined from the other search engines, at query time.
Dennis
On 06/15/2010 04:36 AM, Massimo Schiavon wrote:
There is a way to reorder the results returned by nutch based on
results returned by other search engines to the same (or similar) query?