Re: how to combine two run's result for search

Renaud Richardet Tue, 05 Sep 2006 07:09:28 -0700

@Dennis,

Can you explain how to setup distributed search while storing the 2indexes on the same local machine (if possible)?


@Feng,

We created a shell script to merge 2 runs, let us know if that works foryou.

http://wiki.apache.org/nutch/MergeCrawl

Renaud


Dennis Kubes wrote:

You can keep the indexes separate and use the distributed searchserver, one per index or you can use the mergedb and mergesegscommands to merge the two runs into a single crawldb and a singlesegments then re-run the invertlinks and index to create a singleindex file which can then be searched.
Dennis

Feng Ji wrote:
Hi there,

In Nutch 08, I have crawled down from two webDB independently.

For each run, I did invertlinks and index. So each one is searchable.
Now I want to combine them togeter for search. I tried "merge"command tomerge two indexes, but the search for the result index output dir isdull.
Do I need put output dir to the same directory as above two crawl/ ?
I wonder what is proper steps to combine two seperate run into onesearch
result. Do I need to combine two webdb, merge two segments and do
invertlinks and do index?

thanks your time,

Michael,


--
Renaud Richardet
COO America
Wyona    -   Open Source Content Management   -   Apache Lenya
office +1 857 776-3195                  mobile +1 617 230 9112
renaud.richardet <at> wyona.com           http://www.wyona.com

Re: how to combine two run's result for search

Reply via email to