@Dennis,
Can you explain how to setup distributed search while storing the 2 indexes on the same local machine (if possible)?

@Feng,
We created a shell script to merge 2 runs, let us know if that works for you.
http://wiki.apache.org/nutch/MergeCrawl

Renaud


Dennis Kubes wrote:
You can keep the indexes separate and use the distributed search server, one per index or you can use the mergedb and mergesegs commands to merge the two runs into a single crawldb and a single segments then re-run the invertlinks and index to create a single index file which can then be searched.

Dennis

Feng Ji wrote:
Hi there,

In Nutch 08, I have crawled down from two webDB independently.

For each run, I did invertlinks and index. So each one is searchable.

Now I want to combine them togeter for search. I tried "merge" command to merge two indexes, but the search for the result index output dir is dull.
Do I need put output dir to the same directory as above two crawl/ ?

I wonder what is proper steps to combine two seperate run into one search
result. Do I need to combine two webdb, merge two segments and do
invertlinks and do index?

thanks your time,

Michael,



--
Renaud Richardet
COO America
Wyona    -   Open Source Content Management   -   Apache Lenya
office +1 857 776-3195                  mobile +1 617 230 9112
renaud.richardet <at> wyona.com           http://www.wyona.com

Reply via email to