@Dennis,
Can you explain how to setup distributed search while storing the 2
indexes on the same local machine (if possible)?
@Feng,
We created a shell script to merge 2 runs, let us know if that works for
you.
http://wiki.apache.org/nutch/MergeCrawl
Renaud
Dennis Kubes wrote:
You can keep the indexes separate and use the distributed search
server, one per index or you can use the mergedb and mergesegs
commands to merge the two runs into a single crawldb and a single
segments then re-run the invertlinks and index to create a single
index file which can then be searched.
Dennis
Feng Ji wrote:
Hi there,
In Nutch 08, I have crawled down from two webDB independently.
For each run, I did invertlinks and index. So each one is searchable.
Now I want to combine them togeter for search. I tried "merge"
command to
merge two indexes, but the search for the result index output dir is
dull.
Do I need put output dir to the same directory as above two crawl/ ?
I wonder what is proper steps to combine two seperate run into one
search
result. Do I need to combine two webdb, merge two segments and do
invertlinks and do index?
thanks your time,
Michael,
--
Renaud Richardet
COO America
Wyona - Open Source Content Management - Apache Lenya
office +1 857 776-3195 mobile +1 617 230 9112
renaud.richardet <at> wyona.com http://www.wyona.com