Hi:

Assuming you have

index 1 at /data/crawl1
index 2 at /data/crawl2

In nutch-site.xml
searcher.dir = /data

Under /data you have a text file called search-server.txt (I think do
check nutch-site search.dir description please)

In the text file you will have the following

hostname1 portnumber
hostname2 portnumber

example
localhost 1234
localhost 5678

Then you need to start

bin/nutch server 1234 /data/craw1 &

and

bin/nutch server 5678 /data/crawl2 &

now try

bin/nutch org.apache.nutch.search.NutchBean www

you should see results :-)

Cheers

On 9/5/06, Renaud Richardet <[EMAIL PROTECTED]> wrote:
> @Dennis,
> Can you explain how to setup distributed search while storing the 2
> indexes on the same local machine (if possible)?
>
> @Feng,
> We created a shell script to merge 2 runs, let us know if that works for
> you.
> http://wiki.apache.org/nutch/MergeCrawl
>
> Renaud
>
>
> Dennis Kubes wrote:
> > You can keep the indexes separate and use the distributed search
> > server, one per index or you can use the mergedb and mergesegs
> > commands to merge the two runs into a single crawldb and a single
> > segments then re-run the invertlinks and index to create a single
> > index file which can then be searched.
> >
> > Dennis
> >
> > Feng Ji wrote:
> >> Hi there,
> >>
> >> In Nutch 08, I have crawled down from two webDB independently.
> >>
> >> For each run, I did invertlinks and index. So each one is searchable.
> >>
> >> Now I want to combine them togeter for search. I tried "merge"
> >> command to
> >> merge two indexes, but the search for the result index output dir is
> >> dull.
> >> Do I need put output dir to the same directory as above two crawl/ ?
> >>
> >> I wonder what is proper steps to combine two seperate run into one
> >> search
> >> result. Do I need to combine two webdb, merge two segments and do
> >> invertlinks and do index?
> >>
> >> thanks your time,
> >>
> >> Michael,
> >>
> >
>
> --
> Renaud Richardet
> COO America
> Wyona    -   Open Source Content Management   -   Apache Lenya
> office +1 857 776-3195                  mobile +1 617 230 9112
> renaud.richardet <at> wyona.com           http://www.wyona.com
>
>

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to