Are those like the shuttle boards? Smaller 1/4 size boxes? Dennis
Zaheed Haque wrote: > Renaud: > > Yes or No!. I have done some testing as Dennis Kubes suggested and got > similler results like his test. In short having 4 nutch search servers > in one box but in 4 different disks with in my case 0.75 mil docs per > disk. I had about 4 gig memory and 1 AMD 64 processor and it worked > out rather ok. I need to do more testing to fine tune this cos this > really brings the issue of cost. I have also thought about doing some > testing with VIA EPIA boards. Maybe in the future :-) > > The problem I encountered is more this > > http://issues.apache.org/jira/browse/NUTCH-92 > > but this will be solved sooner or later just a matter of time. > > Cheers > > > On 9/5/06, Renaud Richardet <[EMAIL PROTECTED]> wrote: >> Zaheed, >> >> Thank you, that works good. Do you know if there is a big performance >> overhead with starting 2 servers? As an alternative, we could use >> Lucene's Multisearcher? >> >> -- Renaud >> >> >> Zaheed Haque wrote: >> > Hi: >> > >> > Assuming you have >> > >> > index 1 at /data/crawl1 >> > index 2 at /data/crawl2 >> > >> > In nutch-site.xml >> > searcher.dir = /data >> > >> > Under /data you have a text file called search-server.txt (I think do >> > check nutch-site search.dir description please) >> > >> > In the text file you will have the following >> > >> > hostname1 portnumber >> > hostname2 portnumber >> > >> > example >> > localhost 1234 >> > localhost 5678 >> > >> > Then you need to start >> > >> > bin/nutch server 1234 /data/craw1 & >> > >> > and >> > >> > bin/nutch server 5678 /data/crawl2 & >> > >> > now try >> > >> > bin/nutch org.apache.nutch.search.NutchBean www >> > >> > you should see results :-) >> > >> > Cheers >> > >> > On 9/5/06, Renaud Richardet <[EMAIL PROTECTED]> wrote: >> >> @Dennis, >> >> Can you explain how to setup distributed search while storing the 2 >> >> indexes on the same local machine (if possible)? >> >> >> >> @Feng, >> >> We created a shell script to merge 2 runs, let us know if that >> works for >> >> you. >> >> http://wiki.apache.org/nutch/MergeCrawl >> >> >> >> Renaud >> >> >> >> >> >> Dennis Kubes wrote: >> >> > You can keep the indexes separate and use the distributed search >> >> > server, one per index or you can use the mergedb and mergesegs >> >> > commands to merge the two runs into a single crawldb and a single >> >> > segments then re-run the invertlinks and index to create a single >> >> > index file which can then be searched. >> >> > >> >> > Dennis >> >> > >> >> > Feng Ji wrote: >> >> >> Hi there, >> >> >> >> >> >> In Nutch 08, I have crawled down from two webDB independently. >> >> >> >> >> >> For each run, I did invertlinks and index. So each one is >> searchable. >> >> >> >> >> >> Now I want to combine them togeter for search. I tried "merge" >> >> >> command to >> >> >> merge two indexes, but the search for the result index output >> dir is >> >> >> dull. >> >> >> Do I need put output dir to the same directory as above two >> crawl/ ? >> >> >> >> >> >> I wonder what is proper steps to combine two seperate run into one >> >> >> search >> >> >> result. Do I need to combine two webdb, merge two segments and do >> >> >> invertlinks and do index? >> >> >> >> >> >> thanks your time, >> >> >> >> >> >> Michael, >> >> >> >> >> > >> >> >> >> -- >> >> Renaud Richardet >> >> COO America >> >> Wyona - Open Source Content Management - Apache Lenya >> >> office +1 857 776-3195 mobile +1 617 230 9112 >> >> renaud.richardet <at> wyona.com http://www.wyona.com >> >> >> >> >> > >> >> -- >> Renaud Richardet >> COO America >> Wyona - Open Source Content Management - Apache Lenya >> office +1 857 776-3195 mobile +1 617 230 9112 >> renaud.richardet <at> wyona.com http://www.wyona.com >> >> ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
