Re: how to combine two run's result for search

2006-09-18 Thread Tomi NA
On 9/18/06, Zaheed Haque <[EMAIL PROTECTED]> wrote: Hi: I have just checked your flash movie.. quick observation you are running tomcat 4.1.31 and there is nothing you are doing that seems wrong. Anyway after starting the servers can you search using the following command bin/nutch org.apache.n

Re: how to combine two run's result for search

2006-09-18 Thread Zaheed Haque
Hi: I have just checked your flash movie.. quick observation you are running tomcat 4.1.31 and there is nothing you are doing that seems wrong. Anyway after starting the servers can you search using the following command bin/nutch org.apache.nutch.search.NutchBean bobdocs what do you get .. and

Re: how to combine two run's result for search

2006-09-18 Thread Tomi NA
On 9/16/06, Tomi NA <[EMAIL PROTECTED]> wrote: On 9/15/06, Tomi NA <[EMAIL PROTECTED]> wrote: > On 9/14/06, Zaheed Haque <[EMAIL PROTECTED]> wrote: > > > Thats the way I set it up at first. > > > This time, I started with a blank slate, unpacked nutch and tomcat, > > > unpacked nutch-0.8.war into

Re: how to combine two run's result for search

2006-09-16 Thread Tomi NA
On 9/15/06, Tomi NA <[EMAIL PROTECTED]> wrote: On 9/14/06, Zaheed Haque <[EMAIL PROTECTED]> wrote: > > Thats the way I set it up at first. > > This time, I started with a blank slate, unpacked nutch and tomcat, > > unpacked nutch-0.8.war into the webapps/ROOT and left the deployed app > > untouch

Re: how to combine two run's result for search

2006-09-15 Thread Tomi NA
On 9/14/06, Zaheed Haque <[EMAIL PROTECTED]> wrote: > Thats the way I set it up at first. > This time, I started with a blank slate, unpacked nutch and tomcat, > unpacked nutch-0.8.war into the webapps/ROOT and left the deployed app > untouched. The above means that you have an empty nutch-site.

Re: how to combine two run's result for search

2006-09-14 Thread Zaheed Haque
Thats the way I set it up at first. This time, I started with a blank slate, unpacked nutch and tomcat, unpacked nutch-0.8.war into the webapps/ROOT and left the deployed app untouched. The above means that you have an empty nutch-site.xml under webapps/ROOT and you have a nutch-default.xml with

Re: how to combine two run's result for search

2006-09-14 Thread Tomi NA
On 9/14/06, Zaheed Haque <[EMAIL PROTECTED]> wrote: On 9/14/06, Tomi NA <[EMAIL PROTECTED]> wrote: > On 9/5/06, Zaheed Haque <[EMAIL PROTECTED]> wrote: > > Hi: > > I have a problem or two with the described procedure... > > > Assuming you have > > > > index 1 at /data/crawl1 > > index 2 at /data/

Re: how to combine two run's result for search

2006-09-14 Thread Zaheed Haque
On 9/14/06, Tomi NA <[EMAIL PROTECTED]> wrote: On 9/5/06, Zaheed Haque <[EMAIL PROTECTED]> wrote: > Hi: I have a problem or two with the described procedure... > Assuming you have > > index 1 at /data/crawl1 > index 2 at /data/crawl2 Used ./bin/nutch crawl urls -dir /home/myhome/crawls/mycrawl

Re: how to combine two run's result for search

2006-09-14 Thread Tomi NA
On 9/5/06, Zaheed Haque <[EMAIL PROTECTED]> wrote: Hi: I have a problem or two with the described procedure... Assuming you have index 1 at /data/crawl1 index 2 at /data/crawl2 Used ./bin/nutch crawl urls -dir /home/myhome/crawls/mycrawldir to generate an index: luke says the index is vali

Re: how to combine two run's result for search

2006-09-06 Thread Tomi NA
On 9/6/06, Zaheed Haque <[EMAIL PROTECTED]> wrote: On 9/6/06, Tomi NA <[EMAIL PROTECTED]> wrote: > On 9/5/06, Zaheed Haque <[EMAIL PROTECTED]> wrote: > > Hi: > > > In the text file you will have the following > > > > hostname1 portnumber > > hostname2 portnumber > > > > example > > localhost 1234

Re: how to combine two run's result for search

2006-09-06 Thread Zaheed Haque
On 9/6/06, Tomi NA <[EMAIL PROTECTED]> wrote: On 9/5/06, Zaheed Haque <[EMAIL PROTECTED]> wrote: > Hi: > In the text file you will have the following > > hostname1 portnumber > hostname2 portnumber > > example > localhost 1234 > localhost 5678 > Does this work with nutch 0.7.2 or is it specific

Re: how to combine two run's result for search

2006-09-06 Thread Zaheed Haque
On 9/6/06, Dennis Kubes <[EMAIL PROTECTED]> wrote: Are those like the shuttle boards? Smaller 1/4 size boxes? Yes I was actually thinking about the following: http://www.via.com.tw/en/initiatives/spearhead/clusterserver/ But put 4 boards in 1U like these guys did.. http://linitx.com/product

Re: how to combine two run's result for search

2006-09-05 Thread Tomi NA
On 9/5/06, Zaheed Haque <[EMAIL PROTECTED]> wrote: Hi: In the text file you will have the following hostname1 portnumber hostname2 portnumber example localhost 1234 localhost 5678 Does this work with nutch 0.7.2 or is it specific to the 0.8 release? t.n.a.

Re: how to combine two run's result for search

2006-09-05 Thread Dennis Kubes
Are those like the shuttle boards? Smaller 1/4 size boxes? Dennis Zaheed Haque wrote: Renaud: Yes or No!. I have done some testing as Dennis Kubes suggested and got similler results like his test. In short having 4 nutch search servers in one box but in 4 different disks with in my case 0.75

Re: how to combine two run's result for search

2006-09-05 Thread Feng Ji
thanks, Renaud: I figured out the same senario as your script, it works well. Michael On 9/5/06, Renaud Richardet <[EMAIL PROTECTED]> wrote: @Dennis, Can you explain how to setup distributed search while storing the 2 indexes on the same local machine (if possible)? @Feng, We created a shel

Re: how to combine two run's result for search

2006-09-05 Thread Zaheed Haque
Renaud: Yes or No!. I have done some testing as Dennis Kubes suggested and got similler results like his test. In short having 4 nutch search servers in one box but in 4 different disks with in my case 0.75 mil docs per disk. I had about 4 gig memory and 1 AMD 64 processor and it worked out rathe

Re: how to combine two run's result for search

2006-09-05 Thread Renaud Richardet
Zaheed, Thank you, that works good. Do you know if there is a big performance overhead with starting 2 servers? As an alternative, we could use Lucene's Multisearcher? -- Renaud Zaheed Haque wrote: Hi: Assuming you have index 1 at /data/crawl1 index 2 at /data/crawl2 In nutch-site.xml s

Re: how to combine two run's result for search

2006-09-05 Thread Zaheed Haque
Hi: Assuming you have index 1 at /data/crawl1 index 2 at /data/crawl2 In nutch-site.xml searcher.dir = /data Under /data you have a text file called search-server.txt (I think do check nutch-site search.dir description please) In the text file you will have the following hostname1 portnumber

Re: how to combine two run's result for search

2006-09-05 Thread Renaud Richardet
@Dennis, Can you explain how to setup distributed search while storing the 2 indexes on the same local machine (if possible)? @Feng, We created a shell script to merge 2 runs, let us know if that works for you. http://wiki.apache.org/nutch/MergeCrawl Renaud Dennis Kubes wrote: You can keep

Re: how to combine two run's result for search

2006-09-04 Thread Dennis Kubes
You can keep the indexes separate and use the distributed search server, one per index or you can use the mergedb and mergesegs commands to merge the two runs into a single crawldb and a single segments then re-run the invertlinks and index to create a single index file which can then be search

how to combine two run's result for search

2006-09-04 Thread Feng Ji
Hi there, In Nutch 08, I have crawled down from two webDB independently. For each run, I did invertlinks and index. So each one is searchable. Now I want to combine them togeter for search. I tried "merge" command to merge two indexes, but the search for the result index output dir is dull. Do