You would want something like this: bin/nutch generate -topN 1000000 segment=`ls -d segments/2* | tail -1` bin/nutch fetch $segment bin/nutch updatedb db $segment
ofcourse replace topN with the count of urls you wish to fetch. You could do a for loop to run this over x amount of times as well. -byron -----Original Message----- From: Richard Anderson <[EMAIL PROTECTED]> To: [email protected] Date: Thu, 21 Apr 2005 09:23:31 -0400 Subject: Running nutch on new segments > > For nightly indexing how do you select the current segment to fetch, > updatedb, analyze, and index? > > I wrote the following script that shows what I need to do to update the > index nightly. > > $cat nutch-update.sh > > nutch fetch /webapps/nutch/search-dir/segments/20050421085829/ > nutch updatedb /webapps/nutch/search-dir/db/ > /webapps/nutch/search-dir/segments/20050421085829/ > nutch analyze /webapps/nutch/search-dir/db/ 2 > nutch index /webapps/nutch/search-dir/segments/20050421085829/ > > > Am I completely crazy? I can't find any docs on automating the indexing > process in regards to using segments. >
