hi, anay idea guys ??
thanx > From: mbel...@msn.com > To: nutch-user@lucene.apache.org > Subject: RE: recrawl.sh stopped at depth 7/10 without error > Date: Fri, 27 Nov 2009 20:11:12 +0000 > > > > hi, > > this is the main loop of my recrawl.sh > > > do > > echo "--- Beginning crawl at depth `expr $i + 1` of $depth ---" > $NUTCH_HOME/bin/nutch generate $crawl/crawldb $crawl/segments $topN \ > -adddays $adddays > if [ $? -ne 0 ] > then > echo "runbot: Stopping at depth $depth. No more URLs to fetch." > break > fi > segment=`ls -d $crawl/segments/* | tail -1` > > $NUTCH_HOME/bin/nutch fetch $segment -threads $threads > if [ $? -ne 0 ] > then > echo "runbot: fetch $segment at depth `expr $i + 1` failed." > echo "runbot: Deleting segment $segment." > rm $RMARGS $segment > continue > fi > > $NUTCH_HOME/bin/nutch updatedb $crawl/crawldb $segment > > done > > echo "----- Merge Segments (Step 3 of $steps) -----" > > > > in my log file i never find the message "----- Merge Segments (Step 3 of > $steps) -----" ! so it breaks the loop and stops the process. > > i dont understand why it stops at depth 7 without any errors ! > > > > From: mbel...@msn.com > > To: nutch-user@lucene.apache.org > > Subject: recrawl.sh stopped at depth 7/10 without error > > Date: Wed, 25 Nov 2009 15:43:33 +0000 > > > > > > > > hi, > > > > i'm running recrawl.sh and it stops every time at depth 7/10 without any > > error ! but when run the bin/crawl with the same crawl-urlfilter and the > > same seeds file it finishs softly in 1h50 > > > > i checked the hadoop.log, and dont find any error there...i just find the > > last url it was parsing > > do fetching or crawling has a timeout ? > > my recrawl takes 2 hours before it stops. i set the time fetch interval 24 > > hours and i'm running the generate with adddays = 1 > > > > best regards > > > > _________________________________________________________________ > > Eligible CDN College & University students can upgrade to Windows 7 before > > Jan 3 for only $39.99. Upgrade now! > > http://go.microsoft.com/?linkid=9691819 > > _________________________________________________________________ > Eligible CDN College & University students can upgrade to Windows 7 before > Jan 3 for only $39.99. Upgrade now! > http://go.microsoft.com/?linkid=9691819 _________________________________________________________________ Ready. Set. Get a great deal on Windows 7. See fantastic deals on Windows 7 now http://go.microsoft.com/?linkid=9691818