I sill want to know the reason. 2009/12/2 BELLINI ADAM <mbel...@msn.com>
> > hi, > > anay idea guys ?? > > > > thanx > > > From: mbel...@msn.com > > To: nutch-user@lucene.apache.org > > Subject: RE: recrawl.sh stopped at depth 7/10 without error > > Date: Fri, 27 Nov 2009 20:11:12 +0000 > > > > > > > > hi, > > > > this is the main loop of my recrawl.sh > > > > > > do > > > > echo "--- Beginning crawl at depth `expr $i + 1` of $depth ---" > > $NUTCH_HOME/bin/nutch generate $crawl/crawldb $crawl/segments $topN \ > > -adddays $adddays > > if [ $? -ne 0 ] > > then > > echo "runbot: Stopping at depth $depth. No more URLs to fetch." > > break > > fi > > segment=`ls -d $crawl/segments/* | tail -1` > > > > $NUTCH_HOME/bin/nutch fetch $segment -threads $threads > > if [ $? -ne 0 ] > > then > > echo "runbot: fetch $segment at depth `expr $i + 1` failed." > > echo "runbot: Deleting segment $segment." > > rm $RMARGS $segment > > continue > > fi > > > > $NUTCH_HOME/bin/nutch updatedb $crawl/crawldb $segment > > > > done > > > > echo "----- Merge Segments (Step 3 of $steps) -----" > > > > > > > > in my log file i never find the message "----- Merge Segments (Step 3 of > $steps) -----" ! so it breaks the loop and stops the process. > > > > i dont understand why it stops at depth 7 without any errors ! > > > > > > > From: mbel...@msn.com > > > To: nutch-user@lucene.apache.org > > > Subject: recrawl.sh stopped at depth 7/10 without error > > > Date: Wed, 25 Nov 2009 15:43:33 +0000 > > > > > > > > > > > > hi, > > > > > > i'm running recrawl.sh and it stops every time at depth 7/10 without > any error ! but when run the bin/crawl with the same crawl-urlfilter and the > same seeds file it finishs softly in 1h50 > > > > > > i checked the hadoop.log, and dont find any error there...i just find > the last url it was parsing > > > do fetching or crawling has a timeout ? > > > my recrawl takes 2 hours before it stops. i set the time fetch interval > 24 hours and i'm running the generate with adddays = 1 > > > > > > best regards > > > > > > _________________________________________________________________ > > > Eligible CDN College & University students can upgrade to Windows 7 > before Jan 3 for only $39.99. Upgrade now! > > > http://go.microsoft.com/?linkid=9691819 > > > > _________________________________________________________________ > > Eligible CDN College & University students can upgrade to Windows 7 > before Jan 3 for only $39.99. Upgrade now! > > http://go.microsoft.com/?linkid=9691819 > > _________________________________________________________________ > Ready. Set. Get a great deal on Windows 7. See fantastic deals on Windows 7 > now > http://go.microsoft.com/?linkid=9691818