I sill want to  know the reason.

2009/12/2 BELLINI ADAM <mbel...@msn.com>

>
> hi,
>
> anay idea guys ??
>
>
>
> thanx
>
> > From: mbel...@msn.com
> > To: nutch-user@lucene.apache.org
> > Subject: RE: recrawl.sh stopped at depth 7/10 without error
> > Date: Fri, 27 Nov 2009 20:11:12 +0000
> >
> >
> >
> > hi,
> >
> > this is the main loop of my recrawl.sh
> >
> >
> > do
> >
> >   echo "--- Beginning crawl at depth `expr $i + 1` of $depth ---"
> >   $NUTCH_HOME/bin/nutch generate $crawl/crawldb $crawl/segments $topN \
> >       -adddays $adddays
> >   if [ $? -ne 0 ]
> >   then
> >     echo "runbot: Stopping at depth $depth. No more URLs to fetch."
> >     break
> >   fi
> >   segment=`ls -d $crawl/segments/* | tail -1`
> >
> >   $NUTCH_HOME/bin/nutch fetch $segment -threads $threads
> >   if [ $? -ne 0 ]
> >   then
> >     echo "runbot: fetch $segment at depth `expr $i + 1` failed."
> >     echo "runbot: Deleting segment $segment."
> >     rm $RMARGS $segment
> >     continue
> >   fi
> >
> >   $NUTCH_HOME/bin/nutch updatedb $crawl/crawldb $segment
> >
> > done
> >
> > echo "----- Merge Segments (Step 3 of $steps) -----"
> >
> >
> >
> > in my log file i never find the message "----- Merge Segments (Step 3 of
> $steps) -----" ! so it breaks the loop and stops the process.
> >
> > i dont understand why it stops at depth 7 without any errors !
> >
> >
> > > From: mbel...@msn.com
> > > To: nutch-user@lucene.apache.org
> > > Subject: recrawl.sh stopped at depth 7/10 without error
> > > Date: Wed, 25 Nov 2009 15:43:33 +0000
> > >
> > >
> > >
> > > hi,
> > >
> > > i'm running recrawl.sh and it stops every time at depth 7/10 without
> any error ! but when run the bin/crawl with the same crawl-urlfilter and the
> same seeds file it finishs softly in 1h50
> > >
> > > i checked the hadoop.log, and dont find any error there...i just find
> the last url it was parsing
> > > do fetching or crawling has a timeout ?
> > > my recrawl takes 2 hours before it stops. i set the time fetch interval
> 24 hours and i'm running the generate with adddays = 1
> > >
> > > best regards
> > >
> > > _________________________________________________________________
> > > Eligible CDN College & University students can upgrade to Windows 7
> before Jan 3 for only $39.99. Upgrade now!
> > > http://go.microsoft.com/?linkid=9691819
> >
> > _________________________________________________________________
> > Eligible CDN College & University students can upgrade to Windows 7
> before Jan 3 for only $39.99. Upgrade now!
> > http://go.microsoft.com/?linkid=9691819
>
> _________________________________________________________________
> Ready. Set. Get a great deal on Windows 7. See fantastic deals on Windows 7
> now
> http://go.microsoft.com/?linkid=9691818

Reply via email to