hi,

anay idea guys ??



thanx

> From: mbel...@msn.com
> To: nutch-user@lucene.apache.org
> Subject: RE: recrawl.sh stopped at depth 7/10 without error
> Date: Fri, 27 Nov 2009 20:11:12 +0000
> 
> 
> 
> hi,
> 
> this is the main loop of my recrawl.sh
> 
> 
> do
> 
>   echo "--- Beginning crawl at depth `expr $i + 1` of $depth ---"
>   $NUTCH_HOME/bin/nutch generate $crawl/crawldb $crawl/segments $topN \
>       -adddays $adddays
>   if [ $? -ne 0 ]
>   then
>     echo "runbot: Stopping at depth $depth. No more URLs to fetch."
>     break
>   fi
>   segment=`ls -d $crawl/segments/* | tail -1`
> 
>   $NUTCH_HOME/bin/nutch fetch $segment -threads $threads
>   if [ $? -ne 0 ]
>   then
>     echo "runbot: fetch $segment at depth `expr $i + 1` failed."
>     echo "runbot: Deleting segment $segment."
>     rm $RMARGS $segment
>     continue
>   fi
> 
>   $NUTCH_HOME/bin/nutch updatedb $crawl/crawldb $segment
> 
> done
> 
> echo "----- Merge Segments (Step 3 of $steps) -----"
> 
> 
> 
> in my log file i never find the message "----- Merge Segments (Step 3 of 
> $steps) -----" ! so it breaks the loop and stops the process. 
> 
> i dont understand why it stops at depth 7 without any errors !
> 
> 
> > From: mbel...@msn.com
> > To: nutch-user@lucene.apache.org
> > Subject: recrawl.sh stopped at depth 7/10 without error
> > Date: Wed, 25 Nov 2009 15:43:33 +0000
> > 
> > 
> > 
> > hi,
> > 
> > i'm running recrawl.sh and it stops every time at depth 7/10 without any 
> > error ! but when run the bin/crawl with the same crawl-urlfilter and the 
> > same seeds file it finishs softly in 1h50
> > 
> > i checked the hadoop.log, and dont find any error there...i just find the 
> > last url it was parsing
> > do fetching or crawling has a timeout ?
> > my recrawl takes 2 hours before it stops. i set the time fetch interval 24 
> > hours and i'm running the generate with adddays = 1
> > 
> > best regards
> >                                       
> > _________________________________________________________________
> > Eligible CDN College & University students can upgrade to Windows 7 before 
> > Jan 3 for only $39.99. Upgrade now!
> > http://go.microsoft.com/?linkid=9691819
>                                         
> _________________________________________________________________
> Eligible CDN College & University students can upgrade to Windows 7 before 
> Jan 3 for only $39.99. Upgrade now!
> http://go.microsoft.com/?linkid=9691819
                                          
_________________________________________________________________
Ready. Set. Get a great deal on Windows 7. See fantastic deals on Windows 7 now
http://go.microsoft.com/?linkid=9691818

Reply via email to