[Nutch-general] Re: Problems with MapRed-

Mike Smith Sun, 29 Jan 2006 15:56:23 -0800

I forgot to mention the namenode log file gives me thousands of these:

060129 155553 Zero targets found, forbidden1.size=2allowSameHostTargets=false
forbidden2.size()=0
060129 155553 Zero targets found, forbidden1.size=2allowSameHostTargets=false
forbidden2.size()=0


Thanks, Mike


On 1/29/06, Mike Smith <[EMAIL PROTECTED]> wrote:
>
> I do have the same problem and this problem is killing. I have tried all
> sort of comfiguration and tricks.
>
> I have 3 machines, all three are datanodes and 1 is jobtracker. It
> successfully fetches 300,000 pages, but when I try to fetch more than that
> by injecting more number of pages at the first cycle it always crashes at
> the end of the fetching reduce step:
>
> 060129 142220  reduce 95%
> 060129 142347  reduce 96%
> 060129 143401  reduce 100%
> Exception in thread "main" java.io.IOException: Job failed!
>         at org.apache.nutch.mapred.JobClient.runJob(JobClient.java :308)
>         at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:347)
>         at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:381)
>
>
>
>
> This has happened at one of the tasktrackers:
>
> 060129 172145 task_r_ca2dxi 0.8677622% reduce > reduce
> 060129 172146 task_r_ca2dxi 0.868171% reduce > reduce
> 060129 173149 Task task_r_ca2dxi timed out.  Killing.
> 060129 173149 Server connection on port 50050 from 164.67.195.26: exiting
> 060129 173149 task_r_ca2dxi Child Error
> java.io.IOException: Task process exit with nonzero status.
>         at org.apache.nutch.mapred.TaskRunner.runChild (TaskRunner.java
> :139)
>         at org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:92)
> 060129 173153 task_m_bikodi done; removing files.
>
>
> Any suggestion?
>
> Thanks, Mike
>
>
>
>
>
>
>
> On 1/29/06, Stefan Groschupf <[EMAIL PROTECTED]> wrote:
> >
> > Sounds like your tasktracker wasn't able to connect to your
> > jobtracker and more.
> > Are you sure the jobtracker still runs and the tasktracker can access
> > the jobtracker box still under same hostname?
> >
> > Am 28.01.2006 um 21:21 schrieb Rafit Izhak_Ratzin:
> >
> > > Hi,
> > >
> > > I ran the mapreduce starting with 10 URL into the sixth cycle where
> > > it fetched 400K pages and everything was fine.
> > >
> > > 060127 001055 TOTAL urls:       1877326
> > > 060127 001055 avg score:        1.099
> > > 060127 001055 max score:        1666.305
> > > 060127 001055 min score:        1.0
> > > 060127 001055 retry 0:  1865721
> > > 060127 001055 retry 1:  10887
> > > 060127 001055 retry 2:  621
> > > 060127 001055 retry 3:  92
> > > 060127 001055 retry 4:  4
> > > 060127 001055 retry 5:  1
> > > 060127 001055 status 1 (DB_unfetched):  1477634
> > > 060127 001055 status 2 (DB_fetched):    374736
> > > 060127 001055 status 3 (DB_gone):       24956
> > >
> > > Then I tried another scenario starting with 80K urls, and the first
> > > cycle was OK, but the second cycle where it supposed to fetch 800K
> > > failed after 100% reduce.
> > > I ran it with three machines 1 name node and 2 datanodes.
> > >
> > > One of my datanode has the next Exception:
> > > 060128 083726 task_r_alfaaq Child Error
> > > java.io.IOException : Task process exit with nonzero status.
> > >        at org.apache.nutch.mapred.TaskRunner.runChild
> > > (TaskRunner.java:139)
> > >        at org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:92)
> > > Which appeared more than once.
> > >
> > > The other Data Node had the next Exception:
> > > 060128 142626 Lost connection to JobTracker [server.name/i.i.i.i:
> > > 50020]. ex=java.lang.reflect.UndeclaredThrowableException  Retrying...
> > >
> > > Any idea?
> > >
> > > Thanks,
> > > Rafit
> > >
> > > _________________________________________________________________
> > > Express yourself instantly with MSN Messenger! Download today it's
> > > FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
> > >
> > >
> >
> > ---------------------------------------------------------------
> > company:        http://www.media-style.com
> > forum:        http://www.text-mining.org
> > blog:             http://www.find23.net
> >
> >
> >
> >
>

[Nutch-general] Re: Problems with MapRed-

Reply via email to