I have a huge a disk too and /tmp folder was fine and has almost 200G free space on that partition but it still fails. I am going to do the same and look for the bad URL that makes the problem. But how come Nutch is sensitive to a particular URL and fails!? It might be because of the parser plugins.
Mike. On 1/30/06, Rafit Izhak_Ratzin <[EMAIL PROTECTED]> wrote: > > Hi, > I don't think its a problem of disc capacity since I am working on huge > Disk > and only 10% is used, > > What I decide to do is to split the seed into two part and see if I still > get this problem so one half ended succesfuly but the second had the same > problem so I continue with teh spliting, > > I tart with group of 80,000 URL and now I have a group of 5000 that when I > Run them have this problem, I am continuing with this problem till I'll > find > the smallest group that has this problem and let you know about the seed. > > Thanks, > Rafit > > > > >From: Ken Krugler <[EMAIL PROTECTED]> > >Reply-To: [email protected] > >To: [email protected] > >Subject: Re: Problems with MapRed- > >Date: Sun, 29 Jan 2006 16:42:15 -0800 > > > >>This looks like the namenode has lost connection to one of the > datanodes. > >>The default number of replications in ndfs is 3 and it seems like the > >>namenode has only 2 in its list so it logs this warning. As Stefan > >>suggested, you should check the diskspace on your machines. If I recall > >>correctly datanodes crash when they run out of diskspace. > >> > >>This could also explain your problem with the fetching. One datanode > runs > >>out of diskspace and crashes while one of the tasktrackers is writing > data > >>to it. You should also check if the partition with /tmp has enough free > >>space. > > > >Yes, that also can happen. > > > >Especially if, like us, you accidentally configure Nutch to use a > directory > >on the root volume, but your servers have been configured with a separate > >filesystem for /data, and that's where all the disk capacity is located. > > > >-- Ken > > > > > >>Stefan Groschupf schrieb: > >>>may the hdds are full? > >>>try: > >>>bin/nutch ndfs -report > >>>Nutch generates some temporarily data until processing. > >>> > >>>Am 30.01.2006 um 00:54 schrieb Mike Smith: > >>> > >>>>I forgot to mention the namenode log file gives me thousands of these: > >>>> > >>>>060129 155553 Zero targets found, > >>>>forbidden1.size=2allowSameHostTargets=false > >>>>forbidden2.size()=0 > >>>>060129 155553 Zero targets found, > >>>>forbidden1.size=2allowSameHostTargets=false > >>>>forbidden2.size()=0 > > > > > >-- > >Ken Krugler > >Krugle, Inc. > >+1 530-470-9200 > > _________________________________________________________________ > Express yourself instantly with MSN Messenger! Download today it's FREE! > http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ > >
