Very small clusters are often problematic but your logs look like your cluster has something really hosey going on beyond just process going missing for a time. I don't know what it is, off-hand, but it is ugly. Approaching this cold, I would not assume that anything is correct. Thus I would look at network configuration, DNS and other simple things.
Can you run small test jobs correctly or does everything mess up? On Wed, Dec 8, 2010 at 8:26 PM, rajgopalv <[email protected]> wrote: > > Ted, > > I've tried incrementing my own counter in every map job, but this keep > happening. > Kindly look at the log here http://pastebin.com/Xv76mXDJ > http://pastebin.com/Xv76mXDJ > > One more question, > I have a small cluster of small computers now. Cluster contains 2 machines, > each of 2GB ram, dual core. but i've increased the hadoop and hbase > heapsize > to 1.5 gb. will this create any problem ? (other than slowing down the > process, i dont think this will lead to errors like what is in the log that > i've given above) > > > Ted Dunning-2 wrote: > > > > lt looks like your task took a long time to complete (> 10 minutes) and > > didn't produce any output or report any status to Hadoop during this > time. > > > > This often happens during indexing tasks where a reducer or mapper builds > > some off-line data structure for a long time. Can you force your mappers > > to > > update a Hadoop counter as they go along? That might be all that is > > needed. > > > > On Tue, Dec 7, 2010 at 5:37 AM, rajgopalv <[email protected]> wrote: > > > >> Task attempt_201012071646_0001_m_000025_0 failed to report status for > 600 > >> seconds. Killing! > >> > > > > > > -- > View this message in context: > http://old.nabble.com/Zoo-keeper-exception-in-the-middle-of-MR-tp30396344p30412978.html > Sent from the HBase User mailing list archive at Nabble.com. > >
