On Thu, Apr 26, 2012 at 10:24 PM, Harsh J <ha...@cloudera.com> wrote:
> Is only the same IP printed in all such messages? Can you check the DN > log in that machine to see if it reports any form of issues? > > All IPs were logged with this message > Also, did your jobs fail or kept going despite these hiccups? I notice > you're threading your clients though (?), but I can't tell if that may > cause this without further information. > > It started with this error message and slowly all the jobs died with "shortRead" errors. I am not sure about threading. I am using pig script to read .gz file > On Fri, Apr 27, 2012 at 5:19 AM, Mohit Anchlia <mohitanch...@gmail.com> > wrote: > > I had 20 mappers in parallel reading 20 gz files and each file around > > 30-40MB data over 5 hadoop nodes and then writing to the analytics > > database. Almost midway it started to get this error: > > > > > > 2012-04-26 16:13:53,723 [Thread-8] INFO org.apache.hadoop.hdfs.DFSClient > - > > Exception in createBlockOutputStream > > 17.18.62.192:50010java.io.IOException: Bad connect ack with > > firstBadLink as > > 17.18.62.191:50010 > > > > I am trying to look at the logs but doesn't say much. What could be the > > reason? We are in pretty closed reliable network and all machines are up. > > > > -- > Harsh J >