This is a know issue: https://issues.apache.org/jira/browse/HADOOP-3033
Your best bet now is to use 0.16.2 release. Runping > -----Original Message----- > From: Iván de Prado [mailto:[EMAIL PROTECTED] > Sent: Friday, March 28, 2008 6:08 AM > To: [email protected] > Subject: DFS get blocked when writing a file. > > Hello, > > I'm working with Hadoop 0.16.1. I have an issue with the DFS. Sometimes > when writing to the HDFS it gets blocked. Sometimes it doesn't happen, > so it's not easily reproducible. > > My cluster have 4 nodes and one master with the NameNode and JobTracker. > This are the logs that appears when all gets blocked. Look to the block > blk_7857709233639057851 that seems to be the problematic one. It raises > the exception: > > "Exception in receiveBlock for block java.io.IOException: Trying to > change block file offset of block blk_7857709233639057851 to 33357824 > but actual size of file is 33353728" > > A bigger trace of the logs and a part of the stack trace: > > hn3: 2008-03-28 07:34:44,499 INFO org.apache.hadoop.dfs.DataNode: > Receiving block blk_7857709233639057851 src: /172.16.3.2:46092 > dest: /172.16.3.2:50010 > hn3: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode: > Datanode 2 got response for connect ack from downstream datanode with > firstbadlink as > hn3: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode: > Datanode 2 forwarding connect ack to upstream firstbadlink is > hn2: 2008-03-28 07:34:44,496 INFO org.apache.hadoop.dfs.DataNode: > Received block blk_8152094109584962620 of size 67108864 from /172.16.3.2 > hn2: 2008-03-28 07:34:44,496 INFO org.apache.hadoop.dfs.DataNode: > PacketResponder 2 for block blk_8152094109584962620 terminating > hn2: 2008-03-28 07:34:44,500 INFO org.apache.hadoop.dfs.DataNode: > Receiving block blk_7857709233639057851 src: /172.16.3.5:35904 > dest: /172.16.3.5:50010 > hn2: 2008-03-28 07:34:44,502 INFO org.apache.hadoop.dfs.DataNode: > Datanode 1 got response for connect ack from downstream datanode with > firstbadlink as > hn2: 2008-03-28 07:34:44,502 INFO org.apache.hadoop.dfs.DataNode: > Datanode 1 forwarding connect ack to upstream firstbadlink is > hn1: 2008-03-28 07:34:44,495 INFO org.apache.hadoop.dfs.DataNode: > Received block blk_8152094109584962620 of size 67108864 from /172.16.3.4 > hn1: 2008-03-28 07:34:44,495 INFO org.apache.hadoop.dfs.DataNode: > PacketResponder 1 for block blk_8152094109584962620 terminating > hn4: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode: > Receiving block blk_7857709233639057851 src: /172.16.3.4:36887 > dest: /172.16.3.4:50010 > hn4: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode: > Datanode 0 forwarding connect ack to upstream firstbadlink is > hn4: 2008-03-28 07:34:44,615 INFO org.apache.hadoop.dfs.DataNode: > Changing block file offset of block blk_7857709233639057851 from 4325376 > to 4325376 meta file offset to 33799 > hn3: 2008-03-28 07:34:45,304 INFO org.apache.hadoop.dfs.DataNode: > Changing block file offset of block blk_7857709233639057851 from > 33353728 to 33357824 meta file offset to 260615 > hn3: 2008-03-28 07:34:45,305 INFO org.apache.hadoop.dfs.DataNode: > Exception in receiveBlock for block java.io.IOException: Trying to > change block file offset of block blk_7857709233639057851 to 33357824 > but actual size of file is 33353728 > hn1: 2008-03-28 07:35:31,835 INFO org.apache.hadoop.dfs.DataNode: > BlockReport of 564 blocks got processed in 128 msecs > > Full thread dump Java HotSpot(TM) 64-Bit Server VM (10.0-b19 mixed > mode): > > "ResponseProcessor for block blk_7857709233639057851" prio=10 > tid=0x000000005c557800 nid=0x23ad waiting for monitor entry > [0x0000000040e15000..0x0000000040e15a10] > java.lang.Thread.State: BLOCKED (on object monitor) > at org.apache.hadoop.dfs.DFSClient$DFSOutputStream > $ResponseProcessor.run(DFSClient.java:1771) > - waiting to lock <0x00002aaab43ad910> (a java.util.LinkedList) > > "DataStreamer for file /user/properazzi/test/output/index/_0.cfs block > blk_7857709233639057851" prio=10 tid=0x000000005c59f000 nid=0x2392 > runnable [0x0000000041219000..0x0000000041219d10] > java.lang.Thread.State: RUNNABLE > at java.net.SocketOutputStream.socketWrite0(Native Method) > at > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) > at > java.net.SocketOutputStream.write(SocketOutputStream.java:136) > at > java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > - locked <0x00002aaade9b8120> (a java.io.BufferedOutputStream) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > - locked <0x00002aaade9b8148> (a java.io.DataOutputStream) > at org.apache.hadoop.dfs.DFSClient$DFSOutputStream > $DataStreamer.run(DFSClient.java:1623) > - locked <0x00002aaab43ad910> (a java.util.LinkedList) > > "[EMAIL PROTECTED]" daemon prio=10 > tid=0x000000005c7f1000 nid=0x2254 waiting on condition > [0x0000000041118000..0x0000000041118a90] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at org.apache.hadoop.dfs.DFSClient > $LeaseChecker.run(DFSClient.java:597) > at java.lang.Thread.run(Thread.java:619) > > "[EMAIL PROTECTED]" daemon prio=10 > tid=0x000000005c4fec00 nid=0x224f waiting on condition > [0x0000000040f16000..0x0000000040f16c90] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at org.apache.hadoop.dfs.DFSClient > $LeaseChecker.run(DFSClient.java:597) > at java.lang.Thread.run(Thread.java:619) > > "org.apache.hadoop.io.ObjectWritable Connection Culler" daemon prio=10 > tid=0x000000005c7c5c00 nid=0x224d waiting on condition > [0x0000000040d14000..0x0000000040d14b90] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at org.apache.hadoop.ipc.Client > $ConnectionCuller.run(Client.java:423) > > > "main" prio=10 tid=0x000000005c417000 nid=0x223b waiting for monitor > entry [0x0000000040207000..0x0000000040209ed0] > java.lang.Thread.State: BLOCKED (on object monitor) > at org.apache.hadoop.dfs.DFSClient > $DFSOutputStream.writeChunk(DFSClient.java:2117) > - waiting to lock <0x00002aaab43ad910> (a java.util.LinkedList) > at > org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java > :141) > at > org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:100) > at > org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86) > - locked <0x00002aaab43addd8> (a org.apache.hadoop.dfs.DFSClient > $DFSOutputStream) > at org.apache.hadoop.fs.FSDataOutputStream > $PositionCache.write(FSDataOutputStream.java:41) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > - locked <0x00002aaab43aef18> (a > org.apache.hadoop.fs.FSDataOutputStream) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:151) > at > org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1028) > at > org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1016) > at > org.apache.hadoop.fs.FileSystem.moveFromLocalFile(FileSystem.java:1006) > at > org.apache.hadoop.fs.FileSystem.completeLocalOutput(FileSystem.java:1077) > ... > > Any Help with that? Ask for more information if needed. > > Thanks, and congratulations for your revolutionary project. > > Iván de Prado Alonso > http://ivandeprado.blogspot.com/ > >
