> "Exception in receiveBlock for block java.io.IOException: Trying to
> change block file offset of block blk_7857709233639057851 to 33357824
> but actual size of file is 33353728"
This was fixed in HADOOP-3033. You can try running latest 0.16 branch
(svn...hadoop/core/branches/branch-016). 0.16.2 release is scheduled for
early next week.
This exception does not fully explain blocked client. If the client
blocks again with latest 0.16 branch, could you include stacktraces on
datanodes also? You could file a jira so that it is convenient to attach
logs and stacktrace.
Raghu.
Iván de Prado wrote:
Hello,
I'm working with Hadoop 0.16.1. I have an issue with the DFS. Sometimes
when writing to the HDFS it gets blocked. Sometimes it doesn't happen,
so it's not easily reproducible.
My cluster have 4 nodes and one master with the NameNode and JobTracker.
This are the logs that appears when all gets blocked. Look to the block
blk_7857709233639057851 that seems to be the problematic one. It raises
the exception:
"Exception in receiveBlock for block java.io.IOException: Trying to
change block file offset of block blk_7857709233639057851 to 33357824
but actual size of file is 33353728"
A bigger trace of the logs and a part of the stack trace:
hn3: 2008-03-28 07:34:44,499 INFO org.apache.hadoop.dfs.DataNode:
Receiving block blk_7857709233639057851 src: /172.16.3.2:46092
dest: /172.16.3.2:50010
hn3: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
Datanode 2 got response for connect ack from downstream datanode with
firstbadlink as
hn3: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
Datanode 2 forwarding connect ack to upstream firstbadlink is
hn2: 2008-03-28 07:34:44,496 INFO org.apache.hadoop.dfs.DataNode:
Received block blk_8152094109584962620 of size 67108864 from /172.16.3.2
hn2: 2008-03-28 07:34:44,496 INFO org.apache.hadoop.dfs.DataNode:
PacketResponder 2 for block blk_8152094109584962620 terminating
hn2: 2008-03-28 07:34:44,500 INFO org.apache.hadoop.dfs.DataNode:
Receiving block blk_7857709233639057851 src: /172.16.3.5:35904
dest: /172.16.3.5:50010
hn2: 2008-03-28 07:34:44,502 INFO org.apache.hadoop.dfs.DataNode:
Datanode 1 got response for connect ack from downstream datanode with
firstbadlink as
hn2: 2008-03-28 07:34:44,502 INFO org.apache.hadoop.dfs.DataNode:
Datanode 1 forwarding connect ack to upstream firstbadlink is
hn1: 2008-03-28 07:34:44,495 INFO org.apache.hadoop.dfs.DataNode:
Received block blk_8152094109584962620 of size 67108864 from /172.16.3.4
hn1: 2008-03-28 07:34:44,495 INFO org.apache.hadoop.dfs.DataNode:
PacketResponder 1 for block blk_8152094109584962620 terminating
hn4: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
Receiving block blk_7857709233639057851 src: /172.16.3.4:36887
dest: /172.16.3.4:50010
hn4: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
Datanode 0 forwarding connect ack to upstream firstbadlink is
hn4: 2008-03-28 07:34:44,615 INFO org.apache.hadoop.dfs.DataNode:
Changing block file offset of block blk_7857709233639057851 from 4325376
to 4325376 meta file offset to 33799
hn3: 2008-03-28 07:34:45,304 INFO org.apache.hadoop.dfs.DataNode:
Changing block file offset of block blk_7857709233639057851 from
33353728 to 33357824 meta file offset to 260615
hn3: 2008-03-28 07:34:45,305 INFO org.apache.hadoop.dfs.DataNode:
Exception in receiveBlock for block java.io.IOException: Trying to
change block file offset of block blk_7857709233639057851 to 33357824
but actual size of file is 33353728
hn1: 2008-03-28 07:35:31,835 INFO org.apache.hadoop.dfs.DataNode:
BlockReport of 564 blocks got processed in 128 msecs
Full thread dump Java HotSpot(TM) 64-Bit Server VM (10.0-b19 mixed
mode):
"ResponseProcessor for block blk_7857709233639057851" prio=10
tid=0x000000005c557800 nid=0x23ad waiting for monitor entry
[0x0000000040e15000..0x0000000040e15a10]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream
$ResponseProcessor.run(DFSClient.java:1771)
- waiting to lock <0x00002aaab43ad910> (a java.util.LinkedList)
"DataStreamer for file /user/properazzi/test/output/index/_0.cfs block
blk_7857709233639057851" prio=10 tid=0x000000005c59f000 nid=0x2392
runnable [0x0000000041219000..0x0000000041219d10]
java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at
java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at
java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
- locked <0x00002aaade9b8120> (a java.io.BufferedOutputStream)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
- locked <0x00002aaade9b8148> (a java.io.DataOutputStream)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream
$DataStreamer.run(DFSClient.java:1623)
- locked <0x00002aaab43ad910> (a java.util.LinkedList)
"[EMAIL PROTECTED]" daemon prio=10
tid=0x000000005c7f1000 nid=0x2254 waiting on condition
[0x0000000041118000..0x0000000041118a90]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.dfs.DFSClient
$LeaseChecker.run(DFSClient.java:597)
at java.lang.Thread.run(Thread.java:619)
"[EMAIL PROTECTED]" daemon prio=10
tid=0x000000005c4fec00 nid=0x224f waiting on condition
[0x0000000040f16000..0x0000000040f16c90]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.dfs.DFSClient
$LeaseChecker.run(DFSClient.java:597)
at java.lang.Thread.run(Thread.java:619)
"org.apache.hadoop.io.ObjectWritable Connection Culler" daemon prio=10
tid=0x000000005c7c5c00 nid=0x224d waiting on condition
[0x0000000040d14000..0x0000000040d14b90]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.ipc.Client
$ConnectionCuller.run(Client.java:423)
"main" prio=10 tid=0x000000005c417000 nid=0x223b waiting for monitor
entry [0x0000000040207000..0x0000000040209ed0]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.hadoop.dfs.DFSClient
$DFSOutputStream.writeChunk(DFSClient.java:2117)
- waiting to lock <0x00002aaab43ad910> (a java.util.LinkedList)
at
org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:141)
at
org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:100)
at
org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
- locked <0x00002aaab43addd8> (a org.apache.hadoop.dfs.DFSClient
$DFSOutputStream)
at org.apache.hadoop.fs.FSDataOutputStream
$PositionCache.write(FSDataOutputStream.java:41)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
- locked <0x00002aaab43aef18> (a
org.apache.hadoop.fs.FSDataOutputStream)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:151)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1028)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1016)
at
org.apache.hadoop.fs.FileSystem.moveFromLocalFile(FileSystem.java:1006)
at
org.apache.hadoop.fs.FileSystem.completeLocalOutput(FileSystem.java:1077)
...
Any Help with that? Ask for more information if needed.
Thanks, and congratulations for your revolutionary project.
Iván de Prado Alonso
http://ivandeprado.blogspot.com/