This is a know issue:
https://issues.apache.org/jira/browse/HADOOP-3033

Your best bet now is to use 0.16.2 release.

Runping


> -----Original Message-----
> From: Iván de Prado [mailto:[EMAIL PROTECTED]
> Sent: Friday, March 28, 2008 6:08 AM
> To: [email protected]
> Subject: DFS get blocked when writing a file.
> 
> Hello,
> 
> I'm working with Hadoop 0.16.1. I have an issue with the DFS. Sometimes
> when writing to the HDFS it gets blocked. Sometimes it doesn't happen,
> so it's not easily reproducible.
> 
> My cluster have 4 nodes and one master with the NameNode and JobTracker.
> This are the logs that appears when all gets blocked. Look to the block
> blk_7857709233639057851 that seems to be the problematic one. It raises
> the exception:
> 
> "Exception in receiveBlock for block  java.io.IOException: Trying to
> change block file offset of block blk_7857709233639057851 to 33357824
> but actual size of file is 33353728"
> 
> A bigger trace of the logs and a part of the stack trace:
> 
> hn3: 2008-03-28 07:34:44,499 INFO org.apache.hadoop.dfs.DataNode:
> Receiving block blk_7857709233639057851 src: /172.16.3.2:46092
> dest: /172.16.3.2:50010
> hn3: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
> Datanode 2 got response for connect ack  from downstream datanode with
> firstbadlink as
> hn3: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
> Datanode 2 forwarding connect ack to upstream firstbadlink is
> hn2: 2008-03-28 07:34:44,496 INFO org.apache.hadoop.dfs.DataNode:
> Received block blk_8152094109584962620 of size 67108864 from /172.16.3.2
> hn2: 2008-03-28 07:34:44,496 INFO org.apache.hadoop.dfs.DataNode:
> PacketResponder 2 for block blk_8152094109584962620 terminating
> hn2: 2008-03-28 07:34:44,500 INFO org.apache.hadoop.dfs.DataNode:
> Receiving block blk_7857709233639057851 src: /172.16.3.5:35904
> dest: /172.16.3.5:50010
> hn2: 2008-03-28 07:34:44,502 INFO org.apache.hadoop.dfs.DataNode:
> Datanode 1 got response for connect ack  from downstream datanode with
> firstbadlink as
> hn2: 2008-03-28 07:34:44,502 INFO org.apache.hadoop.dfs.DataNode:
> Datanode 1 forwarding connect ack to upstream firstbadlink is
> hn1: 2008-03-28 07:34:44,495 INFO org.apache.hadoop.dfs.DataNode:
> Received block blk_8152094109584962620 of size 67108864 from /172.16.3.4
> hn1: 2008-03-28 07:34:44,495 INFO org.apache.hadoop.dfs.DataNode:
> PacketResponder 1 for block blk_8152094109584962620 terminating
> hn4: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
> Receiving block blk_7857709233639057851 src: /172.16.3.4:36887
> dest: /172.16.3.4:50010
> hn4: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
> Datanode 0 forwarding connect ack to upstream firstbadlink is
> hn4: 2008-03-28 07:34:44,615 INFO org.apache.hadoop.dfs.DataNode:
> Changing block file offset of block blk_7857709233639057851 from 4325376
> to 4325376 meta file offset to 33799
> hn3: 2008-03-28 07:34:45,304 INFO org.apache.hadoop.dfs.DataNode:
> Changing block file offset of block blk_7857709233639057851 from
> 33353728 to 33357824 meta file offset to 260615
> hn3: 2008-03-28 07:34:45,305 INFO org.apache.hadoop.dfs.DataNode:
> Exception in receiveBlock for block  java.io.IOException: Trying to
> change block file offset of block blk_7857709233639057851 to 33357824
> but actual size of file is 33353728
> hn1: 2008-03-28 07:35:31,835 INFO org.apache.hadoop.dfs.DataNode:
> BlockReport of 564 blocks got processed in 128 msecs
> 
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (10.0-b19 mixed
> mode):
> 
> "ResponseProcessor for block blk_7857709233639057851" prio=10
> tid=0x000000005c557800 nid=0x23ad waiting for monitor entry
> [0x0000000040e15000..0x0000000040e15a10]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream
> $ResponseProcessor.run(DFSClient.java:1771)
>         - waiting to lock <0x00002aaab43ad910> (a java.util.LinkedList)
> 
> "DataStreamer for file /user/properazzi/test/output/index/_0.cfs block
> blk_7857709233639057851" prio=10 tid=0x000000005c59f000 nid=0x2392
> runnable [0x0000000041219000..0x0000000041219d10]
>    java.lang.Thread.State: RUNNABLE
>         at java.net.SocketOutputStream.socketWrite0(Native Method)
>         at
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>         at
> java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at
> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
>         - locked <0x00002aaade9b8120> (a java.io.BufferedOutputStream)
>         at java.io.DataOutputStream.write(DataOutputStream.java:90)
>         - locked <0x00002aaade9b8148> (a java.io.DataOutputStream)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream
> $DataStreamer.run(DFSClient.java:1623)
>         - locked <0x00002aaab43ad910> (a java.util.LinkedList)
> 
> "[EMAIL PROTECTED]" daemon prio=10
> tid=0x000000005c7f1000 nid=0x2254 waiting on condition
> [0x0000000041118000..0x0000000041118a90]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.hadoop.dfs.DFSClient
> $LeaseChecker.run(DFSClient.java:597)
>         at java.lang.Thread.run(Thread.java:619)
> 
> "[EMAIL PROTECTED]" daemon prio=10
> tid=0x000000005c4fec00 nid=0x224f waiting on condition
> [0x0000000040f16000..0x0000000040f16c90]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.hadoop.dfs.DFSClient
> $LeaseChecker.run(DFSClient.java:597)
>         at java.lang.Thread.run(Thread.java:619)
> 
> "org.apache.hadoop.io.ObjectWritable Connection Culler" daemon prio=10
> tid=0x000000005c7c5c00 nid=0x224d waiting on condition
> [0x0000000040d14000..0x0000000040d14b90]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.hadoop.ipc.Client
> $ConnectionCuller.run(Client.java:423)
> 
> 
> "main" prio=10 tid=0x000000005c417000 nid=0x223b waiting for monitor
> entry [0x0000000040207000..0x0000000040209ed0]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at org.apache.hadoop.dfs.DFSClient
> $DFSOutputStream.writeChunk(DFSClient.java:2117)
>         - waiting to lock <0x00002aaab43ad910> (a java.util.LinkedList)
>         at
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java
> :141)
>         at
> org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:100)
>         at
> org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
>         - locked <0x00002aaab43addd8> (a org.apache.hadoop.dfs.DFSClient
> $DFSOutputStream)
>         at org.apache.hadoop.fs.FSDataOutputStream
> $PositionCache.write(FSDataOutputStream.java:41)
>         at java.io.DataOutputStream.write(DataOutputStream.java:90)
>         - locked <0x00002aaab43aef18> (a
> org.apache.hadoop.fs.FSDataOutputStream)
>         at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47)
>         at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:151)
>         at
> org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1028)
>         at
> org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1016)
>         at
> org.apache.hadoop.fs.FileSystem.moveFromLocalFile(FileSystem.java:1006)
>         at
> org.apache.hadoop.fs.FileSystem.completeLocalOutput(FileSystem.java:1077)
>       ...
> 
> Any Help with that? Ask for more information if needed.
> 
> Thanks, and congratulations for your revolutionary project.
> 
> Iván de Prado Alonso
> http://ivandeprado.blogspot.com/
> 
> 

Reply via email to