Hey Raghu,
I never heard back from you about whether any of these fixes are ready
to try out. Things are getting kind of bad here.
Even at three replicas, I found one block which has all three replicas
of length=0. Grepping through the logs, I get things like this:
2008-12-18 22:45:04,680 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(172.16.1.121:50010,
storageID=DS-1732140560-172.16.1.121-50010-1228236234012,
infoPort=50075, ipcPort=50020):Got exception while serving
blk_7345861444716855534_7201 to /172.16.1.1:
java.io.IOException: Offset 35307520 and length 10485760 don't match
block blk_7345861444716855534_7201 ( blockLen 0 )
java.io.IOException: Offset 35307520 and length 10485760 don't match
block blk_7345861444716855534_7201 ( blockLen 0 )
On the other hand, if I look for the block scanner activity:
2008-12-08 13:59:15,616 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_7345861444716855534_7201
There is indeed a zero-sized file on disk and matching *correct*
metadata:
[r...@node121 ~]# find /hadoop-data/ -name *7345861444716855534* -exec
ls -lh {} \;
-rw-r--r-- 1 root root 7 Dec 3 15:44 /hadoop-data/dfs/data/current/
subdir9/subdir6/blk_7345861444716855534_7201.meta
-rw-r--r-- 1 root root 0 Dec 3 15:44 /hadoop-data/dfs/data/current/
subdir9/subdir6/blk_7345861444716855534
The metadata matches the 0-sized block, not the full one, of course.
We recently went from 2 replicas to 3 replicas on Dec 11. On Dec 12,
a replicas was created on node191:
[r...@node191 ~]# find /hadoop-data/ -name *7345861444716855534* -exec
ls -lh {} \;
-rw-r--r-- 1 root root 7 Dec 12 08:53 /hadoop-data/dfs/data/current/
subdir40/subdir37/subdir42/blk_7345861444716855534_7201.meta
-rw-r--r-- 1 root root 0 Dec 12 08:53 /hadoop-data/dfs/data/current/
subdir40/subdir37/subdir42/blk_7345861444716855534
The corresponding log entries are here:
2008-12-12 08:53:09,014 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_7345861444716855534_7201 src: /172.16.1.121:47799 dest: /
172.16.1.191:50010
2008-12-12 08:53:17,134 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Received block
blk_7345861444716855534_7201 src: /172.16.1.121:47799 dest: /
172.16.1.191:50010 of size 0
So, the incorrectly-sized block had a new copy created, the datanode
reported the incorrect size (!), and the namenode never deleted it
afterward. I unfortunately don't have the namenode logs from this
period.
Brian
On Dec 16, 2008, at 4:10 PM, Raghu Angadi wrote:
Brian Bockelman wrote:
Hey,
I hit a bit of a roadbump in solving the "truncated block issue" at
our site: namely, some of the blocks appear perfectly valid to the
datanode. The block verifies, but it is still the wrong size (it
appears that the metadata is too small too).
What's the best way to proceed? It appears that either (a) the
block scanner needs to report to the datanode the size of the block
it just verified, which is possibly a scaling issue or (b) the
metadata file needs to save the correct block size, which is a
pretty major modification, as it requires a change of the on-disk
format.
This should be detected by the NameNode. i.e. it should detect this
replica is shorter (either compared to other replicas or the
expected size). There are various fixes (recent or being worked on)
to this area of NameNode and it is mostly covered by of those or
should be soon.
Raghu.
Ideas?
Brian