On Dec 16, 2008, at 4:10 PM, Raghu Angadi wrote:
Brian Bockelman wrote:
Hey,
I hit a bit of a roadbump in solving the "truncated block issue" at
our site: namely, some of the blocks appear perfectly valid to the
datanode. The block verifies, but it is still the wrong size (it
appears that the metadata is too small too).
What's the best way to proceed? It appears that either (a) the
block scanner needs to report to the datanode the size of the block
it just verified, which is possibly a scaling issue or (b) the
metadata file needs to save the correct block size, which is a
pretty major modification, as it requires a change of the on-disk
format.
This should be detected by the NameNode. i.e. it should detect this
replica is shorter (either compared to other replicas or the
expected size). There are various fixes (recent or being worked on)
to this area of NameNode and it is mostly covered by of those or
should be soon.
Do you know which JIRA tickets I can ask my admins to follow? We'd
like to test these out as soon as the fixes are reasonably stable.
Right now, we've increased # of replicas to 3, but can't maintain this
level of replication forever.
I forgot to give numbers: out of ~200 remaining blocks with this
"truncated" issue, having the block scanner verify blocks after a
failed transfers due to "inconsistent size" solved about 150 of the
issues. The remaining 50 appear to be due to the issue described above.
Thank you very much Raghu; the Hadoop team has been quick in solving
these issues.
Brian