On Dec 16, 2008, at 4:10 PM, Raghu Angadi wrote:

Brian Bockelman wrote:
Hey,
I hit a bit of a roadbump in solving the "truncated block issue" at our site: namely, some of the blocks appear perfectly valid to the datanode. The block verifies, but it is still the wrong size (it appears that the metadata is too small too). What's the best way to proceed? It appears that either (a) the block scanner needs to report to the datanode the size of the block it just verified, which is possibly a scaling issue or (b) the metadata file needs to save the correct block size, which is a pretty major modification, as it requires a change of the on-disk format.

This should be detected by the NameNode. i.e. it should detect this replica is shorter (either compared to other replicas or the expected size). There are various fixes (recent or being worked on) to this area of NameNode and it is mostly covered by of those or should be soon.

Do you know which JIRA tickets I can ask my admins to follow? We'd like to test these out as soon as the fixes are reasonably stable. Right now, we've increased # of replicas to 3, but can't maintain this level of replication forever.

I forgot to give numbers: out of ~200 remaining blocks with this "truncated" issue, having the block scanner verify blocks after a failed transfers due to "inconsistent size" solved about 150 of the issues. The remaining 50 appear to be due to the issue described above.

Thank you very much Raghu; the Hadoop team has been quick in solving these issues.

Brian

Reply via email to