Sriram Rao wrote:
Does this read every block of every file from all replicas and verify
that the checksums are good?
Sriram
The DataBlockScanner thread on every datanode does this for you
automatically. You can tune the rate it reads it, but it reads in all
local blocks and compares the MD5 sums, deals with failures by reporting
a list of failures to the namenode after the scan. After that, it's the
namenode's problem how to deal with the corrupt block. In an ideal
system, at least one non-corrupt copy of the block is still live
the configuration attribute dfs.datanode.scan.period.hours can tune the
scan rate