what is the parameter I can use to check more often, like 3 days?
On Mon, Jun 25, 2012 at 7:33 AM, Kai Voigt <k...@123.org> wrote: > HDFS has block checksums. Whenever a block is written to the datanodes, a > checksum is calculated and written with the block to the datanodes' disks. > > Whenever a block is requested, the block's checksum is verified against > the stored checksum. If they don't match, that block is corrupt. But since > there's > additional replicas of the block, chances are high one block is matching > the checksum. Corrupt blocks will be scheduled to be rereplicated. > > Also, to prevent bit rod, blocks are checked periodically (weekly by > default, I believe, you can configure that period) in the background. > > Kai > > Am 25.06.2012 um 13:29 schrieb Rita: > > > Does Hadoop, HDFS in particular, do any sanity checks of the file before > > and after balancing/copying/reading the files? We have 20TB of data and I > > want to make sure after these operating are completed the data is still > in > > good shape. Where can I read about this? > > > > tia > > > > -- > > --- Get your facts first, then you can distort them as you please.-- > > -- > Kai Voigt > k...@123.org > > > > > -- --- Get your facts first, then you can distort them as you please.--