For Missing blocks and corrupted blocks, do check if all the datanode services are up, non of the disks where hdfs data is stored is accessible and have no issues, hosts are reachable from namenode,
If you are able to re-generate the data and write its great, otherwise hadoop cannot correct itself. *"Does HDFS check that the data node is up, data disk is mounted, path tothe file exists and file can be read?"* -- yes, only after it fails it will say missing blocks. *Or does it also do a filesystem check on that data disk as well asperhaps a checksum to ensure block integrity?* -- yes, every file cheksum is maintained and cross checked, if it fails it will say corrupted blocks. hope this helps. -Sanjeev On Tue, 20 Oct 2020 at 09:52, TomK <tomk...@mdevsys.com> wrote: > Hello, > > HDFS Missing Blocks / Corrupt Blocks Logic: What are the specific > checks done to determine a block is bad and needs to be replicated? > > Does HDFS check that the data node is up, data disk is mounted, path to > the file exists and file can be read? > > Or does it also do a filesystem check on that data disk as well as > perhaps a checksum to ensure block integrity? > > I've googled on this quite a bit. I don't see the exact answer I'm > looking for. I would like to know exactly what happens during file > integrity verification that then constitutes missing blocks or corrupt > blocks in the reports. > > -- > Thank You, > TK. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org > For additional commands, e-mail: user-h...@hadoop.apache.org > >