Pranav,

Can you do the following checklist before confirming the missing/corrupt
blocks?


Missing Block: Mark missing if all of the block replicas of that file is
not reported to Namenode.

Corrupt Block: Mark corrupt if all of the block replicas of that file is
corrupted (Or) none of them are reported to Namenode.


1. Check if all datanodes are running in the cluster

2. Check if you see dead datanodes

3. Check if disk failure from multiple datanode

4. Check if disk out of space from multiple datanode

5. Check if block report is rejected by namenode (It can be seen from
namenode log as a warning/error)

6. Check if you changed any config groups

7. Check if block physically exists in local filesystem or removed by users
unknowingly. Ex: "find <dfs.datanode.data.dir> -type f -iname <blkid>*".
Repeat the same step in all datanodes

8. Check if too many blocks hosted in a single datanode

9. Check if block report fails with "exceeding max RPC size", default 64
MB. You can see this warning from namenode log "Protocol message was too
large. May be malicious"

10. Check if mount point is unmounted because of filesystem failure

11. Check if block is written into root volume because of disk auto
unmount. Data might be hidden if you remount the filesystem on top of
existing datanode dir.


Thanks,

Karthik

On Mon, Mar 4, 2019 at 3:27 AM pranav.puri <pranav.p...@orkash.com.invalid>
wrote:

> Hi
>
> I have a accumulo cluster setup over a hadoop cluster(ver.- 2.9.0). Since
> I had to make changes into the accumulo config files, I had to shutdown
> hadoop multiple times.
>
> After some changes the hadoop started showing this when FSCK command is
> run:
>
> Total size:    986079103985 B (Total open files size: 372 B)
>  Total dirs:    2011
>  Total files:    15530
>  Total symlinks:        0 (Files currently being written: 4)
>  Total blocks (validated):    19663 (avg. block size 50148965 B) (Total
> open file blocks (not validated): 4)
>   ********************************
>   UNDER MIN REPL'D BLOCKS:    19663 (100.0 %)
>   dfs.namenode.replication.min:    1
>   CORRUPT FILES:    15310
>   MISSING BLOCKS:    19663
>   MISSING SIZE:        986079103985 B
>   CORRUPT BLOCKS:     19663
>
> All the changes were done in the accumulo config files, while no hadoop
> config files were changed. What are the troubleshooting steps.
>
> Regards
> Pranav
>
>
>
>

-- 
Thank you,
*Karthik Palanisamy*
Bangalore, *India*
Mobile : +91 9940089181
Skype : karthik.p01

Reply via email to