[ 
https://issues.apache.org/jira/browse/HDFS-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270042#comment-15270042
 ] 

Arpit Agarwal commented on HDFS-10359:
--------------------------------------

Hi [~Tao Jie], processing full block reports is an expensive operation for the 
NameNode and it gets more expensive as the cluster size and data grow. You will 
cause a denial of service attack on your NameNode if you trigger full block 
reports every time you issue setrep. The default block report interval is 6 
hours for a good reason.

bq. however namenode would not notice block missing until block report in 6 
hours. In this case, we suppose to trigger block report for all datanodes 
before setrep -w. Further more, if we want to set replication of blocks to 1, 
some blocks may corrupt.
You should never set the replication factor of a file to 1 unless you are okay 
with losing the data or it can be trivially regenerated.

bq. It is OK to use a script to trigger block report from all datenodes, or 
just restart namenode.
Neither is necessary or recommended. You should trust the self-healing 
mechanisms of HDFS to detect and deal with lost blocks and let go of the 
expectation that all blocks will have exactly the expected number of replicas 
at all times. Under and over-replications are common in any real cluster as 
disks fail, network links get congested, or nodes go away and come back.

> Allow trigger block report from all datanodes
> ---------------------------------------------
>
>                 Key: HDFS-10359
>                 URL: https://issues.apache.org/jira/browse/HDFS-10359
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.7.0, 2.6.1
>            Reporter: Tao Jie
>
> Since we have HDFS-7278 allows trigger block report from one certain 
> datanode. It would be helpful to add a option to this command to trigger 
> block report from all datanodes.
> Command maybe like this:
> *hdfs dfsadmin -triggerBlockReport \[-incremental\] 
> <datanode_host:ipc_port|all>*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to