I have encountered when there is a disk IO error in a datanode machine, the
datanode will be dead, but the in the dead datanode, the datanode daemon is
still alive, and I cannot stop it to restart it the datanode. When I check
the process , it seems that the linux command "du -sk path/to/datadir" is
hangup, this problem cause the datanode dead, so that I cannot stop the
datanode as well as cannot use the “kill -9 datanode-process” to kill the
datanode process, is this a bug? may be we should set a timeout of linux
command "du" , when there is no return to the datanode.