What distro/version of Hadoop are you using? This was a bug fixed quite a while ago.
On Fri, Apr 20, 2012 at 7:29 AM, Johnson Chengwu <johnsonchen...@gmail.com> wrote: > I have encountered when there is a disk IO error in a datanode machine, the > datanode will be dead, but the in the dead datanode, the datanode daemon is > still alive, and I cannot stop it to restart it the datanode. When I check > the process , it seems that the linux command "du -sk path/to/datadir" is > hangup, this problem cause the datanode dead, so that I cannot stop the > datanode as well as cannot use the “kill -9 datanode-process” to kill the > datanode process, is this a bug? may be we should set a timeout of linux > command "du" , when there is no return to the datanode. -- Harsh J