[ 
https://issues.apache.org/jira/browse/HADOOP-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503010#comment-13503010
 ] 

liaowenrui commented on HADOOP-9085:
------------------------------------

1.when stop namenode is success,we should be delete the namenode pid file.
2.when we start namenode,we should be ps,not use kill -0.

code in hadoop-daemo.sh

if [ -f $pid ]; then
      if kill -0 `cat $pid` > /dev/null 2>&1; then
        echo $command running as process `cat $pid`.  Stop it first.
        exit 1
      fi
    fi

we will change it like this:
    if [ -f $pid ]; then
      tmppid=`cat $pid`
      curpid=`ps -ww -eo pid,user,euid,cmd | grep "org.apache.hadoop.hdfs." | 
grep "$command" | grep $tmppid | grep -v "grep" | awk '{print $1}'`
      if [ -n "$curpid" ]; then
        echo $command running as process `cat $pid`.  Stop it first.
        exit 1
      fi
    fi

                
> start namenode failure,bacause pid of namenode pid file is other process pid 
> or thread id before start namenode
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-9085
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9085
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: bin
>    Affects Versions: 2.0.1-alpha, 2.0.3-alpha
>         Environment: NA
>            Reporter: liaowenrui
>             Fix For: 2.0.1-alpha, 2.0.2-alpha, 2.0.3-alpha
>
>
> pid of namenode pid file is other process pid or thread id before start 
> namenode,start namenode will failure.because the pid of namenode pid file 
> will be checked use kill -0 command before start namenode in hadoop-daemo.sh 
> script.when pid of namenode pid file is other process pid or thread id,checkt 
> is use kil -0 command,and the kill -0 will return success.it means the 
> namenode is runing.in really,namenode is not runing.
> 2338 is dead namenode pid 
> 2305 is datanode pid
> cqn2:/tmp # kill -0 2338
> cqn2:/tmp # ps -wweLo pid,ppid,tid | grep 2338
>  2305     1  2338

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to