Author: mbautin
Date: Mon Feb 27 11:37:59 2012
New Revision: 1294115

URL: http://svn.apache.org/viewvc?rev=1294115&view=rev
Log:
[master] porting "ghost pid" issue to hbase-daemon.sh

Summary:
Much like the already-corrected issue in hadoop-daemon.sh, we sometimes see
false alarms for hbase processes when another thread responds to kill -0 $pid --
this can lead to clusters starting up slower than ideal.

hadoop-daemon.sh patch where I stole this from:
https://phabricator.fb.com/D350607

I would like someone from hbase-eng to shepherd this patch to the various other
places it needs to be committed and take care of any merging to the open source
processes.

prefixign with [master] as per kannan

Test Plan:
found a host which was exhibiting this behaviour, applied the new .sh file

[[email protected] ~]# hadoopctl start
INFO: start SEARCHHBASE002-ASH3-HBASE:regionserver (timeout=180s)
WARN: Non-zero code 1 received from start script.
      Stdout:
          regionserver running as process 4735. Stop it first.
      Stderr:

Modified:
    hbase/branches/0.89-fb/bin/hbase-daemon.sh

Modified: hbase/branches/0.89-fb/bin/hbase-daemon.sh
URL: 
http://svn.apache.org/viewvc/hbase/branches/0.89-fb/bin/hbase-daemon.sh?rev=1294115&r1=1294114&r2=1294115&view=diff
==============================================================================
--- hbase/branches/0.89-fb/bin/hbase-daemon.sh (original)
+++ hbase/branches/0.89-fb/bin/hbase-daemon.sh Mon Feb 27 11:37:59 2012
@@ -131,8 +131,17 @@ case $startStop in
     mkdir -p "$HBASE_PID_DIR"
     if [ -f $pid ]; then
       if kill -0 `cat $pid` > /dev/null 2>&1; then
-        echo $command running as process `cat $pid`.  Stop it first.
-        exit 1
+        # On Linux, process pids and thread pids are indistinguishable to
+        # signals. It's possible that the pid in our pidfile is now a thread
+        # owned by another process. Let's check to make sure our pid is
+        # actually a running process.
+        ps -e -o pid | egrep "^`cat $pid`$" >/dev/null 2>&1
+        if [ $? -eq 0 ]; then
+          echo $command running as process `cat $pid`.  Stop it first.
+          exit 1
+        else
+          rm $pid
+        fi
       fi
     fi
 


Reply via email to