** Description changed:

  Binary package hint: ocfs2-tools
  
  Ubuntu release:
  Description:    Ubuntu 10.04.1 LTS
  Release:        10.04
  Package version:
  ocfs2-tools                      1.4.3-1
  
  The script /etc/init.d/o2cb exits with an error when stopped and the services 
do not stop.
  Here the error message:
  
  /etc/init.d/o2cb stop
  Stopping O2CB cluster ocfs2: Failed
  Unable to stop cluster as heartbeat region still active
  
  I have identified a first error in the script. In the function
  clean_heartbeat the following if:
  
  if [ ! -f "$(configfs_path)/cluster/${CLUSTER}/heartbeat/*" ]
-     then
-         return
+     then
+         return
  fi
  
  is always true and the function returns. If the intention was to check
  the existence of the directory code must be:
  
  if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" ]
-     then
-         echo "OK"
-         return
+     then
+         echo "OK"
+         return
  fi
  
  An error persist even after these changes.
  
  /etc/init.d/o2cb stop
  Cleaning heartbeat on ocfs2: Failed
  At least one heartbeat region still active
  
  I added some lines for debugging by changing the function so:
  
  #
  # clean_heartbeat()
  # Removes the inactive heartbeat regions
  #
  clean_heartbeat()
  {
-     if [ "$#" -lt "1" -o -z "$1" ]
-     then
-         echo "clean_heartbeat(): Requires an argument" >&2
-         return 1
-     fi
-     CLUSTER="$1"
+     if [ "$#" -lt "1" -o -z "$1" ]
+     then
+         echo "clean_heartbeat(): Requires an argument" >&2
+         return 1
+     fi
+     CLUSTER="$1"
  
-     if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" ]
-     then
-         echo "OK"
-         return
-     fi
+     if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" ]
+     then
+         echo "OK"
+         return
+     fi
  
-     echo -n "Cleaning heartbeat on ${CLUSTER}: "
+     echo -n "Cleaning heartbeat on ${CLUSTER}: "
  
-     ls -1 "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" | while read HBUUID
-     do
-         if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/${HBUUID}" ]
-         then
-             continue
-         fi
+     ls -1 "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" | while read HBUUID
+     do
+         if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/${HBUUID}" ]
+         then
+             continue
+         fi
  
  echo
  echo "DEBUG ocfs2_hb_ctl -I -u ${HBUUID} 2>&1"
-         OUTPUT="`ocfs2_hb_ctl -I -u ${HBUUID} 2>&1`"
-         if [ $? != 0 ]
-         then
-             echo "Failed"
-             echo "${OUTPUT}" >&2
-             exit 1
-         fi
+         OUTPUT="`ocfs2_hb_ctl -I -u ${HBUUID} 2>&1`"
+         if [ $? != 0 ]
+         then
+             echo "Failed"
+             echo "${OUTPUT}" >&2
+             exit 1
+         fi
  
  echo "DEBUG ${OUTPUT}"
-         REF="`echo ${OUTPUT} | awk '/refs/ {print $2; exit;}' 2>&1`"
+         REF="`echo ${OUTPUT} | awk '/refs/ {print $2; exit;}' 2>&1`"
  echo "DEBUG REF=$REF"
-         if [ $REF != 0 ]
-         then
-            echo "Failed"
-            echo "At least one heartbeat region still active" >&2
-            exit 1
-         else
-            OUTPUT="`ocfs2_hb_ctl -K -u ${HBUUID} 2>&1`"
-         fi
-     done
-     if [ $? = 1 ]
-     then
-         exit 1
-     fi
-     echo "OK"
+         if [ $REF != 0 ]
+         then
+            echo "Failed"
+            echo "At least one heartbeat region still active" >&2
+            exit 1
+         else
+            OUTPUT="`ocfs2_hb_ctl -K -u ${HBUUID} 2>&1`"
+         fi
+     done
+     if [ $? = 1 ]
+     then
+         exit 1
+     fi
+     echo "OK"
  }
  
  The new output is:
  
  /etc/init.d/o2cb stop
- Cleaning heartbeat on ocfs2: 
+ Cleaning heartbeat on ocfs2:
  DEBUG ocfs2_hb_ctl -I -u FC046AD7B2584E7EB12A7293993C81B0 2>&1
  DEBUG FC046AD7B2584E7EB12A7293993C81B0: 2 refs
  DEBUG REF=2
  Failed
  At least one heartbeat region still active
  
  At this point I checked the source code ocfs2_hb_ctl. The command 
ocfs2_hb_ctl-I-u ${HBUUID} returns the number of references in a semaphore used 
by programs that manage ocfs filesystem. In the source file libo2cb/o2cb_api.c:
  - the function o2cb_mutex_down increases the second semaphore;
  - the function o2cb_mutex_up decreases the first semaphore;
  - the function __o2cb_get_ref increases the first semaphore;
  - the function __o2cb_drop_ref decreases the first semaphore.
  
  I have not found the point where the second semaphore is decreased. This
  could be the cause of the error.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/613793

Title:
  o2cb stopping Failed

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ocfs2-tools/+bug/613793/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to