** Description changed:
Binary package hint: ocfs2-tools
Ubuntu release:
Description: Ubuntu 10.04.1 LTS
Release: 10.04
Package version:
ocfs2-tools 1.4.3-1
The script /etc/init.d/o2cb exits with an error when stopped and the services
do not stop.
Here the error message:
/etc/init.d/o2cb stop
Stopping O2CB cluster ocfs2: Failed
Unable to stop cluster as heartbeat region still active
I have identified a first error in the script. In the function
clean_heartbeat the following if:
if [ ! -f "$(configfs_path)/cluster/${CLUSTER}/heartbeat/*" ]
- then
- return
+ then
+ return
fi
is always true and the function returns. If the intention was to check
the existence of the directory code must be:
if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" ]
- then
- echo "OK"
- return
+ then
+ echo "OK"
+ return
fi
An error persist even after these changes.
/etc/init.d/o2cb stop
Cleaning heartbeat on ocfs2: Failed
At least one heartbeat region still active
I added some lines for debugging by changing the function so:
#
# clean_heartbeat()
# Removes the inactive heartbeat regions
#
clean_heartbeat()
{
- if [ "$#" -lt "1" -o -z "$1" ]
- then
- echo "clean_heartbeat(): Requires an argument" >&2
- return 1
- fi
- CLUSTER="$1"
+ if [ "$#" -lt "1" -o -z "$1" ]
+ then
+ echo "clean_heartbeat(): Requires an argument" >&2
+ return 1
+ fi
+ CLUSTER="$1"
- if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" ]
- then
- echo "OK"
- return
- fi
+ if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" ]
+ then
+ echo "OK"
+ return
+ fi
- echo -n "Cleaning heartbeat on ${CLUSTER}: "
+ echo -n "Cleaning heartbeat on ${CLUSTER}: "
- ls -1 "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" | while read HBUUID
- do
- if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/${HBUUID}" ]
- then
- continue
- fi
+ ls -1 "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" | while read HBUUID
+ do
+ if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/${HBUUID}" ]
+ then
+ continue
+ fi
echo
echo "DEBUG ocfs2_hb_ctl -I -u ${HBUUID} 2>&1"
- OUTPUT="`ocfs2_hb_ctl -I -u ${HBUUID} 2>&1`"
- if [ $? != 0 ]
- then
- echo "Failed"
- echo "${OUTPUT}" >&2
- exit 1
- fi
+ OUTPUT="`ocfs2_hb_ctl -I -u ${HBUUID} 2>&1`"
+ if [ $? != 0 ]
+ then
+ echo "Failed"
+ echo "${OUTPUT}" >&2
+ exit 1
+ fi
echo "DEBUG ${OUTPUT}"
- REF="`echo ${OUTPUT} | awk '/refs/ {print $2; exit;}' 2>&1`"
+ REF="`echo ${OUTPUT} | awk '/refs/ {print $2; exit;}' 2>&1`"
echo "DEBUG REF=$REF"
- if [ $REF != 0 ]
- then
- echo "Failed"
- echo "At least one heartbeat region still active" >&2
- exit 1
- else
- OUTPUT="`ocfs2_hb_ctl -K -u ${HBUUID} 2>&1`"
- fi
- done
- if [ $? = 1 ]
- then
- exit 1
- fi
- echo "OK"
+ if [ $REF != 0 ]
+ then
+ echo "Failed"
+ echo "At least one heartbeat region still active" >&2
+ exit 1
+ else
+ OUTPUT="`ocfs2_hb_ctl -K -u ${HBUUID} 2>&1`"
+ fi
+ done
+ if [ $? = 1 ]
+ then
+ exit 1
+ fi
+ echo "OK"
}
The new output is:
/etc/init.d/o2cb stop
- Cleaning heartbeat on ocfs2:
+ Cleaning heartbeat on ocfs2:
DEBUG ocfs2_hb_ctl -I -u FC046AD7B2584E7EB12A7293993C81B0 2>&1
DEBUG FC046AD7B2584E7EB12A7293993C81B0: 2 refs
DEBUG REF=2
Failed
At least one heartbeat region still active
At this point I checked the source code ocfs2_hb_ctl. The command
ocfs2_hb_ctl-I-u ${HBUUID} returns the number of references in a semaphore used
by programs that manage ocfs filesystem. In the source file libo2cb/o2cb_api.c:
- the function o2cb_mutex_down increases the second semaphore;
- the function o2cb_mutex_up decreases the first semaphore;
- the function __o2cb_get_ref increases the first semaphore;
- the function __o2cb_drop_ref decreases the first semaphore.
I have not found the point where the second semaphore is decreased. This
could be the cause of the error.
--
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to ocfs2-tools in Ubuntu.
https://bugs.launchpad.net/bugs/613793
Title:
o2cb stopping Failed
Status in ocfs2-tools package in Ubuntu:
Confirmed
Bug description:
Binary package hint: ocfs2-tools
Ubuntu release:
Description: Ubuntu 10.04.1 LTS
Release: 10.04
Package version:
ocfs2-tools 1.4.3-1
The script /etc/init.d/o2cb exits with an error when stopped and the services
do not stop.
Here the error message:
/etc/init.d/o2cb stop
Stopping O2CB cluster ocfs2: Failed
Unable to stop cluster as heartbeat region still active
I have identified a first error in the script. In the function
clean_heartbeat the following if:
if [ ! -f "$(configfs_path)/cluster/${CLUSTER}/heartbeat/*" ]
then
return
fi
is always true and the function returns. If the intention was to check
the existence of the directory code must be:
if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" ]
then
echo "OK"
return
fi
An error persist even after these changes.
/etc/init.d/o2cb stop
Cleaning heartbeat on ocfs2: Failed
At least one heartbeat region still active
I added some lines for debugging by changing the function so:
#
# clean_heartbeat()
# Removes the inactive heartbeat regions
#
clean_heartbeat()
{
if [ "$#" -lt "1" -o -z "$1" ]
then
echo "clean_heartbeat(): Requires an argument" >&2
return 1
fi
CLUSTER="$1"
if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" ]
then
echo "OK"
return
fi
echo -n "Cleaning heartbeat on ${CLUSTER}: "
ls -1 "$(configfs_path)/cluster/${CLUSTER}/heartbeat/" | while read HBUUID
do
if [ ! -d "$(configfs_path)/cluster/${CLUSTER}/heartbeat/${HBUUID}" ]
then
continue
fi
echo
echo "DEBUG ocfs2_hb_ctl -I -u ${HBUUID} 2>&1"
OUTPUT="`ocfs2_hb_ctl -I -u ${HBUUID} 2>&1`"
if [ $? != 0 ]
then
echo "Failed"
echo "${OUTPUT}" >&2
exit 1
fi
echo "DEBUG ${OUTPUT}"
REF="`echo ${OUTPUT} | awk '/refs/ {print $2; exit;}' 2>&1`"
echo "DEBUG REF=$REF"
if [ $REF != 0 ]
then
echo "Failed"
echo "At least one heartbeat region still active" >&2
exit 1
else
OUTPUT="`ocfs2_hb_ctl -K -u ${HBUUID} 2>&1`"
fi
done
if [ $? = 1 ]
then
exit 1
fi
echo "OK"
}
The new output is:
/etc/init.d/o2cb stop
Cleaning heartbeat on ocfs2:
DEBUG ocfs2_hb_ctl -I -u FC046AD7B2584E7EB12A7293993C81B0 2>&1
DEBUG FC046AD7B2584E7EB12A7293993C81B0: 2 refs
DEBUG REF=2
Failed
At least one heartbeat region still active
At this point I checked the source code ocfs2_hb_ctl. The command
ocfs2_hb_ctl-I-u ${HBUUID} returns the number of references in a semaphore used
by programs that manage ocfs filesystem. In the source file libo2cb/o2cb_api.c:
- the function o2cb_mutex_down increases the second semaphore;
- the function o2cb_mutex_up decreases the first semaphore;
- the function __o2cb_get_ref increases the first semaphore;
- the function __o2cb_drop_ref decreases the first semaphore.
I have not found the point where the second semaphore is decreased.
This could be the cause of the error.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ocfs2-tools/+bug/613793/+subscriptions
_______________________________________________
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : [email protected]
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help : https://help.launchpad.net/ListHelp