[devel] [PATCH 1/1] base: Try again for opensafd stop [#2459]

2017-06-08 Thread Rafael Odzakow
Internally opensafd creates a mutex during start/stop to avoid parallel
execution. Makes mutex more robust and add a short retry if mutex is
taken.
---
 src/nid/opensafd.in | 155 +---
 1 file changed, 88 insertions(+), 67 deletions(-)

diff --git a/src/nid/opensafd.in b/src/nid/opensafd.in
index d316967c5..57d374361 100644
--- a/src/nid/opensafd.in
+++ b/src/nid/opensafd.in
@@ -196,19 +196,56 @@ check_transport() {
fi
 }
 
+# Create a mutex for start/stop on the filesystem. Will use trap if available.
+mutex_create() {
+   timeout=10
+   interval=2
+   while [ $timeout -gt 0 ]; do
+   if mkdir "$lockfile_inprogress"; then
+   trap 'rmdir "$lockfile_inprogress"; exit $?' INT TERM 
EXIT 2> /dev/null
+   return 0
+   else
+   # lockfile exist, try again until timeout
+   if [ $timeout -eq 10 ]; then  # log only one time
+   log_warning_msg "opensafd start/stop in 
progress. Waiting for lockfile to be removed"
+   logger -t $osafprog "opensafd start/stop in 
progress. Waiting for lockfile to be removed"
+   fi
+   sleep $interval
+   timeout=$((timeout-interval))
+   fi
+   done
+
+   log_warning_msg "opensafd start/stop already in progress. Unable to 
continue"
+   logger -t $osafprog "opensafd start/stop already in progress. Unable to 
continue"
+   log_warning_msg "To forcefully start/stop OpenSAF remove 
$lockfile_inprogress"
+   logger -t $osafprog "To forcefully start/stop OpenSAF remove 
$lockfile_inprogress"
+   return 1
+}
+
+mutex_remove() {
+   rmdir "$lockfile_inprogress" 2> /dev/null
+   trap - INT TERM EXIT 2> /dev/null
+}
+
 start() {
+   if ! mutex_create; then
+   return 1
+   fi
+
export LD_LIBRARY_PATH=$pkglibdir:$LD_LIBRARY_PATH
 pidofproc -p $amfnd_pid $amfnd_bin > /dev/null 2>&1
lsb_status=$?
if [ $lsb_status -eq 0 ]; then
-   RETVAL=0
+   RETVAL=0
log_success_msg
+   mutex_remove
return $RETVAL
fi
 
 
[ -x $daemon ] || exit 5
 
+# Does more than check ...
check_env
check_transport
 
@@ -218,85 +255,69 @@ start() {
#enable_coredump
 
echo -n "Starting OpenSAF Services (Using $MDS_TRANSPORT):"
-   if [ -e "$lockfile_inprogress" ]; then
-   RETVAL=1
-   log_warning_msg "opensafd start/stop already in progress. 
Unable to continue"
-   logger -t $osafprog "opensafd start/stop already in progress. 
Unable to continue"
-   log_warning_msg "To forcefully start/stop OpenSAF remove 
$lockfile_inprogress"
-   logger -t $osafprog "To forcefully start/stop OpenSAF remove 
$lockfile_inprogress"
+   start_daemon $binary $args
+   RETVAL=$?
+   if [ $RETVAL -eq 0 ]; then
+   logger -t $osafprog "OpenSAF($osafversion - $osafcshash) 
services successfully started"
+   touch $lockfile
+   log_success_msg
else
-   touch "$lockfile_inprogress"
-   start_daemon $binary $args
-   RETVAL=$?
-   if [ $RETVAL -eq 0 ]; then
-   logger -t $osafprog "OpenSAF($osafversion - 
$osafcshash) services successfully started"
-   touch $lockfile
-   log_success_msg
+   final_clean
+   log_failure_msg
+   if [ $REBOOT_ON_FAIL_TIMEOUT -ne 0 ]; then
+   logger -t $osafprog "Starting OpenSAF failed, 
rebooting..."
+   sleep $REBOOT_ON_FAIL_TIMEOUT
+   mutex_remove
+   /sbin/reboot &
else
-   final_clean
-   log_failure_msg
-   if [ $REBOOT_ON_FAIL_TIMEOUT -ne 0 ]; then
-   logger -t $osafprog "Starting OpenSAF failed, 
rebooting..." 
-   sleep $REBOOT_ON_FAIL_TIMEOUT
-   rm -f "$lockfile_inprogress"
-   /sbin/reboot &
-   else
-   logger -t $osafprog "Starting OpenSAF failed"
-   fi
+   logger -t $osafprog "Starting OpenSAF failed"
fi
-   rm -f "$lockfile_inprogress"
fi
+   mutex_remove
return $RETVAL
 }
 
 stop() {
-   logger -t $osafprog "Stopping OpenSAF Services"
+   if ! mutex_create; then
+   return 1
+   fi
 
-   if [ -e "$lockfile_inprogress" ]; then
-   RETVAL=1
-   log_warning_msg "opensafd start/stop 

Re: [devel] [PATCH 1/1] base: Try again for opensafd stop [#2459]

2017-06-02 Thread Anders Widell
Note that we maybe should avoid adding new a dependency towards the 
inotifywait tool...


regards,

Anders Widell


On 06/02/2017 02:38 PM, Rafael Odzakow wrote:
Thanks I am trying to implement the suggestions and will send out a 
new patch soon.



On 06/02/2017 08:51 AM, Hans Nordebäck wrote:

Hi Rafael,

I forgot one thing, not to say that you should change your patch, 
only as an example as we use inotify in opensaf,


the following may also work:

wait_for_lockfile_clear() {
  inotifywait -q -t 20 -e delete_self $lockfile_inprogress
  if [ $? = 2 ]; then
return 1
  else
touch $lockfile_inprogress
return 0
  fi
}

/Hans

On 06/02/2017 07:27 AM, Hans Nordebäck wrote:

ack, code review only.

Minor comments, in current opensafd script, the test and create of 
the lockfile is not atomic, so there is


a window for race. Perhaps we can move the creation of the lockfile 
to function "wait_for_lockfile_clear",


and make the  test and create of the lockfile atomic?

/HansN


On 06/01/2017 02:37 PM, Rafael Odzakow wrote:

Internally opensafd creates a lock file during start/stop to avoid
parallel execution. Wait for this lockfile to be released when a 
call to

opensafd stop is done.
---
  src/nid/opensafd.in | 19 +--
  1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/src/nid/opensafd.in b/src/nid/opensafd.in
index e7683bd7e..df90331a6 100644
--- a/src/nid/opensafd.in
+++ b/src/nid/opensafd.in
@@ -196,6 +196,22 @@ check_transport() {
  fi
  }
  +wait_for_lockfile_clear() {
+local timeout=10
+local interval=2
+while [ $timeout -gt 0 ]; do
+if [ -e "$lockfile_inprogress" ]; then
+log_warning_msg "opensafd start/stop in 
progress. Wait for lockfile to be removed"
+logger -t $osafprog "opensafd start/stop in 
progress. Wait for lockfile to be removed"

+sleep $interval
+else
+return 0
+fi
+timeout=`expr $timeout - $interval`
+done
+return 1
+}
+
  start() {
  export LD_LIBRARY_PATH=$pkglibdir:$LD_LIBRARY_PATH
  pidofproc -p $amfnd_pid $amfnd_bin > /dev/null 2>&1
@@ -251,8 +267,7 @@ start() {
stop() {
  logger -t $osafprog "Stopping OpenSAF Services"
-
-if [ -e "$lockfile_inprogress" ]; then
+if ! wait_for_lockfile_clear; then
  RETVAL=1
  log_warning_msg "opensafd start/stop already in progress. 
Unable to continue"
  logger -t $osafprog "opensafd start/stop already in 
progress. Unable to continue"











--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 1/1] base: Try again for opensafd stop [#2459]

2017-06-02 Thread Rafael Odzakow
Thanks I am trying to implement the suggestions and will send out a new 
patch soon.



On 06/02/2017 08:51 AM, Hans Nordebäck wrote:

Hi Rafael,

I forgot one thing, not to say that you should change your patch, only 
as an example as we use inotify in opensaf,


the following may also work:

wait_for_lockfile_clear() {
  inotifywait -q -t 20 -e delete_self $lockfile_inprogress
  if [ $? = 2 ]; then
return 1
  else
touch $lockfile_inprogress
return 0
  fi
}

/Hans

On 06/02/2017 07:27 AM, Hans Nordebäck wrote:

ack, code review only.

Minor comments, in current opensafd script, the test and create of 
the lockfile is not atomic, so there is


a window for race. Perhaps we can move the creation of the lockfile 
to function "wait_for_lockfile_clear",


and make the  test and create of the lockfile atomic?

/HansN


On 06/01/2017 02:37 PM, Rafael Odzakow wrote:

Internally opensafd creates a lock file during start/stop to avoid
parallel execution. Wait for this lockfile to be released when a 
call to

opensafd stop is done.
---
  src/nid/opensafd.in | 19 +--
  1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/src/nid/opensafd.in b/src/nid/opensafd.in
index e7683bd7e..df90331a6 100644
--- a/src/nid/opensafd.in
+++ b/src/nid/opensafd.in
@@ -196,6 +196,22 @@ check_transport() {
  fi
  }
  +wait_for_lockfile_clear() {
+local timeout=10
+local interval=2
+while [ $timeout -gt 0 ]; do
+if [ -e "$lockfile_inprogress" ]; then
+log_warning_msg "opensafd start/stop in 
progress. Wait for lockfile to be removed"
+logger -t $osafprog "opensafd start/stop in 
progress. Wait for lockfile to be removed"

+sleep $interval
+else
+return 0
+fi
+timeout=`expr $timeout - $interval`
+done
+return 1
+}
+
  start() {
  export LD_LIBRARY_PATH=$pkglibdir:$LD_LIBRARY_PATH
  pidofproc -p $amfnd_pid $amfnd_bin > /dev/null 2>&1
@@ -251,8 +267,7 @@ start() {
stop() {
  logger -t $osafprog "Stopping OpenSAF Services"
-
-if [ -e "$lockfile_inprogress" ]; then
+if ! wait_for_lockfile_clear; then
  RETVAL=1
  log_warning_msg "opensafd start/stop already in progress. 
Unable to continue"
  logger -t $osafprog "opensafd start/stop already in 
progress. Unable to continue"








--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


[devel] [PATCH 1/1] base: Try again for opensafd stop [#2459]

2017-06-01 Thread Rafael Odzakow
Internally opensafd creates a lock file during start/stop to avoid
parallel execution. Wait for this lockfile to be released when a call to
opensafd stop is done.
---
 src/nid/opensafd.in | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/src/nid/opensafd.in b/src/nid/opensafd.in
index e7683bd7e..df90331a6 100644
--- a/src/nid/opensafd.in
+++ b/src/nid/opensafd.in
@@ -196,6 +196,22 @@ check_transport() {
fi
 }
 
+wait_for_lockfile_clear() {
+local timeout=10
+local interval=2
+while [ $timeout -gt 0 ]; do
+if [ -e "$lockfile_inprogress" ]; then
+log_warning_msg "opensafd start/stop in progress. Wait for 
lockfile to be removed"
+logger -t $osafprog "opensafd start/stop in progress. Wait 
for lockfile to be removed"
+sleep $interval
+else
+return 0
+fi
+timeout=`expr $timeout - $interval`
+done
+return 1
+}
+
 start() {
export LD_LIBRARY_PATH=$pkglibdir:$LD_LIBRARY_PATH
 pidofproc -p $amfnd_pid $amfnd_bin > /dev/null 2>&1
@@ -251,8 +267,7 @@ start() {
 
 stop() {
logger -t $osafprog "Stopping OpenSAF Services"
-
-   if [ -e "$lockfile_inprogress" ]; then
+if ! wait_for_lockfile_clear; then
RETVAL=1
log_warning_msg "opensafd start/stop already in progress. 
Unable to continue"
logger -t $osafprog "opensafd start/stop already in progress. 
Unable to continue"
-- 
2.11.0


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel