[jira] [Comment Edited] (AMQ-8471) activemq stop command fails with no or outdated process id

Consultant Leon (Jira) Sat, 17 Sep 2022 19:33:06 -0700


    [ 
https://issues.apache.org/jira/browse/AMQ-8471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606228#comment-17606228
 ]


Consultant Leon edited comment on AMQ-8471 at 9/18/22 2:32 AM:
---------------------------------------------------------------

Decided to spend some weekend time on this issue as the state of the script 
bothered me and failing to gracefully stop activemq 5.16.5 on our Solaris 10 
envs seems unacceptable.

If I attach my now smooth running version, can others confirm that this is not 
just a 'Solaris 10' fix but a general improvement that could be included in a 
future service pack of 5.16 ?

At least it can help individuals with similar troubles.

When I update {{bin/activemq}} to the attached and enable the management 
connector in {{conf/activemq.xml}} (why is this not be default enabled? Without 
this enabled there's no graceful shutdown possible, which could include the 
risk on data loss or corruptions during broker restarts I'd say)
{code:java}
       <managementContext>
          <managementContext createConnector="true" />
        </managementContext>{code}
Then start and stop works smooth for me and logging reports more clearly on the 
console.

Let me explain each change I made:
{code:java}
1c1
< #!/bin/bash
—
> #!/bin/sh{code}
*  /bin/sh is bourne shell on Solaris and that doesn't understand bash scripts 
very well! Is there any particular reason not to use /bin/bash in the first 
place?

 
{code:java}
115c115
< ACTIVEMQ_CLASSPATH="$ACTIVEMQ_BASE/lib/:$ACTIVEMQ_USER_CLASSPATH"
—
> ACTIVEMQ_CLASSPATH="$ACTIVEMQ_BASE/../lib/:$ACTIVEMQ_USER_CLASSPATH"{code}
 

* going out the $ACTIVEMQ_BASE to ../lib can't be anything but a mistake? Not 
sure this change is needed but it struck me as odd at least!

 
{code:java}
162,165d161
< # Max time to wait before killing the activemq process
< if [ -z "$ACTIVEMQ_KILL_MAXSECONDS" ]; then
<   ACTIVEMQ_KILL_MAXSECONDS=60
< fi{code}
 

* as ACTIVEMQ_KILL_MAXSECONDS is a mandatory env, make sure it gets initialized 
somewhere{{{}{}}}

 
{code:java}
481c477
<    local pidfile="${1}"
—
>     local pidfile="${1}"
483,489c479,486
<    if [ -f "$pidfile" ]; then
<       local activemq_pid="`cat "$pidfile"`"
<       if  [ -z "$activemq_pid" ];then
<          echo "ERROR: Pidfile '$pidfile' exists but contains no pid"
<          return 2
<       fi
<       if [ "$(ps -p $activemq_pid -o fname=)" = "java" ]; then
—
>     if [ -f $pidfile ]; then
>        if  [ -z "`cat "$pidfile"`" ];then
>         echo "ERROR: Pidfile '$pidfile' exists but contains no pid"
>         return 2
>        fi
>        local activemq_pid="`cat "$pidfile"`"
>        local RET="`ps -eo "pid,args" | grep "^\s*$activemq_pid\s.*java"`"
>        if [ -n "$RET" ];then
491c488
<       else
—
>        else
493,496c490,493
<       fi
<    else
<       return 1;
<    fi
—
>        fi
>     else
>          return 1;
>     fi{code}
 
 * good practice to quote filenames in {{if [ -f "$pidfile" ]}}
 * {{local activemq_pid="`cat "$pidfile"`"}} --> prevent multiple cat's of 
pidfile
 * {{if [ "$(ps -p $activemq_pid -o fname=)" = "java" ]; then}} --> big fan of 
using $( .. ) rather than backticks, most importantly this is the only 
construct that worked for me on solaris and it's more efficient with -p than 
grepping on all processes. {{-o fname= }}gets the process's name ('java' for 
java) without any other path or formatting. Adding the {{'=' }}suppresses the 
headers so that the result should match exactly "java". I think is perfectly 
POSIX compliant and should in theory run 'everywhere' (just tested , ok on 
ubuntu, not on mingw, not on alpine busybox ps... anyway those are hard to 
please)

{code:java}
invoke_stop(){
    RET="1"
    if ( checkRunning "$ACTIVEMQ_PIDFILE" );then
       ACTIVEMQ_OPTS="$ACTIVEMQ_OPTS $ACTIVEMQ_SSL_OPTS"
       COMMANDLINE_ARGS="$COMMANDLINE_ARGS $ACTIVEMQ_SUNJMX_CONTROL"       
ACTIVEMQ_PID="`cat $ACTIVEMQ_PIDFILE`"       if ! invokeJar "$ACTIVEMQ_PIDFILE" 
"stop"; then
          echo "WARNING: Failed to invoke activemq.jar stop command"
          invoke_kill
          return $?
       fi       FOUND="0"
       i=0
       echo "INFO: Waiting max $ACTIVEMQ_KILL_MAXSECONDS seconds for broker to 
shutdown gracefully"
       while [ "$i" -le "$ACTIVEMQ_KILL_MAXSECONDS" ] && (checkRunning 
"${ACTIVEMQ_PIDFILE}"); do
         if ! checkRunning "${ACTIVEMQ_PIDFILE}.stop"; then
            if ! invokeJar "$ACTIVEMQ_PIDFILE" "stop"; then
               break
            fi
         fi
         sleep 1
         i=$(( $i + 1 ))
         if ((i % 5 == 0 )); then
            echo "INFO: Waiting for broker to shutdown ..."
         fi
       done
       if checkRunning "$ACTIVEMQ_PIDFILE"; then
         echo "WARNING: Broker failed to shutdown gracefully"
         invoke_kill
         RET="$?"
       else
         echo "INFO: Broker has shutdown gracefully"
         RET="0"
       fi
    elif [ -f "$ACTIVEMQ_PIDFILE" ];then
       echo "ERROR: No or outdated process id in '$ACTIVEMQ_PIDFILE'"
       echo
       echo "INFO: Removing $ACTIVEMQ_PIDFILE"
    else
       echo "ActiveMQ not running"
       exit 0
    fi
    rm -f "$ACTIVEMQ_PIDFILE" >/dev/null 2>&1
    rm -f "${ACTIVEMQ_PIDFILE}.stop" >/dev/null 2>&1
    exit $RET
}{code}
 

 

 Reworked the invoke_stop method:
 * one time invokeJar "stop" . If that returns non-zero then invoke_kill 
immediately, no point to carry a $RET value around or go into the loop in such 
case
 * change INFO log from  "Waiting at least ..." to "Waiting max ..."
 * main loop checks for {{$ACTIVEMQ_KILL_MAXSECONDS}} and as long as the broker 
pid ({{{}$ACTIVEMQ_PID{}}} ) is still running. We don't really have have to 
wait for backgroun stop invocation (pid in file {{{}${ACTIVEMQ_PID}.stop{}}}) 
to end. The opposite: we need to worry when we're still waiting and this 
process has already ended. In that case it seems better to relaunch invokeJar 
stop! I had a case where I started and immediately attempted to stop. The old 
script would see the activemq stop exiting very quickly as it could not connect 
to the broker pid (as it's still in its startup phase!), and would then issue a 
kill. Where if you reattempt activemq stop it'll succeed to connect to the 
broker in a second attempt (once the broker fully started), and perform a much 
preferable graceful shutdown.
 * instead of logging '.' characters which get intermixed with the text printed 
by activemq stop background process, it seems better to start logging {{"INFO: 
Waiting for broker to shutdown ..." }}after at least 5 seconds and only repeat 
it once every 5 seconds. {{$(( ... ))}} construct is great for modulo math, if 
{{(( ...  ))}} great for arithmethic evaluations.
 * aligned the logs to mention more clearly when shutdown is 'graceful' (via 
activemq stop )
{{}}

 
{code:java}
 652c652
<     rm -f "${ACTIVEMQ_PIDFILE}.stop" >/dev/null 2>&1
—
>     rm -f "$ACTIVEMQ_DATA/stop.pid" >/dev/null 2>&1{code}
 

There doesn't seem to be any use of a file named {{stop.pid}} , I guess it 
could be a good safe cleanup to remove {{${ACTIVEMQ_PIDFILE}.stop}}

I feel that the whole script is somewhat rusty and could do with a major 
makeover; understood that activemq 5.x is end of life and any change risks 
breaking something on any of x different platforms...

(also doubtful about using {{exit }}in a function, wouldn't it be better to use 
return in functions and exit at the top level for premature script exits?)


was (Author: consultantleon):
Decided to spend some weekend time on this as the state of the script bothered 
me. If I attach my version can others confirm that this is not just a 'solaris 
10' fix but a general improvement that could be included in a future service 
pack of 5.16 maybe? Or at least help individuals with similar troubles as mine.

When I update bin/activemq to the attached and enable the jmx connector in 
conf/activemq.xml :
{quote}       <managementContext>
          <managementContext createConnector="true" />
        </managementContext>
{quote}
 

Then start and stop works smooth for me and logging reports more clearly on the 
console.

Let me explain each change I made:
{quote}1c1
< #!/bin/bash
---
> #!/bin/sh
{quote} *  /bin/sh is bourne shell on Solaris and that doesn't understand bash 
scripts very well! Is there any particular reason not to use /bin/bash in the 
first place?

 
{quote}115c115
< ACTIVEMQ_CLASSPATH="$ACTIVEMQ_BASE/lib/:$ACTIVEMQ_USER_CLASSPATH"
---
> ACTIVEMQ_CLASSPATH="$ACTIVEMQ_BASE/../lib/:$ACTIVEMQ_USER_CLASSPATH"
{quote} * going out the $ACTIVEMQ_BASE to ../lib can't be anything but a 
mistake? Not sure this change is needed but it struck me as odd at least!

{quote}162,165d161
< # Max time to wait before killing the activemq process
< if [ -z "$ACTIVEMQ_KILL_MAXSECONDS" ]; then
<   ACTIVEMQ_KILL_MAXSECONDS=60
< fi
{quote} * as ACTIVEMQ_KILL_MAXSECONDS is a mandatory env, make sure it gets 
initialized somewhere

{quote}481c477
<    local pidfile="${1}"
---
>     local pidfile="${1}"
483,489c479,486
<    if [ -f "$pidfile" ]; then
<       local activemq_pid="`cat "$pidfile"`"
<       if  [ -z "$activemq_pid" ];then
<          echo "ERROR: Pidfile '$pidfile' exists but contains no pid"
<          return 2
<       fi
<       if [ "$(ps -p $activemq_pid -o fname=)" = "java" ]; then
---
>     if [ -f $pidfile ]; then
>        if  [ -z "`cat "$pidfile"`" ];then
>         echo "ERROR: Pidfile '$pidfile' exists but contains no pid"
>         return 2
>        fi
>        local activemq_pid="`cat "$pidfile"`"
>        local RET="`ps -eo "pid,args" | grep "^\s*$activemq_pid\s.*java"`"
>        if [ -n "$RET" ];then
491c488
<       else
---
>        else
493,496c490,493
<       fi
<    else
<       return 1;
<    fi
---
>        fi
>     else
>          return 1;
>     fi
{quote} * good practice to quote filenames in if [ -f "$pidfile" ]
 * local activemq_pid="`cat "$pidfile"`" --> prevent multiple cat's of pidfile
 * if [ "$(ps -p $activemq_pid -o fname=)" = "java" ]; then --> big fan of 
using $( .. ) rather than backticks, most importantly this is the only 
construct that worked for me on solaris and it's more efficient with -p than 
grepping on all processes. -o fname= gets the process's name ('java' for java) 
without any other path or formatting. Adding the '=' suppresses the headers so 
that the result should match exactly "java". I think is perfectly POSIX 
compliant and should in theory run 'everywhere' (just tested , ok on ubuntu, 
not on mingw, not on alpine busybox ps... anyway those are hard to please)

{quote} 

{{invoke_stop(){}}
{{    RET="1"}}
{{    if ( checkRunning "$ACTIVEMQ_PIDFILE" );then}}
{{       ACTIVEMQ_OPTS="$ACTIVEMQ_OPTS $ACTIVEMQ_SSL_OPTS"}}
{{       COMMANDLINE_ARGS="$COMMANDLINE_ARGS $ACTIVEMQ_SUNJMX_CONTROL"}}{{      
 ACTIVEMQ_PID="`cat $ACTIVEMQ_PIDFILE`"}}{{       if ! invokeJar 
"$ACTIVEMQ_PIDFILE" "stop"; then}}
{{          echo "WARNING: Failed to invoke activemq.jar stop command"}}
{{          invoke_kill}}
{{          return $?}}
{{       fi}}{{       FOUND="0"}}
{{       i=0}}
{{       echo "INFO: Waiting max $ACTIVEMQ_KILL_MAXSECONDS seconds for broker 
to shutdown gracefully"}}
{{       while [ "$i" -le "$ACTIVEMQ_KILL_MAXSECONDS" ] && (checkRunning 
"${ACTIVEMQ_PIDFILE}"); do}}
{{         if ! checkRunning "${ACTIVEMQ_PIDFILE}.stop"; then}}
{{            if ! invokeJar "$ACTIVEMQ_PIDFILE" "stop"; then}}
{{               break}}
{{            fi}}
{{         fi}}
{{         sleep 1}}
{{         i=$(( $i + 1 ))}}
{{         if ((i % 5 == 0 )); then}}
{{            echo "INFO: Waiting for broker to shutdown ..."}}
{{         fi}}
{{       done}}
{{       if checkRunning "$ACTIVEMQ_PIDFILE"; then}}
{{         echo "WARNING: Broker failed to shutdown gracefully"}}
{{         invoke_kill}}
{{         RET="$?"}}
{{       else}}
{{         echo "INFO: Broker has shutdown gracefully"}}
{{         RET="0"}}
{{       fi}}
{{    elif [ -f "$ACTIVEMQ_PIDFILE" ];then}}
{{       echo "ERROR: No or outdated process id in '$ACTIVEMQ_PIDFILE'"}}
{{       echo}}
{{       echo "INFO: Removing $ACTIVEMQ_PIDFILE"}}
{{    else}}
{{       echo "ActiveMQ not running"}}
{{       exit 0}}
{{    fi}}
{{    rm -f "$ACTIVEMQ_PIDFILE" >/dev/null 2>&1}}
{{    rm -f "${ACTIVEMQ_PIDFILE}.stop" >/dev/null 2>&1}}
{{    exit $RET}}
{{}}}
{quote} * Reworked the invoke_stop method.
 ** one time invokeJar "stop" . If that returns non-zero then invoke_kill 
immediately, no point to carry $RET values around or go into the loop in such 
case
 ** change INFO log from  "Waiting at least ..." to "Waiting max ..."
 ** main loop checks for ACTIVEMQ_KILL_MAXSECONDS and as long as the broker pid 
(ACTIVEMQ_PID ) is still running. We don't really have have to wait for 
backgroun stop invocation (pid in file ${ACTIVEMQ_PID}.stop) to end. The 
opposite: we need to worry when we're still waiting and this process has 
already ended. In that case it seems better to relaunch invokeJar stop! I had a 
case where I started and immediately attempted to stop. The old script would 
see the activemq stop exiting very quickly as it could not connect to the 
broker pid (as it's still in its startup phase!), and would then issue a kill. 
Where if you reattempt activemq stop it'll succeed to connect to the broker in 
a second attempt (once the broker fully started), and perform a much preferable 
graceful shutdown.
 ** instead of logging '.' characters which get intermixed with the text 
printed by activemq stop background process, it seems better to start logging 
"INFO: Waiting for broker to shutdown ..." after at least 5 seconds and only 
repeat it once every 5 seconds. $(( ... )) construct is great for modulo math, 
if (( ...  )) great for arithmethic evaluations.
 ** aligned the logs to mention more clearly when shutdown is 'graceful' (via 
activemq stop )

 
{quote} 652c652
<     rm -f "${ACTIVEMQ_PIDFILE}.stop" >/dev/null 2>&1
---
>     rm -f "$ACTIVEMQ_DATA/stop.pid" >/dev/null 2>&1
{quote}
There doesn't seem to be any use of a stop.pid , I guess it could be a good 
safe cleanup to remove ${ACTIVEMQ_PIDFILE}.stop

 

Feel that the whole script is somewhat rusty and could do with a major 
makeover; understood that activemq 5.x is end of life and any change risks 
breaking something on any of x different platforms...

 

(also doubtful about using exit in a function, wouldn't it be better to use 
return in functions and exit at the top level for premature script exits?)

> activemq stop command fails with no or outdated process id
> ----------------------------------------------------------
>
>                 Key: AMQ-8471
>                 URL: https://issues.apache.org/jira/browse/AMQ-8471
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 5.16.2
>            Reporter: shrihari
>            Assignee: Jean-Baptiste Onofré
>            Priority: Major
>         Attachments: activemq, image-2022-02-07-10-21-19-913.png, 
> image-2022-02-07-10-21-39-782.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In some AIX/Linux environments, the activemq stop command fails with the 
> below error. 
> bash-4.4# ./activemq stop
> INFO: Loading '/AMQ/message-broker/bin/env'
> INFO: Using java '/usr/java8_64/bin/java'
> ERROR: No or outdated process id in '/AMQ/message-broker//data/activemq.pid'
> INFO: Removing /AMQ/message-broker//data/activemq.pid
>  
> The fix provided in https://issues.apache.org/jira/browse/AMQ-8425 doesn't 
> work in such environments as the issue is not due to the user/terminal 
> instance.
>  
> Workaround: 
> Some AIX/linux environments are highly sensitive to the *acute/backquot* 
> *character `* 
> Backquot character reference : 
> [https://www.computerhope.com/jargon/b/backquot.htm]
> Update the file <AMQ_HOME>/message-broker/bin/activemq as below :
> Change the line by removing backquot character :
> RET="`ps -eo "pid,args" | grep "\s*$activemq_pid\s.java"`" to RET="ps -eo 
> "pid,args" | grep "\s$activemq_pid\s.*java""
> This will be inside checkRunning() function.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (AMQ-8471) activemq stop command fails with no or outdated process id

Reply via email to