[
https://issues.apache.org/jira/browse/AMQ-8471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606228#comment-17606228
]
Consultant Leon edited comment on AMQ-8471 at 9/18/22 2:32 AM:
---------------------------------------------------------------
Decided to spend some weekend time on this issue as the state of the script
bothered me and failing to gracefully stop activemq 5.16.5 on our Solaris 10
envs seems unacceptable.
If I attach my now smooth running version, can others confirm that this is not
just a 'Solaris 10' fix but a general improvement that could be included in a
future service pack of 5.16 ?
At least it can help individuals with similar troubles.
When I update {{bin/activemq}} to the attached and enable the management
connector in {{conf/activemq.xml}} (why is this not be default enabled? Without
this enabled there's no graceful shutdown possible, which could include the
risk on data loss or corruptions during broker restarts I'd say)
{code:java}
<managementContext>
<managementContext createConnector="true" />
</managementContext>{code}
Then start and stop works smooth for me and logging reports more clearly on the
console.
Let me explain each change I made:
{code:java}
1c1
< #!/bin/bash
—
> #!/bin/sh{code}
* /bin/sh is bourne shell on Solaris and that doesn't understand bash scripts
very well! Is there any particular reason not to use /bin/bash in the first
place?
{code:java}
115c115
< ACTIVEMQ_CLASSPATH="$ACTIVEMQ_BASE/lib/:$ACTIVEMQ_USER_CLASSPATH"
—
> ACTIVEMQ_CLASSPATH="$ACTIVEMQ_BASE/../lib/:$ACTIVEMQ_USER_CLASSPATH"{code}
* going out the $ACTIVEMQ_BASE to ../lib can't be anything but a mistake? Not
sure this change is needed but it struck me as odd at least!
{code:java}
162,165d161
< # Max time to wait before killing the activemq process
< if [ -z "$ACTIVEMQ_KILL_MAXSECONDS" ]; then
< ACTIVEMQ_KILL_MAXSECONDS=60
< fi{code}
* as ACTIVEMQ_KILL_MAXSECONDS is a mandatory env, make sure it gets initialized
somewhere{{{}{}}}
{code:java}
481c477
< local pidfile="${1}"
—
> local pidfile="${1}"
483,489c479,486
< if [ -f "$pidfile" ]; then
< local activemq_pid="`cat "$pidfile"`"
< if [ -z "$activemq_pid" ];then
< echo "ERROR: Pidfile '$pidfile' exists but contains no pid"
< return 2
< fi
< if [ "$(ps -p $activemq_pid -o fname=)" = "java" ]; then
—
> if [ -f $pidfile ]; then
> if [ -z "`cat "$pidfile"`" ];then
> echo "ERROR: Pidfile '$pidfile' exists but contains no pid"
> return 2
> fi
> local activemq_pid="`cat "$pidfile"`"
> local RET="`ps -eo "pid,args" | grep "^\s*$activemq_pid\s.*java"`"
> if [ -n "$RET" ];then
491c488
< else
—
> else
493,496c490,493
< fi
< else
< return 1;
< fi
—
> fi
> else
> return 1;
> fi{code}
* good practice to quote filenames in {{if [ -f "$pidfile" ]}}
* {{local activemq_pid="`cat "$pidfile"`"}} --> prevent multiple cat's of
pidfile
* {{if [ "$(ps -p $activemq_pid -o fname=)" = "java" ]; then}} --> big fan of
using $( .. ) rather than backticks, most importantly this is the only
construct that worked for me on solaris and it's more efficient with -p than
grepping on all processes. {{-o fname= }}gets the process's name ('java' for
java) without any other path or formatting. Adding the {{'=' }}suppresses the
headers so that the result should match exactly "java". I think is perfectly
POSIX compliant and should in theory run 'everywhere' (just tested , ok on
ubuntu, not on mingw, not on alpine busybox ps... anyway those are hard to
please)
{code:java}
invoke_stop(){
RET="1"
if ( checkRunning "$ACTIVEMQ_PIDFILE" );then
ACTIVEMQ_OPTS="$ACTIVEMQ_OPTS $ACTIVEMQ_SSL_OPTS"
COMMANDLINE_ARGS="$COMMANDLINE_ARGS $ACTIVEMQ_SUNJMX_CONTROL"
ACTIVEMQ_PID="`cat $ACTIVEMQ_PIDFILE`" if ! invokeJar "$ACTIVEMQ_PIDFILE"
"stop"; then
echo "WARNING: Failed to invoke activemq.jar stop command"
invoke_kill
return $?
fi FOUND="0"
i=0
echo "INFO: Waiting max $ACTIVEMQ_KILL_MAXSECONDS seconds for broker to
shutdown gracefully"
while [ "$i" -le "$ACTIVEMQ_KILL_MAXSECONDS" ] && (checkRunning
"${ACTIVEMQ_PIDFILE}"); do
if ! checkRunning "${ACTIVEMQ_PIDFILE}.stop"; then
if ! invokeJar "$ACTIVEMQ_PIDFILE" "stop"; then
break
fi
fi
sleep 1
i=$(( $i + 1 ))
if ((i % 5 == 0 )); then
echo "INFO: Waiting for broker to shutdown ..."
fi
done
if checkRunning "$ACTIVEMQ_PIDFILE"; then
echo "WARNING: Broker failed to shutdown gracefully"
invoke_kill
RET="$?"
else
echo "INFO: Broker has shutdown gracefully"
RET="0"
fi
elif [ -f "$ACTIVEMQ_PIDFILE" ];then
echo "ERROR: No or outdated process id in '$ACTIVEMQ_PIDFILE'"
echo
echo "INFO: Removing $ACTIVEMQ_PIDFILE"
else
echo "ActiveMQ not running"
exit 0
fi
rm -f "$ACTIVEMQ_PIDFILE" >/dev/null 2>&1
rm -f "${ACTIVEMQ_PIDFILE}.stop" >/dev/null 2>&1
exit $RET
}{code}
Reworked the invoke_stop method:
* one time invokeJar "stop" . If that returns non-zero then invoke_kill
immediately, no point to carry a $RET value around or go into the loop in such
case
* change INFO log from "Waiting at least ..." to "Waiting max ..."
* main loop checks for {{$ACTIVEMQ_KILL_MAXSECONDS}} and as long as the broker
pid ({{{}$ACTIVEMQ_PID{}}} ) is still running. We don't really have have to
wait for backgroun stop invocation (pid in file {{{}${ACTIVEMQ_PID}.stop{}}})
to end. The opposite: we need to worry when we're still waiting and this
process has already ended. In that case it seems better to relaunch invokeJar
stop! I had a case where I started and immediately attempted to stop. The old
script would see the activemq stop exiting very quickly as it could not connect
to the broker pid (as it's still in its startup phase!), and would then issue a
kill. Where if you reattempt activemq stop it'll succeed to connect to the
broker in a second attempt (once the broker fully started), and perform a much
preferable graceful shutdown.
* instead of logging '.' characters which get intermixed with the text printed
by activemq stop background process, it seems better to start logging {{"INFO:
Waiting for broker to shutdown ..." }}after at least 5 seconds and only repeat
it once every 5 seconds. {{$(( ... ))}} construct is great for modulo math, if
{{(( ... ))}} great for arithmethic evaluations.
* aligned the logs to mention more clearly when shutdown is 'graceful' (via
activemq stop )
{{}}
{code:java}
652c652
< rm -f "${ACTIVEMQ_PIDFILE}.stop" >/dev/null 2>&1
—
> rm -f "$ACTIVEMQ_DATA/stop.pid" >/dev/null 2>&1{code}
There doesn't seem to be any use of a file named {{stop.pid}} , I guess it
could be a good safe cleanup to remove {{${ACTIVEMQ_PIDFILE}.stop}}
I feel that the whole script is somewhat rusty and could do with a major
makeover; understood that activemq 5.x is end of life and any change risks
breaking something on any of x different platforms...
(also doubtful about using {{exit }}in a function, wouldn't it be better to use
return in functions and exit at the top level for premature script exits?)
was (Author: consultantleon):
Decided to spend some weekend time on this as the state of the script bothered
me. If I attach my version can others confirm that this is not just a 'solaris
10' fix but a general improvement that could be included in a future service
pack of 5.16 maybe? Or at least help individuals with similar troubles as mine.
When I update bin/activemq to the attached and enable the jmx connector in
conf/activemq.xml :
{quote} <managementContext>
<managementContext createConnector="true" />
</managementContext>
{quote}
Then start and stop works smooth for me and logging reports more clearly on the
console.
Let me explain each change I made:
{quote}1c1
< #!/bin/bash
---
> #!/bin/sh
{quote} * /bin/sh is bourne shell on Solaris and that doesn't understand bash
scripts very well! Is there any particular reason not to use /bin/bash in the
first place?
{quote}115c115
< ACTIVEMQ_CLASSPATH="$ACTIVEMQ_BASE/lib/:$ACTIVEMQ_USER_CLASSPATH"
---
> ACTIVEMQ_CLASSPATH="$ACTIVEMQ_BASE/../lib/:$ACTIVEMQ_USER_CLASSPATH"
{quote} * going out the $ACTIVEMQ_BASE to ../lib can't be anything but a
mistake? Not sure this change is needed but it struck me as odd at least!
{quote}162,165d161
< # Max time to wait before killing the activemq process
< if [ -z "$ACTIVEMQ_KILL_MAXSECONDS" ]; then
< ACTIVEMQ_KILL_MAXSECONDS=60
< fi
{quote} * as ACTIVEMQ_KILL_MAXSECONDS is a mandatory env, make sure it gets
initialized somewhere
{quote}481c477
< local pidfile="${1}"
---
> local pidfile="${1}"
483,489c479,486
< if [ -f "$pidfile" ]; then
< local activemq_pid="`cat "$pidfile"`"
< if [ -z "$activemq_pid" ];then
< echo "ERROR: Pidfile '$pidfile' exists but contains no pid"
< return 2
< fi
< if [ "$(ps -p $activemq_pid -o fname=)" = "java" ]; then
---
> if [ -f $pidfile ]; then
> if [ -z "`cat "$pidfile"`" ];then
> echo "ERROR: Pidfile '$pidfile' exists but contains no pid"
> return 2
> fi
> local activemq_pid="`cat "$pidfile"`"
> local RET="`ps -eo "pid,args" | grep "^\s*$activemq_pid\s.*java"`"
> if [ -n "$RET" ];then
491c488
< else
---
> else
493,496c490,493
< fi
< else
< return 1;
< fi
---
> fi
> else
> return 1;
> fi
{quote} * good practice to quote filenames in if [ -f "$pidfile" ]
* local activemq_pid="`cat "$pidfile"`" --> prevent multiple cat's of pidfile
* if [ "$(ps -p $activemq_pid -o fname=)" = "java" ]; then --> big fan of
using $( .. ) rather than backticks, most importantly this is the only
construct that worked for me on solaris and it's more efficient with -p than
grepping on all processes. -o fname= gets the process's name ('java' for java)
without any other path or formatting. Adding the '=' suppresses the headers so
that the result should match exactly "java". I think is perfectly POSIX
compliant and should in theory run 'everywhere' (just tested , ok on ubuntu,
not on mingw, not on alpine busybox ps... anyway those are hard to please)
{quote}
{{invoke_stop(){}}
{{ RET="1"}}
{{ if ( checkRunning "$ACTIVEMQ_PIDFILE" );then}}
{{ ACTIVEMQ_OPTS="$ACTIVEMQ_OPTS $ACTIVEMQ_SSL_OPTS"}}
{{ COMMANDLINE_ARGS="$COMMANDLINE_ARGS $ACTIVEMQ_SUNJMX_CONTROL"}}{{
ACTIVEMQ_PID="`cat $ACTIVEMQ_PIDFILE`"}}{{ if ! invokeJar
"$ACTIVEMQ_PIDFILE" "stop"; then}}
{{ echo "WARNING: Failed to invoke activemq.jar stop command"}}
{{ invoke_kill}}
{{ return $?}}
{{ fi}}{{ FOUND="0"}}
{{ i=0}}
{{ echo "INFO: Waiting max $ACTIVEMQ_KILL_MAXSECONDS seconds for broker
to shutdown gracefully"}}
{{ while [ "$i" -le "$ACTIVEMQ_KILL_MAXSECONDS" ] && (checkRunning
"${ACTIVEMQ_PIDFILE}"); do}}
{{ if ! checkRunning "${ACTIVEMQ_PIDFILE}.stop"; then}}
{{ if ! invokeJar "$ACTIVEMQ_PIDFILE" "stop"; then}}
{{ break}}
{{ fi}}
{{ fi}}
{{ sleep 1}}
{{ i=$(( $i + 1 ))}}
{{ if ((i % 5 == 0 )); then}}
{{ echo "INFO: Waiting for broker to shutdown ..."}}
{{ fi}}
{{ done}}
{{ if checkRunning "$ACTIVEMQ_PIDFILE"; then}}
{{ echo "WARNING: Broker failed to shutdown gracefully"}}
{{ invoke_kill}}
{{ RET="$?"}}
{{ else}}
{{ echo "INFO: Broker has shutdown gracefully"}}
{{ RET="0"}}
{{ fi}}
{{ elif [ -f "$ACTIVEMQ_PIDFILE" ];then}}
{{ echo "ERROR: No or outdated process id in '$ACTIVEMQ_PIDFILE'"}}
{{ echo}}
{{ echo "INFO: Removing $ACTIVEMQ_PIDFILE"}}
{{ else}}
{{ echo "ActiveMQ not running"}}
{{ exit 0}}
{{ fi}}
{{ rm -f "$ACTIVEMQ_PIDFILE" >/dev/null 2>&1}}
{{ rm -f "${ACTIVEMQ_PIDFILE}.stop" >/dev/null 2>&1}}
{{ exit $RET}}
{{}}}
{quote} * Reworked the invoke_stop method.
** one time invokeJar "stop" . If that returns non-zero then invoke_kill
immediately, no point to carry $RET values around or go into the loop in such
case
** change INFO log from "Waiting at least ..." to "Waiting max ..."
** main loop checks for ACTIVEMQ_KILL_MAXSECONDS and as long as the broker pid
(ACTIVEMQ_PID ) is still running. We don't really have have to wait for
backgroun stop invocation (pid in file ${ACTIVEMQ_PID}.stop) to end. The
opposite: we need to worry when we're still waiting and this process has
already ended. In that case it seems better to relaunch invokeJar stop! I had a
case where I started and immediately attempted to stop. The old script would
see the activemq stop exiting very quickly as it could not connect to the
broker pid (as it's still in its startup phase!), and would then issue a kill.
Where if you reattempt activemq stop it'll succeed to connect to the broker in
a second attempt (once the broker fully started), and perform a much preferable
graceful shutdown.
** instead of logging '.' characters which get intermixed with the text
printed by activemq stop background process, it seems better to start logging
"INFO: Waiting for broker to shutdown ..." after at least 5 seconds and only
repeat it once every 5 seconds. $(( ... )) construct is great for modulo math,
if (( ... )) great for arithmethic evaluations.
** aligned the logs to mention more clearly when shutdown is 'graceful' (via
activemq stop )
{quote} 652c652
< rm -f "${ACTIVEMQ_PIDFILE}.stop" >/dev/null 2>&1
---
> rm -f "$ACTIVEMQ_DATA/stop.pid" >/dev/null 2>&1
{quote}
There doesn't seem to be any use of a stop.pid , I guess it could be a good
safe cleanup to remove ${ACTIVEMQ_PIDFILE}.stop
Feel that the whole script is somewhat rusty and could do with a major
makeover; understood that activemq 5.x is end of life and any change risks
breaking something on any of x different platforms...
(also doubtful about using exit in a function, wouldn't it be better to use
return in functions and exit at the top level for premature script exits?)
> activemq stop command fails with no or outdated process id
> ----------------------------------------------------------
>
> Key: AMQ-8471
> URL: https://issues.apache.org/jira/browse/AMQ-8471
> Project: ActiveMQ
> Issue Type: Bug
> Components: Broker
> Affects Versions: 5.16.2
> Reporter: shrihari
> Assignee: Jean-Baptiste Onofré
> Priority: Major
> Attachments: activemq, image-2022-02-07-10-21-19-913.png,
> image-2022-02-07-10-21-39-782.png
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In some AIX/Linux environments, the activemq stop command fails with the
> below error.
> bash-4.4# ./activemq stop
> INFO: Loading '/AMQ/message-broker/bin/env'
> INFO: Using java '/usr/java8_64/bin/java'
> ERROR: No or outdated process id in '/AMQ/message-broker//data/activemq.pid'
> INFO: Removing /AMQ/message-broker//data/activemq.pid
>
> The fix provided in https://issues.apache.org/jira/browse/AMQ-8425 doesn't
> work in such environments as the issue is not due to the user/terminal
> instance.
>
> Workaround:
> Some AIX/linux environments are highly sensitive to the *acute/backquot*
> *character `*
> Backquot character reference :
> [https://www.computerhope.com/jargon/b/backquot.htm]
> Update the file <AMQ_HOME>/message-broker/bin/activemq as below :
> Change the line by removing backquot character :
> RET="`ps -eo "pid,args" | grep "\s*$activemq_pid\s.java"`" to RET="ps -eo
> "pid,args" | grep "\s$activemq_pid\s.*java""
> This will be inside checkRunning() function.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)