[
https://issues.apache.org/jira/browse/AMQ-8471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606012#comment-17606012
]
Consultant Leon edited comment on AMQ-8471 at 9/17/22 1:23 AM:
---------------------------------------------------------------
I run into the same problem, just today upgrading from activemq 5.16.0 ->
5.16.5 on good old solaris 10 ...
Firstly I couldn't start activemq:
* first line reads #!/bin/sh which on solaris means the script is executed as
bourne shell , not bash.
* then I got error /WEB-INF/webconsole-embedded.xml FileNotFound -> it's
caused by $ACTIVEMQ_HOME pointing to a symlink rather than the physical
directory (this worked fine before, I think there's another issue open, btw
this is also an issue on windows and any linux I've tried it on)
* and now I found this issue trying to stop failing...
uname -a :
SunOS server1 5.10 Generic_150401-11 i86pc i386 i86pc
Several thoughts while investigating a workaround for myself:
====
Why log that the pid file is outdated while the process id is still running -
this can be made much more robust in my view.
===
local pidfile="${1}"
if [ -f $pidfile ]; then
Need to use double quotes around $pidfile in case there's spaces in the path
(the old checkRunning() code did that correctly.
====
Why not write a new method to check for the wrapper script (pidfile.stop) and
leave the original tested and proven method as is rather than combining the 2
needs and breaking good old proven methods?
===
Some further testing and I found that ps -eo "pid,args" | grep java returns:
orchestrate@kzone151> ~/3rdparty/apache-activemq-5.16.5 $ ps -eo "pid,args" |
grep "java"
13143 /kenvng/java/jdk1.8.0_333/bin/java
-Djava.util.logging.config.file=logging.prop
11457 /usr/java/bin/java -server -Xmx128m -XX:+UseParallelGC
-XX:ParallelGCThreads=4
And as you see args gives the full absolute path to the java binary on solaris,
so grep'ing for a \sjava pattern fails (as there's no white space before 'java'
in this case).
===
When I use ps -eo "pid,fname" things work fine:
$ RET="`ps -eo "pid,fname" | grep "^\s*13143\s*java"`"
$ if [ -n "$RET" ]; then echo yes; else echo no; fi
yes
====
Now I thought I'd tackled the problem, however on solaris placing this code in
the script responds completely differently and RET variable always yields ""
(empty).
Can't find any solution for that.
Only way I got the checkRunning to work now is using the -p flag and skipping
the pipe to grep... I think it could be a general solution cross platform as
well:
checkRunning(){
local pidfile="${1}"
if [ -f "$pidfile" ]; then
local activemq_pid="`cat "$pidfile"`"
if [ -z "$activemq_pid" ];then
echo "ERROR: Pidfile '$pidfile' exists but contains no pid"
return 2
fi
if [ "$(ps -p $activemq_pid -o fname=)" = "java" ]; then
return 0;
else
return 1;
fi
else
return 1;
fi
}
===
Now this issue is tackled, I hit:
INFO: Waiting at least seconds for regular process termination of pid '15286' :
bin/activemq: line 617: [: : integer expression expected
It appears that there's no default value set for $ACTIVEMQ_KILL_MAXSECONDS ? Do
I have to explicitly set this before using the activemq script?
while [ "$i" -le "$ACTIVEMQ_KILL_MAXSECONDS" ]; do
====
I've added on line 162 of bin/activemq script:
# Max time to wait before killing the activemq process
if [ -z "$ACTIVEMQ_KILL_MAXSECONDS" ]; then
ACTIVEMQ_KILL_MAXSECONDS=60
fi
That solved that issue but...
====
When I stop no I get a whole lot of output and the script instantly kills the
process without giving it any chance for a graceful shutdown, the jmx URL
cannot be resolved...?
$ bin/activemq stop
INFO: Using default configuration
Configurations are loaded in the following order: /etc/default/activemq
/server/home/user/.activemqrc
/server/home/user/3rdparty/apache-activemq-5.16.5/bin/env
INFO: Using java '/server/java/jdk1.8.0_333/bin/java'
INFO: Waiting at least 60 seconds for regular process termination of pid
'17798' :
Java Runtime: Oracle Corporation 1.8.0_333 /server/java/jdk1.8.0_333/jre
Heap sizes: current=2011136k free=1990082k max=27961344k
JVM args: -Djava.util.logging.config.file=logging.properties
-Djava.security.auth.login.config=/server/home/user/3rdparty/apache-activemq-5.16.5/conf/login.config
-Dactivemq.classpath=/server/home/user/3rdparty/apache-activemq-5.16.5/conf:/server/home/user/3rdparty/apache-activemq-5.16.5/../lib/:
-Dactivemq.home=/server/home/user/3rdparty/apache-activemq-5.16.5
-Dactivemq.base=/server/home/user/3rdparty/apache-activemq-5.16.5
-Dactivemq.conf=/server/home/user/3rdparty/apache-activemq-5.16.5/conf
-Dactivemq.data=/server/home/user/3rdparty/apache-activemq-5.16.5/data
Extensions classpath:
[/server/home/user/3rdparty/apache-activemq-5.16.5/lib,/server/home/user/3rdparty/apache-activemq-5.16.5/lib/camel,/server/home/user/3rdparty/apache-activemq-5.16.5/lib/optional,/server/home/user/3rdparty/apache-activemq-5.16.5/lib/web,/server/home/user/3rdparty/apache-activemq-5.16.5/lib/extra]
ACTIVEMQ_HOME: /server/home/user/3rdparty/apache-activemq-5.16.5
ACTIVEMQ_BASE: /server/home/user/3rdparty/apache-activemq-5.16.5
ACTIVEMQ_CONF: /server/home/user/3rdparty/apache-activemq-5.16.5/conf
ACTIVEMQ_DATA: /server/home/user/3rdparty/apache-activemq-5.16.5/data
Connecting to pid: 17798
.INFO: failed to resolve jmxUrl for pid:17798, using default JMX url
Connecting to JMX URL: service:jmx:rmi:///jndi/rmi://localhost:1099/jmxrmi
.INFO: Broker not available at:
service:jmx:rmi:///jndi/rmi://localhost:1099/jmxrmi
.
INFO: Regular shutdown not successful, sending SIGKILL to process
INFO: sending SIGKILL to pid '17798'
====
This tiny upgrade of 5.16.0 to 5.16.5 has turned into a major headache!
Any help appreciated!!
was (Author: consultantleon):
I run into the same problem, just today upgrading from activemq 5.16.0 ->
5.16.5 on good old solaris 10 ...
Firstly I couldn't start activemq:
* first line reads #!/bin/sh which on solaris means the script is executed as
bourne shell , not bash.
* then I got error /WEB-INF/webconsole-embedded.xml FileNotFound -> it's
caused by $ACTIVEMQ_HOME pointing to a symlink rather than the physical
directory (this worked fine before, I think there's another issue open, btw
this is also an issue on windows and any linux I've tried it on)
* and now I found this issue trying to stop failing...
uname -a :
SunOS server1 5.10 Generic_150401-11 i86pc i386 i86pc
Don't really understand who still uses ` backticks? $( ... ) seems much more
robust and elegant?
Why log that the pid file is outdated while the process id is still running -
this can be made much more robust in my view.
The following seems more robust and will probably work cross platform:
$ if [ "$(ps -p 13143 -o fname=)" = "java" ]; then echo kill the process; else
echo its not java; fi
kill the process
$ ps -p 13143 -o fname=
java
(the '=' after fname suppresses the headers
[https://unix.stackexchange.com/questions/232708/disabling-column-names-in-ps-output]
)
---
local pidfile="${1}"
if [ -f $pidfile ]; then
I'd use double quotes around $pidfile in case there's spaces in the path (the
old checkRunning() code did that correctly.
====
Why not write a new method to check for the wrapper script (pidfile.stop) and
leave the original tested and proven method as is rather than combining the 2
needs and breaking good old proven methods?
===
Some further testing and I found that ps -eo "pid,args" | grep java returns:
orchestrate@kzone151> ~/3rdparty/apache-activemq-5.16.5 $ ps -eo "pid,args" |
grep "java"
13143 /kenvng/java/jdk1.8.0_333/bin/java
-Djava.util.logging.config.file=logging.prop
11457 /usr/java/bin/java -server -Xmx128m -XX:+UseParallelGC
-XX:ParallelGCThreads=4
And as you see args gives the full absolute path to the java binary on solaris,
so grep'ing for a \sjava pattern fails (as there's no white space before 'java'
in this case).
===
When I use ps -eo "pid,fname" things work fine:
$ RET="`ps -eo "pid,fname" | grep "^\s*13143\s*java"`"
$ if [ -n "$RET" ]; then echo yes; else echo no; fi
yes
===
Instead of checking if there is some output of the grep you could check its
return status to be successful (and suppress any stdout or stderr):
$ if ps -eo "pid,fname" | grep "^\s*13143\s*java" > /dev/null 2>&1; then echo
yes; else echo no; fi
yes
===
> activemq stop command fails with no or outdated process id
> ----------------------------------------------------------
>
> Key: AMQ-8471
> URL: https://issues.apache.org/jira/browse/AMQ-8471
> Project: ActiveMQ
> Issue Type: Bug
> Components: Broker
> Affects Versions: 5.16.2
> Reporter: shrihari
> Assignee: Jean-Baptiste Onofré
> Priority: Major
> Attachments: image-2022-02-07-10-21-19-913.png,
> image-2022-02-07-10-21-39-782.png
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In some AIX/Linux environments, the activemq stop command fails with the
> below error.
> bash-4.4# ./activemq stop
> INFO: Loading '/AMQ/message-broker/bin/env'
> INFO: Using java '/usr/java8_64/bin/java'
> ERROR: No or outdated process id in '/AMQ/message-broker//data/activemq.pid'
> INFO: Removing /AMQ/message-broker//data/activemq.pid
>
> The fix provided in https://issues.apache.org/jira/browse/AMQ-8425 doesn't
> work in such environments as the issue is not due to the user/terminal
> instance.
>
> Workaround:
> Some AIX/linux environments are highly sensitive to the *acute/backquot*
> *character `*
> Backquot character reference :
> [https://www.computerhope.com/jargon/b/backquot.htm]
> Update the file <AMQ_HOME>/message-broker/bin/activemq as below :
> Change the line by removing backquot character :
> RET="`ps -eo "pid,args" | grep "\s*$activemq_pid\s.java"`" to RET="ps -eo
> "pid,args" | grep "\s$activemq_pid\s.*java""
> This will be inside checkRunning() function.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)