Pavel Moravec created QPID-7149:
-----------------------------------
Summary: [HA] active HA broker memory leak when ring queue
discards overflow messages
Key: QPID-7149
URL: https://issues.apache.org/jira/browse/QPID-7149
Project: Qpid
Issue Type: Bug
Components: C++ Broker
Environment: RHEL6
qpid trunk svn rev. 1735384
- issue seen in very old releases (since active-passive HA cluster initial
implementation, most probably)
libstdc++-devel-4.4.7-4.el6.x86_64
gcc-c++-4.4.7-4.el6.x86_64
libgcc-4.4.7-4.el6.x86_64
libstdc++-4.4.7-4.el6.x86_64
gcc-4.4.7-4.el6.x86_64
Reporter: Pavel Moravec
There is a memory leak on active HA broker, triggered most probably by purging
overflow message from a ring queue. Basic scenario is to setup HA cluster,
promote to primary and feed forever a ring queue with messages.
Detailed scenario:
1) Start brokers and promote one to primary:
start_broker() {
port=$1
shift
rm -rf _${port}
mkdir _${port}
nohup qpidd --load-module=ha.so --port=$port
--log-to-file=qpidd.$port.log --data-dir=_${port} --auth=no --log-to-stderr=no
--ha-cluster=yes
--ha-brokers-url="$(hostname):5672,$(hostname):5673,$(hostname):5674"
--ha-replicate=all --acl-file=/root/qpidd.acl "$@" > /dev/null 2>&1 &
sleep 1
}
killall qpidd qpid-receive 2> /dev/null
rm -f qpidd.*.log
start_broker 5672
sleep 1
qpid-ha promote -b $(hostname):5672 --cluster-manager
sleep 1
start_broker 5673
sleep 1
start_broker 5674
2) Create ring queues and send there messages (it is enough to have 1 queue,
having more should show the leak faster):
for i in $(seq 0 9); do
qpid-config add queue FromKeyServer_$i --max-queue-size=10000
--max-queue-count=10 --limit-policy=ring --argument=x-qpid-priorities=10
done
while true; do
for j in $(seq 1 10); do
for i in $(seq 1 10); do
for k in $(seq 0 9); do
qpid-send -a FromKeyServer_$k -m 100
--send-rate=50 -- priority=$(($((RANDOM))%10)) &
done
done
wait
while [ $(qpid-stat -q | grep broker-replicator | sed "s/Y//g"
| awk '{ print $2 }' | sort -n | tail -n1) != "0" ]; do
sleep 1
done
done
date
ps aux | grep qpidd | grep "port=5672" | awk -F "--store-dir" '{ print
$1 }'
done
(the "while [ $(qpid-stat -q | .." cycle is there just to slow down the message
enqueues to ensure replication federation queues dont have big backlog - that
would interfere with memory consumpiton observation)
3) Run those scripts and monitor memory consumption.
- without using priority queues and sending messages without priorities, leak
is evident as well - but much smaller
- valgrind (on some older versions I tested before more thoroughly) detects
nothing (neither leaked memory or reachable at shutdown)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]