[ 
https://issues.apache.org/jira/browse/QPID-7149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Moravec updated QPID-7149:
--------------------------------
    Description: 
There is a memory leak on active HA broker, triggered most probably by purging 
overflow message from a ring queue. Basic scenario is to setup HA cluster, 
promote to primary and feed forever a ring queue with messages.

Detailed scenario:

1) Start brokers and promote one to primary:

{noformat}
start_broker() {
        port=$1
        shift
        rm -rf _${port}
        mkdir _${port}
        nohup qpidd --load-module=ha.so --port=$port 
--log-to-file=qpidd.$port.log --data-dir=_${port} --auth=no --log-to-stderr=no 
--ha-cluster=yes 
--ha-brokers-url="$(hostname):5672,$(hostname):5673,$(hostname):5674" 
--ha-replicate=all --acl-file=/root/qpidd.acl "$@" > /dev/null 2>&1 &
        sleep 1
}


killall qpidd qpid-receive 2> /dev/null
rm -f qpidd.*.log
start_broker 5672
sleep 1
qpid-ha promote -b $(hostname):5672 --cluster-manager
sleep 1
start_broker 5673
sleep 1
start_broker 5674
{noformat}

2) Create ring queues and send there messages (it is enough to have 1 queue, 
having more should show the leak faster):

{noformat}
for i in $(seq 0 9); do
        qpid-config add queue FromKeyServer_$i --max-queue-size=10000 
--max-queue-count=10 --limit-policy=ring --argument=x-qpid-priorities=10
done

while true; do
        for j in $(seq 1 10); do
                for i in $(seq 1 10); do
                        for k in $(seq 0 9); do
                                qpid-send -a FromKeyServer_$k -m 100 
--send-rate=50 -- priority=$(($((RANDOM))%10)) &
                        done
                done
                wait
                while [ $(qpid-stat -q | grep broker-replicator | sed "s/Y//g" 
| awk '{ print $2 }' | sort -n | tail -n1) != "0" ]; do
                        sleep 1
                done
        done
        date
        ps aux | grep qpidd | grep "port=5672" | awk -F "--store-dir" '{ print 
$1 }'
done
{noformat}

(the "while [ $(qpid-stat -q | .." cycle is there just to slow down the message 
enqueues to ensure replication federation queues dont have big backlog - that 
would interfere with memory consumpiton observation)


3) Run those scripts and monitor memory consumption.

- without using priority queues and sending messages without priorities, leak 
is evident as well - sometimes smaller, sometimes the same
- valgrind (on some older versions I tested before more thoroughly) detects 
nothing (neither leaked memory or reachable at shutdown)
- same leak is evident even with --ha-replicate=none
- number of backup brokers does not affect the memory leak


  was:
There is a memory leak on active HA broker, triggered most probably by purging 
overflow message from a ring queue. Basic scenario is to setup HA cluster, 
promote to primary and feed forever a ring queue with messages.

Detailed scenario:

1) Start brokers and promote one to primary:

start_broker() {
        port=$1
        shift
        rm -rf _${port}
        mkdir _${port}
        nohup qpidd --load-module=ha.so --port=$port 
--log-to-file=qpidd.$port.log --data-dir=_${port} --auth=no --log-to-stderr=no 
--ha-cluster=yes 
--ha-brokers-url="$(hostname):5672,$(hostname):5673,$(hostname):5674" 
--ha-replicate=all --acl-file=/root/qpidd.acl "$@" > /dev/null 2>&1 &
        sleep 1
}


killall qpidd qpid-receive 2> /dev/null
rm -f qpidd.*.log
start_broker 5672
sleep 1
qpid-ha promote -b $(hostname):5672 --cluster-manager
sleep 1
start_broker 5673
sleep 1
start_broker 5674


2) Create ring queues and send there messages (it is enough to have 1 queue, 
having more should show the leak faster):

for i in $(seq 0 9); do
        qpid-config add queue FromKeyServer_$i --max-queue-size=10000 
--max-queue-count=10 --limit-policy=ring --argument=x-qpid-priorities=10
done

while true; do
        for j in $(seq 1 10); do
                for i in $(seq 1 10); do
                        for k in $(seq 0 9); do
                                qpid-send -a FromKeyServer_$k -m 100 
--send-rate=50 -- priority=$(($((RANDOM))%10)) &
                        done
                done
                wait
                while [ $(qpid-stat -q | grep broker-replicator | sed "s/Y//g" 
| awk '{ print $2 }' | sort -n | tail -n1) != "0" ]; do
                        sleep 1
                done
        done
        date
        ps aux | grep qpidd | grep "port=5672" | awk -F "--store-dir" '{ print 
$1 }'
done

(the "while [ $(qpid-stat -q | .." cycle is there just to slow down the message 
enqueues to ensure replication federation queues dont have big backlog - that 
would interfere with memory consumpiton observation)


3) Run those scripts and monitor memory consumption.

- without using priority queues and sending messages without priorities, leak 
is evident as well - sometimes smaller, sometimes the same
- valgrind (on some older versions I tested before more thoroughly) detects 
nothing (neither leaked memory or reachable at shutdown)
- same leak is evident even with `--ha-replicate=none`




> [HA] active HA broker memory leak when ring queue discards overflow messages
> ----------------------------------------------------------------------------
>
>                 Key: QPID-7149
>                 URL: https://issues.apache.org/jira/browse/QPID-7149
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker
>         Environment: RHEL6
> qpid trunk svn rev. 1735384
> - issue seen in very old releases (since active-passive HA cluster initial 
> implementation, most probably)
> libstdc++-devel-4.4.7-4.el6.x86_64
> gcc-c++-4.4.7-4.el6.x86_64
> libgcc-4.4.7-4.el6.x86_64
> libstdc++-4.4.7-4.el6.x86_64
> gcc-4.4.7-4.el6.x86_64
>            Reporter: Pavel Moravec
>
> There is a memory leak on active HA broker, triggered most probably by 
> purging overflow message from a ring queue. Basic scenario is to setup HA 
> cluster, promote to primary and feed forever a ring queue with messages.
> Detailed scenario:
> 1) Start brokers and promote one to primary:
> {noformat}
> start_broker() {
>       port=$1
>       shift
>       rm -rf _${port}
>       mkdir _${port}
>       nohup qpidd --load-module=ha.so --port=$port 
> --log-to-file=qpidd.$port.log --data-dir=_${port} --auth=no 
> --log-to-stderr=no --ha-cluster=yes 
> --ha-brokers-url="$(hostname):5672,$(hostname):5673,$(hostname):5674" 
> --ha-replicate=all --acl-file=/root/qpidd.acl "$@" > /dev/null 2>&1 &
>       sleep 1
> }
> killall qpidd qpid-receive 2> /dev/null
> rm -f qpidd.*.log
> start_broker 5672
> sleep 1
> qpid-ha promote -b $(hostname):5672 --cluster-manager
> sleep 1
> start_broker 5673
> sleep 1
> start_broker 5674
> {noformat}
> 2) Create ring queues and send there messages (it is enough to have 1 queue, 
> having more should show the leak faster):
> {noformat}
> for i in $(seq 0 9); do
>       qpid-config add queue FromKeyServer_$i --max-queue-size=10000 
> --max-queue-count=10 --limit-policy=ring --argument=x-qpid-priorities=10
> done
> while true; do
>       for j in $(seq 1 10); do
>               for i in $(seq 1 10); do
>                       for k in $(seq 0 9); do
>                               qpid-send -a FromKeyServer_$k -m 100 
> --send-rate=50 -- priority=$(($((RANDOM))%10)) &
>                       done
>               done
>               wait
>               while [ $(qpid-stat -q | grep broker-replicator | sed "s/Y//g" 
> | awk '{ print $2 }' | sort -n | tail -n1) != "0" ]; do
>                       sleep 1
>               done
>       done
>       date
>       ps aux | grep qpidd | grep "port=5672" | awk -F "--store-dir" '{ print 
> $1 }'
> done
> {noformat}
> (the "while [ $(qpid-stat -q | .." cycle is there just to slow down the 
> message enqueues to ensure replication federation queues dont have big 
> backlog - that would interfere with memory consumpiton observation)
> 3) Run those scripts and monitor memory consumption.
> - without using priority queues and sending messages without priorities, leak 
> is evident as well - sometimes smaller, sometimes the same
> - valgrind (on some older versions I tested before more thoroughly) detects 
> nothing (neither leaked memory or reachable at shutdown)
> - same leak is evident even with --ha-replicate=none
> - number of backup brokers does not affect the memory leak



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to