Oren created AMQCPP-760:
---------------------------

             Summary: ActiveMQ-CPP 3.9.5 Failover Timeout Issue with Advisory 
Topic in PROD
                 Key: AMQCPP-760
                 URL: https://issues.apache.org/jira/browse/AMQCPP-760
             Project: ActiveMQ C++ Client
          Issue Type: Bug
          Components: Transports
         Environment: * {*}ActiveMQ-CPP Version{*}: 3.9.5
 * {*}ActiveMQ Broker Version{*}: 5.17.3
 * {*}Broker Hosts{*}: cca-prdappqua01.icc.corp:20801, 
cca-prdappqua02.icc.corp:20801
 * {*}Client Host{*}: cca-prdappbta05.icc.corp
 * {*}OS{*}: RHEL 7.9
 * {*}AMQ_URL{*}: 
failover:([tcp://cca-prdappqua01.icc.corp:20801,tcp://cca-prdappqua02.icc.corp:20801])?maxReconnectAttempts=10&initialReconnectDelay=1000&timeout=30000&randomize=false
 * {*}Demo Application{*}: C++ program (demo_sys_amq_modules.exe) subscribing 
to ActiveMQ.Advisory.Consumer.Queue.101 with a 30-second receive timeout. 
(attached)

 
            Reporter: Oren
         Attachments: activemq.log, activemq.xml, demo_sys_amq_modules.cpp

h2. Summary

We are experiencing a timeout issue in our PROD environment when using the 
ActiveMQ-CPP 3.9.5 client with a {{failover:}} URI, specifically when 
subscribing to the advisory topic {{{}ActiveMQ.Advisory.Consumer.Queue.101{}}}. 
The issue persists despite upgrading from 3.9.3 (where AMQCPP-610 was 
suspected) and correcting a typo in the {{{}AMQ_URL{}}}. The same code works in 
non-PROD environments (TEST/BETA) and in PROD with a direct {{tcp://}} URI.
h2. Issue Description

The demo times out in PROD when using the failover: protocol to receive 
advisory messages for Queue.101. Key observations:
 * Works in TEST/BETA with identical broker settings and failover: URI 
(maxReconnectAttempts=10).

^*[BETA l_onissa@cca-betappbta01 cca_domain]$ 
AMQ_URL="failover:([tcp://cca-betappqua01.icc.corp:20801])"*^

^*[BETA l_onissa@cca-betappbta01 cca_domain]$ demo_sys_amq_modules.exe 101*^

^ActiveMQ-CPP initialized. Connecting to 
failover:([tcp://cca-betappqua01.icc.corp:20801])^

^Waiting for advisory messages on: ActiveMQ.Advisory.Consumer.Queue.101 ...^

^Received advisory message of type: Advisory^
 * Works in PROD with direct [tcp://cca-prdappqua01.icc.corp:20801].

^*[PROD l_onissa@cca-prdappbta05 cca_domain]$ 
AMQ_URL="[tcp://cca-prdappqua01.icc.corp:20801]"*^

^*[PROD l_onissa@cca-prdappbta05 cca_domain]$ demo_sys_amq_modules.exe 101*^

^ActiveMQ-CPP initialized. Connecting to [tcp://cca-prdappqua01.icc.corp:20801]^

^Waiting for advisory messages on: ActiveMQ.Advisory.Consumer.Queue.101 ...^

^Received advisory message of type: Advisory^
 * Fails in PROD with failover: even after upgrading to 3.9.5 (fixing 
[AMQCPP-610]) and correcting a typo in the original AMQ_URL 
(maxReconnectAttemps=0, which defaulted to maxReconnectAttempts=-1).

^*[PROD l_onissa@cca-prdappbta05 cca_domain]$ 
AMQ_URL="failover:([tcp://cca-prdappqua01.icc.corp:20801])"*^

^*[PROD l_onissa@cca-prdappbta05 cca_domain]$ demo_sys_amq_modules.exe 101*^

^ActiveMQ-CPP initialized. Connecting to 
failover:([tcp://cca-prdappqua01.icc.corp:20801])^

^Waiting for advisory messages on: ActiveMQ.Advisory.Consumer.Queue.101 ...^

^No advisory message received (timeout).^
 * Java-based clients work reliably in PROD with failover:.

h2. Relevant Broker Log Entries

From activemq.log on cca-prdappqua01 (ActiveMQ 5.17.3) activemq.log file 
attached:

^2025-05-19 21:21:23,839 | WARN  | TopicSubscription: 
consumer=ID:cca-prdappbta07.icc.corp-14259-1747678799256-0:0:-1:1, ... 
dispatched=1000, delivered=0, matched=1001, ... has twice its prefetch limit 
pending, without an ack; it appears to be slow: [tcp://10.222.12.83:31247]^

^2025-05-19 17:31:22,146 | WARN  | Transport Connection to: 
[tcp://10.222.12.74:25787] failed: Broken pipe (Write failed)^

^2025-05-24 20:25:15,567 | WARN  | Transport Connection to: 
[tcp://10.222.12.84:33681] failed: Cannot send, channel has already failed^
 * Slow topic consumers on cca-prdappbta07, cca-prdappbta02, cca-prdappbta08 
suggest broker resource contention.
 * Failed connections indicate potential network instability affecting failover 
clients.

h2. Suspected Causes
 # {*}Broker Overload{*}: Slow consumers may delay advisory message delivery, 
impacting failover clients.
 # {*}Network Issues{*}: Failed connections suggest network instability, 
disrupting failover retries or subscriptions.
 # {*}Advisory Message Absence{*}: Possible lack of consumer activity on 
Queue.101 in PROD.
 # {*}Residual Effects{*}: Earlier 3.9.3 clients with maxReconnectAttemps=0 
(defaulting to -1) may have stressed the broker, affecting current 3.9.5 
clients.
 # {*}Failover Transport{*}: Possible edge case in 3.9.5 failover handling for 
advisory topics.

h2. Steps Taken
 * Upgraded from ActiveMQ-CPP 3.9.3 to 3.9.5 to address AMQCPP-610.
 * Corrected AMQ_URL typo (maxReconnectAttemps to maxReconnectAttempts=10).
 * Tested with direct tcp:// URI (works) and failover: (times out).
 * Verified broker connectivity via telnet (successful).
 * Added consumer for Queue.101 to trigger advisory messages (no success yet).
 * Enabled debug logging in the demo (logs pending).

h2. Request for Assistance
 * Is there a known issue in ActiveMQ-CPP 3.9.5 with failover and advisory 
topic subscriptions under high broker load or network instability?
 * Could slow consumers (as seen in the log) affect failover client 
subscriptions? Any recommended configurations to mitigate this?
 * Are there additional failover parameters or patches in 3.9.5 to improve 
stability for advisory topics?
 * Suggestions for diagnosing broker-side issues (e.g., clearing stale 
connections without restart)?

h2. Additional Information
 * {*}Broker Config{*}: advisorySupport is true (default for 5.17.3 see 
attached activemq.xml).
 * {*}Compile Command{*}:

^g++ -O2 -finline-functions -D_XOPEN_SOURCE=600 -m64 -fPIC -Wall \-L/usr/lib64 
-lactivemq-cpp \-isystem /usr/include/activemq-cpp-3.9.5 \-std=c++11 
\src/demo_sys_amq_modules.cpp -o target/bin/demo_sys_amq_modules.exe^
 * {*}Next Steps Planned{*}: Test with non-advisory topic, increase receive 
timeout to 60s, check activemq.xml for advisory settings, and consider 
upgrading to 3.10.0.

Please advise on next steps or known issues. Thank you for your support!

Best regards,
Oren



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact


Reply via email to