[
https://issues.apache.org/jira/browse/AMQ-6197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Hofer updated AMQ-6197:
------------------------------
Description:
In order to improve CPU usage in a test setup of a network of brokers
consisting of 3+ brokers using the following broker configuration
{{<broker useJmx="${activemq.expose.jmx}" persistent="false"
brokerName="${activemq.brokerName}"
xmlns="http://activemq.apache.org/schema/core">
<sslContext>
<amq:sslContext keyStore="${activemq.broker.keyStore}"
keyStorePassword="${activemq.broker.keyStorePassword}"
trustStore="${activemq.broker.trustStore}"
trustStorePassword="${activemq.broker.trustStorePassword}" />
</sslContext>
<systemUsage>
<systemUsage>
<memoryUsage>
<memoryUsage limit="${activemq.memoryUsage}" />
</memoryUsage>
<tempUsage>
<tempUsage limit="${activemq.tempUsage}" />
</tempUsage>
</systemUsage>
</systemUsage>
<destinationPolicy>
<policyMap>
<policyEntries>
<policyEntry queue=">" enableAudit="false">
<networkBridgeFilterFactory>
<conditionalNetworkBridgeFilterFactory
replayWhenNoConsumers="true" />
</networkBridgeFilterFactory>
</policyEntry>
</policyEntries>
</policyMap>
</destinationPolicy>
<networkConnectors>
<networkConnector name="queues"
uri="static:(${activemq.otherBrokers})"
networkTTL="2" dynamicOnly="true"
decreaseNetworkConsumerPriority="true"
conduitSubscriptions="false">
<excludedDestinations>
<topic physicalName=">" />
</excludedDestinations>
</networkConnector>
<networkConnector name="topics"
uri="static:(${activemq.otherBrokers})"
networkTTL="1" dynamicOnly="true"
decreaseNetworkConsumerPriority="true"
conduitSubscriptions="true">
<excludedDestinations>
<queue physicalName=">" />
</excludedDestinations>
</networkConnector>
</networkConnectors>
<transportConnectors>
<transportConnector
uri="${activemq.protocol}${activemq.host}:${activemq.tcp.port}?needClientAuth=true"
updateClusterClients="true" rebalanceClusterClients="true" />
<transportConnector
uri="${activemq.websocket.protocol}${activemq.websocket.host}:${activemq.websocket.port}?needClientAuth=true"
updateClusterClients="true" rebalanceClusterClients="true" />
</transportConnectors>
</broker>}}
we have altered the activemq.protocol placeholder from originally ssl:// to
nio+ssl:// and immediately could observe some CPU improvements (note the same
works with tcp:// and nio://). However after a new deployment we started to
encounter weird behavior that some producers would either get timeouts from
their request-reply messages or a "unknown destination exception" once the
reply is being sent on a temp-queue, the issue only happened when the producer
and consumer were connected to different brokers in the network. After some
testing we ultimately found out that after a restart often brokers would not
start both network bridges, one for queues and one for topics, but rather only
one of them. For example in a 3 broker setup each broker usually had 4 network
bridges active, 2 for each other broker. However during some restarts we would
see any number of active bridges between 2 and 4, no matter the wait time the
2nd bridge was never started. The logs also should no output whatsoever, as
long as 1 broker was shutdown the other two would output 'connection refused'
once he started the would show either 1 or 2 'successfully reconnected'.
As soon as we switched back to ssl:// protocol on the transport connector the
issue was gone for good, no matter how many restarts always 4 network bridges
would be started in each broker. Switching back to nio:// the problem is back
right away.
For now we are checking if it is worth it starting an additional
TransportConnector running nio:// just for producers and consumers while the
network used the tcp:// connector. The documentations regarding
NetworkConnectors usually all use tcp:// or multicast:// in the
TransportConnector the bridges attach to, so we are not entirely sure if nio://
is even supposed to work for this case or if this is indeed a bug somewhere.
was:
In order to improve CPU usage in a test setup of a network of brokers
consisting of 3+ brokers using the following broker configuration
<broker useJmx="${activemq.expose.jmx}" persistent="false"
brokerName="${activemq.brokerName}"
xmlns="http://activemq.apache.org/schema/core">
<sslContext>
<amq:sslContext keyStore="${activemq.broker.keyStore}"
keyStorePassword="${activemq.broker.keyStorePassword}"
trustStore="${activemq.broker.trustStore}"
trustStorePassword="${activemq.broker.trustStorePassword}" />
</sslContext>
<systemUsage>
<systemUsage>
<memoryUsage>
<memoryUsage limit="${activemq.memoryUsage}" />
</memoryUsage>
<tempUsage>
<tempUsage limit="${activemq.tempUsage}" />
</tempUsage>
</systemUsage>
</systemUsage>
<destinationPolicy>
<policyMap>
<policyEntries>
<policyEntry queue=">" enableAudit="false">
<networkBridgeFilterFactory>
<conditionalNetworkBridgeFilterFactory
replayWhenNoConsumers="true" />
</networkBridgeFilterFactory>
</policyEntry>
</policyEntries>
</policyMap>
</destinationPolicy>
<networkConnectors>
<networkConnector name="queues"
uri="static:(${activemq.otherBrokers})"
networkTTL="2" dynamicOnly="true"
decreaseNetworkConsumerPriority="true"
conduitSubscriptions="false">
<excludedDestinations>
<topic physicalName=">" />
</excludedDestinations>
</networkConnector>
<networkConnector name="topics"
uri="static:(${activemq.otherBrokers})"
networkTTL="1" dynamicOnly="true"
decreaseNetworkConsumerPriority="true"
conduitSubscriptions="true">
<excludedDestinations>
<queue physicalName=">" />
</excludedDestinations>
</networkConnector>
</networkConnectors>
<transportConnectors>
<transportConnector
uri="${activemq.protocol}${activemq.host}:${activemq.tcp.port}?needClientAuth=true"
updateClusterClients="true" rebalanceClusterClients="true" />
<transportConnector
uri="${activemq.websocket.protocol}${activemq.websocket.host}:${activemq.websocket.port}?needClientAuth=true"
updateClusterClients="true" rebalanceClusterClients="true" />
</transportConnectors>
</broker>
we have altered the activemq.protocol placeholder from originally ssl:// to
nio+ssl:// and immediately could observe some CPU improvements (note the same
works with tcp:// and nio://). However after a new deployment we started to
encounter weird behavior that some producers would either get timeouts from
their request-reply messages or a "unknown destination exception" once the
reply is being sent on a temp-queue, the issue only happened when the producer
and consumer were connected to different brokers in the network. After some
testing we ultimately found out that after a restart often brokers would not
start both network bridges, one for queues and one for topics, but rather only
one of them. For example in a 3 broker setup each broker usually had 4 network
bridges active, 2 for each other broker. However during some restarts we would
see any number of active bridges between 2 and 4, no matter the wait time the
2nd bridge was never started. The logs also should no output whatsoever, as
long as 1 broker was shutdown the other two would output 'connection refused'
once he started the would show either 1 or 2 'successfully reconnected'.
As soon as we switched back to ssl:// protocol on the transport connector the
issue was gone for good, no matter how many restarts always 4 network bridges
would be started in each broker. Switching back to nio:// the problem is back
right away.
For now we are checking if it is worth it starting an additional
TransportConnector running nio:// just for producers and consumers while the
network used the tcp:// connector. The documentations regarding
NetworkConnectors usually all use tcp:// or multicast:// in the
TransportConnector the bridges attach to, so we are not entirely sure if nio://
is even supposed to work for this case or if this is indeed a bug somewhere.
> Problem using 2 or more NetworkConnectors in a single broker with NIO
> TransportConnectors
> -----------------------------------------------------------------------------------------
>
> Key: AMQ-6197
> URL: https://issues.apache.org/jira/browse/AMQ-6197
> Project: ActiveMQ
> Issue Type: Bug
> Components: Broker
> Affects Versions: 5.12.1
> Environment: RHEL 6.6, java-openjdk-1.7.0 u95
> Reporter: Daniel Hofer
> Labels: networkConnector, networkbridge, nio
>
> In order to improve CPU usage in a test setup of a network of brokers
> consisting of 3+ brokers using the following broker configuration
> {{<broker useJmx="${activemq.expose.jmx}" persistent="false"
> brokerName="${activemq.brokerName}"
> xmlns="http://activemq.apache.org/schema/core">
> <sslContext>
> <amq:sslContext keyStore="${activemq.broker.keyStore}"
> keyStorePassword="${activemq.broker.keyStorePassword}"
> trustStore="${activemq.broker.trustStore}"
> trustStorePassword="${activemq.broker.trustStorePassword}" />
> </sslContext>
> <systemUsage>
> <systemUsage>
> <memoryUsage>
> <memoryUsage limit="${activemq.memoryUsage}" />
> </memoryUsage>
> <tempUsage>
> <tempUsage limit="${activemq.tempUsage}" />
> </tempUsage>
> </systemUsage>
> </systemUsage>
> <destinationPolicy>
> <policyMap>
> <policyEntries>
> <policyEntry queue=">" enableAudit="false">
> <networkBridgeFilterFactory>
> <conditionalNetworkBridgeFilterFactory
> replayWhenNoConsumers="true" />
> </networkBridgeFilterFactory>
> </policyEntry>
> </policyEntries>
> </policyMap>
> </destinationPolicy>
> <networkConnectors>
> <networkConnector name="queues"
> uri="static:(${activemq.otherBrokers})"
> networkTTL="2" dynamicOnly="true"
> decreaseNetworkConsumerPriority="true"
> conduitSubscriptions="false">
> <excludedDestinations>
> <topic physicalName=">" />
> </excludedDestinations>
> </networkConnector>
> <networkConnector name="topics"
> uri="static:(${activemq.otherBrokers})"
> networkTTL="1" dynamicOnly="true"
> decreaseNetworkConsumerPriority="true"
> conduitSubscriptions="true">
> <excludedDestinations>
> <queue physicalName=">" />
> </excludedDestinations>
> </networkConnector>
> </networkConnectors>
> <transportConnectors>
> <transportConnector
>
> uri="${activemq.protocol}${activemq.host}:${activemq.tcp.port}?needClientAuth=true"
> updateClusterClients="true" rebalanceClusterClients="true" />
> <transportConnector
>
> uri="${activemq.websocket.protocol}${activemq.websocket.host}:${activemq.websocket.port}?needClientAuth=true"
> updateClusterClients="true" rebalanceClusterClients="true" />
> </transportConnectors>
> </broker>}}
> we have altered the activemq.protocol placeholder from originally ssl:// to
> nio+ssl:// and immediately could observe some CPU improvements (note the same
> works with tcp:// and nio://). However after a new deployment we started to
> encounter weird behavior that some producers would either get timeouts from
> their request-reply messages or a "unknown destination exception" once the
> reply is being sent on a temp-queue, the issue only happened when the
> producer and consumer were connected to different brokers in the network.
> After some testing we ultimately found out that after a restart often brokers
> would not start both network bridges, one for queues and one for topics, but
> rather only one of them. For example in a 3 broker setup each broker usually
> had 4 network bridges active, 2 for each other broker. However during some
> restarts we would see any number of active bridges between 2 and 4, no matter
> the wait time the 2nd bridge was never started. The logs also should no
> output whatsoever, as long as 1 broker was shutdown the other two would
> output 'connection refused' once he started the would show either 1 or 2
> 'successfully reconnected'.
> As soon as we switched back to ssl:// protocol on the transport connector the
> issue was gone for good, no matter how many restarts always 4 network bridges
> would be started in each broker. Switching back to nio:// the problem is back
> right away.
> For now we are checking if it is worth it starting an additional
> TransportConnector running nio:// just for producers and consumers while the
> network used the tcp:// connector. The documentations regarding
> NetworkConnectors usually all use tcp:// or multicast:// in the
> TransportConnector the bridges attach to, so we are not entirely sure if
> nio:// is even supposed to work for this case or if this is indeed a bug
> somewhere.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)