Looking for some advice for an ActiveMQ setup that gives both stability, redundancy, and is able to handle 10s of thousands of clients. This is for an mcollective deployment and will be used exclusively to execute remote upgrades and maintenance of systems.
Initially setup a network of brokers with two MQ servers per datacenter in a hub and spoke topology. Systems are virtual and initially had 4cpu/16gb ram and kahadb. Configured with the following params: <broker xmlns="http://activemq.apache.org/schema/core" useJmx="true" brokerName="hostname" dataDirectory="${activemq.base}/data" networkConnectorStartAsync="true" schedulePeriodForDestinationPurge="60000"> <destinationPolicy> <policyMap> <policyEntries> <policyEntry topic=">" producerFlowControl="false"> <pendingSubscriberPolicy> <vmCursor /> </pendingSubscriberPolicy> </policyEntry> <policyEntry queue=">" producerFlowControl="false" /> <policyEntry queue="*.reply.>" gcInactiveDestinations="true" inactiveTimoutBeforeGC="300000" /> </policyEntries> </policyMap> </destinationPolicy> Many connections like this: <networkConnector name="connection_to_remotehostname" uri="static:(tcp://remotehostname.com:61616)" dynamicOnly="true" userName="replication" password="replication" duplex="true" networkTTL="2" decreaseNetworkConsumerPriority="true" /> <systemUsage> <systemUsage> <memoryUsage> <memoryUsage limit="8 gb"/> </memoryUsage> <storeUsage> <storeUsage limit="1 gb"/> </storeUsage> <tempUsage> <tempUsage limit="100 mb"/> </tempUsage> </systemUsage> </systemUsage> start script memory config ACTIVEMQ_OPTS_MEMORY="-Xms256m -Xmx16384m -Dorg.apache.activemq.UseDedicatedTaskRunner=false" Found that it ran fine for about 8 hours and would start to drop connections once it got to about 50% ram utilization. Debug showed lots of message expiry (hostnames replace with hostname and host) [t201] Scheduler] DEBUG Queue - queue://host_mcollective.reply.hostname_25106 expiring messages .. 2015-03-13 07:37:02,023 [t201] Scheduler] DEBUG Queue - queue://host_mcollective.reply.hostname_25106 expiring messages done. 2015-03-13 07:37:02,033 [t201] Scheduler] DEBUG Queue - queue://host_mcollective.reply.hostname_25434 expiring messages .. 2015-03-13 07:37:02,033 [t201] Scheduler] DEBUG Queue - queue://host_mcollective.reply.hostname_25434 expiring messages done. We have now tried to switch to a master slave topology with leveldb and shared nfs at each data center. Network connectors defined as master slaves from teh hub node <networkConnector name="remote site #1" uri="masterslave:(tcp://host1.domain.com:61616,tcp://host2.domain.com:61616)" duplex="true" > </networkConnector> memory footprint reduced <systemUsage> <systemUsage> <memoryUsage> <memoryUsage limit="8 gb"/> </memoryUsage> <storeUsage> <storeUsage limit="1 gb"/> </storeUsage> <tempUsage> <tempUsage limit="100 mb"/> </tempUsage> </systemUsage> </systemUsage> Both setups utilized NIO and all of the tuning that was mentioned from this site: http://www.javabeat.net/deploying-activemq-for-large-numbers-of-concurrent-applications/ Any help on the best way to create a functioning system that scales and is able to handle possible slow links would be much appreciated. Real world deployment examples would be preferred. -- View this message in context: http://activemq.2283324.n4.nabble.com/Network-of-brokers-with-multiple-worldwide-data-centers-tp4693158.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.