Hi all, I'd like to post some symptoms we're seeing with our ActiveMQ publishing broker and maybe get some ideas on what we need to tweak.
We're using ActiveMQ to push publishing messages to a set of pre-prod and prod consumers. During peak of publishing, ActiveMQ becomes unresponsive (web UI timeout, JMX timeout, process hangs). On the system side, we're seeing swap usage approach 50% of host's physical memory (4GB out of 8GB). In logs there are three types of warnings: Slow KahaDB access: 2013-01-17 23:49:58,616 | INFO | Slow KahaDB access: Journal append took: 0 ms, Index update took 1991 ms 2013-01-17 23:50:26,649 | INFO | Slow KahaDB access: Journal read took: 1818 ms Timeout of Zookeeper connection: 2013-01-18 03:24:25,195 | INFO | Client session timed out, have not heard from server in 14251ms for sessionid 0x23c46ac51160442, closing socket connection and attempting reconnect Timeout of consumer connections: 2013-01-18 04:23:58,674 | WARN | Transport Connection to: tcp://10.129.18.33:48055 failed: org.apache.activemq.transport.InactivityIOException: Channel was inactive for too (>30000) long: tcp://10.129.18.33:48055 2013-01-18 04:23:58,673 | WARN | Transport Connection to: tcp://10.14.26.60:48237 failed: org.apache.activemq.transport.InactivityIOException: Channel was inactive for too (>30000) long: tcp://10.14.26.60:48237 2013-01-18 04:23:58,673 | WARN | Transport Connection to: tcp://10.14.26.61:56382 failed: org.apache.activemq.transport.InactivityIOException: Channel was inactive for too (>30000) long: tcp://10.14.26.61:56382 We're running ActiveMQ 5.7.0 on RHEL 5.7 with Java 1.6.0_34. The host has 8GB of RAM and has /data and /logs NFS-mounted. There's about 180 queues on the broker, and a publishing event usually goes out to about 3-10 queues. We've already tested the underlying infrastructure (host, network, etc.) as well as tweaked ActiveMQ settings to our best knowledge, yet we continue seeing these issues. Recently, we've enabled PFC (producer flow control), which helped bring down the number of occurrences. Here's the startup command: /usr/java/default/bin/java -DuseProfiler=true -javaagent:/usr/local/appdynamics-agent-3.5.5/javaagent.jar -Dappdynamics.agent.nodeName=hostname.fqdn -Dappdynamics.agent.logs.dir=/logs/appdynamics/ -Xmx4096M -Xms4096M -XX:MaxPermSize=128m -Dorg.apache.activemq.UseDedicatedTaskRunner=false -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dorg.apache.activemq.store.kahadb.LOG_SLOW_ACCESS_TIME=1500 -Djava.util.logging.config.file=logging.properties -Dactivemq.classpath=/apps/apache-activemq/conf -Dactivemq.conf=/apps/apache-activemq/conf -jar /apps/apache-activemq/bin/run.jar start Here are the links to some Appdynamics graphs (a publishing activity has started at 8:30am), as well as our ActiveMQ config and complete console output log. https://dl.dropbox.com/u/2739192/amq/activemq.xml https://dl.dropbox.com/u/2739192/amq/broker.log https://dl.dropbox.com/u/2739192/amq/io.png https://dl.dropbox.com/u/2739192/amq/net_io.png https://dl.dropbox.com/u/2739192/amq/jmx.png https://dl.dropbox.com/u/2739192/amq/jvm.png https://dl.dropbox.com/u/2739192/amq/mem.png Here's the portion of ActiveMQ config for ease of use. <broker xmlns="http://activemq.apache.org/schema/core" brokerName="broker-publishing-legacy" useJmx="true" dataDirectory="/data/activemq/"> <destinationInterceptors> <bean xmlns="http://www.springframework.org/schema/beans" id="QueueDestinationInterceptor" class="com.abcde.eps.interceptor.QueueDestinationInterceptor"> </bean> <virtualDestinationInterceptor> <virtualDestinations> <virtualTopic name="VirtualTopic.>" prefix="Consumer.*." /> </virtualDestinations> </virtualDestinationInterceptor> </destinationInterceptors> <destinationPolicy> <policyMap> <policyEntries> <policyEntry topic=">" memoryLimit="16 mb" producerFlowControl="true"> </policyEntry> <policyEntry queue=">" memoryLimit="16 mb" optimizedDispatch="true" producerFlowControl="true"> </policyEntry> </policyEntries> </policyMap> </destinationPolicy> <managementContext> <managementContext connectorPort="1101" rmiServerPort="1100" jmxDomainName="org.apache.activemq" /> </managementContext> <persistenceAdapter> <kahaDB directory="/data/activemq/" enableIndexWriteAsync="true" enableJournalDiskSyncs="false" journalMaxFileLength="256mb" /> </persistenceAdapter> <systemUsage> <systemUsage> <memoryUsage> <memoryUsage limit="512 mb"/> </memoryUsage> <storeUsage> <storeUsage limit="200 gb"/> </storeUsage> <tempUsage> <tempUsage limit="1 gb"/> </tempUsage> </systemUsage> </systemUsage> <transportConnectors> <transportConnector name="nio" uri="nio://0.0.0.0:1102?transport.closeAsync=false" /> </transportConnectors> </broker> Any suggestions would be appreciated. -- Best regards, Dmitriy V.