Hi Roger,
Thanks for your explanation, but actually, our setup runs just fine.
Well, up untill the moment I switched to stomp+nio rather than just stomp.
I was, however, under the assumption that nio should have increased its
scalability towards your proxy-example, rather than reduce and change
things from already highly performant to unusable slow.
In our case we had to increase the ulimit to some high number, remove
the sizing-limits and turn off async disconnects. But after that, we
didn't see any problem in terms of load. It easily stores 500k messages
if our consumer happens to be out and doesn't come far above 3-5%
cpu-usage during during normal load.
Best regards,
Arjen
On 20-1-2010 23:06 Roger Hoover wrote:
Hi Arjen,
We had a similar setup with lots of PHP processes as producers making STOMP
connections, enqueuing a message, and disconnecting and experienced many
problems including max file descriptor errors and other broker issues.
We've seen much better stability and throughput by introducing a proxy so
that the PHP producers do an HTTP POST to the proxy instead of connecting
directly AMQ with STOMP. The proxy caches the message temporarily to local
filesystem and a daemon with a single, long-lived STOMP connection picks up
the files and enqueues them.
We use nginx + fastcgi for the proxy. I see this as a nice separation of
concerns. IMO, AMQ is not optimized for handling lots of short-lived
connections but nginx is (or any other async multiplexer) and can easily
handle thousands of concurrent connections per process. So, using this
setup allows nginx to do what it does well (very efficiently multiplex
connections) and AMQ to do what it does well (pass messages between
producers and consumers with long-lived connections).
HTH,
Roger
On Wed, Jan 20, 2010 at 1:08 PM, Arjen van der Meijden<
[email protected]> wrote:
Hi List,
I just restarted a activemq 5.3-instance which was causing 1000% cpu (the
machine has 8 cores, so all where fully working) and a load of 1035.
I'm pretty confident that this load/cpu-usage was caused by using stomp+nio
rather than 'stomp' (whether it was just nio or not, I don't know).
Our usage-pattern is that we have a few queues with a single consumer each.
The producers are php-web-processes that make a connection using stomp, send
a few messages to some of the queues and disconnect again (which is why we
have the transport.closeAsync=false). The heaviest two queues receive about
30-50 messages/second during peak hours (and thus, ActiveMQ receives about
30-50 connections/second).
We also have a few topics that get messages quite infrequently, from
similar stomp-producers, but those are consumed with java-based
applications.
The load/cpu-usage apparently has gradually increased, as can be seen in
these two graphs:
Cpu-usage
http://tweakers.net/stats/?Action=Generator&Mode=Serverstats&Time=1263919627&Dagen=1&StatsServer=Argus&colServers=CPUUsage
Load average
http://tweakers.net/stats/?Action=Generator&Mode=Serverstats&Time=1263919627&Dagen=1&StatsServer=Argus&colServers=LoadAvg
The increase in load has gone steadily in just a few hours up untill the
moment where it took the php-processes more than 13 seconds to actually
connect, send two messages and disconnect (rather than a few milliseconds).
Our activemq.xml is attached, although I already replaced stomp+nio with
normal stomp. Our ACTIVEM_OPTS is "-Xms2048M -Xmx2048M -XX:+UseParallelOldGC
-server -Dorg.apache.activemq.UseDedicatedTaskRunner=true" (i.e. the default
with adjusted memory and GC)
The increasing load hasn't appeared yet with the adjusted config that uses
non-nio connections.
Unfortunately, this is a production server so I can't run tests very
easily. Nor do I have a problem with not running nio, so this is mainly a
mail to let you know of this problem, rather than me having a big issue with
activemq right now :)
Best regards,
Arjen