Hi,
The issue I'm having is when a client producer sends message based on
user interaction. The message causes a screen to pop up on another
workstation. Usually the pop up is instantaneous, sometimes though it
takes up to 2 minutes for the message to get to the other workstation.
The message is a JMS text message containing 9 characters, so fairly
small message. We have tried tuning the worker-threads thinking it was
an availability issue. This single message is more important than all
the other traffic our qpid is handling. Is there a way to give priority
to one queue over another? There is a large amount of traffic being
handled by the broker, but not sure how the design is setup to handle
when they are many more sessions/queues than worker-threads. Does a
thread send all messages to a consumer before moving on to the next
queue? Or is the only way to ensure availability to further increase
worker-threads? I've had the threads as high as 100, but the load on
the system made the problem worse. Our setup is below.
We are using version 0.8 of the C broker and java client. The broker
has roughly 100 queues. Each queue has at least two consumers, 1 each
from separate servers in a cluster. We then also have 20 clients
listens to 4 topics and 5 clients listening to 1 queue (the important
one mentioned above). So in general out broker has roughly 300 sessions
open at any given time. Almost all of the queues are durable. The
topics are not durable, nor are subscribers durable. All but one
clients in the scenario are java clients, with 1 c client. The servers
also use the java client. The following is connection url used by most
of the clients (its embedded in spring xml, thus the escaped &.
amqp://guest:guest@/program?brokerlist='tcp://${broker.addr}?retries='0'&tcp_nodelay='true'&connecttimeout='5000''&maxprefetch='0'&sync_publish='all'&failover='nofailover'
I only recently turned on tcp_nodelay and sync_publish, thinking that
perhaps the message was occasionally getting stuck. These are the
setting from our conf file for the broker:
auth=no
worker-threads=50
data-dir=/somepath/qpid/data
store-dir=/somepath/qpid/messageStore
pid-dir=/somepath/qpid/var/lock
num-jfiles=16
jfile-size-pgs=24
tcp-nodelay=true
Many of the queues are sized larger than the default through a queue
creator script. The sizes range up to a max file count of 32 and file
size of 48. The server running qpid is a 8 cpu system with 2g of
memory, some of the offices have a 16 cpu system with 8g of memory. The
server size does not make a difference in the errors.
Part of the theory for availability being the issue was that the clients
kept timing out on heartbeat. So we disabled the heartbeat. We also
occasionally see
INFO 2011-06-06 17:47:42,501 [IoReceiver - somemachine/someip:5672]
JmsPooledSession: EDEX: DEFAULT - Failed to close session
org.apache.qpid.transport.SessionException: timed out waiting for sync:
complete = 30115, point = 30116
at org.apache.qpid.transport.Session.sync(Session.java:744)
at org.apache.qpid.transport.Session.sync(Session.java:713)
at
org.apache.qpid.client.AMQSession_0_10.sendClose(AMQSession_0_10.java:427)
at org.apache.qpid.client.AMQSession.close(AMQSession.java:700)
at org.apache.qpid.client.AMQSession.close(AMQSession.java:666)
at org.apache.qpid.client.AMQSession.close(AMQSession.java:525)
at
somepackage.jms.JmsPooledSession.closeInternal(JmsPooledSession.java:164)
at
somepackage.jms.JmsPooledConnection.disconnect(JmsPooledConnection.java:152)
at
somepackage.jms.JmsPooledConnection.onException(JmsPooledConnection.java:127)
at
org.apache.qpid.client.AMQConnectionDelegate_0_10.closed(AMQConnectionDelegate_0_10.java:270)
at org.apache.qpid.transport.Connection.closed(Connection.java:529)
at
org.apache.qpid.transport.network.Assembler.closed(Assembler.java:113)
at
org.apache.qpid.transport.network.InputHandler.closed(InputHandler.java:202)
at
org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:150)
at java.lang.Thread.run(Thread.java:619)
The gap between complete and point used to be much larger before adding
the sync_publish setting. There are no errors in the qpid broker log.
The only thing in the log is along the lines of the following 2 messages:
qpidd[19149]: 2011-06-08 11:50:03 warning
ManagementAgent::periodicProcessing task overran 1 times by 6ms (taking
5098421ns) on average.
qpidd[19149]: 2011-06-08 11:50:16 warning task overran 3 times by 2ms
(taking 27955ns) on average.
Thanks,
Richard Peter