Sounds like a big problem. Could you please raise a JIRA and attach the necessary files and instructions to reproduce the problem ?
Thanks, On 9/19/06, robin.byrne <[EMAIL PROTECTED]> wrote:
We've encountered a problem with slow endpoints leading to a deadlock situation around the BoundedLinkedQueue. We've reproduce the problem with Seda, JMS and JCA flows. The situation is similar to that described for SM-512/SM-521 in the Jira, athough we don't think sendSync is necessarily involved. Our servicemix configuration started with a JMS endpoint receiving messages from a topic. The messages were sent to a pipeline which invoked an XSL transform and forwarded the result to an HTTPComponent. The HTTPComponent sent the transformed messages to a SOAP service. We noticed that rapidly sending a large number of messages to the JMS topic would lock up ServiceMix. That is, 1) no more messages were sent to the SOAP service 2) more messages would simply create more blocked threads 3) ServiceMix never recovered. We suspected that the slowness of the HTTP interaction was triggering the problem. To confirm this we replaced the HTTPComponent with a TraceComponent - modified to wait a configurable number of millis before sending the done message. We were able to reliably recreate the problem. We found that, in the blocked state, all of the worker threads were waiting to put ME's into the BoundedLinkedQueue. Many of the threads were blocked trying to put 'done' messages into the queue. We found that, as long as the Queue for the TraceComponent remained below capacity, that system continued to work. However, when the queue for that component hit capacity (100 MEs) the Trace output stopped. Even if the inbound flow of messages stopped at that point, and all of the other queue's were empty, the queue for the TraceComponent retained those 100 MEs After that, the next 100 inbound messages simply filled up the prior Queue (for the XSLT) until it hit capacity. And so it went until all of Queues were full. Once all of the Queue's were full each new message created a thread that blocked trying to put the new ME into the first Queue. There seems to be a problem here beyond something that can be handled by throttling configuration. The ESB should never lock up like this - or even be at risk of locking up like this the throttling doesn't prevent a message backlog. Thanks, Robin Byrne -- View this message in context: http://www.nabble.com/Deadlock-on-BounderLinkedQueue-tf2300852.html#a6394549 Sent from the ServiceMix - Dev mailing list archive at Nabble.com.
-- Cheers, Guillaume Nodet
