[
https://issues.apache.org/jira/browse/AMQ-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Christopher L. Shannon updated AMQ-8287:
----------------------------------------
Description:
I ran into a deadlock caused by the fix for AMQ-8169 when using Stomp
NIOSSLTransport (but it could probably happen for other nio ssl transports)
The newly added synchronized on the serviceRead() caused a deadlock between the
transport and the TransportConnection. One thread acquired a lock on the
TransportConnection and was waiting on serviceRead() to acquire the
NIOSSLTransport lock. Another thread had was inside serviceRead() so it
acquired the NIOSSLTransport lock and was then later waiting for the
TransportConnection lock.
The main issue is that processCommand(plain) ends up being protected by the
lock and since there's multiple brokers/filters running we run into a deadlock
(my current deadlock happened processing a ConsumerInfo command)
To fix this we simply need to narrow the lock as it's too broad. The goal here
was to protect the the reading off the channel concurrently (so really the
secureRead() method so we can move the lock to secureRead() and not lock the
entire serviceRead() call and that should fix the deadlock problem while still
solving the initial issue which was demonstrated by the StompNIOSSL failing
before this fix. I will open a new Jira shortly and push a fix.
was:
I ran into a deadlock caused by the fix for AMQ-8169 when using Stomp
NIOSSLTransport (but it could probably happen for other nio ssl transports)
The newly added synchronized on the serviceRead() caused a deadlock between the
transport and the TransportConnection. One thread acquired a lock on the
TransportConnection and was waiting on serviceRead() to acquire the
NIOSSLTransport lock. Another thread had was inside serviceRead() so it
acquired the NIOSSLTransport lock and was then later waiting for the
TransportConnection lock.
The main issue is that processCommand(plain) ends up being protected by the
lock and since there's multiple brokers/filters running we run into a deadlock
(my current deadlock happened processing a ConsumerInfo command)
To fix this we simply need to narrow the lock as it's too broad. The goal here
was to protect the the reading off the channel concurrently (so really the
secureRead() method0 so we can move the lock to secureRead() and not lock the
entire serviceRead() call and that should fix the deadlock problem while still
solving the initial issue which was demonstrated by the StompNIOSSL failing
before this fix. I will open a new Jira shortly and push a fix.
> Deadlock caused by synchronized on serviceRead() in NIOSSLTransport
> -------------------------------------------------------------------
>
> Key: AMQ-8287
> URL: https://issues.apache.org/jira/browse/AMQ-8287
> Project: ActiveMQ
> Issue Type: Bug
> Components: Broker
> Affects Versions: 5.15.15, 5.16.2
> Reporter: Christopher L. Shannon
> Assignee: Christopher L. Shannon
> Priority: Major
> Fix For: 5.17.0, 5.16.3
>
>
> I ran into a deadlock caused by the fix for AMQ-8169 when using Stomp
> NIOSSLTransport (but it could probably happen for other nio ssl transports)
> The newly added synchronized on the serviceRead() caused a deadlock between
> the transport and the TransportConnection. One thread acquired a lock on the
> TransportConnection and was waiting on serviceRead() to acquire the
> NIOSSLTransport lock. Another thread had was inside serviceRead() so it
> acquired the NIOSSLTransport lock and was then later waiting for the
> TransportConnection lock.
> The main issue is that processCommand(plain) ends up being protected by the
> lock and since there's multiple brokers/filters running we run into a
> deadlock (my current deadlock happened processing a ConsumerInfo command)
> To fix this we simply need to narrow the lock as it's too broad. The goal
> here was to protect the the reading off the channel concurrently (so really
> the secureRead() method so we can move the lock to secureRead() and not lock
> the entire serviceRead() call and that should fix the deadlock problem while
> still solving the initial issue which was demonstrated by the StompNIOSSL
> failing before this fix. I will open a new Jira shortly and push a fix.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)