[ 
https://issues.apache.org/jira/browse/ARTEMIS-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848227#comment-17848227
 ] 

ASF subversion and git services commented on ARTEMIS-3622:
----------------------------------------------------------

Commit 3c058e98f1e332987aff026e4026758df6ca2785 in activemq-artemis's branch 
refs/heads/main from Justin Bertram
[ https://gitbox.apache.org/repos/asf?p=activemq-artemis.git;h=3c058e98f1 ]

ARTEMIS-3622 MQTT can deadlock on client cxn/discxn

This commit fixes the deadlock described on ARTEMIS-3622 by moving the
synchronization "up" a level from the MQTTSession to the
MQTTConnectionManager. It also eliminates the synchronization on the
MQTTSessionState in the MQTTConnectionManager because it's no longer
needed. This change should not only eliminate the deadlock, but improve
performance relatively as well.

There is no test associated with this commit as I wasn't able to
reproduce the deadlock with any kind of straight-forward test. There was
a test linked on the Jira, but it involved intrusive and fragile
scaffolding and wasn't ultimately tenable. That said, I did test this
fix with that test and it was successful. In any case, I think static
analysis should be sufficient here as the changes are pretty
straight-forward.


> MQTT can deadlock on client connection / disconnection
> ------------------------------------------------------
>
>                 Key: ARTEMIS-3622
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3622
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: MQTT
>    Affects Versions: 2.19.0
>         Environment: Using the latest java 17 and artemis 2.19 but looking at 
> the code, it should affect 2.20 as well.
>            Reporter: Marcelo Takeshi Fukushima
>            Assignee: Justin Bertram
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> It seems that the {{MQTTProtocolHandler}} and {{MQTTConnectionManager}} are 
> on a racing condition and can deadlock themselves on misbehaving clients. I'm 
> including the relevant stack trace (ignore thread 11 that is just waiting for 
> the lock).
> Looking at the relevant code, it seems that the clean-up thread (88 on the 
> {{{}MQTTFailureListener{}}}) starts cleaning up the session state and then 
> the session, but when {{MQTTSession.stop}} calls 
> {{{}MQTTSessionState.clear{}}}, the session state is no longer the same (a 
> racy connection has replaced the session state with a brand new under the 
> same client-id).
> I think the methods connect and disconnect on the {{MQTTConnectionManager}} 
> could be marked as synchronized as a whole, to prevent racy connects / 
> disconnects (but since I don't know all the ins and outs of the code, you 
> guys might have a better fix).
> {noformat}
> Found one Java-level deadlock:
> =============================
> "Thread-11 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@640f11a1)":
>   waiting to lock monitor 0x00007f6d003368c0 (object 0x000000045f29f240, a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState),
>   which is held by "Thread-24 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@640f11a1)"
> "Thread-24 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@640f11a1)":
>   waiting to lock monitor 0x00007f6d00336a80 (object 0x000000045f2a1068, a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSession),
>   which is held by "Thread-88 
> (ActiveMQ-remoting-threads-ActiveMQServerImpl::name=0.0.0.0-212232499)"
> "Thread-88 
> (ActiveMQ-remoting-threads-ActiveMQServerImpl::name=0.0.0.0-212232499)":
>   waiting to lock monitor 0x00007f6d003368c0 (object 0x000000045f29f240, a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState),
>   which is held by "Thread-24 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@640f11a1)"
> Java stack information for the threads listed above:
> ===================================================
> "Thread-11 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@640f11a1)":
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTConnectionManager.disconnect(MQTTConnectionManager.java:150)
>   - waiting to lock <0x000000045f29f240> (a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTFailureListener.connectionFailed(MQTTFailureListener.java:37)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTConnection.fail(MQTTConnection.java:150)
>   at 
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl$FailureCheckAndFlushThread$2.run(RemotingServiceImpl.java:780)
>   at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
>   at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$137/0x0000000800e01dc8.run(Unknown
>  Source)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@17.0.1/ThreadPoolExecutor.java:1136)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@17.0.1/ThreadPoolExecutor.java:635)
>   at 
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
> "Thread-24 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@640f11a1)":
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSession.start(MQTTSession.java:87)
>   - waiting to lock <0x000000045f2a1068> (a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSession)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTConnectionManager.connect(MQTTConnectionManager.java:111)
>   - locked <0x000000045f29f240> (a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTProtocolHandler.handleConnect(MQTTProtocolHandler.java:185)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTProtocolHandler.act(MQTTProtocolHandler.java:133)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTProtocolHandler$$Lambda$382/0x0000000801034000.onMessage(Unknown
>  Source)
>   at org.apache.activemq.artemis.utils.actors.Actor.doTask(Actor.java:33)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$137/0x0000000800e01dc8.run(Unknown
>  Source)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@17.0.1/ThreadPoolExecutor.java:1136)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@17.0.1/ThreadPoolExecutor.java:635)
>   at 
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
> "Thread-88 
> (ActiveMQ-remoting-threads-ActiveMQServerImpl::name=0.0.0.0-212232499)":
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState.clear(MQTTSessionState.java:59)
>   - waiting to lock <0x000000045f29f240> (a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSession.clean(MQTTSession.java:200)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSession.stop(MQTTSession.java:115)
>   - locked <0x000000045f2a1068> (a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSession)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTConnectionManager.disconnect(MQTTConnectionManager.java:155)
>   - locked <0x00000002223bc2d0> (a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTFailureListener.connectionFailed(MQTTFailureListener.java:37)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTConnection.fail(MQTTConnection.java:150)
>   at 
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl.issueFailure(RemotingServiceImpl.java:606)
>   at 
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl.connectionDestroyed(RemotingServiceImpl.java:587)
>   at 
> org.apache.activemq.artemis.core.remoting.impl.netty.NettyAcceptor$Listener.connectionDestroyed(NettyAcceptor.java:951)
>   at 
> org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler.lambda$channelInactive$0(ActiveMQChannelHandler.java:89)
>   at 
> org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler$$Lambda$419/0x0000000801103b48.run(Unknown
>  Source)
>   at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
>   at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$137/0x0000000800e01dc8.run(Unknown
>  Source)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@17.0.1/ThreadPoolExecutor.java:1136)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@17.0.1/ThreadPoolExecutor.java:635)
>   at 
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
> Found 1 deadlock.{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to