[ 
https://issues.apache.org/jira/browse/ARTEMIS-3622?focusedWorklogId=917049&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-917049
 ]

ASF GitHub Bot logged work on ARTEMIS-3622:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Apr/24 16:11
            Start Date: 30/Apr/24 16:11
    Worklog Time Spent: 10m 
      Work Description: jbertram opened a new pull request, #4909:
URL: https://github.com/apache/activemq-artemis/pull/4909

   This commit fixes the deadlock described on ARTEMIS-3622 by moving the 
synchronization "up" a level from the MQTTSession to the MQTTConnectionManager. 
It also eliminates the synchronization on the MQTTSessionState in the 
MQTTConnectionManager because it's no longer needed. This change should not 
only eliminate the deadlock, but improve performance relatively as well.
   
   There is no test associated with this commit as I wasn't able to reproduce 
the deadlock with any kind of straight-forward test. There was a test linked on 
the Jira, but it involved intrusive and fragile scaffolding and wasn't 
ultimately tenable. That said, I did test this fix with that test and it was 
successful. In any case, I think static analysis should be sufficient here as 
the changes are pretty straight-forward.




Issue Time Tracking
-------------------

            Worklog Id:     (was: 917049)
    Remaining Estimate: 0h
            Time Spent: 10m

> MQTT can deadlock on client connection / disconnection
> ------------------------------------------------------
>
>                 Key: ARTEMIS-3622
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3622
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: MQTT
>    Affects Versions: 2.19.0
>         Environment: Using the latest java 17 and artemis 2.19 but looking at 
> the code, it should affect 2.20 as well.
>            Reporter: Marcelo Takeshi Fukushima
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that the {{MQTTProtocolHandler}} and {{MQTTConnectionManager}} are 
> on a racing condition and can deadlock themselves on misbehaving clients. I'm 
> including the relevant stack trace (ignore thread 11 that is just waiting for 
> the lock).
> Looking at the relevant code, it seems that the clean-up thread (88 on the 
> {{{}MQTTFailureListener{}}}) starts cleaning up the session state and then 
> the session, but when {{MQTTSession.stop}} calls 
> {{{}MQTTSessionState.clear{}}}, the session state is no longer the same (a 
> racy connection has replaced the session state with a brand new under the 
> same client-id).
> I think the methods connect and disconnect on the {{MQTTConnectionManager}} 
> could be marked as synchronized as a whole, to prevent racy connects / 
> disconnects (but since I don't know all the ins and outs of the code, you 
> guys might have a better fix).
> {noformat}
> Found one Java-level deadlock:
> =============================
> "Thread-11 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@640f11a1)":
>   waiting to lock monitor 0x00007f6d003368c0 (object 0x000000045f29f240, a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState),
>   which is held by "Thread-24 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@640f11a1)"
> "Thread-24 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@640f11a1)":
>   waiting to lock monitor 0x00007f6d00336a80 (object 0x000000045f2a1068, a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSession),
>   which is held by "Thread-88 
> (ActiveMQ-remoting-threads-ActiveMQServerImpl::name=0.0.0.0-212232499)"
> "Thread-88 
> (ActiveMQ-remoting-threads-ActiveMQServerImpl::name=0.0.0.0-212232499)":
>   waiting to lock monitor 0x00007f6d003368c0 (object 0x000000045f29f240, a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState),
>   which is held by "Thread-24 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@640f11a1)"
> Java stack information for the threads listed above:
> ===================================================
> "Thread-11 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@640f11a1)":
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTConnectionManager.disconnect(MQTTConnectionManager.java:150)
>   - waiting to lock <0x000000045f29f240> (a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTFailureListener.connectionFailed(MQTTFailureListener.java:37)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTConnection.fail(MQTTConnection.java:150)
>   at 
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl$FailureCheckAndFlushThread$2.run(RemotingServiceImpl.java:780)
>   at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
>   at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$137/0x0000000800e01dc8.run(Unknown
>  Source)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
>   at 
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
> "Thread-24 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@640f11a1)":
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSession.start(MQTTSession.java:87)
>   - waiting to lock <0x000000045f2a1068> (a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSession)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTConnectionManager.connect(MQTTConnectionManager.java:111)
>   - locked <0x000000045f29f240> (a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTProtocolHandler.handleConnect(MQTTProtocolHandler.java:185)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTProtocolHandler.act(MQTTProtocolHandler.java:133)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTProtocolHandler$$Lambda$382/0x0000000801034000.onMessage(Unknown
>  Source)
>   at org.apache.activemq.artemis.utils.actors.Actor.doTask(Actor.java:33)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$137/0x0000000800e01dc8.run(Unknown
>  Source)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
>   at 
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
> "Thread-88 
> (ActiveMQ-remoting-threads-ActiveMQServerImpl::name=0.0.0.0-212232499)":
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState.clear(MQTTSessionState.java:59)
>   - waiting to lock <0x000000045f29f240> (a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSession.clean(MQTTSession.java:200)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSession.stop(MQTTSession.java:115)
>   - locked <0x000000045f2a1068> (a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSession)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTConnectionManager.disconnect(MQTTConnectionManager.java:155)
>   - locked <0x00000002223bc2d0> (a 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTSessionState)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTFailureListener.connectionFailed(MQTTFailureListener.java:37)
>   at 
> org.apache.activemq.artemis.core.protocol.mqtt.MQTTConnection.fail(MQTTConnection.java:150)
>   at 
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl.issueFailure(RemotingServiceImpl.java:606)
>   at 
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl.connectionDestroyed(RemotingServiceImpl.java:587)
>   at 
> org.apache.activemq.artemis.core.remoting.impl.netty.NettyAcceptor$Listener.connectionDestroyed(NettyAcceptor.java:951)
>   at 
> org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler.lambda$channelInactive$0(ActiveMQChannelHandler.java:89)
>   at 
> org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler$$Lambda$419/0x0000000801103b48.run(Unknown
>  Source)
>   at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
>   at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65)
>   at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$137/0x0000000800e01dc8.run(Unknown
>  Source)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
>   at 
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
> Found 1 deadlock.{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to