[ 
https://issues.apache.org/jira/browse/CASSSIDECAR-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18046021#comment-18046021
 ] 

Sudipta Laha commented on CASSSIDECAR-390:
------------------------------------------

Circle CI link: 
https://app.circleci.com/pipelines/github/sklaha/cassandra-sidecar/57/workflows/4d6d9d4d-d4b7-49ee-9cd9-a76eb7d91b41

> Deadlock during JMX reconnection in sidecar
> -------------------------------------------
>
>                 Key: CASSSIDECAR-390
>                 URL: https://issues.apache.org/jira/browse/CASSSIDECAR-390
>             Project: Sidecar for Apache Cassandra
>          Issue Type: Bug
>          Components: Rest API
>            Reporter: Sudipta Laha
>            Assignee: Sudipta Laha
>            Priority: Major
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A specific condition in the sidecar causes a deadlock during JMX 
> reconnection. This deadlock occurs in the ClientCommunicatorAdmin.restart 
> method under the following scenario:
>  
>  * JMX connection undergoes reconnection.
>  * A notification handler for connection status changes executes JMX calls 
> during reconnection.
>  * The JMX call fails due to an IOException.
>  
>  
> As a result, any threads attempting to access the JMX connection are blocked.
>  
> {color:#000000}Here is a stack trace for the deadlocked thread:{color}
>  
> {code:java}
> "JMX client heartbeat 5" #301 daemon prio=5 os_prio=0 cpu=516.04ms 
> elapsed=414591.85s tid=0x00007f4f0473f030 nid=0x5bb1 in Object.wait()  
> [0x00007f4d543fd000]   java.lang.Thread.State: WAITING (on object monitor)  
> at java.lang.Object.wait([email protected]/Native Method)       - waiting on 
> <no object reference available>    at 
> java.lang.Object.wait([email protected]/Object.java:328)     at 
> com.sun.jmx.remote.internal.ClientCommunicatorAdmin.restart([email protected]/ClientCommunicatorAdmin.java:107)
>         - waiting to re-lock in wait() <0x00000006041e54e8> (a [I)      at 
> com.sun.jmx.remote.internal.ClientCommunicatorAdmin.gotIOException([email protected]/ClientCommunicatorAdmin.java:59)
>   at 
> javax.management.remote.rmi.RMIConnector$RMIClientCommunicatorAdmin.gotIOException([email protected]/RMIConnector.java:1497)
>        at 
> javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute([email protected]/RMIConnector.java:908)
>  at 
> javax.management.MBeanServerInvocationHandler.invoke([email protected]/MBeanServerInvocationHandler.java:273)
>   at com.sun.proxy.$Proxy71.getTokens(Unknown Source)     at 
> org.apache.cassandra.sidecar.cluster.CassandraAdapterDelegate.maybeGetTokens(CassandraAdapterDelegate.java:329)
>       at 
> org.apache.cassandra.sidecar.cluster.CassandraAdapterDelegate.newNodeSettingsFromJmx(CassandraAdapterDelegate.java:305)
>       at 
> org.apache.cassandra.sidecar.cluster.CassandraAdapterDelegate.jmxHealthCheck(CassandraAdapterDelegate.java:211)
>       at 
> org.apache.cassandra.sidecar.cluster.CassandraAdapterDelegate$JmxNotificationListener.handleNotification(CassandraAdapterDelegate.java:579)
>   at 
> org.apache.cassandra.sidecar.common.server.JmxClient.lambda$forwardNotification$0(JmxClient.java:242)
>         at 
> org.apache.cassandra.sidecar.common.server.JmxClient$$Lambda$1978/0x0000000800edb440.accept(Unknown
>  Source)  at 
> java.util.concurrent.ConcurrentHashMap$KeySetView.forEach([email protected]/ConcurrentHashMap.java:4698)
>      at 
> java.util.Collections$SetFromMap.forEach([email protected]/Collections.java:5581)
>     at 
> org.apache.cassandra.sidecar.common.server.JmxClient.forwardNotification(JmxClient.java:242)
>  at 
> org.apache.cassandra.sidecar.common.server.JmxClient.handleNotification(JmxClient.java:235)
>   at 
> javax.management.NotificationBroadcasterSupport.handleNotification([email protected]/NotificationBroadcasterSupport.java:275)
>   at 
> javax.management.NotificationBroadcasterSupport$SendNotifJob.run([email protected]/NotificationBroadcasterSupport.java:352)
>     at 
> javax.management.NotificationBroadcasterSupport$1.execute([email protected]/NotificationBroadcasterSupport.java:337)
>    at 
> javax.management.NotificationBroadcasterSupport.sendNotification([email protected]/NotificationBroadcasterSupport.java:248)
>     at 
> javax.management.remote.rmi.RMIConnector.sendNotification([email protected]/RMIConnector.java:442)
>  at 
> javax.management.remote.rmi.RMIConnector$RMIClientCommunicatorAdmin.doStart([email protected]/RMIConnector.java:1670)
>       at 
> com.sun.jmx.remote.internal.ClientCommunicatorAdmin.restart([email protected]/ClientCommunicatorAdmin.java:132)
>         at 
> com.sun.jmx.remote.internal.ClientCommunicatorAdmin.gotIOException([email protected]/ClientCommunicatorAdmin.java:59)
>   at 
> javax.management.remote.rmi.RMIConnector$RMIClientCommunicatorAdmin.gotIOException([email protected]/RMIConnector.java:1497)
>        at 
> com.sun.jmx.remote.internal.ClientCommunicatorAdmin$Checker.run([email protected]/ClientCommunicatorAdmin.java:204)
>     at java.lang.Thread.run([email protected]/Thread.java:829) {code}
> Stack trace for a blocked thread:
>  
> {code:java}
> [vertx-blocked-thread-checker] io.vertx.core.impl.BlockedThreadChecker - 
> Thread Thread[sidecar-internal-worker-pool-12,5,main] has been blocked for 
> 44266107 ms, time limit is 300000 msio.vertx.core.VertxException: Thread 
> blocked  at java.lang.Object.wait(Native Method) ~[?:?]  at 
> java.lang.Object.wait(Object.java:328) ~[?:?]        at 
> com.sun.jmx.remote.internal.ClientCommunicatorAdmin.restart(ClientCommunicatorAdmin.java:107)
>  ~[?:?] at 
> com.sun.jmx.remote.internal.ClientCommunicatorAdmin.gotIOException(ClientCommunicatorAdmin.java:59)
>  ~[?:?]   at 
> javax.management.remote.rmi.RMIConnector$RMIClientCommunicatorAdmin.gotIOException(RMIConnector.java:1497)
>  ~[?:?]    at 
> javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1027)
>  ~[?:?]   at 
> javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:298)
>  ~[?:?]   at com.sun.proxy.$Proxy89.importNewSSTables(Unknown Source) ~[?:?]  
>     at 
> org.apache.cassandra.sidecar.adapters.base.CassandraTableOperations.importNewSSTables(CassandraTableOperations.java:57)
>  ~[adapters-base-1.0.0.111-aci-cassandra.jar:?]       at 
> org.apache.cassandra.sidecar.utils.SSTableImporter.drainImportQueue(SSTableImporter.java:240)
>  ~[cassandra-sidecar-1.0.0.111-aci-cassandra.jar:?]     at 
> org.apache.cassandra.sidecar.utils.SSTableImporter.maybeDrainImportQueue(SSTableImporter.java:188)
>  ~[cassandra-sidecar-1.0.0.111-aci-cassandra.jar:?]        at 
> org.apache.cassandra.sidecar.utils.SSTableImporter.lambda$processPendingImports$1(SSTableImporter.java:170)
>  ~[cassandra-sidecar-1.0.0.111-aci-cassandra.jar:?]       at 
> org.apache.cassandra.sidecar.utils.SSTableImporter$$Lambda$1924/0x0000000800eac440.run(Unknown
>  Source) ~[?:?]        at 
> org.apache.cassandra.sidecar.concurrent.TaskExecutorPool.lambda$runBlocking$4(TaskExecutorPool.java:198)
>  ~[cassandra-sidecar-1.0.0.111-aci-cassandra.jar:?]  at 
> org.apache.cassandra.sidecar.concurrent.TaskExecutorPool$$Lambda$627/0x0000000800a57040.call(Unknown
>  Source) ~[?:?]  at 
> io.vertx.core.impl.ContextImpl.lambda$executeBlocking$0(ContextImpl.java:178) 
> ~[vertx-core-4.5.7.jar:4.5.7]  at 
> io.vertx.core.impl.ContextImpl$$Lambda$465/0x0000000800978840.handle(Unknown 
> Source) ~[?:?]  at 
> io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:279) 
> ~[vertx-core-4.5.7.jar:4.5.7]  at 
> io.vertx.core.impl.ContextImpl.lambda$internalExecuteBlocking$2(ContextImpl.java:210)
>  ~[vertx-core-4.5.7.jar:4.5.7]  at 
> io.vertx.core.impl.ContextImpl$$Lambda$466/0x0000000800979040.run(Unknown 
> Source) ~[?:?]     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]       at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  ~[netty-common-4.1.111.Final.jar:4.1.111.Final]        at 
> java.lang.Thread.run(Thread.java:829) ~[?:?] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to