[ 
https://issues.apache.org/jira/browse/GEODE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Smith updated GEODE-7055:
-----------------------------
    Description: 
An error/exception occurs on the P2P message thread, which requires a 
FailureReply be sent, but the StartupResponse message has not been recieved (on 
the P2P message thread) the failure reply will DEADLOCK on the call to
org.apache.geode.distributed.internal.ClusterDistributionManager.waitUntilReadyToSendMsgs
as the StartupOperation is already in a waitForReplies() for the StartupResponse
{code:java}
// below is an example of an Exception triggering the DEADLOCK
{code}
 
{code:java}
[fatal 2019/08/05 22:47:06.462 UTC <P2P message reader for 
10.0.8.10(cacheserver-28663bad-c0b0-41f7-b723-5a2425fa54ff:1)<v5>:56152(version:GEODE
 1.9.0) shared unordered uid=63 port=49194> tid=0x25] Error deserializing 
message
java.lang.ClassNotFoundException: 
org.apache.geode.modules.util.BootstrappingFunction
        at 
org.apache.geode.internal.ClassPathLoader.forName(ClassPathLoader.java:180)
        at 
org.apache.geode.internal.InternalDataSerializer.getCachedClass(InternalDataSerializer.java:3274)
        at org.apache.geode.DataSerializer.readClass(DataSerializer.java:264)
        at 
org.apache.geode.internal.InternalDataSerializer.readDataSerializable(InternalDataSerializer.java:2398)
        at 
org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2673)
        at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2968)
        at 
org.apache.geode.internal.cache.MemberFunctionStreamingMessage.fromData(MemberFunctionStreamingMessage.java:277)
        at 
org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2372)
        at org.apache.geode.internal.DSFIDFactory.create(DSFIDFactory.java:997)
        at 
org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2516)
        at 
org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2528)
        at 
org.apache.geode.internal.tcp.Connection.readMessage(Connection.java:3111)
        at 
org.apache.geode.internal.tcp.Connection.processInputBuffer(Connection.java:2920)
        at 
org.apache.geode.internal.tcp.Connection.readMessages(Connection.java:1745)
        at org.apache.geode.internal.tcp.Connection.run(Connection.java:1577)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

        "P2P message reader for 
10.0.8.10(cacheserver-28663bad-c0b0-41f7-b723-5a2425fa54ff:1)<v5>:56152(version:GEODE
 1.9.0) shared unordered uid=63 port=49194" #37 daemon prio=10 os_prio=0 
tid=0x00007f4a108bb800 nid=0x2a in Object.wait() [0x00007f4a0dca7000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000006d39c4538> (a java.lang.Object)
        at java.lang.Object.wait(Object.java:502)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.waitUntilReadyToSendMsgs(ClusterDistributionManager.java:1212)
        - locked <0x00000006d39c4538> (a java.lang.Object)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2816)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1528)
        at 
org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:113)
        at 
org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:86)
        at 
org.apache.geode.internal.tcp.Connection.sendFailureReply(Connection.java:1954)
        at 
org.apache.geode.internal.tcp.Connection.readMessage(Connection.java:3162)
        at 
org.apache.geode.internal.tcp.Connection.processInputBuffer(Connection.java:2920)
        at 
org.apache.geode.internal.tcp.Connection.readMessages(Connection.java:1745)
        at org.apache.geode.internal.tcp.Connection.run(Connection.java:1577)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
{code}
 

  was:
An error/exception occurs on the P2P message thread, which requires a 
FailureReply be sent, but the StartupResponse message has not been recieved (on 
the P2P message thread) the failure reply will DEADLOCK on the call to
org.apache.geode.distributed.internal.ClusterDistributionManager.waitUntilReadyToSendMsgs
as the StartupOperation is already in a waitForReplies() for the StartupResponse
{code:java}
// below is an example of an Exception triggering the DEADLOCK
{code}
 
{code:java}
[fatal 2019/08/05 22:47:06.462 UTC <P2P message reader for 
10.0.8.10(cacheserver-28663bad-c0b0-41f7-b723-5a2425fa54ff:1)<v5>:56152(version:GEODE
 1.9.0) shared unordered uid=63 port=49194> tid=0x25] Error deserializing 
message java.lang.ClassNotFoundException: 
org.apache.geode.modules.util.BootstrappingFunction        at 
org.apache.geode.internal.ClassPathLoader.forName(ClassPathLoader.java:180)     
   at 
org.apache.geode.internal.InternalDataSerializer.getCachedClass(InternalDataSerializer.java:3274)
        at org.apache.geode.DataSerializer.readClass(DataSerializer.java:264)   
     at 
org.apache.geode.internal.InternalDataSerializer.readDataSerializable(InternalDataSerializer.java:2398)
        at 
org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2673)
        at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2968) 
       at 
org.apache.geode.internal.cache.MemberFunctionStreamingMessage.fromData(MemberFunctionStreamingMessage.java:277)
        at 
org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2372)
        at org.apache.geode.internal.DSFIDFactory.create(DSFIDFactory.java:997) 
       at 
org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2516)
        at 
org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2528)
        at 
org.apache.geode.internal.tcp.Connection.readMessage(Connection.java:3111)      
  at 
org.apache.geode.internal.tcp.Connection.processInputBuffer(Connection.java:2920)
        at 
org.apache.geode.internal.tcp.Connection.readMessages(Connection.java:1745)     
   at org.apache.geode.internal.tcp.Connection.run(Connection.java:1577)        
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
       at java.lang.Thread.run(Thread.java:748)        "P2P message reader for 
10.0.8.10(cacheserver-28663bad-c0b0-41f7-b723-5a2425fa54ff:1)<v5>:56152(version:GEODE
 1.9.0) shared unordered uid=63 port=49194" #37 daemon prio=10 os_prio=0 
tid=0x00007f4a108bb800 nid=0x2a in Object.wait() [0x00007f4a0dca7000]   
java.lang.Thread.State: WAITING (on object monitor)        at 
java.lang.Object.wait(Native Method)        - waiting on <0x00000006d39c4538> 
(a java.lang.Object)        at java.lang.Object.wait(Object.java:502)        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.waitUntilReadyToSendMsgs(ClusterDistributionManager.java:1212)
        - locked <0x00000006d39c4538> (a java.lang.Object)        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2816)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1528)
        at 
org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:113)  
      at 
org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:86)   
     at 
org.apache.geode.internal.tcp.Connection.sendFailureReply(Connection.java:1954) 
       at 
org.apache.geode.internal.tcp.Connection.readMessage(Connection.java:3162)      
  at 
org.apache.geode.internal.tcp.Connection.processInputBuffer(Connection.java:2920)
        at 
org.apache.geode.internal.tcp.Connection.readMessages(Connection.java:1745)     
   at org.apache.geode.internal.tcp.Connection.run(Connection.java:1577)        
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
       at java.lang.Thread.run(Thread.java:748){code}
 


> Deadlock with StartupMessages if P2P error requiring a sendFailureReply 
> ------------------------------------------------------------------------
>
>                 Key: GEODE-7055
>                 URL: https://issues.apache.org/jira/browse/GEODE-7055
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Ernest Burghardt
>            Priority: Major
>
> An error/exception occurs on the P2P message thread, which requires a 
> FailureReply be sent, but the StartupResponse message has not been recieved 
> (on the P2P message thread) the failure reply will DEADLOCK on the call to
> org.apache.geode.distributed.internal.ClusterDistributionManager.waitUntilReadyToSendMsgs
> as the StartupOperation is already in a waitForReplies() for the 
> StartupResponse
> {code:java}
> // below is an example of an Exception triggering the DEADLOCK
> {code}
>  
> {code:java}
> [fatal 2019/08/05 22:47:06.462 UTC <P2P message reader for 
> 10.0.8.10(cacheserver-28663bad-c0b0-41f7-b723-5a2425fa54ff:1)<v5>:56152(version:GEODE
>  1.9.0) shared unordered uid=63 port=49194> tid=0x25] Error deserializing 
> message
> java.lang.ClassNotFoundException: 
> org.apache.geode.modules.util.BootstrappingFunction
>         at 
> org.apache.geode.internal.ClassPathLoader.forName(ClassPathLoader.java:180)
>         at 
> org.apache.geode.internal.InternalDataSerializer.getCachedClass(InternalDataSerializer.java:3274)
>         at org.apache.geode.DataSerializer.readClass(DataSerializer.java:264)
>         at 
> org.apache.geode.internal.InternalDataSerializer.readDataSerializable(InternalDataSerializer.java:2398)
>         at 
> org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2673)
>         at 
> org.apache.geode.DataSerializer.readObject(DataSerializer.java:2968)
>         at 
> org.apache.geode.internal.cache.MemberFunctionStreamingMessage.fromData(MemberFunctionStreamingMessage.java:277)
>         at 
> org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2372)
>         at 
> org.apache.geode.internal.DSFIDFactory.create(DSFIDFactory.java:997)
>         at 
> org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2516)
>         at 
> org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2528)
>         at 
> org.apache.geode.internal.tcp.Connection.readMessage(Connection.java:3111)
>         at 
> org.apache.geode.internal.tcp.Connection.processInputBuffer(Connection.java:2920)
>         at 
> org.apache.geode.internal.tcp.Connection.readMessages(Connection.java:1745)
>         at org.apache.geode.internal.tcp.Connection.run(Connection.java:1577)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
>         "P2P message reader for 
> 10.0.8.10(cacheserver-28663bad-c0b0-41f7-b723-5a2425fa54ff:1)<v5>:56152(version:GEODE
>  1.9.0) shared unordered uid=63 port=49194" #37 daemon prio=10 os_prio=0 
> tid=0x00007f4a108bb800 nid=0x2a in Object.wait() [0x00007f4a0dca7000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000006d39c4538> (a java.lang.Object)
>         at java.lang.Object.wait(Object.java:502)
>         at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.waitUntilReadyToSendMsgs(ClusterDistributionManager.java:1212)
>         - locked <0x00000006d39c4538> (a java.lang.Object)
>         at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2816)
>         at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1528)
>         at 
> org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:113)
>         at 
> org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:86)
>         at 
> org.apache.geode.internal.tcp.Connection.sendFailureReply(Connection.java:1954)
>         at 
> org.apache.geode.internal.tcp.Connection.readMessage(Connection.java:3162)
>         at 
> org.apache.geode.internal.tcp.Connection.processInputBuffer(Connection.java:2920)
>         at 
> org.apache.geode.internal.tcp.Connection.readMessages(Connection.java:1745)
>         at org.apache.geode.internal.tcp.Connection.run(Connection.java:1577)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to