[ 
https://issues.apache.org/jira/browse/CASSANDRA-778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Dusbabek updated CASSANDRA-778:
------------------------------------

    Attachment:     (was: 0001-fix-deadlock.patch)

> Gossiper thread deadlock
> ------------------------
>
>                 Key: CASSANDRA-778
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-778
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Gary Dusbabek
>            Assignee: Gary Dusbabek
>             Fix For: 0.6
>
>         Attachments: 0001-fix-deadlock.patch
>
>
> Found this while attempting to bootstrap a node with more than a trivial 
> amount of data:
> Found one Java-level deadlock:
> =============================
> "GMFD:1":
>   waiting to lock monitor 0x0000000100861d60 (object 0x00000001066a7ed8, a 
> org.apache.cassandra.service.StorageService),
>   which is held by "main"
> "main":
>   waiting to lock monitor 0x0000000100860710 (object 0x0000000106c7c968, a 
> org.apache.cassandra.gms.Gossiper),
>   which is held by "GMFD:1"
> Java stack information for the threads listed above:
> ===================================================
> "GMFD:1":
>       at 
> org.apache.cassandra.service.StorageService.getReplicationStrategy(StorageService.java:226)
>       - waiting to lock <0x00000001066a7ed8> (a 
> org.apache.cassandra.service.StorageService)
>       at 
> org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:634)
>       at 
> org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:502)
>       at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:445)
>       at 
> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:812)
>       at 
> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:607)
>       at org.apache.cassandra.gms.Gossiper.handleNewJoin(Gossiper.java:582)
>       at 
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:649)
>       - locked <0x0000000106c7c968> (a org.apache.cassandra.gms.Gossiper)
>       at 
> org.apache.cassandra.gms.Gossiper$GossipDigestAck2VerbHandler.doVerb(Gossiper.java:1061)
>       at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:40)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>       at java.lang.Thread.run(Thread.java:637)
> "main":
>       at 
> org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:861)
>       - waiting to lock <0x0000000106c7c968> (a 
> org.apache.cassandra.gms.Gossiper)
>       at 
> org.apache.cassandra.service.StorageService.startBootstrap(StorageService.java:347)
>       at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:318)
>       - locked <0x00000001066a7ed8> (a 
> org.apache.cassandra.service.StorageService)
>       at 
> org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:99)
>       at 
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:174)
> Found 1 deadlock.
> main acquires SS lock and doesn't release it before attempting to acquire the 
> Gossiper lock.  Meanwhile, the gossip stage acquires the Gossiper lock and 
> then attempts to acquire the SS lock.
> Solution is to have finer-grained locking on the resource in SS (map of 
> replication strategies), or to move the collection to a different class (DD 
> maybe?).  This was introduced in CASSANDRA-620.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to