[
https://issues.apache.org/jira/browse/CASSANDRA-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Bailey updated CASSANDRA-6373:
-----------------------------------
Attachment: jstack.txt
Attached the output of jstack.
> describe_ring hangs with hsha thrift server
> -------------------------------------------
>
> Key: CASSANDRA-6373
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6373
> Project: Cassandra
> Issue Type: Bug
> Reporter: Nick Bailey
> Assignee: Pavel Yaskevich
> Fix For: 2.0.4
>
> Attachments: describe_ring_failure.patch, jstack.txt
>
>
> There is a strange bug with the thrift hsha server in 2.0 (we switched to
> lmax disruptor server).
> The bug is that the first call to describe_ring from one connection will hang
> indefinitely when the client is not connecting from localhost (or it at least
> looks like the client is not on the same host). Additionally the cluster must
> be using vnodes. When connecting from localhost the first call will work as
> expected. And in either case subsequent calls from the same connection will
> work as expected. According to git bisect the bad commit is the switch to the
> lmax disruptor server:
> https://github.com/apache/cassandra/commit/98eec0a223251ecd8fec7ecc9e46b05497d631c6
> I've attached the patch I used to reproduce the error in the unit tests. The
> command to reproduce is:
> {noformat}
> PYTHONPATH=test nosetests
> --tests=system.test_thrift_server:TestMutations.test_describe_ring
> {noformat}
> I reproduced on ec2 and a single machine by having the server bind to the
> private ip on ec2 and the client connect to the public ip (so it appears as
> if the client is non local). I've also reproduced with two different vms
> though.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)