?
From: Mike Torra [mailto:mto...@salesforce.com]
Sent: 13 February 2018 15:10
To: user@cassandra.apache.org
Subject: Re: node restart causes application latency
Then could it be that calling `nodetool drain` after calling `nodetool
disablegossip` is what causes the problem?
On Mon, Feb 12, 2018
Then could it be that calling `nodetool drain` after calling `nodetool
disablegossip` is what causes the problem?
On Mon, Feb 12, 2018 at 6:12 PM, kurt greaves wrote:
>
> ​Actually, it's not really clear to me why disablebinary and thrift are
> necessary prior to drain,
​Actually, it's not really clear to me why disablebinary and thrift are
necessary prior to drain, because they happen in the same order during
drain anyway. It also really doesn't make sense that disabling gossip after
drain would make a difference here, because it should be already stopped.
This
Drain will take care of stopping gossip, and does a few tasks before
stopping gossip (stops batchlog, hints, auth, cache saver and a few other
things). I'm not sure why this causes a side effect when you restart the
node, but there should be no need to issue a disablegossip anyway, just
leave that
Interestingly, it seems that changing the order of steps I take during the
node restart resolves the problem. Instead of:
`nodetool disablebinary && nodetool disablethrift && *nodetool
disablegossip* && nodetool drain && sudo service cassandra restart`,
if I do:
`nodetool disablebinary &&
Any other ideas? If I simply stop the node, there is no latency problem,
but once I start the node the problem appears. This happens consistently
for all nodes in the cluster
On Wed, Feb 7, 2018 at 11:36 AM, Mike Torra wrote:
> No, I am not
>
> On Wed, Feb 7, 2018 at
No, I am not
On Wed, Feb 7, 2018 at 11:35 AM, Jeff Jirsa wrote:
> Are you using internode ssl?
>
>
> --
> Jeff Jirsa
>
>
> On Feb 7, 2018, at 8:24 AM, Mike Torra wrote:
>
> Thanks for the feedback guys. That example data model was indeed
> abbreviated -
Are you using internode ssl?
--
Jeff Jirsa
> On Feb 7, 2018, at 8:24 AM, Mike Torra wrote:
>
> Thanks for the feedback guys. That example data model was indeed abbreviated
> - the real queries have the partition key in them. I am using RF 3 on the
> keyspace, so I
Thanks for the feedback guys. That example data model was indeed
abbreviated - the real queries have the partition key in them. I am using
RF 3 on the keyspace, so I don't think a node being down would mean the key
I'm looking for would be unavailable. The load balancing policy of the
driver seems
Unless you abbreviated, your data model is questionable (SELECT without any
equality in the WHERE clause on the partition key will always cause a range
scan, which is super inefficient). Since you're doing LOCAL_ONE and a range
scan, timeouts sorta make sense - the owner of at least one range
On 02/06/2018 12:58 PM, Mike Torra wrote:
>
> I restart a node like this:
>
> nodetool disablethrift && nodetool disablegossip && nodetool drain
> sudo service cassandra restart
Just a guess here - are you really only using thrift?
(ie. `nodetool disablebinary`)
> When I do that, I very often
Hi -
I am running a 29 node cluster spread over 4 DC's in EC2, using C* 3.11.1
on Ubuntu. Occasionally I have the need to restart nodes in the cluster,
but every time I do, I see errors and application (nodejs) timeouts.
I restart a node like this:
nodetool disablethrift && nodetool
12 matches
Mail list logo