Re: Cassandra 3.0.18 went OOM several hours after joining a cluster

2019-11-06 Thread Reid Pinchback
Just food for thought.

Elevated read requests won’t result in escalating pending compactions, except 
in the corner case where the reads trigger additional write work, like for a 
repair or lurking tombstones deemed droppable.  For a sustained growth in 
pending compactions, that’s not looking like random tripping over corner cases. 
  All an elevated read request rate would do, if it weren’t for an increasing 
number of sstables, is cause you to churn the chunk cache. Reads would be 
slower due to the cache misses but the memory footprint wouldn’t be that 
different.


From: "Steinmaurer, Thomas" 
Reply-To: "user@cassandra.apache.org" 
Date: Wednesday, November 6, 2019 at 2:43 PM
To: "user@cassandra.apache.org" 
Subject: RE: Cassandra 3.0.18 went OOM several hours after joining a cluster

Message from External Sender
Reid,

thanks for thoughts.

I agree with your last comment and I’m pretty sure/convinced that the 
increasing number of SSTables is causing the issue, although I’m not sure if 
compaction or read requests (after the node flipped from UJ to UN) or both, but 
I tend more towards client read requests resulting in accessing a high number 
of SSTables which basically results in ~ 2Mbyte on-heap usage per 
BigTableReader instance, with ~ 5K such object instances on the heap.

The big question for us is why this starts to pop-up with Cas 3.0 without 
seeing this with 2.1 in > 3 years production usage.

To avoid double work, I will try to continue providing additional information / 
thoughts on the Cassandra ticket.

Regards,
Thomas

From: Reid Pinchback 
Sent: Mittwoch, 06. November 2019 18:28
To: user@cassandra.apache.org
Subject: Re: Cassandra 3.0.18 went OOM several hours after joining a cluster

The other thing that comes to mind is that the increase in pending compactions 
suggests back pressure on compaction activity.  GC is only one possible source 
of that.  Between your throughput setting and how your disk I/O is set up, 
maybe that’s throttling you to a rate where the rate of added reasons for 
compactions > the rate of compactions completed.

In fact, the more that I think about it, I wonder about that a lot.

If you can’t keep up with compactions, then operations have to span more and 
more SSTables over time.  You’ll keep holding on to what you read, as you read 
more of them, until eventually…pop.


From: Reid Pinchback 
mailto:rpinchb...@tripadvisor.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Wednesday, November 6, 2019 at 12:11 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: Cassandra 3.0.18 went OOM several hours after joining a cluster

Message from External Sender
My first thought was that you were running into the merkle tree depth problem, 
but the details on the ticket don’t seem to confirm that.

It does look like eden is too small.   C* lives in Java’s GC pain point, a lot 
of medium-lifetime objects.  If you haven’t already done so, you’ll want to 
configure as many things to be off-heap as you can, but I’d definitely look at 
improving the ratio of eden to old gen, and see if you can get the young gen GC 
activity to be more successful at sweeping away the medium-lived objects.

All that really comes to mind is if you’re getting to a point where GC isn’t 
coping.  That can be hard to sometimes spot on metrics with coarse granularity. 
 Per-second metrics might show CPU cores getting pegged.

I’m not sure that GC tuning eliminates this problem, but if it isn’t being 
caused by that, GC tuning may at least improve the visibility of the underlying 
problem.

From: "Steinmaurer, Thomas" 
mailto:thomas.steinmau...@dynatrace.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Wednesday, November 6, 2019 at 11:27 AM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Cassandra 3.0.18 went OOM several hours after joining a cluster

Message from External Sender
Hello,

after moving from 2.1.18 to 3.0.18, we are facing OOM situations after several 
hours a node has successfully joined a cluster (via auto-bootstrap).

I have created the following ticket trying to describe the situation, including 
hprof / MAT screens: 

Re: Aws instance stop and star with ebs

2019-11-06 Thread Rahul Reddy
Thanks Daemeon ,

will do that and post the results.
I found jira in open state with similar issue
https://issues.apache.org/jira/browse/CASSANDRA-13984

On Wed, Nov 6, 2019 at 1:49 PM daemeon reiydelle  wrote:

> No connection timeouts? No tcp level retries? I am sorry truly sorry but
> you have exceeded my capability. I have never seen a java.io timeout with
> out either a session half open failure (no response) or multiple retries.
>
> I am out of my depth, so please feel free to ignore but, did you see the
> packets that are making the initial connection (which must have timed out)?
> Out of curiosity, a netstat -arn must be showing bad packets, timeouts,
> etc. To see progress, create a simple shell script that dumps date, dumps
> netstat, sleeps 100 seconds, repeated. During that window stop, wait 10
> seconds, restart the remove node.
>
> <==>
> Made weak by time and fate, but strong in will,
> To strive, to seek, to find, and not to yield.
> Ulysses - A. Lord Tennyson
>
> *Daemeon C.M. Reiydelle*
>
> *email: daeme...@gmail.com *
> *San Francisco 1.415.501.0198/Skype daemeon.c.m.reiydelle*
>
>
>
> On Wed, Nov 6, 2019 at 9:11 AM Rahul Reddy 
> wrote:
>
>> Thank you.
>>
>> I have stopped instance in east. i see that all other instances can
>> gossip to that instance and only one instance in west having issues
>> gossiping to that node.  when i enable debug mode i see below on the west
>> node
>>
>> i see bellow messages from 16:32 to 16:47
>> DEBUG [RMI TCP Connection(272)-127.0.0.1] 2019-11-06 16:44:50,
>> 417 StorageProxy.java:2361 - Hosts not in agreement. Didn't get a
>> response from everybody:
>> 424 StorageProxy.java:2361 - Hosts not in agreement. Didn't get a
>> response from everybody:
>>
>> later i see timeout
>>
>> DEBUG [MessagingService-Outgoing-/eastip-Gossip] 2019-11-06 16:47:04,831
>> OutboundTcpConnection.java:350 - Error writing to /eastip
>> java.io.IOException: Connection timed out
>>
>> then  INFO  [GossipStage:1] 2019-11-06 16:47:05,792 StorageService.j
>> ava:2289 - Node /eastip state jump to NORMAL
>>
>> DEBUG [GossipStage:1] 2019-11-06 16:47:06,244 MigrationManager
>> .java:99 - Not pulling schema from /eastip, because sche
>> ma versions match: local/real=cdbb639b-1675-31b3-8a0d-84aca18e
>> 86bf, local/compatible=49bf1daa-d585-38e0-a72b-b36ce82da9cb, r
>> emote=cdbb639b-1675-31b3-8a0d-84aca18e86bf
>>
>> i tried running some tcpdump during that time i dont see any packet loss
>> during that time.  still unsure why east instance which was stopped and
>> started unreachable to west node almost for 15 minutes.
>>
>>
>> On Tue, Nov 5, 2019 at 10:14 PM daemeon reiydelle 
>> wrote:
>>
>>> 10 minutes is 600 seconds, and there are several timeouts that are set
>>> to that, including the data center timeout as I recall.
>>>
>>> You may be forced to tcpdump the interface(s) to see where the chatter
>>> is. Out of curiosity, when you restart the node, have you snapped the jvm's
>>> memory to see if e.g. heap is even in use?
>>>
>>>
>>> On Tue, Nov 5, 2019 at 7:03 PM Rahul Reddy 
>>> wrote:
>>>
 Thanks Ben,
 Before stoping the ec2 I did run nodetool drain .so i ruled it out and
 system.log also doesn't show commitlogs being applied.





 On Tue, Nov 5, 2019, 7:51 PM Ben Slater 
 wrote:

> The logs between first start and handshaking should give you a
> clue but my first guess would be replaying commit logs.
>
> Cheers
> Ben
>
> ---
>
>
> *Ben Slater**Chief Product Officer*
>
> 
>
> 
> 
> 
>
> Read our latest technical blog posts here
> .
>
> This email has been sent on behalf of Instaclustr Pty. Limited
> (Australia) and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not 
> copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
> On Wed, 6 Nov 2019 at 04:36, Rahul Reddy 
> wrote:
>
>> I can reproduce the issue.
>>
>> I did drain Cassandra node then stop and started Cassandra instance .
>> Cassandra instance comes up but other nodes will be in DN state around 10
>> minutes.
>>
>> I don't see error in the systemlog
>>
>> DN  xx.xx.xx.59   420.85 MiB  256  48.2% id  2
>> UN  xx.xx.xx.30   432.14 MiB  256  50.0% id  0
>> UN  xx.xx.xx.79   447.33 MiB  256  51.1% id  4
>> DN  xx.xx.xx.144  452.59 MiB  256  51.6% id  1
>> DN  xx.xx.xx.19   431.7 MiB  256  50.1% 

RE: Cassandra 3.0.18 went OOM several hours after joining a cluster

2019-11-06 Thread Steinmaurer, Thomas
Reid,

thanks for thoughts.

I agree with your last comment and I’m pretty sure/convinced that the 
increasing number of SSTables is causing the issue, although I’m not sure if 
compaction or read requests (after the node flipped from UJ to UN) or both, but 
I tend more towards client read requests resulting in accessing a high number 
of SSTables which basically results in ~ 2Mbyte on-heap usage per 
BigTableReader instance, with ~ 5K such object instances on the heap.

The big question for us is why this starts to pop-up with Cas 3.0 without 
seeing this with 2.1 in > 3 years production usage.

To avoid double work, I will try to continue providing additional information / 
thoughts on the Cassandra ticket.

Regards,
Thomas

From: Reid Pinchback 
Sent: Mittwoch, 06. November 2019 18:28
To: user@cassandra.apache.org
Subject: Re: Cassandra 3.0.18 went OOM several hours after joining a cluster

The other thing that comes to mind is that the increase in pending compactions 
suggests back pressure on compaction activity.  GC is only one possible source 
of that.  Between your throughput setting and how your disk I/O is set up, 
maybe that’s throttling you to a rate where the rate of added reasons for 
compactions > the rate of compactions completed.

In fact, the more that I think about it, I wonder about that a lot.

If you can’t keep up with compactions, then operations have to span more and 
more SSTables over time.  You’ll keep holding on to what you read, as you read 
more of them, until eventually…pop.


From: Reid Pinchback 
mailto:rpinchb...@tripadvisor.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Wednesday, November 6, 2019 at 12:11 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: Cassandra 3.0.18 went OOM several hours after joining a cluster

Message from External Sender
My first thought was that you were running into the merkle tree depth problem, 
but the details on the ticket don’t seem to confirm that.

It does look like eden is too small.   C* lives in Java’s GC pain point, a lot 
of medium-lifetime objects.  If you haven’t already done so, you’ll want to 
configure as many things to be off-heap as you can, but I’d definitely look at 
improving the ratio of eden to old gen, and see if you can get the young gen GC 
activity to be more successful at sweeping away the medium-lived objects.

All that really comes to mind is if you’re getting to a point where GC isn’t 
coping.  That can be hard to sometimes spot on metrics with coarse granularity. 
 Per-second metrics might show CPU cores getting pegged.

I’m not sure that GC tuning eliminates this problem, but if it isn’t being 
caused by that, GC tuning may at least improve the visibility of the underlying 
problem.

From: "Steinmaurer, Thomas" 
mailto:thomas.steinmau...@dynatrace.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Wednesday, November 6, 2019 at 11:27 AM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Cassandra 3.0.18 went OOM several hours after joining a cluster

Message from External Sender
Hello,

after moving from 2.1.18 to 3.0.18, we are facing OOM situations after several 
hours a node has successfully joined a cluster (via auto-bootstrap).

I have created the following ticket trying to describe the situation, including 
hprof / MAT screens: 
https://issues.apache.org/jira/browse/CASSANDRA-15400

Would be great if someone could have a look.

Thanks a lot.

Thomas
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freistädterstraße 313
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If 

Re: Aws instance stop and star with ebs

2019-11-06 Thread daemeon reiydelle
No connection timeouts? No tcp level retries? I am sorry truly sorry but
you have exceeded my capability. I have never seen a java.io timeout with
out either a session half open failure (no response) or multiple retries.

I am out of my depth, so please feel free to ignore but, did you see the
packets that are making the initial connection (which must have timed out)?
Out of curiosity, a netstat -arn must be showing bad packets, timeouts,
etc. To see progress, create a simple shell script that dumps date, dumps
netstat, sleeps 100 seconds, repeated. During that window stop, wait 10
seconds, restart the remove node.

<==>
Made weak by time and fate, but strong in will,
To strive, to seek, to find, and not to yield.
Ulysses - A. Lord Tennyson

*Daemeon C.M. Reiydelle*

*email: daeme...@gmail.com *
*San Francisco 1.415.501.0198/Skype daemeon.c.m.reiydelle*



On Wed, Nov 6, 2019 at 9:11 AM Rahul Reddy  wrote:

> Thank you.
>
> I have stopped instance in east. i see that all other instances can gossip
> to that instance and only one instance in west having issues gossiping to
> that node.  when i enable debug mode i see below on the west node
>
> i see bellow messages from 16:32 to 16:47
> DEBUG [RMI TCP Connection(272)-127.0.0.1] 2019-11-06 16:44:50,
> 417 StorageProxy.java:2361 - Hosts not in agreement. Didn't get a response
> from everybody:
> 424 StorageProxy.java:2361 - Hosts not in agreement. Didn't get a response
> from everybody:
>
> later i see timeout
>
> DEBUG [MessagingService-Outgoing-/eastip-Gossip] 2019-11-06 16:47:04,831
> OutboundTcpConnection.java:350 - Error writing to /eastip
> java.io.IOException: Connection timed out
>
> then  INFO  [GossipStage:1] 2019-11-06 16:47:05,792 StorageService.j
> ava:2289 - Node /eastip state jump to NORMAL
>
> DEBUG [GossipStage:1] 2019-11-06 16:47:06,244 MigrationManager
> .java:99 - Not pulling schema from /eastip, because sche
> ma versions match: local/real=cdbb639b-1675-31b3-8a0d-84aca18e
> 86bf, local/compatible=49bf1daa-d585-38e0-a72b-b36ce82da9cb, r
> emote=cdbb639b-1675-31b3-8a0d-84aca18e86bf
>
> i tried running some tcpdump during that time i dont see any packet loss
> during that time.  still unsure why east instance which was stopped and
> started unreachable to west node almost for 15 minutes.
>
>
> On Tue, Nov 5, 2019 at 10:14 PM daemeon reiydelle 
> wrote:
>
>> 10 minutes is 600 seconds, and there are several timeouts that are set to
>> that, including the data center timeout as I recall.
>>
>> You may be forced to tcpdump the interface(s) to see where the chatter
>> is. Out of curiosity, when you restart the node, have you snapped the jvm's
>> memory to see if e.g. heap is even in use?
>>
>>
>> On Tue, Nov 5, 2019 at 7:03 PM Rahul Reddy 
>> wrote:
>>
>>> Thanks Ben,
>>> Before stoping the ec2 I did run nodetool drain .so i ruled it out and
>>> system.log also doesn't show commitlogs being applied.
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Nov 5, 2019, 7:51 PM Ben Slater 
>>> wrote:
>>>
 The logs between first start and handshaking should give you a clue but
 my first guess would be replaying commit logs.

 Cheers
 Ben

 ---


 *Ben Slater**Chief Product Officer*

 

 
 
 

 Read our latest technical blog posts here
 .

 This email has been sent on behalf of Instaclustr Pty. Limited
 (Australia) and Instaclustr Inc (USA).

 This email and any attachments may contain confidential and legally
 privileged information.  If you are not the intended recipient, do not copy
 or disclose its content, but please reply to this email immediately and
 highlight the error to the sender and then immediately delete the message.


 On Wed, 6 Nov 2019 at 04:36, Rahul Reddy 
 wrote:

> I can reproduce the issue.
>
> I did drain Cassandra node then stop and started Cassandra instance .
> Cassandra instance comes up but other nodes will be in DN state around 10
> minutes.
>
> I don't see error in the systemlog
>
> DN  xx.xx.xx.59   420.85 MiB  256  48.2% id  2
> UN  xx.xx.xx.30   432.14 MiB  256  50.0% id  0
> UN  xx.xx.xx.79   447.33 MiB  256  51.1% id  4
> DN  xx.xx.xx.144  452.59 MiB  256  51.6% id  1
> DN  xx.xx.xx.19   431.7 MiB  256  50.1% id  5
> UN  xx.xx.xx.6421.79 MiB  256  48.9%
>
> when i do nodetool status 3 nodes still showing down. and i dont see
> errors in system.log
>
> and after 10 mins it shows the other node is up as well.
>
>
> INFO  [HANDSHAKE-/10.72.100.156] 2019-11-05 15:05:09,133
> OutboundTcpConnection.java:561 - Handshaking 

Re: Aws instance stop and star with ebs

2019-11-06 Thread Rahul Reddy
And this is on the node which was not stopped was active and didn't had
issues with tcp before. Only after east node stopped and started it started
seeing errors.  Please let me know if anything else need to be checked

On Wed, Nov 6, 2019, 12:18 PM Reid Pinchback 
wrote:

> Almost 15 minutes, that sounds suspiciously like blocking on a default TCP
> socket timeout.
>
>
>
> *From: *Rahul Reddy 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Wednesday, November 6, 2019 at 12:12 PM
> *To: *"user@cassandra.apache.org" 
> *Subject: *Re: Aws instance stop and star with ebs
>
>
>
> *Message from External Sender*
>
> Thank you.
>
> I have stopped instance in east. i see that all other instances can gossip
> to that instance and only one instance in west having issues gossiping to
> that node.  when i enable debug mode i see below on the west node
>
>
>
> i see bellow messages from 16:32 to 16:47
>
> DEBUG [RMI TCP Connection(272)-127.0.0.1] 2019-11-06 16:44:50,
> 417 StorageProxy.java:2361 - Hosts not in agreement. Didn't get a response
> from everybody:
>
> 424 StorageProxy.java:2361 - Hosts not in agreement. Didn't get a response
> from everybody:
>
>
>
> later i see timeout
>
> DEBUG [MessagingService-Outgoing-/eastip-Gossip] 2019-11-06 16:47:04,831
> OutboundTcpConnection.java:350 - Error writing to /eastip
> java.io.IOException: Connection timed out
>
>
>
> then  INFO  [GossipStage:1] 2019-11-06 16:47:05,792 StorageService.j
>
> ava:2289 - Node /eastip state jump to NORMAL
>
>
>
> DEBUG [GossipStage:1] 2019-11-06 16:47:06,244 MigrationManager
> .java:99 - Not pulling schema from /eastip, because sche
> ma versions match: local/real=cdbb639b-1675-31b3-8a0d-84aca18e
> 86bf, local/compatible=49bf1daa-d585-38e0-a72b-b36ce82da9cb, r
> emote=cdbb639b-1675-31b3-8a0d-84aca18e86bf
>
>
> i tried running some tcpdump during that time i dont see any packet loss
> during that time.  still unsure why east instance which was stopped and
> started unreachable to west node almost for 15 minutes.
>
>
>
>
>
> On Tue, Nov 5, 2019 at 10:14 PM daemeon reiydelle 
> wrote:
>
> 10 minutes is 600 seconds, and there are several timeouts that are set to
> that, including the data center timeout as I recall.
>
>
>
> You may be forced to tcpdump the interface(s) to see where the chatter is.
> Out of curiosity, when you restart the node, have you snapped the jvm's
> memory to see if e.g. heap is even in use?
>
>
>
>
>
> On Tue, Nov 5, 2019 at 7:03 PM Rahul Reddy 
> wrote:
>
> Thanks Ben,
>
> Before stoping the ec2 I did run nodetool drain .so i ruled it out and
> system.log also doesn't show commitlogs being applied.
>
>
>
>
>
>
>
>
>
> On Tue, Nov 5, 2019, 7:51 PM Ben Slater 
> wrote:
>
> The logs between first start and handshaking should give you a clue but my
> first guess would be replaying commit logs.
>
>
>
> Cheers
>
> Ben
>
> ---
>
> *Ben Slater*
> *Chief Product Officer*
>
> *Error! Filename not specified.*
> 
>
> *Error! Filename not specified.*
> 
>   *Error! Filename not specified.*
> 
>   *Error! Filename not specified.*
> 
>
> Read our latest technical blog posts here
> 
> .
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
>
>
>
> On Wed, 6 Nov 2019 at 04:36, Rahul Reddy  wrote:
>
> I can reproduce the issue.
>
>
>
> I did drain Cassandra node then 

Re: Cassandra 3.0.18 went OOM several hours after joining a cluster

2019-11-06 Thread Reid Pinchback
The other thing that comes to mind is that the increase in pending compactions 
suggests back pressure on compaction activity.  GC is only one possible source 
of that.  Between your throughput setting and how your disk I/O is set up, 
maybe that’s throttling you to a rate where the rate of added reasons for 
compactions > the rate of compactions completed.

In fact, the more that I think about it, I wonder about that a lot.

If you can’t keep up with compactions, then operations have to span more and 
more SSTables over time.  You’ll keep holding on to what you read, as you read 
more of them, until eventually…pop.


From: Reid Pinchback 
Reply-To: "user@cassandra.apache.org" 
Date: Wednesday, November 6, 2019 at 12:11 PM
To: "user@cassandra.apache.org" 
Subject: Re: Cassandra 3.0.18 went OOM several hours after joining a cluster

Message from External Sender
My first thought was that you were running into the merkle tree depth problem, 
but the details on the ticket don’t seem to confirm that.

It does look like eden is too small.   C* lives in Java’s GC pain point, a lot 
of medium-lifetime objects.  If you haven’t already done so, you’ll want to 
configure as many things to be off-heap as you can, but I’d definitely look at 
improving the ratio of eden to old gen, and see if you can get the young gen GC 
activity to be more successful at sweeping away the medium-lived objects.

All that really comes to mind is if you’re getting to a point where GC isn’t 
coping.  That can be hard to sometimes spot on metrics with coarse granularity. 
 Per-second metrics might show CPU cores getting pegged.

I’m not sure that GC tuning eliminates this problem, but if it isn’t being 
caused by that, GC tuning may at least improve the visibility of the underlying 
problem.

From: "Steinmaurer, Thomas" 
Reply-To: "user@cassandra.apache.org" 
Date: Wednesday, November 6, 2019 at 11:27 AM
To: "user@cassandra.apache.org" 
Subject: Cassandra 3.0.18 went OOM several hours after joining a cluster

Message from External Sender
Hello,

after moving from 2.1.18 to 3.0.18, we are facing OOM situations after several 
hours a node has successfully joined a cluster (via auto-bootstrap).

I have created the following ticket trying to describe the situation, including 
hprof / MAT screens: 
https://issues.apache.org/jira/browse/CASSANDRA-15400

Would be great if someone could have a look.

Thanks a lot.

Thomas
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freistädterstraße 313


Re: Aws instance stop and star with ebs

2019-11-06 Thread Reid Pinchback
Almost 15 minutes, that sounds suspiciously like blocking on a default TCP 
socket timeout.

From: Rahul Reddy 
Reply-To: "user@cassandra.apache.org" 
Date: Wednesday, November 6, 2019 at 12:12 PM
To: "user@cassandra.apache.org" 
Subject: Re: Aws instance stop and star with ebs

Message from External Sender
Thank you.
I have stopped instance in east. i see that all other instances can gossip to 
that instance and only one instance in west having issues gossiping to that 
node.  when i enable debug mode i see below on the west node

i see bellow messages from 16:32 to 16:47
DEBUG [RMI TCP Connection(272)-127.0.0.1] 2019-11-06 16:44:50,
417 StorageProxy.java:2361 - Hosts not in agreement. Didn't get a response from 
everybody:
424 StorageProxy.java:2361 - Hosts not in agreement. Didn't get a response from 
everybody:

later i see timeout
DEBUG [MessagingService-Outgoing-/eastip-Gossip] 2019-11-06 16:47:04,831 
OutboundTcpConnection.java:350 - Error writing to /eastip
java.io.IOException: Connection timed out

then  INFO  [GossipStage:1] 2019-11-06 16:47:05,792 StorageService.j
ava:2289 - Node /eastip state jump to NORMAL

DEBUG [GossipStage:1] 2019-11-06 16:47:06,244 MigrationManager
.java:99 - Not pulling schema from /eastip, because sche
ma versions match: local/real=cdbb639b-1675-31b3-8a0d-84aca18e
86bf, local/compatible=49bf1daa-d585-38e0-a72b-b36ce82da9cb, r
emote=cdbb639b-1675-31b3-8a0d-84aca18e86bf

i tried running some tcpdump during that time i dont see any packet loss during 
that time.  still unsure why east instance which was stopped and started 
unreachable to west node almost for 15 minutes.


On Tue, Nov 5, 2019 at 10:14 PM daemeon reiydelle 
mailto:daeme...@gmail.com>> wrote:
10 minutes is 600 seconds, and there are several timeouts that are set to that, 
including the data center timeout as I recall.

You may be forced to tcpdump the interface(s) to see where the chatter is. Out 
of curiosity, when you restart the node, have you snapped the jvm's memory to 
see if e.g. heap is even in use?


On Tue, Nov 5, 2019 at 7:03 PM Rahul Reddy 
mailto:rahulreddy1...@gmail.com>> wrote:
Thanks Ben,
Before stoping the ec2 I did run nodetool drain .so i ruled it out and 
system.log also doesn't show commitlogs being applied.




On Tue, Nov 5, 2019, 7:51 PM Ben Slater 
mailto:ben.sla...@instaclustr.com>> wrote:
The logs between first start and handshaking should give you a clue but my 
first guess would be replaying commit logs.

Cheers
Ben

---

Ben Slater
Chief Product Officer

Error! Filename not 
specified.

Error! Filename not 
specified.
  Error! Filename not 
specified.
  Error! Filename not 
specified.

Read our latest technical blog posts 
here.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia) and 
Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally privileged 
information.  If you are not the intended recipient, do not copy or disclose 
its content, but please reply to this email immediately and highlight the error 
to the sender and then immediately delete the message.


On Wed, 6 Nov 2019 at 04:36, Rahul Reddy 
mailto:rahulreddy1...@gmail.com>> wrote:
I can reproduce the issue.

I did drain Cassandra node then stop and started Cassandra instance . Cassandra 
instance comes up but other nodes will be in DN state around 10 minutes.

I don't see error in the systemlog

DN  xx.xx.xx.59   420.85 MiB  256  48.2% id  2
UN  xx.xx.xx.30   432.14 MiB  256  50.0% id  0
UN  xx.xx.xx.79   447.33 MiB  256  51.1% id  4
DN  xx.xx.xx.144  452.59 MiB  256  51.6%   

Re: Cassandra 3.0.18 went OOM several hours after joining a cluster

2019-11-06 Thread Reid Pinchback
My first thought was that you were running into the merkle tree depth problem, 
but the details on the ticket don’t seem to confirm that.

It does look like eden is too small.   C* lives in Java’s GC pain point, a lot 
of medium-lifetime objects.  If you haven’t already done so, you’ll want to 
configure as many things to be off-heap as you can, but I’d definitely look at 
improving the ratio of eden to old gen, and see if you can get the young gen GC 
activity to be more successful at sweeping away the medium-lived objects.

All that really comes to mind is if you’re getting to a point where GC isn’t 
coping.  That can be hard to sometimes spot on metrics with coarse granularity. 
 Per-second metrics might show CPU cores getting pegged.

I’m not sure that GC tuning eliminates this problem, but if it isn’t being 
caused by that, GC tuning may at least improve the visibility of the underlying 
problem.

From: "Steinmaurer, Thomas" 
Reply-To: "user@cassandra.apache.org" 
Date: Wednesday, November 6, 2019 at 11:27 AM
To: "user@cassandra.apache.org" 
Subject: Cassandra 3.0.18 went OOM several hours after joining a cluster

Message from External Sender
Hello,

after moving from 2.1.18 to 3.0.18, we are facing OOM situations after several 
hours a node has successfully joined a cluster (via auto-bootstrap).

I have created the following ticket trying to describe the situation, including 
hprof / MAT screens: 
https://issues.apache.org/jira/browse/CASSANDRA-15400

Would be great if someone could have a look.

Thanks a lot.

Thomas
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freistädterstraße 313


Re: Aws instance stop and star with ebs

2019-11-06 Thread Rahul Reddy
Thank you.

I have stopped instance in east. i see that all other instances can gossip
to that instance and only one instance in west having issues gossiping to
that node.  when i enable debug mode i see below on the west node

i see bellow messages from 16:32 to 16:47
DEBUG [RMI TCP Connection(272)-127.0.0.1] 2019-11-06 16:44:50,
417 StorageProxy.java:2361 - Hosts not in agreement. Didn't get a response
from everybody:
424 StorageProxy.java:2361 - Hosts not in agreement. Didn't get a response
from everybody:

later i see timeout

DEBUG [MessagingService-Outgoing-/eastip-Gossip] 2019-11-06 16:47:04,831
OutboundTcpConnection.java:350 - Error writing to /eastip
java.io.IOException: Connection timed out

then  INFO  [GossipStage:1] 2019-11-06 16:47:05,792 StorageService.j
ava:2289 - Node /eastip state jump to NORMAL

DEBUG [GossipStage:1] 2019-11-06 16:47:06,244 MigrationManager
.java:99 - Not pulling schema from /eastip, because sche
ma versions match: local/real=cdbb639b-1675-31b3-8a0d-84aca18e
86bf, local/compatible=49bf1daa-d585-38e0-a72b-b36ce82da9cb, r
emote=cdbb639b-1675-31b3-8a0d-84aca18e86bf

i tried running some tcpdump during that time i dont see any packet loss
during that time.  still unsure why east instance which was stopped and
started unreachable to west node almost for 15 minutes.


On Tue, Nov 5, 2019 at 10:14 PM daemeon reiydelle 
wrote:

> 10 minutes is 600 seconds, and there are several timeouts that are set to
> that, including the data center timeout as I recall.
>
> You may be forced to tcpdump the interface(s) to see where the chatter is.
> Out of curiosity, when you restart the node, have you snapped the jvm's
> memory to see if e.g. heap is even in use?
>
>
> On Tue, Nov 5, 2019 at 7:03 PM Rahul Reddy 
> wrote:
>
>> Thanks Ben,
>> Before stoping the ec2 I did run nodetool drain .so i ruled it out and
>> system.log also doesn't show commitlogs being applied.
>>
>>
>>
>>
>>
>> On Tue, Nov 5, 2019, 7:51 PM Ben Slater 
>> wrote:
>>
>>> The logs between first start and handshaking should give you a clue but
>>> my first guess would be replaying commit logs.
>>>
>>> Cheers
>>> Ben
>>>
>>> ---
>>>
>>>
>>> *Ben Slater**Chief Product Officer*
>>>
>>> 
>>>
>>> 
>>> 
>>> 
>>>
>>> Read our latest technical blog posts here
>>> .
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia) and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the message.
>>>
>>>
>>> On Wed, 6 Nov 2019 at 04:36, Rahul Reddy 
>>> wrote:
>>>
 I can reproduce the issue.

 I did drain Cassandra node then stop and started Cassandra instance .
 Cassandra instance comes up but other nodes will be in DN state around 10
 minutes.

 I don't see error in the systemlog

 DN  xx.xx.xx.59   420.85 MiB  256  48.2% id  2
 UN  xx.xx.xx.30   432.14 MiB  256  50.0% id  0
 UN  xx.xx.xx.79   447.33 MiB  256  51.1% id  4
 DN  xx.xx.xx.144  452.59 MiB  256  51.6% id  1
 DN  xx.xx.xx.19   431.7 MiB  256  50.1% id  5
 UN  xx.xx.xx.6421.79 MiB  256  48.9%

 when i do nodetool status 3 nodes still showing down. and i dont see
 errors in system.log

 and after 10 mins it shows the other node is up as well.


 INFO  [HANDSHAKE-/10.72.100.156] 2019-11-05 15:05:09,133
 OutboundTcpConnection.java:561 - Handshaking version with /stopandstarted
 node
 INFO  [RequestResponseStage-7] 2019-11-05 15:16:27,166
 Gossiper.java:1019 - InetAddress /nodewhichitwasshowing down is now UP

 what is causing delay for 10mins to be able to say that node is
 reachable

 On Wed, Oct 30, 2019, 8:37 AM Rahul Reddy 
 wrote:

> And also aws ec2 stop and start comes with new instance with same ip
> and all our file systems are in ebs mounted fine.  Does coming new 
> instance
> with same ip cause any gossip issues?
>
> On Tue, Oct 29, 2019, 6:16 PM Rahul Reddy 
> wrote:
>
>> Thanks Alex. We have 6 nodes in each DC with RF=3  with CL local
>> qourum . and we stopped and started only one instance at a time . Tough
>> nodetool status says all nodes UN and system.log says canssandra started
>> and started listening . Jmx explrter shows instance stayed down longer 
>> how
>> do we determine what caused  the Cassandra unavialbe though log says its
>> stared and listening ?
>>
>> On Tue, 

Medusa : a new OSS backup/restore tool for Apache Cassandra

2019-11-06 Thread Alexander Dejanovski
Hi folks,

I'm happy to announce that Spotify and TLP have been collaborating to
create and open source a new backup and restore tool for Apache Cassandra :
https://github.com/spotify/cassandra-medusa
It is released under the Apache 2.0 license.

It can perform full and differential backups, in place restores (same
cluster) and remote restores (remote cluster) whether or not the topologies
match or not.
More details in our latest blog post :
https://thelastpickle.com/blog/2019/11/05/cassandra-medusa-backup-tool-is-open-source.html

Hope you'll enjoy using it,

-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Cassandra 3.0.18 went OOM several hours after joining a cluster

2019-11-06 Thread Steinmaurer, Thomas
Hello,

after moving from 2.1.18 to 3.0.18, we are facing OOM situations after several 
hours a node has successfully joined a cluster (via auto-bootstrap).

I have created the following ticket trying to describe the situation, including 
hprof / MAT screens: https://issues.apache.org/jira/browse/CASSANDRA-15400

Would be great if someone could have a look.

Thanks a lot.

Thomas
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freist?dterstra?e 313