;user@cassandra.apache.org"
Cc: Reid Pinchback
Subject: Re: OOM only on one datacenter nodes
Message from External Sender
We are using JRE and not JDK , hence not able to take heap dump .
On Sun, 5 Apr 2020 at 19:21, Jeff Jirsa
mailto:jji...@gmail.com>> wrote:
Set the jvm flags to
e memory.
>>>>
>>>> RP> As the problem is only happening in DC2, then there has to be a thing
>>>> that is true in DC2 that isn’t true in DC1. A difference in hardware, a
>>>> RP> difference in O/S version, a difference in networking confi
vity, or a
>> RP> difference in how repairs are handled. Somewhere, there is a
>> difference. I’d start with focusing on that.
>>
>> RP> From: Erick Ramirez
>> RP> Reply-To: "user@cassandra.apache.org"
>> RP> Date: Saturday, April 4
nce in O/S version, a difference in networking config or
>> physical infrastructure, a difference in client-triggered activity, or a
>> RP> difference in how repairs are handled. Somewhere, there is a
>> difference. I’d start with focusing on that.
>>
>>
erence in how repairs are handled. Somewhere, there is a
> difference. I’d start with focusing on that.
>
> RP> From: Erick Ramirez
> RP> Reply-To: "user@cassandra.apache.org"
> RP> Date: Saturday, April 4, 2020 at 8:28 PM
> RP> To: "user@cassandra.ap
M
RP> To: "user@cassandra.apache.org"
RP> Subject: Re: OOM only on one datacenter nodes
RP> Message from External Sender
RP> With a lack of heapdump for you to analyse, my hypothesis is that your DC2
nodes are taking on traffic (from some client somewhere) but you're
, or a difference in how repairs are
handled. Somewhere, there is a difference. I’d start with focusing on that.
From: Erick Ramirez
Reply-To: "user@cassandra.apache.org"
Date: Saturday, April 4, 2020 at 8:28 PM
To: "user@cassandra.apache.org"
Subject: Re: OOM only on one datacenter
With a lack of heapdump for you to analyse, my hypothesis is that your DC2
nodes are taking on traffic (from some client somewhere) but you're just
not aware of it. The hints replay is just a side-effect of the nodes
getting overloaded.
To rule out my hypothesis in the first instance, my
Hi,
We have two datacenter with 5 nodes each and have replication factor of 3.
We have traffic on DC1 and DC2 is just for disaster recovery and there is
no direct traffic.
We are using 24cpu with 128GB RAM machines .
For DC1 where we have live traffic , we don't see any issue, however for
DC2