OOM only on one datacenter nodes

Surbhi Gupta Sat, 04 Apr 2020 12:55:17 -0700

Hi,

We have two datacenter with 5 nodes each and have replication factor of 3.
We have traffic on DC1 and DC2 is just for disaster recovery and there is
no direct traffic.
We are using 24cpu with 128GB RAM machines .
For DC1 where we have live traffic , we don't see any issue, however for
DC2 where we don't have live traffic we see lots OOM(Out of Memory)and node
goes down(only on DC2 nodes).


We were using 16GB heap with G1GC in DC1 and DC2 both .
As DC2 nodes were OOM so we increased 16GB to 24GB and then to 32GB but
still DC2 nodes goes down with OOM , but obviously not as frequently as it
used to go down when heap was 16GB .
DC1 nodes are still on 16GB heap and none of the nodes goes down .

We are on open source 3.11.0 .
We are having Materialized views.
We see lots of hints pending on DC2 nodes and hints replay is very very
slow on DC2 nodes compare to DC1 nodes.

Other than heap sizes mentioned above , all the configs are same in
all nodes in the clusters.
We are using JRE and can't collect the heap dump.

Any idea, what can be the cause ?

Currently disk_access_modeis not set hence it is auto in our env. Should
setting disk_access_mode  to mmap_index_only  will help ?

My question is "*Why DC2 nodes OOM and DC1 nodes doesn't?*"

Thanks
Surbhi

OOM only on one datacenter nodes

Reply via email to