Hi, We have two datacenter with 5 nodes each and have replication factor of 3. We have traffic on DC1 and DC2 is just for disaster recovery and there is no direct traffic. We are using 24cpu with 128GB RAM machines . For DC1 where we have live traffic , we don't see any issue, however for DC2 where we don't have live traffic we see lots OOM(Out of Memory)and node goes down(only on DC2 nodes).
We were using 16GB heap with G1GC in DC1 and DC2 both . As DC2 nodes were OOM so we increased 16GB to 24GB and then to 32GB but still DC2 nodes goes down with OOM , but obviously not as frequently as it used to go down when heap was 16GB . DC1 nodes are still on 16GB heap and none of the nodes goes down . We are on open source 3.11.0 . We are having Materialized views. We see lots of hints pending on DC2 nodes and hints replay is very very slow on DC2 nodes compare to DC1 nodes. Other than heap sizes mentioned above , all the configs are same in all nodes in the clusters. We are using JRE and can't collect the heap dump. Any idea, what can be the cause ? Currently disk_access_modeis not set hence it is auto in our env. Should setting disk_access_mode to mmap_index_only will help ? My question is "*Why DC2 nodes OOM and DC1 nodes doesn't?*" Thanks Surbhi