Hi,

We have a luminous cluster which was upgraded from Hammer --> Jewel -->
Luminous 12.2.8 recently. Post upgrade we are seeing issue with a few nodes
where they are running out of memory and dying. In the logs we are seeing
OOM killer. We don't have this issue before upgrade. The only difference is
the nodes without any issue are R730xd and the ones with the memory leak
are R740xd. The hardware vendor don't see anything wrong with the hardware.
>From Ceph end we are not seeing any issue when it comes to running the
cluster, only issue is with memory leak. Right now we are actively
rebooting the nodes in timely manner to avoid crashes. One R740xd node we
set all the OSDs to 0.0 and there is no memory leak there. Any pointers to
fix the issue would be helpful.

Thanks,
*Pardhiv Karri*
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to