[
https://issues.apache.org/jira/browse/HDDS-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Krishna Kumar Asawa resolved HDDS-6345.
---------------------------------------
Resolution: Information Provided
As per above slack discussion link issue is resolved. with proper config/usage.
> OM always runs OOM in Kubernetes
> ---------------------------------
>
> Key: HDDS-6345
> URL: https://issues.apache.org/jira/browse/HDDS-6345
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Shawn
> Assignee: Ritesh Shukla
> Priority: Major
>
> I deployed ozone 1.21 to kubernetes with security enabled and with OM HA and
> SCM HA. However, one of the OM always gets restarted by Kubernetes because of
> OOM. Even I assigned 300GB memory, the OM still keeps restarting for OOM.
>
> After analysis, we found the OOM was because of rocksDB. When OM gets
> restarted, it first tries to open rocksDB. And during this time, rocksDB
> tries to do compaction, which eventually got OOM. So there are three question:
>
> 1. Why the OM got into this status?
> 2. Why rocksDB needs so much memory to do the compaction?
> 3. How to resolve this?
> Some info maybe useful for you. We directly deploy OM HA, not migrate from
> one OM to HA OM. The OM that has issues is a follower, not a leader. The
> underlying PVC we are using is SSD. Our traffic is mostly large objects, with
> size of hundreds GBs.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]