[jira] [Resolved] (HDDS-6345) OM always runs OOM in Kubernetes

Krishna Kumar Asawa (Jira) Fri, 04 Aug 2023 00:02:56 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Krishna Kumar Asawa resolved HDDS-6345.
---------------------------------------
    Resolution: Information Provided

As per above slack discussion link issue is resolved. with proper config/usage.

> OM always runs OOM in Kubernetes 
> ---------------------------------
>
>                 Key: HDDS-6345
>                 URL: https://issues.apache.org/jira/browse/HDDS-6345
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Shawn
>            Assignee: Ritesh Shukla
>            Priority: Major
>
> I deployed ozone 1.21 to kubernetes  with security enabled and with OM HA and 
> SCM HA. However, one of the OM always gets restarted by Kubernetes because of 
> OOM. Even I assigned 300GB memory, the OM still keeps restarting for OOM.
>  
> After analysis, we found the OOM was because of rocksDB. When OM gets 
> restarted, it first tries to open rocksDB. And during this time, rocksDB 
> tries to do compaction, which eventually got OOM. So there are three question:
>  
> 1. Why the OM got into this status?
> 2. Why rocksDB needs so much memory to do the compaction?
> 3. How to resolve this?
> Some info maybe useful for you. We directly deploy OM HA, not migrate from 
> one OM to HA OM. The OM that has issues is a follower, not a leader. The 
> underlying PVC we are using is SSD. Our traffic is mostly large objects, with 
> size of hundreds GBs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (HDDS-6345) OM always runs OOM in Kubernetes

Reply via email to