Hi,
> On Jan 12, 2024, at 12:01, Frédéric Nass
> wrote:
>
> Hard to tell for sure since this bug hit different major versions of the
> kernel, at least RHEL's from what I know.
In what RH kernel release this issue was fixed?
Thanks,
k
___
Samuel,
Hard to tell for sure since this bug hit different major versions of the
kernel, at least RHEL's from what I know. The only way to tell is to check for
num_cgroups in /proc/cgroups:
$ cat /proc/cgroups | grep -e subsys -e blkio | column -t
#subsys_name hierarchy
Dear Frederic,
Thanks a lot for the suggestions. We are using the valilla Linux 4.19 LTS
version. Do you think we may be suffering from the same bug?
best regards,
Samuel
huxia...@horebdata.cn
From: Frédéric Nass
Date: 2024-01-12 09:19
To: huxiaoyu
CC: ceph-users
Subject: Re: [ceph-users]
Hello,
We've had a similar situation recently where OSDs would use way more memory
than osd_memory_target and get OOM killed by the kernel.
It was due to a kernel bug related to cgroups [1].
If num_cgroups below keeps increasing then you may hit this bug.
$ cat /proc/cgroups | grep
Den ons 10 jan. 2024 kl 19:20 skrev huxia...@horebdata.cn
:
> Dear Ceph folks,
>
> I am responsible for two Ceph clusters, running Nautilius 14.2.22 version,
> one with replication 3, and the other with EC 4+2. After around 400 days
> runing quietly and smoothly, recently the two clusters
Hi Samuel,
It can be a few things. A good place to start is to dump_mempools of one of
those bloated OSDs:
`ceph daemon osd.123 dump_mempools`
Cheers, Dan
--
Dan van der Ster
CTO
Clyso GmbH
p: +49 89 215252722 | a: Vancouver, Canada
w: https://clyso.com | e: dan.vanders...@clyso.com
We are