[ceph-users] Re: Quincy: Corrupted devicehealth sqlite3 database from MGR crashing bug

2022-08-16 Thread Patrick Donnelly
Thank you, that's helpful. I have created a ticket with my findings so far: https://tracker.ceph.com/issues/57152 Please follow there for updates. On Mon, Aug 15, 2022 at 4:12 PM Daniel Williams wrote: > > ceph-post-file: a9802e30-0096-410e-b5c0-f2e6d83acfd6 > > On Tue, Aug 16, 2022 at 3:13 AM

[ceph-users] RBD images Prometheus metrics : not all pools/images reported

2022-08-16 Thread Gilles Mocellin
Hello Cephers, I'm trying to diagnose who's doing what on our cluster, which suffer from SLOW_OPS, High latency periods since Pacific. And I can't see all pool / images in RBD stats. I had activated RBD image stats while running Octopus, now it seems we only need to define

[ceph-users] Announcing go-ceph v0.17.0

2022-08-16 Thread Sven Anderson
We are happy to announce another release of the go-ceph API library. This is a regular release following our every-two-months release cadence. https://github.com/ceph/go-ceph/releases/tag/v0.17.0 Changes include additions to the rados and rgw packages. More details are available at the link

[ceph-users] Re: ceph kernel client RIP when quota exceeded

2022-08-16 Thread Xiubo Li
Hi Andrej, The upstream kernel has one commit: commit 0078ea3b0566e3da09ae8e1e4fbfd708702f2876 Author: Jeff Layton Date:   Tue Nov 9 09:54:49 2021 -0500     ceph: don't check for quotas on MDS stray dirs     玮文 胡 reported seeing the WARN_RATELIMIT pop when writing to an     inode that had

[ceph-users] ceph kernel client RIP when quota exceeded

2022-08-16 Thread Andrej Filipcic
Hi, we experienced massive node failures when a user with cephfs quota exceeded submitted many jobs to a slurm cluster, home is on cephfs. The nodes still work for some time, but they eventually freeze due to too many stuck CPUs Is this a kernel ceph client bug? running on 5.10.123, ceph

[ceph-users] Re: What is client request_load_avg? Troubleshooting MDS issues on Luminous

2022-08-16 Thread Chris Smart
On Tue, 2022-08-16 at 10:52 +, Frank Schilder wrote: > Hi Chris, > > I would strongly advice not to use multi-MDS with 5000 clients on > luminous. I enabled it on mimic with ca. 1750 clients and it was > extremely dependent on luck if it converged to a stable distribution > of dirfrags or

[ceph-users] How to verify the use of wire encryption?

2022-08-16 Thread Martin Traxl
Hi, I am running a Ceph 16.2.9 cluster with wire encryption. From my ceph.conf: _ ms client mode = secure ms cluster mode = secure ms mon client mode = secure ms mon cluster mode = secure ms mon service mode = secure ms service mode = secure _ My cluster is running both

[ceph-users] Re: What is client request_load_avg? Troubleshooting MDS issues on Luminous

2022-08-16 Thread Chris Smart
On Tue, 2022-08-16 at 07:50 +, Eugen Block wrote: > Hi, > > > However, the ceph-mds process is pretty much constantly over 100% > > CPU > > and often over 200%. Given it's a single process, right? It makes > > me > > think that some operations are too slow or some task is pegging the > > CPU

[ceph-users] Re: What is client request_load_avg? Troubleshooting MDS issues on Luminous

2022-08-16 Thread Eugen Block
Hi, However, the ceph-mds process is pretty much constantly over 100% CPU and often over 200%. Given it's a single process, right? It makes me think that some operations are too slow or some task is pegging the CPU at 100%. you might want look into multi-active MDS, especially with 5000

[ceph-users] Re: CephFS perforamnce degradation in root directory

2022-08-16 Thread Robert Sander
Am 16.08.22 um 08:43 schrieb Gregory Farnum: I was wondering if it had something to do with quota enforcement. The other possibility that occurs to me is if other clients are monitoring the system, or an admin pane (eg the dashboard) is displaying per-volume or per-client stats, they may be

[ceph-users] Ceph Days Dublin CFP ends today

2022-08-16 Thread Mike Perez
Hi everyone, Ceph Days are returning, and we're starting with Dublin in September 13th! If you're attending Open Source Summit EU, consider adding this event to your week! A full-day event dedicated to sharing Ceph's transformative power and fostering the vibrant Ceph community, Ceph Day Dublin

[ceph-users] Re: CephFS perforamnce degradation in root directory

2022-08-16 Thread Gregory Farnum
I was wondering if it had something to do with quota enforcement. The other possibility that occurs to me is if other clients are monitoring the system, or an admin pane (eg the dashboard) is displaying per-volume or per-client stats, they may be poking at the mountpoint and interrupting exclusive