[ceph-users] Re: Monitoring slow ops

2022-02-08 Thread Konstantin Shalygin
Hi, > On 9 Feb 2022, at 09:03, BenoƮt Knecht wrote: > > I don't remember in which Ceph release it was introduced, but on Pacific > there's a metric called `ceph_healthcheck_slow_ops`. At least in Nautilus this metric exists k ___ ceph-users mailing

[ceph-users] Monitoring slow ops

2022-02-08 Thread Trey Palmer
Hi all, We have found that RGW access problems on our clusters almost always coincide with slow ops in "ceph -s". Is there any good way to monitor and alert on slow ops from prometheus metrics? We are running Nautilus but I'd be interested in any changes that might help in newer versions, as

[ceph-users] R release naming

2022-02-08 Thread Josh Durgin
Hi folks, As we near the end of the Quincy cycle, it's time to choose a name for the next release. This etherpad began a while ago, so there are some votes already, however we wanted to open it up for anyone who hasn't voted yet. Add your +1 to the name you prefer here, or add a new option:

[ceph-users] Re: RGW automation encryption - still testing only?

2022-02-08 Thread Casey Bodley
On Tue, Feb 8, 2022 at 11:55 AM Stefan Schueffler wrote: > > Hi Casey, > > great news to hear about "SSE-S3 almost implemented" :-) > > One question about that - will the implementation have one key per bucket, or > one key per individual object? > > Amazon (as per the public available docs) is

[ceph-users] Re: cephfs: [ERR] loaded dup inode

2022-02-08 Thread Gregory Farnum
On Tue, Feb 8, 2022 at 7:30 AM Dan van der Ster wrote: > > On Tue, Feb 8, 2022 at 1:04 PM Frank Schilder wrote: > > The reason for this seemingly strange behaviour was an old static snapshot > > taken in an entirely different directory. Apparently, ceph fs snapshots are > > not local to an FS

[ceph-users] Re: RGW automation encryption - still testing only?

2022-02-08 Thread David Orman
Totally understand, I'm not really a fan of service-managed encryption keys as a general rule vs. client-managed. I just thought I'd probe about capabilities considered stable before embarking on our own work. SSE-S3 would be a reasonable middle-ground. I appreciate the PR link, that's very

[ceph-users] Re: RGW automation encryption - still testing only?

2022-02-08 Thread Casey Bodley
On Tue, Feb 8, 2022 at 11:11 AM Casey Bodley wrote: > > hi David, > > that method of encryption based on rgw_crypt_default_encryption_key > will never be officially supported. to expand on why: rgw_crypt_default_encryption_key requires the key material to be stored insecurely in ceph's config,

[ceph-users] Re: RGW automation encryption - still testing only?

2022-02-08 Thread Casey Bodley
hi David, that method of encryption based on rgw_crypt_default_encryption_key will never be officially supported. however, support for SSE-S3 encryption [1] is nearly complete in [2] (cc Marcus), and we hope to include that in the quincy release - and if not, we'll backport it to quincy in an

[ceph-users] RGW automation encryption - still testing only?

2022-02-08 Thread David Orman
Is RGW encryption for all objects at rest still testing only, and if not, which version is it considered stable in?: https://docs.ceph.com/en/latest/radosgw/encryption/#automatic-encryption-for-testing-only David ___ ceph-users mailing list --

[ceph-users] Re: Random scrub errors (omap_digest_mismatch) on pgs of RADOSGW metadata pools (bug 53663)

2022-02-08 Thread Christian Rohmann
Hey there again, there now was a question from Neha Ojha in https://tracker.ceph.com/issues/53663 about providing OSD debug logs for a manual deep-scrub on (inconsistent) PGs. I did provide the logs of two of those deep-scrubs via ceph-post-file already. But since data inconsistencies are

[ceph-users] Re: cephfs: [ERR] loaded dup inode

2022-02-08 Thread Dan van der Ster
On Tue, Feb 8, 2022 at 1:04 PM Frank Schilder wrote: > The reason for this seemingly strange behaviour was an old static snapshot > taken in an entirely different directory. Apparently, ceph fs snapshots are > not local to an FS directory sub-tree but always global on the entire FS > despite

[ceph-users] Re: ceph_assert(start >= coll_range_start && start < coll_range_end)

2022-02-08 Thread Manuel Lausch
Okay, I definitely need here some help. The crashing OSD moved with the PG. so The PG seems to have the issue I moved (via upmaps ) all 4 replicas to filestore OSDs. After this the error seems to be solved. No OSD crashed after this. A deep-scrub of the PG didn't throw any error. So I moved the

[ceph-users] mds crash loop - Server.cc: 7503: FAILED ceph_assert(in->first <= straydn->first)

2022-02-08 Thread Arnaud MARTEL
Hi all, We have a cephfs cluster in production for about 2 months and, for the past 2-3 weeks, we are regularly experiencing MDS crash loops (every 3-4 hours if we have some user activity). A temporary fix is to remove the MDSs in error (or unknown) state, stop samba & nfs-ganesha