[ceph-users] Re: PG backfilled slow

2023-07-26 Thread Danny Webb
The Suse docs are pretty good for this: https://www.suse.com/support/kb/doc/?id=19693 basically up the osd-max-backfills / osd-recovery-max-active and this will allow concurrent backfills to the same device. If you watch the OSD in grafana you should be able to see the underlying device

[ceph-users] PG backfilled slow

2023-07-26 Thread Peter
Hi all, I need replace some disk due to bad sector. I have crush out these disks and ceph did backfilling and migrate data as I want. However, I could see these OSD has one or more PG left after a day wait and backfilling really slow. Now it has only one backfilling PG at the same time.

[ceph-users] Re: cephbot - a Slack bot for Ceph has been added to the github.com/ceph project

2023-07-26 Thread Marc
> > The instructions show how to set it up so that read-only operations can > be > performed from Slack for security purposes, but there are settings that > could make it possible to lock down who can communicate with cephbot > which > could make it relatively secure to run administrative tasks

[ceph-users] cephbot - a Slack bot for Ceph has been added to the github.com/ceph project

2023-07-26 Thread David Turner
cephbot [1] is a project that I've been working on and using for years now and it has been added to the github.com/ceph project to increase visibility for other people that would like to implement slack-ops for their Ceph clusters. The instructions show how to set it up so that read-only

[ceph-users] Ceph Leadership Team Meeting, 2023-07-26 Minutes

2023-07-26 Thread Casey Bodley
Welcome to Aviv Caro as new Ceph NVMe-oF lead Reef status: * reef 18.1.3 built, gibba cluster upgraded, plan to publish this week * https://pad.ceph.com/p/reef_final_blockers all resolved except for bookworm builds https://tracker.ceph.com/issues/61845 * only blockers will merge to reef so the

[ceph-users] Re: MDS stuck in rejoin

2023-07-26 Thread Frank Schilder
Hi Xiubo. > ... I am more interested in the kclient side logs. Just want to > know why that oldest request got stuck so long. I'm afraid I'm a bad admin in this case. I don't have logs from the host any more, I would have needed the output of dmesg and this is gone. In case it happens again I

[ceph-users] Re: RGWs offline after upgrade to Nautilus

2023-07-26 Thread Eugen Block
Hi, apparently, my previous suggestions don't apply here (full OSDs or max_pgs_per_osd limit). Did you also check the rgw client keyrings? Did you also upgrade the operating system? Maybe some apparmor stuff? Can you set debug to 30 to see if there're more to see? Anything in the mon or

[ceph-users] Re: 1 PG stucked in "active+undersized+degraded for long time

2023-07-26 Thread Eugen Block
I can provide some more details, these were the recovery steps taken so far, they started from here (I don't know the whole/exact story though): 70/868386704 objects unfound (0.000%) Reduced data availability: 8 pgs inactive, 8 pgs incomplete Possible data damage: 1 pg recovery_unfound

[ceph-users] Re: ceph quincy repo update to debian bookworm...?

2023-07-26 Thread Eneko Lacunza
Hi, You may want to try Proxmox Ceph repository: debhttp://download.proxmox.com/debian/ceph-quincy bookworm no-subscription Cheers El 22/7/23 a las 0:55, Luke Hall escribió: Ditto this query. I can't recall if there's a separate list for

[ceph-users] Ceph 17.2.6 alert-manager receives error 500 from inactive MGR

2023-07-26 Thread Robert Sander
Hi, we noticed a strange error message in the logfiles: The alert-manager deployed with cephadm receives a HTTP 500 error from the inactive MGR when trying to call the URI /api/prometheus_receiver: Jul 25 09:35:25 alert-manager conmon[2426]: level=error ts=2023-07-25T07:35:25.171Z

[ceph-users] Re: cephadm and kernel memory usage

2023-07-26 Thread Luis Domingues
That's the weird thing. Processes and user-space memory is the same in good memory and bad memory. ceph-osd memory usage looks good in all machines, cache is more of less the same. When I do a ps, htop or any other process review all look good, and coherent between all machines, containers or

[ceph-users] Re: cephadm and kernel memory usage

2023-07-26 Thread Konstantin Shalygin
Without determining what exactly process (kernel or userspace) "eat" memory, the ceph-users can't tell what exactly use memory, because don't see your display with your eyes  You should run this commands on good & bad hosts to see the real difference. This may be related to kernel version, or