[ceph-users] Re: snaptrim number of objects

2023-08-21 Thread Angelo Höngens
On 21/08/2023 16:47, Manuel Lausch wrote: > Hello, > > on my testcluster I played a bit with ceph quincy (17.2.6). > I also see slow ops while deleting snapshots. With the previous major > (pacific) this wasn't a issue. > In my case this is related to the new mclock scheduler which is > defaulted

[ceph-users] Re: snaptrim number of objects

2023-08-21 Thread Angelo Hongens
On 21/08/2023 12:38, Frank Schilder wrote: Hi Angelo, was this cluster upgraded (major version upgrade) before these issues started? We observed that with certain paths of a major version upgrade and the only way to fix that was to re-deploy all OSDs step by step. You can try a rocks-DB

[ceph-users] Re: Debian/bullseye build for reef

2023-08-21 Thread Matthew Darwin
Last few upgrades we upgraded ceph, then upgraded the O/S... it worked great... I was hoping we could do the same again this time. On 2023-08-21 12:18, Chris Palmer wrote: Ohhh.. so if I read that correctly we can't upgrade either debian or ceph until the dependency problem is

[ceph-users] Re: Debian/bullseye build for reef

2023-08-21 Thread Josh Durgin
Another option is moving to cephadm, then you can keep the same ceph version and upgrade distro independently. On Mon, Aug 21, 2023 at 9:19 AM Chris Palmer wrote: > Ohhh.. so if I read that correctly we can't upgrade either debian > or ceph until the dependency problem is resolved, and

[ceph-users] Global recovery event but HEALTH_OK

2023-08-21 Thread Alfredo Daniel Rezinovsky
I had many movement in my cluster. Broken node, replacement, rebalancing. Noy I'm stuck in upgrade to 18.2.0 (mgr and mon upgraded) and the cluster is in "Global Recovery Event" The health is OK I don't know how to search for the problem ___

[ceph-users] Re: Debian/bullseye build for reef

2023-08-21 Thread Chris Palmer
Ohhh..  so if I read that correctly we can't upgrade either debian or ceph until the dependency problem is resolved, and even then we have to do both debian & ceph simultaneously That's an uncomfortable situation in several ways... On 21/08/2023 15:25, Josh Durgin wrote: There was

[ceph-users] Re: Debian/bullseye build for reef

2023-08-21 Thread Josh Durgin
We weren't targeting bullseye once we discovered the compiler version problem, the focus shifted to bookworm. If anyone would like to help maintaining debian builds, or looking into these issues, it would be welcome: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1030129

[ceph-users] Re: snaptrim number of objects

2023-08-21 Thread Manuel Lausch
Hello, on my testcluster I played a bit with ceph quincy (17.2.6). I also see slow ops while deleting snapshots. With the previous major (pacific) this wasn't a issue. In my case this is related to the new mclock scheduler which is defaulted with quincy. With "ceph config set global osd_op_queue

[ceph-users] Re: Debian/bullseye build for reef

2023-08-21 Thread Josh Durgin
There was difficulty building on bullseye due to the older version of GCC available: https://tracker.ceph.com/issues/61845 On Mon, Aug 21, 2023 at 3:01 AM Chris Palmer wrote: > I'd like to try reef, but we are on debian 11 (bullseye). > In the ceph repos, there is debian-quincy/bullseye and >

[ceph-users] Re: Decrepit ceph cluster performance

2023-08-21 Thread Zoltán Arnold Nagy
On 2023-08-14 05:34, Anthony D'Atri wrote: Check the interfaces to ensure they have the proper netmasks and default routes; I found some systems with the main interface configured as a /32 it's top of mind lately. That is not necessarily a problem - it entirely depends on their setup. I

[ceph-users] osd: why not use aio in read?

2023-08-21 Thread Xinying Song
Hi, guys: I'm using ceph 14 on HDD, and observed obvious high latency for pg.lock(). Further inspection shows the root cause seems to be the function pgbackend->objects_read_sync() called in PrimaryLogPG::do_read() which will hold the pg lock until the disk read finish. My question is why not

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-21 Thread Boris Behrens
We're working on the migration to cephadm, but it requires some prerequisites that still needs planing. root@host:~# cat /etc/ceph/ceph.conf ; ceph config dump [global] fsid = ... mon_host = [OLD_NETWORK::10], [OLD_NETWORK::11], [OLD_NETWORK::12] #public_network = OLD_NETWORK::/64,

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-21 Thread Eugen Block
Hi, I don't have those configs. The cluster is not maintained via cephadm / orchestrator. I just assumed that with Quincy it already would be managed by cephadm. So what does the ceph.conf currently look like on an OSD host (mask sensitive data)? Zitat von Boris Behrens : Hey Eugen, I

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-21 Thread Boris Behrens
Hey Eugen, I don't have those configs. The cluster is not maintained via cephadm / orchestrator. The ceph.conf does not have IPaddresses configured. A grep in /var/lib/ceph show only binary matches on the mons I've restarted the whole host, which also did not work. Am Mo., 21. Aug. 2023 um 13:18

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-21 Thread Eugen Block
Hi, there have been a couple of threads wrt network change, simply restarting OSDs is not sufficient. I still haven't had to do it myself, but did you 'ceph orch reconfig osd' after adding the second public network, then restart them? I'm not sure if the orchestrator works as expected

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-21 Thread Janne Johansson
Den mån 21 aug. 2023 kl 12:28 skrev Boris Behrens : > > Hi, > I need to migrate a storage cluster to a new network. > > I added the new network to the ceph config via: > ceph config set global public_network "old_network/64, new_network/64" > I've added a set of new mon daemons with IP addresses

[ceph-users] Re: snaptrim number of objects

2023-08-21 Thread Frank Schilder
Hi Angelo, was this cluster upgraded (major version upgrade) before these issues started? We observed that with certain paths of a major version upgrade and the only way to fix that was to re-deploy all OSDs step by step. You can try a rocks-DB compaction first. If that doesn't help,

[ceph-users] [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-21 Thread Boris Behrens
Hi, I need to migrate a storage cluster to a new network. I added the new network to the ceph config via: ceph config set global public_network "old_network/64, new_network/64" I've added a set of new mon daemons with IP addresses in the new network and they are added to the quorum and seem to

[ceph-users] Debian/bullseye build for reef

2023-08-21 Thread Chris Palmer
I'd like to try reef, but we are on debian 11 (bullseye). In the ceph repos, there is debian-quincy/bullseye and debian-quincy/focal, but under reef there is only focal & jammy. Is there a reason why there is no reef/bullseye build? I had thought that the blocker only affected debian-bookworm

[ceph-users] Re: EC pool degrades when adding device-class to crush rule

2023-08-21 Thread Eugen Block
Hi, I tried to find an older thread that explained this quite well, maybe my google foo left me... Anyway, the docs [1] explain the "degraded" state of a PG: When a client writes an object to the primary OSD, the primary OSD is responsible for writing the replicas to the replica OSDs.

[ceph-users] Re: OSD delete vs destroy vs purge

2023-08-21 Thread Eugen Block
Yeah, that's basically it, also taking into account Anthony's response, of course. Zitat von Nicola Mori : Thanks Eugen for the explanation. To summarize what I understood: - delete from GUI simply does a drain+destroy; - destroy will preserve the OSD id so that it will be used by the next