[ceph-users] Re: Improving CephFS performance by always putting "default" data pool on SSDs?

2024-02-04 Thread Niklas Hambüchen
Is the answer that easy? Why does CephFS then not store this info on the metadata pool automatically? Why do I have to conclude this info about how to get better performance for replicated pools, from information only discussed for EC pools? ___

[ceph-users] Improving CephFS performance by always putting "default" data pool on SSDs?

2024-02-04 Thread Niklas Hambüchen
https://docs.ceph.com/en/reef/cephfs/createfs/ says: > The data pool used to create the file system is the “default” data pool and > the location for storing all inode backtrace information, which is used for > hard link management and disaster recovery. > For this reason, all CephFS inodes

[ceph-users] Re: Adding datacenter level to CRUSH tree causes rebalancing

2023-07-24 Thread Niklas Hambüchen
I can believe the month timeframe for a cluster with multiple large spinners behind each HBA. I’ve witnessed such personally. I do have the numbers for this: My original post showed "1167541260/1595506041 objects misplaced (73.177%)". During my last recovery with Ceph 16.2.7, the recovery

[ceph-users] Re: Adding datacenter level to CRUSH tree causes rebalancing

2023-07-20 Thread Niklas Hambüchen
Thank you both Michel and Christian. Looks like I will have to do the rebalancing eventually. From past experience with Ceph 16 the rebalance will likely take at least a month with my 500 M objects. It seems like a good idea to upgrade to Ceph 17 first as Michel suggests. Unless: I was

[ceph-users] Adding datacenter level to CRUSH tree causes rebalancing

2023-07-15 Thread Niklas Hambüchen
Hi Ceph users, I have a Ceph 16.2.7 cluster that so far has been replicated over the `host` failure domain. All `hosts` have been chosen to be in different `datacenter`s, so that was sufficient. Now I wish to add more hosts, including some in already-used data centers, so I'm planning to use

[ceph-users] Re: 1 pg inconsistent and does not recover

2023-06-29 Thread Niklas Hambüchen
On 28/06/2023 21:26, Niklas Hambüchen wrote: I have increased the number of scrubs per OSD from 1 to 3 using `ceph config set osd osd_max_scrubs 3`. Now the problematic PG is scrubbing in `ceph pg ls`:     active+clean+scrubbing+deep+inconsistent This succeeded! The deep-scrub fixed the PG

[ceph-users] Re: 1 pg inconsistent and does not recover

2023-06-28 Thread Niklas Hambüchen
Frank, high likelihood that at least one OSD of any PG is part of a scrub at any time already. In that case, if a PG is not eligible for scrubbing because one of its OSDs has already max-scrubs (default=1) scrubs running, the reservation has no observable effect. This is a great hint. I

[ceph-users] Re: 1 pg inconsistent and does not recover

2023-06-28 Thread Niklas Hambüchen
Hi Frank, The response to that is not to try manual repair but to issue a deep-scrub. I am a bit confused, because in your script you do issue "ceph pg repair", not a scrub. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an

[ceph-users] Re: 1 pg inconsistent and does not recover

2023-06-28 Thread Niklas Hambüchen
On 28/06/2023 05:24, Alexander E. Patrakov wrote: What you can do is try extracting the PG from the dead OSD disk I believe this is not possible for me because the dead disk does not turn on at all. ___ ceph-users mailing list -- ceph-users@ceph.io

[ceph-users] Re: 1 pg inconsistent and does not recover

2023-06-28 Thread Niklas Hambüchen
I, too have the problem that `ceph pg deep-scrub` does not start the scrub, with Ceph 16.2.7. # ceph pg deep-scrub 2.87 instructing pg 2.87 on osd.33 to deep-scrub However then, on the machine where that osd.33 is: # ceph daemon osd.33 dump_scrubs | jq . | head -n 13 [ {

[ceph-users] Re: 1 pg inconsistent and does not recover

2023-06-27 Thread Niklas Hambüchen
Hi Alvaro, Can you post the entire Ceph status output? Pasting here since it is short cluster: id: d9000ec0-93c2-479f-bd5d-94ae9673e347 health: HEALTH_ERR 1 scrub errors Possible data damage: 1 pg inconsistent services: mon: 3

[ceph-users] 1 pg inconsistent and does not recover

2023-06-27 Thread Niklas Hambüchen
Hi, I have a 3x-replicated pool with Ceph 12.2.7. One HDD broke, its OSD "2" was automatically marked as "out", the disk was physically replaced by a new one, and that added back in. Now `ceph health detail` continues to permanently show: [ERR] OSD_SCRUB_ERRORS: 1 scrub errors [ERR]

[ceph-users] Re: Deep-scrub much slower than HDD speed

2023-05-01 Thread Niklas Hambüchen
That one talks about resilvering, which is not the same as neither ZFS scrubs nor ceph scrubs. The commit I linked is titled "Sequential scrub and resilvers". So ZFS scrubs are included. ___ ceph-users mailing list -- ceph-users@ceph.io To

[ceph-users] Re: Deep-scrub much slower than HDD speed

2023-05-01 Thread Niklas Hambüchen
Hi all, Scrubs only read data that does exist in ceph as it exists, not every sector of the drive, written or not. Thanks, this does explain it. I just discovered: ZFS had this problem in the past: *

[ceph-users] Re: Deep-scrub much slower than HDD speed

2023-05-01 Thread Niklas Hambüchen
Hi Marc, thanks for your numbers, this seems to confirm the suspicions. Oh I get it. Interesting. I think if you will expand the cluster in the future with more disks you will spread the load have more iops, this will disappear. This one I'm not sure about: If I expand the cluster 2x, I'll

[ceph-users] Re: Deep-scrub much slower than HDD speed

2023-04-26 Thread Niklas Hambüchen
The question you should ask yourself, why you want to change/investigate this? Because if scrubbing takes 10x longer thrashing seeks, my scrubs never finish in time (the default is 1 week). I end with e.g. 267 pgs not deep-scrubbed in time On a 38 TB cluster, if you scrub 8 MB/s on 10

[ceph-users] Re: Deep-scrub much slower than HDD speed

2023-04-26 Thread Niklas Hambüchen
Hi Marc, thanks for your reply. 100MB/s is sequential, your scrubbing is random. afaik everything is random. Is there any docs that explain this, any code, or other definitive answer? Also wouldn't it make sense that for scrubbing to be able to read the disk linearly, at least to some

[ceph-users] Deep-scrub much slower than HDD speed

2023-04-25 Thread Niklas Hambüchen
I observed that on an otherwise idle cluster, scrubbing cannot fully utilise the speed of my HDDs. `iostat` shows only 8-10 MB/s per disk, instead of the ~100 MB/s most HDDs can easily deliver. Changing scrubbing settings does not help (see below). Environment: * 6