Re: [ceph-users] SSD Bluestore Backfills Slow

2018-06-04 Thread Reed Dier
Appreciate the input. Wasn’t sure if ceph-volume was the one setting these bits of metadata or something else. Appreciate the help guys. Thanks, Reed > The fix is in core Ceph (the OSD/BlueStore code), not ceph-volume. :) > journal_rotational is still a thing in BlueStore; it represents the

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-06-04 Thread Alfredo Deza
On Mon, Jun 4, 2018 at 12:37 PM, Reed Dier wrote: > Hi Caspar, > > David is correct, in that the issue I was having with SSD OSD’s having NVMe > bluefs_db reporting as HDD creating an artificial throttle based on what > David was mentioning, a prevention to keep spinning rust from thrashing. Not

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-06-04 Thread Gregory Farnum
On Mon, Jun 4, 2018 at 9:38 AM Reed Dier wrote: > Copying Alfredo, as I’m not sure if something changed with respect to > ceph-volume in 12.2.2 (when this originally happened) to 12.2.5 (I’m sure > plenty did), because I recently had an NVMe drive fail on me unexpectedly > (curse you Micron),

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-06-04 Thread Reed Dier
Hi Caspar, David is correct, in that the issue I was having with SSD OSD’s having NVMe bluefs_db reporting as HDD creating an artificial throttle based on what David was mentioning, a prevention to keep spinning rust from thrashing. Not sure if the journal_rotational bit should be 1, but

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-06-04 Thread David Turner
I don't believe this really applies to you. The problem here was with an SSD osd that was incorrectly labeled as an HDD osd by ceph. The fix was to inject a sleep seeing if 0 for those osds to speed up recovery. The sleep is needed to not kill hdds to avoid thrashing, but the bug was SSDs were

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-06-04 Thread Caspar Smit
Hi Reed, "Changing/injecting osd_recovery_sleep_hdd into the running SSD OSD’s on bluestore opened the floodgates." What exactly did you change/inject here? We have a cluster with 10TB SATA HDD's which each have a 100GB SSD based block.db Looking at ceph osd metadata for each of those:

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-27 Thread David Turner
I have 2 different configurations that are incorrectly showing rotational for the OSDs. The [1]first is a server with disks behind controllers and an NVME riser card. It has 2 different OSD types, one with the block on an HDD and WAL on the NVME as well as a pure NVME OSD. The Hybrid OSD seems

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-26 Thread Oliver Freyermuth
Am 26.02.2018 um 23:29 schrieb Gregory Farnum: > > > On Mon, Feb 26, 2018 at 2:23 PM Reed Dier > wrote: > > Quick turn around, > > Changing/injecting osd_recovery_sleep_hdd into the running SSD OSD’s on > bluestore opened the

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-26 Thread Gregory Farnum
On Mon, Feb 26, 2018 at 2:23 PM Reed Dier wrote: > Quick turn around, > > Changing/injecting osd_recovery_sleep_hdd into the running SSD OSD’s on > bluestore opened the floodgates. > Oh right, the OSD does not (think it can) have anything it can really do if you've got a

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-26 Thread Reed Dier
Quick turn around, Changing/injecting osd_recovery_sleep_hdd into the running SSD OSD’s on bluestore opened the floodgates. > pool objects-ssd id 20 > recovery io 1512 MB/s, 21547 objects/s > > pool fs-metadata-ssd id 16 > recovery io 0 B/s, 6494 keys/s, 271 objects/s > client io 82325

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-26 Thread Gregory Farnum
On Mon, Feb 26, 2018 at 12:26 PM Reed Dier wrote: > I will try to set the hybrid sleeps to 0 on the affected OSDs as an > interim solution to getting the metadata configured correctly. > Yes, that's a good workaround as long as you don't have any actual hybrid OSDs (or

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-26 Thread Reed Dier
I will try to set the hybrid sleeps to 0 on the affected OSDs as an interim solution to getting the metadata configured correctly. For reference, here is the complete metadata for osd.24, bluestore SATA SSD with NVMe block.db. > { > "id": 24, > "arch": "x86_64", >

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-26 Thread Gregory Farnum
On Mon, Feb 26, 2018 at 11:21 AM Reed Dier wrote: > The ‘good perf’ that I reported below was the result of beginning 5 new > bluestore conversions which results in a leading edge of ‘good’ > performance, before trickling off. > > This performance lasted about 20 minutes,

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-26 Thread Reed Dier
The ‘good perf’ that I reported below was the result of beginning 5 new bluestore conversions which results in a leading edge of ‘good’ performance, before trickling off. This performance lasted about 20 minutes, where it backfilled a small set of PGs off of non-bluestore OSDs. Current

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-26 Thread Gregory Farnum
On Mon, Feb 26, 2018 at 9:12 AM Reed Dier wrote: > After my last round of backfills completed, I started 5 more bluestore > conversions, which helped me recognize a very specific pattern of > performance. > > pool objects-ssd id 20 > recovery io 757 MB/s, 10845 objects/s

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-26 Thread Reed Dier
After my last round of backfills completed, I started 5 more bluestore conversions, which helped me recognize a very specific pattern of performance. > pool objects-ssd id 20 > recovery io 757 MB/s, 10845 objects/s > > pool fs-metadata-ssd id 16 > recovery io 0 B/s, 36265 keys/s, 1633

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-23 Thread David Turner
Here is a [1] link to a ML thread tracking some slow backfilling on bluestore. It came down to the backfill sleep setting for them. Maybe it will help. [1] https://www.mail-archive.com/ceph-users@lists.ceph.com/msg40256.html On Fri, Feb 23, 2018 at 10:46 AM Reed Dier

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-23 Thread Reed Dier
Probably unrelated, but I do keep seeing this odd negative objects degraded message on the fs-metadata pool: > pool fs-metadata-ssd id 16 > -34/3 objects degraded (-1133.333%) > recovery io 0 B/s, 89 keys/s, 2 objects/s > client io 51289 B/s rd, 101 kB/s wr, 0 op/s rd, 0 op/s wr Don’t

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-23 Thread Reed Dier
Below is ceph -s > cluster: > id: {id} > health: HEALTH_WARN > noout flag(s) set > 260610/1068004947 objects misplaced (0.024%) > Degraded data redundancy: 23157232/1068004947 objects degraded > (2.168%), 332 pgs unclean, 328 pgs degraded, 328

Re: [ceph-users] SSD Bluestore Backfills Slow

2018-02-22 Thread Gregory Farnum
What's the output of "ceph -s" while this is happening? Is there some identifiable difference between these two states, like you get a lot of throughput on the data pools but then metadata recovery is slower? Are you sure the recovery is actually going slower, or are the individual ops larger or