[ceph-users] Re: cephfs-top causes 16 mgr modules have recently crashed
Hello Jos. Thank you for the reply. I can upgrade to 17.2.7 but I wonder can I only upgrade MON+MGR for this issue or do I need to upgrade all the parts? Otherwise I need to wait few weeks. I don't want to request maintenance during delivery time. root@ud-01:~# ceph orch upgrade ls { "image": "quay.io/ceph/ceph", "registry": "quay.io", "bare_image": "ceph/ceph", "versions": [ "18.2.1", "18.2.0", "18.1.3", "18.1.2", "18.1.1", "18.1.0", "17.2.7", "17.2.6", "17.2.5", "17.2.4", "17.2.3", "17.2.2", "17.2.1", "17.2.0" ] } Best regards Jos Collin , 23 Oca 2024 Sal, 07:42 tarihinde şunu yazdı: > Please have this fix: https://tracker.ceph.com/issues/59551. It's > backported to quincy. > > On 23/01/24 03:11, Özkan Göksu wrote: > > Hello > > > > When I run cephfs-top it causes mgr module crash. Can you please tell me > > the reason? > > > > My environment: > > My ceph version 17.2.6 > > Operating System: Ubuntu 22.04.2 LTS > > Kernel: Linux 5.15.0-84-generic > > > > I created the cephfs-top user with the following command: > > ceph auth get-or-create client.fstop mon 'allow r' mds 'allow r' osd > 'allow > > r' mgr 'allow r' > /etc/ceph/ceph.client.fstop.keyring > > > > This is the crash report: > > > > root@ud-01:~# ceph crash info > > 2024-01-22T21:25:59.313305Z_526253e3-e8cc-4d2c-adcb-69a7c9986801 > > { > > "backtrace": [ > > " File \"/usr/share/ceph/mgr/stats/module.py\", line 32, in > > notify\nself.fs_perf_stats.notify_cmd(notify_id)", > > " File \"/usr/share/ceph/mgr/stats/fs/perf_stats.py\", line > 177, > > in notify_cmd\nmetric_features = > > > int(metadata[CLIENT_METADATA_KEY][\"metric_spec\"][\"metric_flags\"][\"feature_bits\"], > > 16)", > > "ValueError: invalid literal for int() with base 16: '0x'" > > ], > > "ceph_version": "17.2.6", > > "crash_id": > > "2024-01-22T21:25:59.313305Z_526253e3-e8cc-4d2c-adcb-69a7c9986801", > > "entity_name": "mgr.ud-01.qycnol", > > "mgr_module": "stats", > > "mgr_module_caller": "ActivePyModule::notify", > > "mgr_python_exception": "ValueError", > > "os_id": "centos", > > "os_name": "CentOS Stream", > > "os_version": "8", > > "os_version_id": "8", > > "process_name": "ceph-mgr", > > "stack_sig": > > "971ae170f1fff7f7bc0b7ae86d164b2b0136a8bd5ca7956166ea5161e51ad42c", > > "timestamp": "2024-01-22T21:25:59.313305Z", > > "utsname_hostname": "ud-01", > > "utsname_machine": "x86_64", > > "utsname_release": "5.15.0-84-generic", > > "utsname_sysname": "Linux", > > "utsname_version": "#93-Ubuntu SMP Tue Sep 5 17:16:10 UTC 2023" > > } > > > > > > Best regards. > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephfs-top causes 16 mgr modules have recently crashed
Please have this fix: https://tracker.ceph.com/issues/59551. It's backported to quincy. On 23/01/24 03:11, Özkan Göksu wrote: Hello When I run cephfs-top it causes mgr module crash. Can you please tell me the reason? My environment: My ceph version 17.2.6 Operating System: Ubuntu 22.04.2 LTS Kernel: Linux 5.15.0-84-generic I created the cephfs-top user with the following command: ceph auth get-or-create client.fstop mon 'allow r' mds 'allow r' osd 'allow r' mgr 'allow r' > /etc/ceph/ceph.client.fstop.keyring This is the crash report: root@ud-01:~# ceph crash info 2024-01-22T21:25:59.313305Z_526253e3-e8cc-4d2c-adcb-69a7c9986801 { "backtrace": [ " File \"/usr/share/ceph/mgr/stats/module.py\", line 32, in notify\nself.fs_perf_stats.notify_cmd(notify_id)", " File \"/usr/share/ceph/mgr/stats/fs/perf_stats.py\", line 177, in notify_cmd\nmetric_features = int(metadata[CLIENT_METADATA_KEY][\"metric_spec\"][\"metric_flags\"][\"feature_bits\"], 16)", "ValueError: invalid literal for int() with base 16: '0x'" ], "ceph_version": "17.2.6", "crash_id": "2024-01-22T21:25:59.313305Z_526253e3-e8cc-4d2c-adcb-69a7c9986801", "entity_name": "mgr.ud-01.qycnol", "mgr_module": "stats", "mgr_module_caller": "ActivePyModule::notify", "mgr_python_exception": "ValueError", "os_id": "centos", "os_name": "CentOS Stream", "os_version": "8", "os_version_id": "8", "process_name": "ceph-mgr", "stack_sig": "971ae170f1fff7f7bc0b7ae86d164b2b0136a8bd5ca7956166ea5161e51ad42c", "timestamp": "2024-01-22T21:25:59.313305Z", "utsname_hostname": "ud-01", "utsname_machine": "x86_64", "utsname_release": "5.15.0-84-generic", "utsname_sysname": "Linux", "utsname_version": "#93-Ubuntu SMP Tue Sep 5 17:16:10 UTC 2023" } Best regards. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] cephfs-top causes 16 mgr modules have recently crashed
Hello When I run cephfs-top it causes mgr module crash. Can you please tell me the reason? My environment: My ceph version 17.2.6 Operating System: Ubuntu 22.04.2 LTS Kernel: Linux 5.15.0-84-generic I created the cephfs-top user with the following command: ceph auth get-or-create client.fstop mon 'allow r' mds 'allow r' osd 'allow r' mgr 'allow r' > /etc/ceph/ceph.client.fstop.keyring This is the crash report: root@ud-01:~# ceph crash info 2024-01-22T21:25:59.313305Z_526253e3-e8cc-4d2c-adcb-69a7c9986801 { "backtrace": [ " File \"/usr/share/ceph/mgr/stats/module.py\", line 32, in notify\nself.fs_perf_stats.notify_cmd(notify_id)", " File \"/usr/share/ceph/mgr/stats/fs/perf_stats.py\", line 177, in notify_cmd\nmetric_features = int(metadata[CLIENT_METADATA_KEY][\"metric_spec\"][\"metric_flags\"][\"feature_bits\"], 16)", "ValueError: invalid literal for int() with base 16: '0x'" ], "ceph_version": "17.2.6", "crash_id": "2024-01-22T21:25:59.313305Z_526253e3-e8cc-4d2c-adcb-69a7c9986801", "entity_name": "mgr.ud-01.qycnol", "mgr_module": "stats", "mgr_module_caller": "ActivePyModule::notify", "mgr_python_exception": "ValueError", "os_id": "centos", "os_name": "CentOS Stream", "os_version": "8", "os_version_id": "8", "process_name": "ceph-mgr", "stack_sig": "971ae170f1fff7f7bc0b7ae86d164b2b0136a8bd5ca7956166ea5161e51ad42c", "timestamp": "2024-01-22T21:25:59.313305Z", "utsname_hostname": "ud-01", "utsname_machine": "x86_64", "utsname_release": "5.15.0-84-generic", "utsname_sysname": "Linux", "utsname_version": "#93-Ubuntu SMP Tue Sep 5 17:16:10 UTC 2023" } Best regards. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Degraded PGs on EC pool when marking an OSD out
On 2024/01/22 19:06, Frank Schilder wrote: > You seem to have a problem with your crush rule(s): > > 14.3d ... [18,17,16,3,1,0,NONE,NONE,12] > > If you really just took out 1 OSD, having 2xNONE in the acting set indicates > that your crush rule can't find valid mappings. You might need to tune crush > tunables: > https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-pg/?highlight=crush%20gives%20up#troubleshooting-pgs Look closely: that's the *acting* (second column) OSD set, not the *up* (first column) OSD set. It's supposed to be the *previous* set of OSDs assigned to that PG, but inexplicably some OSDs just "fall off" when the PGs get remapped around. Simply waiting lets the data recover. At no point are any of my PGs actually missing OSDs according to the current cluster state, and CRUSH always finds a valid mapping. Rather the problem is that the *previous* set of OSDs just loses some entries some for some reason. The same problem happens when I *add* an OSD to the cluster. For example, right now, osd.15 is out. This is the state of one pg: 14.3d 1044 0 0 00 157307567310 0 1630 0 1630 active+clean 2024-01-22T20:15:46.684066+0900 15550'1630 15550:16184 [18,17,16,3,1,0,11,14,12] 18 [18,17,16,3,1,0,11,14,12] 18 15550'1629 2024-01-22T20:15:46.683491+0900 0'0 2024-01-08T15:18:21.654679+0900 02 periodic scrub scheduled @ 2024-01-31T07:34:27.297723+0900 10430 Note the OSD list ([18,17,16,3,1,0,11,14,12]) Then I bring osd.15 in and: 14.3d 1044 0 1077 00 157307567310 0 1630 0 1630 active+recovery_wait+undersized+degraded+remapped 2024-01-22T22:52:22.700096+0900 15550'1630 15554:16163 [15,17,16,3,1,0,11,14,12] 15[NONE,17,16,3,1,0,11,14,12] 17 15550'1629 2024-01-22T20:15:46.683491+0900 0'0 2024-01-08T15:18:21.654679+0900 02 periodic scrub scheduled @ 2024-01-31T02:31:53.342289+0900 10430 So somehow osd.18 "vanished" from the acting list ([NONE,17,16,3,1,0,11,14,12]) as it is being replaced by 15 in the new up list ([15,17,16,3,1,0,11,14,12]). The data is in osd.18, but somehow Ceph forgot. > > It is possible that your low OSD count causes the "crush gives up too soon" > issue. You might also consider to use a crush rule that places exactly 3 > shards per host (examples were in posts just last week). Otherwise, it is not > guaranteed that "... data remains available if a whole host goes down ..." > because you might have 4 chunks on one of the hosts and fall below min_size > (the failure domain of your crush rule for the EC profiles is OSD). That should be what my CRUSH rule does. It picks 3 hosts then picks 3 OSDs per host (IIUC). And oddly enough everything works for the other EC pool even though it shares the same CRUSH rule (just ignoring one OSD from it). > To test if your crush rules can generate valid mappings, you can pull the > osdmap of your cluster and use osdmaptool to experiment with it without risk > of destroying anything. It allows you to try different crush rules and > failure scenarios on off-line but real cluster meta-data. CRUSH steady state isn't the issue here, it's the dynamic state when moving data that is the problem :) > > Best regards, > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > > From: Hector Martin > Sent: Friday, January 19, 2024 10:12 AM > To: ceph-users@ceph.io > Subject: [ceph-users] Degraded PGs on EC pool when marking an OSD out > > I'm having a bit of a weird issue with cluster rebalances with a new EC > pool. I have a 3-machine cluster, each machine with 4 HDD OSDs (+1 SSD). > Until now I've been using an erasure coded k=5 m=3 pool for most of my > data. I've recently started to migrate to a k=5 m=4 pool, so I can > configure the CRUSH rule to guarantee that data remains available if a > whole host goes down (3 chunks per host, 9 total). I also moved the 5,3 > pool to this setup, although by nature I know its PGs will become > inactive if a host goes down (need at least k+1 OSDs to be up). > > I've only just started migrating data to the 5,4 pool, but I've noticed > that any time I trigger any kind of backfilling (e.g. take one OSD out), > a bunch of PGs in the 5,4 pool become degraded (instead of just > misplaced/backfilling). This always seems to happen on that pool only, > and the object count is a significant fraction of the total pool object > count (it's not just "a few recently written objects while PGs were > repeering" or anything like that, I know about that effect). > > Here are the pools: > > pool 13 'cephfs2_data_hec5.3' erasure profile ec5.3 size 8 min_size 6 > crush_rule 7 object_hash
[ceph-users] Re: OSD read latency grows over time
> > Hi Mark, thank you for prompt answer. The fact that changing the pg_num for the index pool drops the latency > back down might be a clue. Do you have a lot of deletes happening on > this cluster? If you have a lot of deletes and long pauses between > writes, you could be accumulating tombstones that you have to keep > iterating over during bucket listing. What you describe looks very close to our case of periodic creation of checkpoints. Now it sounds like it can be our issue. Those get cleaned up during > compaction. If there are no writes, you might not be compacting the > tombstones away enough. Just a theory, but when you rearrange the PG > counts, Ceph does a bunch of writes to move the data around, triggering > compaction, and deleting the tombstones. > > In v17.2.7 we enabled a feature that automatically performs a compaction > if too many tombstones are present during iteration in RocksDB. It > might be worth upgrading to see if it helps (you might have to try > tweaking the settings if the defaults aren't helping enough). The PR is > here: > > https://github.com/ceph/ceph/pull/50893 > > Thank you very much for this idea! We'll upgrade cluster to v17.2.7 and will check if it helped. If not - we'll try to tune options you are referring to. Anyway I'll update the thread with result. Thank you once again for well-explained suggestion, Mark! -- Thank you, Roman ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Degraded PGs on EC pool when marking an OSD out
You seem to have a problem with your crush rule(s): 14.3d ... [18,17,16,3,1,0,NONE,NONE,12] If you really just took out 1 OSD, having 2xNONE in the acting set indicates that your crush rule can't find valid mappings. You might need to tune crush tunables: https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-pg/?highlight=crush%20gives%20up#troubleshooting-pgs It is possible that your low OSD count causes the "crush gives up too soon" issue. You might also consider to use a crush rule that places exactly 3 shards per host (examples were in posts just last week). Otherwise, it is not guaranteed that "... data remains available if a whole host goes down ..." because you might have 4 chunks on one of the hosts and fall below min_size (the failure domain of your crush rule for the EC profiles is OSD). To test if your crush rules can generate valid mappings, you can pull the osdmap of your cluster and use osdmaptool to experiment with it without risk of destroying anything. It allows you to try different crush rules and failure scenarios on off-line but real cluster meta-data. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Hector Martin Sent: Friday, January 19, 2024 10:12 AM To: ceph-users@ceph.io Subject: [ceph-users] Degraded PGs on EC pool when marking an OSD out I'm having a bit of a weird issue with cluster rebalances with a new EC pool. I have a 3-machine cluster, each machine with 4 HDD OSDs (+1 SSD). Until now I've been using an erasure coded k=5 m=3 pool for most of my data. I've recently started to migrate to a k=5 m=4 pool, so I can configure the CRUSH rule to guarantee that data remains available if a whole host goes down (3 chunks per host, 9 total). I also moved the 5,3 pool to this setup, although by nature I know its PGs will become inactive if a host goes down (need at least k+1 OSDs to be up). I've only just started migrating data to the 5,4 pool, but I've noticed that any time I trigger any kind of backfilling (e.g. take one OSD out), a bunch of PGs in the 5,4 pool become degraded (instead of just misplaced/backfilling). This always seems to happen on that pool only, and the object count is a significant fraction of the total pool object count (it's not just "a few recently written objects while PGs were repeering" or anything like that, I know about that effect). Here are the pools: pool 13 'cephfs2_data_hec5.3' erasure profile ec5.3 size 8 min_size 6 crush_rule 7 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode warn last_change 14133 lfor 0/11307/11305 flags hashpspool,ec_overwrites,bulk stripe_width 20480 application cephfs pool 14 'cephfs2_data_hec5.4' erasure profile ec5.4 size 9 min_size 6 crush_rule 7 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode warn last_change 14509 lfor 0/0/14234 flags hashpspool,ec_overwrites,bulk stripe_width 20480 application cephfs EC profiles: # ceph osd erasure-code-profile get ec5.3 crush-device-class= crush-failure-domain=osd crush-root=default jerasure-per-chunk-alignment=false k=5 m=3 plugin=jerasure technique=reed_sol_van w=8 # ceph osd erasure-code-profile get ec5.4 crush-device-class= crush-failure-domain=osd crush-root=default jerasure-per-chunk-alignment=false k=5 m=4 plugin=jerasure technique=reed_sol_van w=8 They both use the same CRUSH rule, which is designed to select 9 OSDs balanced across the hosts (of which only 8 slots get used for the older 5,3 pool): rule hdd-ec-x3 { id 7 type erasure step set_chooseleaf_tries 5 step set_choose_tries 100 step take default class hdd step choose indep 3 type host step choose indep 3 type osd step emit } If I take out an OSD (14), I get something like this: health: HEALTH_WARN Degraded data redundancy: 37631/120155160 objects degraded (0.031%), 38 pgs degraded All the degraded PGs are in the 5,4 pool, and the total object count is around 50k, so this is *most* of the data in the pool becoming degraded just because I marked an OSD out (without stopping it). If I mark the OSD in again, the degraded state goes away. Example degraded PGs: # ceph pg dump | grep degraded dumped all 14.3c812 0 838 00 119250277580 0 1088 0 1088 active+recovery_wait+undersized+degraded+remapped 2024-01-19T18:06:41.786745+0900 15440'1088 15486:10772 [18,17,16,1,3,2,11,13,12] 18[18,17,16,1,3,2,11,NONE,12] 18 14537'432 2024-01-12T11:25:54.168048+0900 0'0 2024-01-08T15:18:21.654679+0900 02 periodic scrub scheduled @ 2024-01-21T08:00:23.572904+0900 2410 14.3d772 0 1602 00 113032802230 0 1283 0 1283 active+recovery_wait+undersized+degraded+remapped
[ceph-users] Scrubbing?
Hello, last week I've got a HEALTH_OK on our CEPH cluster and I started upgrade firmware in network cards. When I had upgraded the sixth card from nine (one-by-one), this server didn't started correctly and our ProxMox had problem with accessing disk images on CEPH. rbd ls pool was OK, but: rbd ls pool -l didn't work. Our virtual servers had a trouble to work with disks. After I resolve network problem with OSD server, everythink returning to normal state. But I've found, that every OSD nod have very high activity: when I've started 'iotop', there was very high load: around 180MB/s read and 20MB/s write. In this time, cluster was in the HEALTH_OK state. I've found, that there is a massive scrubbing activity... After a few days, I have on our OSD nodes around 90MB/s read and 70MB/s write while 'ceph -s' have client io as 2,5MB/s read and 50MB/s write. I've found in log file of our mon server many lines about starting of scrubbing, but there are many messages about starting of scrubb the same PG? I've grep'ed syslog for some of them and attach it to this e-mail. Is this activity OK? Why CEPH start scrubing this PG once and once again? And another question: Is scrubbing part of mClock scheduler? Many thanks for explanation. Sincerely Jan Marek -- Ing. Jan Marek University of South Bohemia Academic Computer Centre Phone: +420389032080 http://www.gnu.org/philosophy/no-word-attachments.cs.html Jan 22 08:50:38 mon1 ceph-mon[1649]: 1.15e deep-scrub starts Jan 22 08:50:42 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:50:44 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:50:46 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:50:47 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:50:48 mon1 ceph-mon[1649]: 1.15e deep-scrub starts Jan 22 08:50:57 mon1 ceph-mon[1649]: 1.15e deep-scrub starts Jan 22 08:50:58 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:00 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:05 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:09 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:11 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:14 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:15 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:17 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:18 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:22 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:24 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:25 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:26 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:27 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:39 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:50 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:52 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:55 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:56 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:57 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:51:58 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:04 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:07 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:09 mon1 ceph-mon[1649]: 1.15e deep-scrub starts Jan 22 08:52:11 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:13 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:14 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:16 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:19 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:22 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:25 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:26 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:27 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:33 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:37 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:41 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:42 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:43 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:49 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:50 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:52 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:54 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:55 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:52:58 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:53:10 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:53:18 mon1 ceph-mon[1649]: 1.15e deep-scrub starts Jan 22 08:53:19 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:53:20 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:53:22 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:53:28 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:53:29 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:53:33 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:53:36 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:53:38 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:53:39 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:53:42 mon1 ceph-mon[1649]: 1.15e scrub starts Jan 22 08:53:44