[ceph-users] Re: cephfs-top causes 16 mgr modules have recently crashed

2024-01-22 Thread Özkan Göksu
Hello Jos.
Thank you for the reply.

I can upgrade to 17.2.7 but I wonder can I only upgrade MON+MGR for this
issue or do I need to upgrade all the parts?
Otherwise I need to wait few weeks. I don't want to request maintenance
during delivery time.

root@ud-01:~# ceph orch upgrade ls
{
"image": "quay.io/ceph/ceph",
"registry": "quay.io",
"bare_image": "ceph/ceph",
"versions": [
"18.2.1",
"18.2.0",
"18.1.3",
"18.1.2",
"18.1.1",
"18.1.0",
"17.2.7",
"17.2.6",
"17.2.5",
"17.2.4",
"17.2.3",
"17.2.2",
"17.2.1",
"17.2.0"
]
}

Best regards

Jos Collin , 23 Oca 2024 Sal, 07:42 tarihinde şunu
yazdı:

> Please have this fix: https://tracker.ceph.com/issues/59551. It's
> backported to quincy.
>
> On 23/01/24 03:11, Özkan Göksu wrote:
> > Hello
> >
> > When I run cephfs-top it causes mgr module crash. Can you please tell me
> > the reason?
> >
> > My environment:
> > My ceph version 17.2.6
> > Operating System: Ubuntu 22.04.2 LTS
> > Kernel: Linux 5.15.0-84-generic
> >
> > I created the cephfs-top user with the following command:
> > ceph auth get-or-create client.fstop mon 'allow r' mds 'allow r' osd
> 'allow
> > r' mgr 'allow r' > /etc/ceph/ceph.client.fstop.keyring
> >
> > This is the crash report:
> >
> > root@ud-01:~# ceph crash info
> > 2024-01-22T21:25:59.313305Z_526253e3-e8cc-4d2c-adcb-69a7c9986801
> > {
> >  "backtrace": [
> >  "  File \"/usr/share/ceph/mgr/stats/module.py\", line 32, in
> > notify\nself.fs_perf_stats.notify_cmd(notify_id)",
> >  "  File \"/usr/share/ceph/mgr/stats/fs/perf_stats.py\", line
> 177,
> > in notify_cmd\nmetric_features =
> >
> int(metadata[CLIENT_METADATA_KEY][\"metric_spec\"][\"metric_flags\"][\"feature_bits\"],
> > 16)",
> >  "ValueError: invalid literal for int() with base 16: '0x'"
> >  ],
> >  "ceph_version": "17.2.6",
> >  "crash_id":
> > "2024-01-22T21:25:59.313305Z_526253e3-e8cc-4d2c-adcb-69a7c9986801",
> >  "entity_name": "mgr.ud-01.qycnol",
> >  "mgr_module": "stats",
> >  "mgr_module_caller": "ActivePyModule::notify",
> >  "mgr_python_exception": "ValueError",
> >  "os_id": "centos",
> >  "os_name": "CentOS Stream",
> >  "os_version": "8",
> >  "os_version_id": "8",
> >  "process_name": "ceph-mgr",
> >  "stack_sig":
> > "971ae170f1fff7f7bc0b7ae86d164b2b0136a8bd5ca7956166ea5161e51ad42c",
> >  "timestamp": "2024-01-22T21:25:59.313305Z",
> >  "utsname_hostname": "ud-01",
> >  "utsname_machine": "x86_64",
> >  "utsname_release": "5.15.0-84-generic",
> >  "utsname_sysname": "Linux",
> >  "utsname_version": "#93-Ubuntu SMP Tue Sep 5 17:16:10 UTC 2023"
> > }
> >
> >
> > Best regards.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs-top causes 16 mgr modules have recently crashed

2024-01-22 Thread Jos Collin
Please have this fix: https://tracker.ceph.com/issues/59551. It's 
backported to quincy.


On 23/01/24 03:11, Özkan Göksu wrote:

Hello

When I run cephfs-top it causes mgr module crash. Can you please tell me
the reason?

My environment:
My ceph version 17.2.6
Operating System: Ubuntu 22.04.2 LTS
Kernel: Linux 5.15.0-84-generic

I created the cephfs-top user with the following command:
ceph auth get-or-create client.fstop mon 'allow r' mds 'allow r' osd 'allow
r' mgr 'allow r' > /etc/ceph/ceph.client.fstop.keyring

This is the crash report:

root@ud-01:~# ceph crash info
2024-01-22T21:25:59.313305Z_526253e3-e8cc-4d2c-adcb-69a7c9986801
{
 "backtrace": [
 "  File \"/usr/share/ceph/mgr/stats/module.py\", line 32, in
notify\nself.fs_perf_stats.notify_cmd(notify_id)",
 "  File \"/usr/share/ceph/mgr/stats/fs/perf_stats.py\", line 177,
in notify_cmd\nmetric_features =
int(metadata[CLIENT_METADATA_KEY][\"metric_spec\"][\"metric_flags\"][\"feature_bits\"],
16)",
 "ValueError: invalid literal for int() with base 16: '0x'"
 ],
 "ceph_version": "17.2.6",
 "crash_id":
"2024-01-22T21:25:59.313305Z_526253e3-e8cc-4d2c-adcb-69a7c9986801",
 "entity_name": "mgr.ud-01.qycnol",
 "mgr_module": "stats",
 "mgr_module_caller": "ActivePyModule::notify",
 "mgr_python_exception": "ValueError",
 "os_id": "centos",
 "os_name": "CentOS Stream",
 "os_version": "8",
 "os_version_id": "8",
 "process_name": "ceph-mgr",
 "stack_sig":
"971ae170f1fff7f7bc0b7ae86d164b2b0136a8bd5ca7956166ea5161e51ad42c",
 "timestamp": "2024-01-22T21:25:59.313305Z",
 "utsname_hostname": "ud-01",
 "utsname_machine": "x86_64",
 "utsname_release": "5.15.0-84-generic",
 "utsname_sysname": "Linux",
 "utsname_version": "#93-Ubuntu SMP Tue Sep 5 17:16:10 UTC 2023"
}


Best regards.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] cephfs-top causes 16 mgr modules have recently crashed

2024-01-22 Thread Özkan Göksu
Hello

When I run cephfs-top it causes mgr module crash. Can you please tell me
the reason?

My environment:
My ceph version 17.2.6
Operating System: Ubuntu 22.04.2 LTS
Kernel: Linux 5.15.0-84-generic

I created the cephfs-top user with the following command:
ceph auth get-or-create client.fstop mon 'allow r' mds 'allow r' osd 'allow
r' mgr 'allow r' > /etc/ceph/ceph.client.fstop.keyring

This is the crash report:

root@ud-01:~# ceph crash info
2024-01-22T21:25:59.313305Z_526253e3-e8cc-4d2c-adcb-69a7c9986801
{
"backtrace": [
"  File \"/usr/share/ceph/mgr/stats/module.py\", line 32, in
notify\nself.fs_perf_stats.notify_cmd(notify_id)",
"  File \"/usr/share/ceph/mgr/stats/fs/perf_stats.py\", line 177,
in notify_cmd\nmetric_features =
int(metadata[CLIENT_METADATA_KEY][\"metric_spec\"][\"metric_flags\"][\"feature_bits\"],
16)",
"ValueError: invalid literal for int() with base 16: '0x'"
],
"ceph_version": "17.2.6",
"crash_id":
"2024-01-22T21:25:59.313305Z_526253e3-e8cc-4d2c-adcb-69a7c9986801",
"entity_name": "mgr.ud-01.qycnol",
"mgr_module": "stats",
"mgr_module_caller": "ActivePyModule::notify",
"mgr_python_exception": "ValueError",
"os_id": "centos",
"os_name": "CentOS Stream",
"os_version": "8",
"os_version_id": "8",
"process_name": "ceph-mgr",
"stack_sig":
"971ae170f1fff7f7bc0b7ae86d164b2b0136a8bd5ca7956166ea5161e51ad42c",
"timestamp": "2024-01-22T21:25:59.313305Z",
"utsname_hostname": "ud-01",
"utsname_machine": "x86_64",
"utsname_release": "5.15.0-84-generic",
"utsname_sysname": "Linux",
"utsname_version": "#93-Ubuntu SMP Tue Sep 5 17:16:10 UTC 2023"
}


Best regards.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Degraded PGs on EC pool when marking an OSD out

2024-01-22 Thread Hector Martin
On 2024/01/22 19:06, Frank Schilder wrote:
> You seem to have a problem with your crush rule(s):
> 
> 14.3d ... [18,17,16,3,1,0,NONE,NONE,12]
> 
> If you really just took out 1 OSD, having 2xNONE in the acting set indicates 
> that your crush rule can't find valid mappings. You might need to tune crush 
> tunables: 
> https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-pg/?highlight=crush%20gives%20up#troubleshooting-pgs

Look closely: that's the *acting* (second column) OSD set, not the *up*
(first column) OSD set. It's supposed to be the *previous* set of OSDs
assigned to that PG, but inexplicably some OSDs just "fall off" when the
PGs get remapped around.

Simply waiting lets the data recover. At no point are any of my PGs
actually missing OSDs according to the current cluster state, and CRUSH
always finds a valid mapping. Rather the problem is that the *previous*
set of OSDs just loses some entries some for some reason.

The same problem happens when I *add* an OSD to the cluster. For
example, right now, osd.15 is out. This is the state of one pg:

14.3d   1044   0 0  00
157307567310   0  1630 0  1630
active+clean  2024-01-22T20:15:46.684066+0900 15550'1630
15550:16184  [18,17,16,3,1,0,11,14,12]  18
[18,17,16,3,1,0,11,14,12]  18 15550'1629
2024-01-22T20:15:46.683491+0900  0'0
2024-01-08T15:18:21.654679+0900  02
periodic scrub scheduled @ 2024-01-31T07:34:27.297723+0900
10430

Note the OSD list ([18,17,16,3,1,0,11,14,12])

Then I bring osd.15 in and:

14.3d   1044   0  1077  00
157307567310   0  1630 0  1630
active+recovery_wait+undersized+degraded+remapped
2024-01-22T22:52:22.700096+0900 15550'1630 15554:16163
[15,17,16,3,1,0,11,14,12]  15[NONE,17,16,3,1,0,11,14,12]
 17 15550'1629  2024-01-22T20:15:46.683491+0900
0'0  2024-01-08T15:18:21.654679+0900  02
 periodic scrub scheduled @ 2024-01-31T02:31:53.342289+0900
 10430

So somehow osd.18 "vanished" from the acting list
([NONE,17,16,3,1,0,11,14,12]) as it is being replaced by 15 in the new
up list ([15,17,16,3,1,0,11,14,12]). The data is in osd.18, but somehow
Ceph forgot.

> 
> It is possible that your low OSD count causes the "crush gives up too soon" 
> issue. You might also consider to use a crush rule that places exactly 3 
> shards per host (examples were in posts just last week). Otherwise, it is not 
> guaranteed that "... data remains available if a whole host goes down ..." 
> because you might have 4 chunks on one of the hosts and fall below min_size 
> (the failure domain of your crush rule for the EC profiles is OSD).

That should be what my CRUSH rule does. It picks 3 hosts then picks 3
OSDs per host (IIUC). And oddly enough everything works for the other EC
pool even though it shares the same CRUSH rule (just ignoring one OSD
from it).

> To test if your crush rules can generate valid mappings, you can pull the 
> osdmap of your cluster and use osdmaptool to experiment with it without risk 
> of destroying anything. It allows you to try different crush rules and 
> failure scenarios on off-line but real cluster meta-data.

CRUSH steady state isn't the issue here, it's the dynamic state when
moving data that is the problem :)

> 
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> 
> 
> From: Hector Martin 
> Sent: Friday, January 19, 2024 10:12 AM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Degraded PGs on EC pool when marking an OSD out
> 
> I'm having a bit of a weird issue with cluster rebalances with a new EC
> pool. I have a 3-machine cluster, each machine with 4 HDD OSDs (+1 SSD).
> Until now I've been using an erasure coded k=5 m=3 pool for most of my
> data. I've recently started to migrate to a k=5 m=4 pool, so I can
> configure the CRUSH rule to guarantee that data remains available if a
> whole host goes down (3 chunks per host, 9 total). I also moved the 5,3
> pool to this setup, although by nature I know its PGs will become
> inactive if a host goes down (need at least k+1 OSDs to be up).
> 
> I've only just started migrating data to the 5,4 pool, but I've noticed
> that any time I trigger any kind of backfilling (e.g. take one OSD out),
> a bunch of PGs in the 5,4 pool become degraded (instead of just
> misplaced/backfilling). This always seems to happen on that pool only,
> and the object count is a significant fraction of the total pool object
> count (it's not just "a few recently written objects while PGs were
> repeering" or anything like that, I know about that effect).
> 
> Here are the pools:
> 
> pool 13 'cephfs2_data_hec5.3' erasure profile ec5.3 size 8 min_size 6
> crush_rule 7 object_hash 

[ceph-users] Re: OSD read latency grows over time

2024-01-22 Thread Roman Pashin
>
> Hi Mark, thank you for prompt answer.

The fact that changing the pg_num for the index pool drops the latency
> back down might be a clue.  Do you have a lot of deletes happening on
> this cluster?  If you have a lot of deletes and long pauses between
> writes, you could be accumulating tombstones that you have to keep
> iterating over during bucket listing.

What you describe looks very close to our case of periodic creation of
checkpoints. Now it sounds like it can be our issue.

Those get cleaned up during
> compaction.  If there are no writes, you might not be compacting the
> tombstones away enough.  Just a theory, but when you rearrange the PG
> counts, Ceph does a bunch of writes to move the data around, triggering
> compaction, and deleting the tombstones.
>
> In v17.2.7 we enabled a feature that automatically performs a compaction
> if too many tombstones are present during iteration in RocksDB.  It
> might be worth upgrading to see if it helps (you might have to try
> tweaking the settings if the defaults aren't helping enough).  The PR is
> here:
>
> https://github.com/ceph/ceph/pull/50893
>
> Thank you very much for this idea! We'll upgrade cluster to v17.2.7 and
will check if it helped. If not - we'll try to tune options you are
referring to. Anyway I'll update the thread with result.

Thank you once again for well-explained suggestion, Mark!

--
Thank you,
Roman
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Degraded PGs on EC pool when marking an OSD out

2024-01-22 Thread Frank Schilder
You seem to have a problem with your crush rule(s):

14.3d ... [18,17,16,3,1,0,NONE,NONE,12]

If you really just took out 1 OSD, having 2xNONE in the acting set indicates 
that your crush rule can't find valid mappings. You might need to tune crush 
tunables: 
https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-pg/?highlight=crush%20gives%20up#troubleshooting-pgs

It is possible that your low OSD count causes the "crush gives up too soon" 
issue. You might also consider to use a crush rule that places exactly 3 shards 
per host (examples were in posts just last week). Otherwise, it is not 
guaranteed that "... data remains available if a whole host goes down ..." 
because you might have 4 chunks on one of the hosts and fall below min_size 
(the failure domain of your crush rule for the EC profiles is OSD).

To test if your crush rules can generate valid mappings, you can pull the 
osdmap of your cluster and use osdmaptool to experiment with it without risk of 
destroying anything. It allows you to try different crush rules and failure 
scenarios on off-line but real cluster meta-data.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Hector Martin 
Sent: Friday, January 19, 2024 10:12 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Degraded PGs on EC pool when marking an OSD out

I'm having a bit of a weird issue with cluster rebalances with a new EC
pool. I have a 3-machine cluster, each machine with 4 HDD OSDs (+1 SSD).
Until now I've been using an erasure coded k=5 m=3 pool for most of my
data. I've recently started to migrate to a k=5 m=4 pool, so I can
configure the CRUSH rule to guarantee that data remains available if a
whole host goes down (3 chunks per host, 9 total). I also moved the 5,3
pool to this setup, although by nature I know its PGs will become
inactive if a host goes down (need at least k+1 OSDs to be up).

I've only just started migrating data to the 5,4 pool, but I've noticed
that any time I trigger any kind of backfilling (e.g. take one OSD out),
a bunch of PGs in the 5,4 pool become degraded (instead of just
misplaced/backfilling). This always seems to happen on that pool only,
and the object count is a significant fraction of the total pool object
count (it's not just "a few recently written objects while PGs were
repeering" or anything like that, I know about that effect).

Here are the pools:

pool 13 'cephfs2_data_hec5.3' erasure profile ec5.3 size 8 min_size 6
crush_rule 7 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode
warn last_change 14133 lfor 0/11307/11305 flags
hashpspool,ec_overwrites,bulk stripe_width 20480 application cephfs
pool 14 'cephfs2_data_hec5.4' erasure profile ec5.4 size 9 min_size 6
crush_rule 7 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode
warn last_change 14509 lfor 0/0/14234 flags
hashpspool,ec_overwrites,bulk stripe_width 20480 application cephfs

EC profiles:

# ceph osd erasure-code-profile get ec5.3
crush-device-class=
crush-failure-domain=osd
crush-root=default
jerasure-per-chunk-alignment=false
k=5
m=3
plugin=jerasure
technique=reed_sol_van
w=8

# ceph osd erasure-code-profile get ec5.4
crush-device-class=
crush-failure-domain=osd
crush-root=default
jerasure-per-chunk-alignment=false
k=5
m=4
plugin=jerasure
technique=reed_sol_van
w=8

They both use the same CRUSH rule, which is designed to select 9 OSDs
balanced across the hosts (of which only 8 slots get used for the older
5,3 pool):

rule hdd-ec-x3 {
id 7
type erasure
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default class hdd
step choose indep 3 type host
step choose indep 3 type osd
step emit
}

If I take out an OSD (14), I get something like this:

health: HEALTH_WARN
Degraded data redundancy: 37631/120155160 objects degraded
(0.031%), 38 pgs degraded

All the degraded PGs are in the 5,4 pool, and the total object count is
around 50k, so this is *most* of the data in the pool becoming degraded
just because I marked an OSD out (without stopping it). If I mark the
OSD in again, the degraded state goes away.

Example degraded PGs:

# ceph pg dump | grep degraded
dumped all
14.3c812   0   838  00
119250277580   0  1088 0  1088
active+recovery_wait+undersized+degraded+remapped
2024-01-19T18:06:41.786745+0900 15440'1088 15486:10772
[18,17,16,1,3,2,11,13,12]  18[18,17,16,1,3,2,11,NONE,12]
 18  14537'432  2024-01-12T11:25:54.168048+0900
0'0  2024-01-08T15:18:21.654679+0900  02
 periodic scrub scheduled @ 2024-01-21T08:00:23.572904+0900
  2410
14.3d772   0  1602  00
113032802230   0  1283 0  1283
active+recovery_wait+undersized+degraded+remapped

[ceph-users] Scrubbing?

2024-01-22 Thread Jan Marek
Hello,

last week I've got a HEALTH_OK on our CEPH cluster and I
started upgrade firmware in network cards.

When I had upgraded the sixth card from nine (one-by-one), this
server didn't started correctly and our ProxMox had problem with
accessing disk images on CEPH.

rbd ls pool

was OK, but:

rbd ls pool -l

didn't work. Our virtual servers had a trouble to work with
disks.

After I resolve network problem with OSD server, everythink
returning to normal state.

But I've found, that every OSD nod have very high activity: when
I've started 'iotop', there was very high load: around 180MB/s
read and 20MB/s write. In this time, cluster was in the HEALTH_OK
state. I've found, that there is a massive scrubbing activity...

After a few days, I have on our OSD nodes around 90MB/s read and
70MB/s write while 'ceph -s' have client io as 2,5MB/s read and
50MB/s write.

I've found in log file of our mon server many lines about
starting of scrubbing, but there are many messages about
starting of scrubb the same PG? I've grep'ed syslog for some of
them and attach it to this e-mail.

Is this activity OK? Why CEPH start scrubing this PG once and
once again?

And another question: Is scrubbing part of mClock scheduler?

Many thanks for explanation.

Sincerely
Jan Marek
-- 
Ing. Jan Marek
University of South Bohemia
Academic Computer Centre
Phone: +420389032080
http://www.gnu.org/philosophy/no-word-attachments.cs.html
Jan 22 08:50:38 mon1 ceph-mon[1649]: 1.15e deep-scrub starts
Jan 22 08:50:42 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:50:44 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:50:46 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:50:47 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:50:48 mon1 ceph-mon[1649]: 1.15e deep-scrub starts
Jan 22 08:50:57 mon1 ceph-mon[1649]: 1.15e deep-scrub starts
Jan 22 08:50:58 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:00 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:05 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:09 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:11 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:14 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:15 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:17 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:18 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:22 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:24 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:25 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:26 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:27 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:39 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:50 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:52 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:55 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:56 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:57 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:51:58 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:04 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:07 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:09 mon1 ceph-mon[1649]: 1.15e deep-scrub starts
Jan 22 08:52:11 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:13 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:14 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:16 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:19 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:22 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:25 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:26 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:27 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:33 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:37 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:41 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:42 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:43 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:49 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:50 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:52 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:54 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:55 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:52:58 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:53:10 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:53:18 mon1 ceph-mon[1649]: 1.15e deep-scrub starts
Jan 22 08:53:19 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:53:20 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:53:22 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:53:28 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:53:29 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:53:33 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:53:36 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:53:38 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:53:39 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:53:42 mon1 ceph-mon[1649]: 1.15e scrub starts
Jan 22 08:53:44