This cluster has a long unhealthy story, means this issue is not
happening out of the blue.
root@ld3955:~# ceph -s
cluster:
id: 6b1b5117-6e08-4843-93d6-2da3cf8a6bae
health: HEALTH_WARN
1 MDSs report slow metadata IOs
noscrub,nodeep-scrub flag(s) set
Reduced data availability: 1 pg inactive, 1 pg down
1 subtrees have overcommitted pool target_size_bytes
1 subtrees have overcommitted pool target_size_ratio
18 slow requests are blocked > 32 sec
mons ld5505,ld5506 are low on available space
services:
mon: 3 daemons, quorum ld5505,ld5506,ld5507 (age 2h)
mgr: ld5507(active, since 28h), standbys: ld5506, ld5505
mds: cephfs:1 {0=ld4465=up:active} 1 up:standby
osd: 441 osds: 438 up, 438 in
flags noscrub,nodeep-scrub
data:
pools: 6 pools, 8432 pgs
objects: 63.28M objects, 241 TiB
usage: 723 TiB used, 796 TiB / 1.5 PiB avail
pgs: 0.012% pgs not active
8431 active+clean
1 creating+down
io:
client: 33 MiB/s rd, 14.20k op/s rd, 0 op/s wr
Am 15.11.2019 um 13:24 schrieb Wido den Hollander:
>
> On 11/15/19 11:22 AM, Thomas Schneider wrote:
>> Hi,
>> ceph health is reporting: pg 59.1c is creating+down, acting [426,438]
>>
>> root@ld3955:~# ceph health detail
>> HEALTH_WARN 1 MDSs report slow metadata IOs; noscrub,nodeep-scrub
>> flag(s) set; Reduced data availability: 1 pg inactive, 1 pg down; 1
>> subtrees have overcommitted pool target_size_bytes; 1 subtrees have
>> overcommitted pool target_size_ratio; mons ld5505,ld5506 are low on
>> available space
>> MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs
>> mdsld4465(mds.0): 8 slow metadata IOs are blocked > 30 secs, oldest
>> blocked for 120721 secs
>> OSDMAP_FLAGS noscrub,nodeep-scrub flag(s) set
>> PG_AVAILABILITY Reduced data availability: 1 pg inactive, 1 pg down
>> pg 59.1c is creating+down, acting [426,438]
>> MON_DISK_LOW mons ld5505,ld5506 are low on available space
>> mon.ld5505 has 22% avail
>> mon.ld5506 has 29% avail
>>
>> root@ld3955:~# ceph pg dump_stuck inactive
>> ok
>> PG_STAT STATE UP UP_PRIMARY ACTING ACTING_PRIMARY
>> 59.1c creating+down [426,438] 426 [426,438] 426
>>
>> How can I fix this?
> Did you change anything to the cluster?
>
> Can you share this output:
>
> $ ceph status
>
> As there seems that more things are wrong with this system. This doesn't
> happen out of the blue. Something must have happened.
>
> Wido
>
>> THX
>> _______________________________________________
>> ceph-users mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>>
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]