and a 3rd one:
health: HEALTH_WARN
1 MDSs report slow metadata IOs
1 MDSs report slow requests
2018-10-13 21:44:08.150722 mds.cloud1-1473 [WRN] 7 slow requests, 1
included below; oldest blocked for > 199.922552 secs
2018-10-13 21:44:08.150725 mds.cloud1-1473 [WRN] slow request 34.829662
seconds old, received at 2018-10-13 21:43:33.321031:
client_request(client.216121228:929114 lookup #0x1/.active.lock
2018-10-13 21:43:33.321594 caller_uid=0, caller_gid=0{}) currently
failed to rdlock, waiting
The relevant OSDs are bluestore again running at 100% I/O:
iostat shows:
sdi 77,00 0,00 580,00 97,00 511032,00 972,00
1512,57 14,88 22,05 24,57 6,97 1,48 100,00
so it reads with 500MB/s which completely saturates the osd. And it does
for > 10 minutes.
Greets,
Stefan
Am 13.10.2018 um 21:29 schrieb Stefan Priebe - Profihost AG:
>
> ods.19 is a bluestore osd on a healthy 2TB SSD.
>
> Log of osd.19 is here:
> https://pastebin.com/raw/6DWwhS0A
>
> Am 13.10.2018 um 21:20 schrieb Stefan Priebe - Profihost AG:
>> Hi David,
>>
>> i think this should be the problem - form a new log from today:
>>
>> 2018-10-13 20:57:20.367326 mon.a [WRN] Health check update: 4 osds down
>> (OSD_DOWN)
>> ...
>> 2018-10-13 20:57:41.268674 mon.a [WRN] Health check update: Reduced data
>> availability: 3 pgs peering (PG_AVAILABILITY)
>> ...
>> 2018-10-13 20:58:08.684451 mon.a [WRN] Health check failed: 1 osds down
>> (OSD_DOWN)
>> ...
>> 2018-10-13 20:58:22.841210 mon.a [WRN] Health check failed: Reduced data
>> availability: 8 pgs inactive (PG_AVAILABILITY)
>> ....
>> 2018-10-13 20:58:47.570017 mon.a [WRN] Health check update: Reduced data
>> availability: 5 pgs inactive (PG_AVAILABILITY)
>> ...
>> 2018-10-13 20:58:49.142108 osd.19 [WRN] Monitor daemon marked osd.19
>> down, but it is still running
>> 2018-10-13 20:58:53.750164 mon.a [WRN] Health check update: Reduced data
>> availability: 3 pgs inactive (PG_AVAILABILITY)
>> ...
>>
>> so there is a timeframe of > 90s whee PGs are inactive and unavail -
>> this would at least explain stalled I/O to me?
>>
>> Greets,
>> Stefan
>>
>>
>> Am 12.10.2018 um 15:59 schrieb David Turner:
>>> The PGs per OSD does not change unless the OSDs are marked out. You
>>> have noout set, so that doesn't change at all during this test. All of
>>> your PGs peered quickly at the beginning and then were active+undersized
>>> the rest of the time, you never had any blocked requests, and you always
>>> had 100MB/s+ client IO. I didn't see anything wrong with your cluster
>>> to indicate that your clients had any problems whatsoever accessing data.
>>>
>>> Can you confirm that you saw the same problems while you were running
>>> those commands? The next thing would seem that possibly a client isn't
>>> getting an updated OSD map to indicate that the host and its OSDs are
>>> down and it's stuck trying to communicate with host7. That would
>>> indicate a potential problem with the client being unable to communicate
>>> with the Mons maybe? Have you completely ruled out any network problems
>>> between all nodes and all of the IPs in the cluster. What does your
>>> client log show during these times?
>>>
>>> On Fri, Oct 12, 2018 at 8:35 AM Nils Fahldieck - Profihost AG
>>> <[email protected] <mailto:[email protected]>> wrote:
>>>
>>> Hi, in our `ceph.conf` we have:
>>>
>>> mon_max_pg_per_osd = 300
>>>
>>> While the host is offline (9 OSDs down):
>>>
>>> 4352 PGs * 3 / 62 OSDs ~ 210 PGs per OSD
>>>
>>> If all OSDs are online:
>>>
>>> 4352 PGs * 3 / 71 OSDs ~ 183 PGs per OSD
>>>
>>> ... so this doesn't seem to be the issue.
>>>
>>> If I understood you right, that's what you've meant. If I got you wrong,
>>> would you mind to point to one of those threads you mentioned?
>>>
>>> Thanks :)
>>>
>>> Am 12.10.2018 um 14:03 schrieb Burkhard Linke:
>>> > Hi,
>>> >
>>> >
>>> > On 10/12/2018 01:55 PM, Nils Fahldieck - Profihost AG wrote:
>>> >> I rebooted a Ceph host and logged `ceph status` & `ceph health
>>> detail`
>>> >> every 5 seconds. During this I encountered 'PG_AVAILABILITY
>>> Reduced data
>>> >> availability: pgs peering'. At the same time some VMs hung as
>>> described
>>> >> before.
>>> >
>>> > Just a wild guess... you have 71 OSDs and about 4500 PG with size=3.
>>> > 13500 PG instance overall, resulting in ~190 PGs per OSD under normal
>>> > circumstances.
>>> >
>>> > If one host is down and the PGs have to re-peer, you might reach the
>>> > limit of 200 PG/OSDs on some of the OSDs, resulting in stuck peering.
>>> >
>>> > You can try to raise this limit. There are several threads on the
>>> > mailing list about this.
>>> >
>>> > Regards,
>>> > Burkhard
>>> >
>>> _______________________________________________
>>> ceph-users mailing list
>>> [email protected] <mailto:[email protected]>
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com