Re: [ceph-users] slow requests due to scrubbing of very small pg
Hi Lukasz, if this is filestore then most probably my comments are irrelevant. The issue I expected is BlueStore specific Unfortunately I'm not an expert in filestore hence unable to help in further investigation. Sorry... Thanks, Igor On 7/9/2019 11:39 AM, Luk wrote: We have (still) on these OSDs filestore. Regards Lukasz Hi Igor, ThankYoufor Your input, will try Your suggestion with ceph-objectstore-tool. But for now it looks like main problem is this: 2019-07-09 09:29:25.410839 7f5e4b64f700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f5e20e87700' had timed out after 15 2019-07-09 09:29:25.410842 7f5e4b64f700 1 heartbeat_map is_healthy 'FileStore::op_tp thread 0x7f5e41651700' had timed out after 60 after this (a lot of) logs osd become unresponsive for monitors and they are marked down for a few seconds/minutes, sometimes it suicide: 2019-07-09 09:29:32.271361 7f5e3d649700 0 log_channel(cluster) log [WRN] : Monitor daemon marked osd.118 down, but it is still running 2019-07-09 09:29:32.271381 7f5e3d649700 0 log_channel(cluster) log [DBG] : map e71903 wrongly marked me down at e71902 2019-07-09 09:29:32.271393 7f5e3d649700 1 osd.118 71903 start_waiting_for_healthy maybe You (or any other cepher) know how to dill with this problem ? Regards Lukasz Hi Lukasz, I've seen something like that - slow requests and relevant OSD reboots on suicide timeout at least twice with two different clusters. The root cause was slow omap listing for some objects which had started to happen after massive removals from RocksDB. To verify if this is the case you can create a script that uses ceph-objectstore-tool to list objects for the specific pg and then list-omap for every returned object. If omap listing for some object(s) takes too long (minutes in my case) - you're facing the same issue. PR that implements automatic lookup for such "slow" objects in ceph-objectstore-tool is under review: https://github.com/ceph/ceph/pull/27985 The only known workaround for existing OSDs so far is manual DB compaction. And https://github.com/ceph/ceph/pull/27627 hopefully fixes the issue for newly deployed OSDs. Relevant upstream tickets are: http://tracker.ceph.com/issues/36482 http://tracker.ceph.com/issues/40557 Hope this helps, Igor On 7/3/2019 9:54 AM, Luk wrote: Hello, I have strange problem with scrubbing. When scrubbing starts on PG which belong to default.rgw.buckets.index pool, I can see that this OSD is very busy (see attachment), and starts showing many slow request, after the scrubbing of this PG stops, slow requests stops immediately. [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# zgrep scrub /var/log/ceph/ceph-osd.118.log.1.gz | grep -w 20.2 2019-07-03 00:14:57.496308 7fd4c7a09700 0 log_channel(cluster) log [DBG] : 20.2 deep-scrub starts 2019-07-03 05:36:13.274637 7fd4ca20e700 0 log_channel(cluster) log [DBG] : 20.2 deep-scrub ok [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# du -sh 20.2_* 636K20.2_head 0 20.2_TEMP [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# ls -1 -R 20.2_head | wc -l 4125 [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# and on mon: 2019-07-03 00:48:44.793893 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6231090 : cluster [WRN] Health check failed: 105 slow requests are blocked > 32 sec. Implicated osds 118 (REQUEST_SLOW) 2019-07-03 00:48:54.086446 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6231097 : cluster [WRN] Health check update: 102 slow requests are blocked > 32 sec. Implicated osds 118 (REQUEST_SLOW) 2019-07-03 00:48:59.088240 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6231099 : cluster [WRN] Health check update: 91 slow requests are blocked > 32 sec. Implicated osds 118 (REQUEST_SLOW) [...] 2019-07-03 05:36:19.695586 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6243211 : cluster [INF] Health check cleared: REQUEST_SLOW (was: 23 slow requests are blocked > 32 sec. Implicated osds 118) 2019-07-03 05:36:19.695700 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6243212 : cluster [INF] Cluster is now healthy ceph version 12.2.9 it might be related to this (taken from: https://ceph.com/releases/v12-2-11-luminous-released/) ? : " There have been fixes to RGW dynamic and manual resharding, which no longer leaves behind stale bucket instances to be removed manually. For finding and cleaning up older instances from a reshard a radosgw-admin command reshard stale-instances list and reshard stale-instances rm should do the necessary cleanup. " ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] slow requests due to scrubbing of very small pg
We have (still) on these OSDs filestore. Regards Lukasz > Hi Igor, > ThankYoufor Your input, will try Your suggestion with > ceph-objectstore-tool. > But for now it looks like main problem is this: > 2019-07-09 09:29:25.410839 7f5e4b64f700 1 heartbeat_map is_healthy > 'OSD::osd_op_tp thread 0x7f5e20e87700' had timed out after 15 > 2019-07-09 09:29:25.410842 7f5e4b64f700 1 heartbeat_map is_healthy > 'FileStore::op_tp thread 0x7f5e41651700' had timed out after 60 > after this (a lot of) logs osd become unresponsive for monitors and > they are marked down for a few seconds/minutes, sometimes it suicide: > 2019-07-09 09:29:32.271361 7f5e3d649700 0 log_channel(cluster) log > [WRN] : Monitor daemon marked osd.118 down, but it is still running > 2019-07-09 09:29:32.271381 7f5e3d649700 0 log_channel(cluster) log > [DBG] : map e71903 wrongly marked me down at e71902 > 2019-07-09 09:29:32.271393 7f5e3d649700 1 osd.118 71903 > start_waiting_for_healthy > maybe You (or any other cepher) know how to dill with this problem ? > Regards > Lukasz >> Hi Lukasz, >> I've seen something like that - slow requests and relevant OSD reboots >> on suicide timeout at least twice with two different clusters. The root >> cause was slow omap listing for some objects which had started to happen >> after massive removals from RocksDB. >> To verify if this is the case you can create a script that uses >> ceph-objectstore-tool to list objects for the specific pg and then >> list-omap for every returned object. >> If omap listing for some object(s) takes too long (minutes in my case) - >> you're facing the same issue. >> PR that implements automatic lookup for such "slow" objects in >> ceph-objectstore-tool is under review: >> https://github.com/ceph/ceph/pull/27985 >> The only known workaround for existing OSDs so far is manual DB >> compaction. And https://github.com/ceph/ceph/pull/27627 hopefully fixes >> the issue for newly deployed OSDs. >> Relevant upstream tickets are: >> http://tracker.ceph.com/issues/36482 >> http://tracker.ceph.com/issues/40557 >> Hope this helps, >> Igor >> On 7/3/2019 9:54 AM, Luk wrote: >>> Hello, >>> >>> I have strange problem with scrubbing. >>> >>> When scrubbing starts on PG which belong to default.rgw.buckets.index >>> pool, I can see that this OSD is very busy (see attachment), and starts >>> showing many >>> slow request, after the scrubbing of this PG stops, slow requests >>> stops immediately. >>> >>> [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# zgrep scrub >>> /var/log/ceph/ceph-osd.118.log.1.gz | grep -w 20.2 >>> 2019-07-03 00:14:57.496308 7fd4c7a09700 0 log_channel(cluster) log [DBG] : >>> 20.2 deep-scrub starts >>> 2019-07-03 05:36:13.274637 7fd4ca20e700 0 log_channel(cluster) log [DBG] : >>> 20.2 deep-scrub ok >>> [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# >>> >>> [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# du -sh 20.2_* >>> 636K20.2_head >>> 0 20.2_TEMP >>> [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# ls -1 -R 20.2_head | wc >>> -l >>> 4125 >>> [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# >>> >>> and on mon: >>> >>> 2019-07-03 00:48:44.793893 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6231090 >>> : cluster [WRN] Health check failed: 105 slow requests are blocked > 32 >>> sec. Implicated osds 118 (REQUEST_SLOW) >>> 2019-07-03 00:48:54.086446 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6231097 >>> : cluster [WRN] Health check update: 102 slow requests are blocked > 32 >>> sec. Implicated osds 118 (REQUEST_SLOW) >>> 2019-07-03 00:48:59.088240 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6231099 >>> : cluster [WRN] Health check update: 91 slow requests are blocked > 32 sec. >>> Implicated osds 118 (REQUEST_SLOW) >>> >>> [...] >>> >>> 2019-07-03 05:36:19.695586 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6243211 >>> : cluster [INF] Health check cleared: REQUEST_SLOW (was: 23 slow requests >>> are blocked > 32 sec. Implicated osds 118) >>> 2019-07-03 05:36:19.695700 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6243212 >>> : cluster [INF] Cluster is now healthy >>> >>> ceph version 12.2.9 >>> >>> it might be related to this (taken from: >>> https://ceph.com/releases/v12-2-11-luminous-released/) ? : >>> >>> " >>> There have been fixes to RGW dynamic and manual resharding, which no longer >>> leaves behind stale bucket instances to be removed manually. For finding and >>> cleaning up older instances from a reshard a radosgw-admin command reshard >>> stale-instances list and reshard stale-instances rm should do the necessary >>> cleanup. >>> " >>> >>> >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Pozdrowienia, Luk ___ ceph-users mailing list ceph-users@lists.ceph.com
Re: [ceph-users] slow requests due to scrubbing of very small pg
Hi Igor, ThankYoufor Your input, will try Your suggestion with ceph-objectstore-tool. But for now it looks like main problem is this: 2019-07-09 09:29:25.410839 7f5e4b64f700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f5e20e87700' had timed out after 15 2019-07-09 09:29:25.410842 7f5e4b64f700 1 heartbeat_map is_healthy 'FileStore::op_tp thread 0x7f5e41651700' had timed out after 60 after this (a lot of) logs osd become unresponsive for monitors and they are marked down for a few seconds/minutes, sometimes it suicide: 2019-07-09 09:29:32.271361 7f5e3d649700 0 log_channel(cluster) log [WRN] : Monitor daemon marked osd.118 down, but it is still running 2019-07-09 09:29:32.271381 7f5e3d649700 0 log_channel(cluster) log [DBG] : map e71903 wrongly marked me down at e71902 2019-07-09 09:29:32.271393 7f5e3d649700 1 osd.118 71903 start_waiting_for_healthy maybe You (or any other cepher) know how to dill with this problem ? Regards Lukasz > Hi Lukasz, > I've seen something like that - slow requests and relevant OSD reboots > on suicide timeout at least twice with two different clusters. The root > cause was slow omap listing for some objects which had started to happen > after massive removals from RocksDB. > To verify if this is the case you can create a script that uses > ceph-objectstore-tool to list objects for the specific pg and then > list-omap for every returned object. > If omap listing for some object(s) takes too long (minutes in my case) - > you're facing the same issue. > PR that implements automatic lookup for such "slow" objects in > ceph-objectstore-tool is under review: > https://github.com/ceph/ceph/pull/27985 > The only known workaround for existing OSDs so far is manual DB > compaction. And https://github.com/ceph/ceph/pull/27627 hopefully fixes > the issue for newly deployed OSDs. > Relevant upstream tickets are: > http://tracker.ceph.com/issues/36482 > http://tracker.ceph.com/issues/40557 > Hope this helps, > Igor > On 7/3/2019 9:54 AM, Luk wrote: >> Hello, >> >> I have strange problem with scrubbing. >> >> When scrubbing starts on PG which belong to default.rgw.buckets.index >> pool, I can see that this OSD is very busy (see attachment), and starts >> showing many >> slow request, after the scrubbing of this PG stops, slow requests >> stops immediately. >> >> [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# zgrep scrub >> /var/log/ceph/ceph-osd.118.log.1.gz | grep -w 20.2 >> 2019-07-03 00:14:57.496308 7fd4c7a09700 0 log_channel(cluster) log [DBG] : >> 20.2 deep-scrub starts >> 2019-07-03 05:36:13.274637 7fd4ca20e700 0 log_channel(cluster) log [DBG] : >> 20.2 deep-scrub ok >> [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# >> >> [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# du -sh 20.2_* >> 636K20.2_head >> 0 20.2_TEMP >> [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# ls -1 -R 20.2_head | wc >> -l >> 4125 >> [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# >> >> and on mon: >> >> 2019-07-03 00:48:44.793893 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6231090 >> : cluster [WRN] Health check failed: 105 slow requests are blocked > 32 sec. >> Implicated osds 118 (REQUEST_SLOW) >> 2019-07-03 00:48:54.086446 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6231097 >> : cluster [WRN] Health check update: 102 slow requests are blocked > 32 sec. >> Implicated osds 118 (REQUEST_SLOW) >> 2019-07-03 00:48:59.088240 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6231099 >> : cluster [WRN] Health check update: 91 slow requests are blocked > 32 sec. >> Implicated osds 118 (REQUEST_SLOW) >> >> [...] >> >> 2019-07-03 05:36:19.695586 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6243211 >> : cluster [INF] Health check cleared: REQUEST_SLOW (was: 23 slow requests >> are blocked > 32 sec. Implicated osds 118) >> 2019-07-03 05:36:19.695700 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6243212 >> : cluster [INF] Cluster is now healthy >> >> ceph version 12.2.9 >> >> it might be related to this (taken from: >> https://ceph.com/releases/v12-2-11-luminous-released/) ? : >> >> " >> There have been fixes to RGW dynamic and manual resharding, which no longer >> leaves behind stale bucket instances to be removed manually. For finding and >> cleaning up older instances from a reshard a radosgw-admin command reshard >> stale-instances list and reshard stale-instances rm should do the necessary >> cleanup. >> " >> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Pozdrowienia, Luk ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] slow requests due to scrubbing of very small pg
Hi Lukasz, I've seen something like that - slow requests and relevant OSD reboots on suicide timeout at least twice with two different clusters. The root cause was slow omap listing for some objects which had started to happen after massive removals from RocksDB. To verify if this is the case you can create a script that uses ceph-objectstore-tool to list objects for the specific pg and then list-omap for every returned object. If omap listing for some object(s) takes too long (minutes in my case) - you're facing the same issue. PR that implements automatic lookup for such "slow" objects in ceph-objectstore-tool is under review: https://github.com/ceph/ceph/pull/27985 The only known workaround for existing OSDs so far is manual DB compaction. And https://github.com/ceph/ceph/pull/27627 hopefully fixes the issue for newly deployed OSDs. Relevant upstream tickets are: http://tracker.ceph.com/issues/36482 http://tracker.ceph.com/issues/40557 Hope this helps, Igor On 7/3/2019 9:54 AM, Luk wrote: Hello, I have strange problem with scrubbing. When scrubbing starts on PG which belong to default.rgw.buckets.index pool, I can see that this OSD is very busy (see attachment), and starts showing many slow request, after the scrubbing of this PG stops, slow requests stops immediately. [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# zgrep scrub /var/log/ceph/ceph-osd.118.log.1.gz | grep -w 20.2 2019-07-03 00:14:57.496308 7fd4c7a09700 0 log_channel(cluster) log [DBG] : 20.2 deep-scrub starts 2019-07-03 05:36:13.274637 7fd4ca20e700 0 log_channel(cluster) log [DBG] : 20.2 deep-scrub ok [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# du -sh 20.2_* 636K20.2_head 0 20.2_TEMP [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# ls -1 -R 20.2_head | wc -l 4125 [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# and on mon: 2019-07-03 00:48:44.793893 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6231090 : cluster [WRN] Health check failed: 105 slow requests are blocked > 32 sec. Implicated osds 118 (REQUEST_SLOW) 2019-07-03 00:48:54.086446 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6231097 : cluster [WRN] Health check update: 102 slow requests are blocked > 32 sec. Implicated osds 118 (REQUEST_SLOW) 2019-07-03 00:48:59.088240 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6231099 : cluster [WRN] Health check update: 91 slow requests are blocked > 32 sec. Implicated osds 118 (REQUEST_SLOW) [...] 2019-07-03 05:36:19.695586 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6243211 : cluster [INF] Health check cleared: REQUEST_SLOW (was: 23 slow requests are blocked > 32 sec. Implicated osds 118) 2019-07-03 05:36:19.695700 mon.ceph-mon-01 mon.0 10.10.8.221:6789/0 6243212 : cluster [INF] Cluster is now healthy ceph version 12.2.9 it might be related to this (taken from: https://ceph.com/releases/v12-2-11-luminous-released/) ? : " There have been fixes to RGW dynamic and manual resharding, which no longer leaves behind stale bucket instances to be removed manually. For finding and cleaning up older instances from a reshard a radosgw-admin command reshard stale-instances list and reshard stale-instances rm should do the necessary cleanup. " ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] slow requests due to scrubbing of very small pg
On Wed, Jul 3, 2019 at 4:47 PM Luk wrote: > > > this pool is that 'big' : > > [root@ceph-mon-01 ~]# rados df | grep -e index -e WR > POOL_NAME USEDOBJECTS CLONES COPIES > MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR > > default.rgw.buckets.index 0B 31763 095289 > 0 00 147679875 3.88TiB 8014040021 0B > > You are running Luminous which doesn't account metadata properly here. 31763 objects in an an rgw index pool is a non-trivial size. However, scrubbing that shouldn't be a problem for a properly configured cluster (i.e., no gigantic shards). Have you tried to increase osd_scrub_sleep? -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 > [root@ceph-mon-01 ~]# ceph df | grep -e index -e N > NAME ID USED%USED MAX AVAIL >OBJECTS > default.rgw.buckets.index 20 0B 0 37.0TiB > 31763 > [root@ceph-mon-01 ~]# > > > I don't think it should have such big problem during scrubing... > > -- > Regards > Lukasz > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] slow requests due to scrubbing of very small pg
Den ons 3 juli 2019 kl 09:01 skrev Luk : > Hello, > > I have strange problem with scrubbing. > > When scrubbing starts on PG which belong to default.rgw.buckets.index > pool, I can see that this OSD is very busy (see attachment), and starts > showing many > slow request, after the scrubbing of this PG stops, slow requests > stops immediately. > > [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# du -sh 20.2_* > 636K20.2_head > 0 20.2_TEMP > [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# ls -1 -R 20.2_head | > wc -l > 4125 > [root@stor-b02 /var/lib/ceph/osd/ceph-118/current]# > I think that pool might contain tons of metadata entries and non-obvious data for which it takes a while to process. But I see similar things on my rgw pools also during scrubs. :-( -- May the most significant bit of your life be positive. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com