Re: [ceph-users] Slow requests blocked. No rebalancing

2018-09-20 Thread Jaime Ibar
Hi all, after increasing mon_max_pg_per_osd number ceph starts rebalancing as usual. However, the slow requests warnings are still there, even after setting primary-affinity to 0 beforehand. By the other hand, if I destroy the osd, ceph will start rebalancing unless noout flag is set, am I

Re: [ceph-users] Slow requests blocked. No rebalancing

2018-09-20 Thread Paul Emmerich
You can prevent creation of the PGs on the old filestore OSDs (which seems to be the culprit here) during replacement by replacing the disks the hard way: * ceph osd destroy osd.X * re-create with bluestore under the same id (ceph volume ... --osd-id X) it will then just backfill onto the same

Re: [ceph-users] Slow requests blocked. No rebalancing

2018-09-20 Thread Eugen Block
Hi, to reduce impact on clients during migration I would set the OSD's primary-affinity to 0 beforehand. This should prevent the slow requests, at least this setting has helped us a lot with problematic OSDs. Regards Eugen Zitat von Jaime Ibar : Hi all, we recently upgrade from

Re: [ceph-users] Slow requests blocked. No rebalancing

2018-09-20 Thread Darius Kasparavičius
Hello, 2018-09-20 09:32:58.851160 mon.dri-ceph01 [WRN] Health check update: 249 PGs pending on creation (PENDING_CREATING_PGS) This error might indicate that you are hitting a PG limit per osd. Here some information on it https://ceph.com/community/new-luminous-pg-overdose-protection/ . You

[ceph-users] Slow requests blocked. No rebalancing

2018-09-20 Thread Jaime Ibar
Hi all, we recently upgrade from Jewel 10.2.10 to Luminous 12.2.7, now we're trying to migrate the osd's to Bluestore following this document[0], however when I mark the osd as out, I'm getting warnings similar to these ones 2018-09-20 09:32:46.079630 mon.dri-ceph01 [WRN] Health check

[ceph-users] slow requests/blocked

2014-11-20 Thread Jeff
Hi, We have a five node cluster that has been running for a long time (over a year). A few weeks ago we upgraded to 0.87 (giant) and things continued to work well. Last week a drive failed on one of the nodes. We replaced the drive and things were working well again.

Re: [ceph-users] slow requests/blocked

2014-11-20 Thread Jean-Charles LOPEZ
Hi Jeff, it would probably wise to first check what these slow requests are: 1) ceph health detail - This will tell you which OSDs are experiencing the slow requests 2) ceph daemon osd.{id} dump_ops_in_flight - To be issued on one of the above OSDs will tell you what theses ops are waiting for.

Re: [ceph-users] slow requests/blocked

2014-11-20 Thread Jeff
Thanks. I should have mentioned that the errors are pretty well distributed across the cluster: ceph1: /var/log/ceph/ceph-osd.0.log 71 ceph1: /var/log/ceph/ceph-osd.1.log 112 ceph1: /var/log/ceph/ceph-osd.2.log 38 ceph2: /var/log/ceph/ceph-osd.3.log 88 ceph2: