Re: [ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-02-05 Thread Jake Grimmett
Dear Nick & Wido, Many thanks for your helpful advice; our cluster has returned to HEALTH_OK One caveat is that a small number of pgs remained at "activating". By increasing mon_max_pg_per_osd from 500 to 1000 these few osds activated, allowing the cluster to rebalance fully. i.e. this was

Re: [ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-01-29 Thread Wido den Hollander
: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jake Grimmett Sent: 29 January 2018 12:46 To: ceph-users@lists.ceph.com Subject: [ceph-users] pgs down after adding 260 OSDs & increasing PGs Dear All, Our ceph luminous (12.2.2) cluster has just broken, due to either adding 260 OSDs d

Re: [ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-01-29 Thread Jake Grimmett
29 January 2018 12:46 To: ceph-users@lists.ceph.com Subject: [ceph-users] pgs down after adding 260 OSDs & increasing PGs Dear All, Our ceph luminous (12.2.2) cluster has just broken, due to either adding 260 OSDs drives in one go, or to increasing the PG number from 1024 to 4096 in on

Re: [ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-01-29 Thread Wido den Hollander
ick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jake Grimmett Sent: 29 January 2018 12:46 To: ceph-users@lists.ceph.com Subject: [ceph-users] pgs down after adding 260 OSDs & increasing PGs Dear All, Our ceph luminous (12.2.2) cluster has just brok

Re: [ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-01-29 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Jake Grimmett > Sent: 29 January 2018 12:46 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] pgs down after adding 260 OSDs & increasing PGs > > Dear All, > >

[ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-01-29 Thread Jake Grimmett
Dear All, Our ceph luminous (12.2.2) cluster has just broken, due to either adding 260 OSDs drives in one go, or to increasing the PG number from 1024 to 4096 in one go, or a combination of both... Prior to the upgrade, the cluster consisted of 10 dual v4 Xeon nodes running SL7.4, each node