Re: [ceph-users] Rebuild after upgrade
Thanks for taking the time to answer! Its a really old cluster, so that does make sense, thanks for confirming! -Brent -Original Message- From: Hector Martin Sent: Monday, March 18, 2019 1:07 AM To: Brent Kennedy ; 'Ceph Users' Subject: Re: [ceph-users] Rebuild after upgrade On 18/03/2019 13:24, Brent Kennedy wrote: > I finally received approval to upgrade our old firefly(0.8.7) cluster > to Luminous. I started the upgrade, upgrading to hammer(0.94.10), > then jewel(10.2.11), but after jewel, I ran the ceph osd crush > tunables optimal command, then ceph s command showed 60% of the > objects were misplaced. Now the cluster is just churning while it > does the recovery for that. > > Is this something that happens when upgrading from firefly up? I had > done a hammer upgrade to Jewel before, no rebalance occurred after > issuing that command. Any time you change the CRUSH tunables, you can expect data movement. The exact impact can vary from nothing (if no changes were made or the changes don't impact your actual pools/CRUSH rules) to a lot of data movement. This is documented here: http://docs.ceph.com/docs/master/rados/operations/crush-map/ In particular, you turned on CRUSH_TUNALBLES5, which causes a large amount of data movement: http://docs.ceph.com/docs/master/rados/operations/crush-map/#jewel-crush-tun ables5 Going from Firefly to Hammer has a much smaller impact (see the CRUSH_V4 section). -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Rebuild after upgrade
On 18/03/2019 13:24, Brent Kennedy wrote: I finally received approval to upgrade our old firefly(0.8.7) cluster to Luminous. I started the upgrade, upgrading to hammer(0.94.10), then jewel(10.2.11), but after jewel, I ran the “ceph osd crush tunables optimal” command, then “ceph –s” command showed 60% of the objects were misplaced. Now the cluster is just churning while it does the recovery for that. Is this something that happens when upgrading from firefly up? I had done a hammer upgrade to Jewel before, no rebalance occurred after issuing that command. Any time you change the CRUSH tunables, you can expect data movement. The exact impact can vary from nothing (if no changes were made or the changes don't impact your actual pools/CRUSH rules) to a lot of data movement. This is documented here: http://docs.ceph.com/docs/master/rados/operations/crush-map/ In particular, you turned on CRUSH_TUNALBLES5, which causes a large amount of data movement: http://docs.ceph.com/docs/master/rados/operations/crush-map/#jewel-crush-tunables5 Going from Firefly to Hammer has a much smaller impact (see the CRUSH_V4 section). -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Rebuild after upgrade
I finally received approval to upgrade our old firefly(0.8.7) cluster to Luminous. I started the upgrade, upgrading to hammer(0.94.10), then jewel(10.2.11), but after jewel, I ran the "ceph osd crush tunables optimal" command, then "ceph -s" command showed 60% of the objects were misplaced. Now the cluster is just churning while it does the recovery for that. Is this something that happens when upgrading from firefly up? I had done a hammer upgrade to Jewel before, no rebalance occurred after issuing that command. Regards, -Brent Existing Clusters: Test: Luminous 12.2.10 with 3 osd servers, 1 mon/man, 1 gateway ( all virtual ) US Production: Jewel 10.2.11 with 5 osd servers, 3 mons, 3 gateways behind haproxy LB UK Production: Luminous 12.2.10 with 15 osd servers, 3 mons/man, 3 gateways behind haproxy LB US Production all SSD: Luminous 12.2.10 with 6 osd servers, 3 mons/man, 3 gateways behind haproxy LB ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com