Re: [ceph-users] strange remap on host failure

2017-05-30 Thread Brad Hubbard
See the release notes for the jewel releases which include instructions for upgrading from hammer. On Wed, May 31, 2017 at 1:53 PM, Laszlo Budai wrote: > Hi Brad, > > Thank you for the answer. > We are aware of the fact that hammer is close to retirement, and we are >

Re: [ceph-users] strange remap on host failure

2017-05-30 Thread Laszlo Budai
Hello Greg! Thank you for the answer. Our pools have their size set to 3: tv-dl360-1:~$ ceph osd pool ls detail pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 pool 1 'images' replicated size 3

Re: [ceph-users] strange remap on host failure

2017-05-30 Thread Laszlo Budai
Hi Brad, Thank you for the answer. We are aware of the fact that hammer is close to retirement, and we are planning for the upgrade. BTW: can you recommend some documentation to read before the hammer -> jewel upgrade? I know http://docs.ceph.com/docs/jewel/install/upgrading-ceph/ and that

Re: [ceph-users] strange remap on host failure

2017-05-30 Thread Brad Hubbard
It should also be noted that hammer is pretty close to retirement and is a poor choice for new clusters. On Wed, May 31, 2017 at 6:17 AM, Gregory Farnum wrote: > On Mon, May 29, 2017 at 4:58 AM, Laszlo Budai wrote: >> >> Hello all, >> >> We have a

Re: [ceph-users] strange remap on host failure

2017-05-30 Thread Gregory Farnum
On Mon, May 29, 2017 at 4:58 AM, Laszlo Budai wrote: > > Hello all, > > We have a ceph cluster with 72 OSDs distributed on 6 hosts, in 3 chassis. In > our crush map the we are distributing the PGs on chassis (complete crush map > below): > > # rules > rule

Re: [ceph-users] strange remap on host failure

2017-05-30 Thread Laszlo Budai
I agree with you that the crush map is changing all the time, because of the changes in the cluster. Our problem is that it did not changed as expected in this host failure situation. Kind regards, Laszlo On 30.05.2017 21:28, David Turner wrote: Adding osds and nodes to a cluster changes

Re: [ceph-users] strange remap on host failure

2017-05-30 Thread David Turner
Adding osds and nodes to a cluster changes the crush map, an osd being marked out changes the crush map, an osd being removed from the cluster changes the crush map... The crush map changes all the time even if you aren't modifying it directly. On Tue, May 30, 2017 at 2:08 PM Laszlo Budai

Re: [ceph-users] strange remap on host failure

2017-05-30 Thread Laszlo Budai
we have not touched the crush map. We have only observed that the cluster is not responding as expected to a failure, and we wonder why. As I've mentioned in the previous post, we were able to reproduce the situation on a different ceph cluster so I've filled in a bug report. So far this is

Re: [ceph-users] strange remap on host failure

2017-05-30 Thread David Turner
When you lose a host, the entire CRUSH map is affected. Any change to the crush map can affect any PG, OSD, host, or failure domain in the entire cluster. If you modified osd.10's weight in the crush map by increasing it by 0.5, you would likely see PGs in the entire cluster moving around, not

Re: [ceph-users] strange remap on host failure

2017-05-30 Thread Laszlo Budai
Hello David, Thank you for your message. Indeed we were expecting to see the PGs from the lost host redistributed to the surviving host from the same chassis (failure domain), but the reality is different :( I can see a lot of PGs being stuck active+undersized+degraded and active+remapped.

Re: [ceph-users] strange remap on host failure

2017-05-30 Thread David Turner
If you lose 1 of the hosts in a chassis, or a single drive, the pgs from that drive/host will be distributed to other drives in that chassis (because you only have 3 failure domains). That is to say that if you lose tv-c1-al01 then all of the pgs and data that were on that will be distributed to

Re: [ceph-users] strange remap on host failure

2017-05-29 Thread Laszlo Budai
Dear all, How should ceph react in case of a host failure when from a total of 72 OSDs 12 are out? is it normal that for the remapping of the PGs it is not following the rule set for in the crush map? (according to the rule the OSDs should be selected from different chassis). in the attached