All very true and worth considering, but I feel compelled to mention the
strategy of setting mon_osd_down_out_subtree_limit carefully to prevent
automatic rebalancing.
*If* the loss of a failure domain is temporary, ie. something you can fix
fairly quickly, it can be preferable to not start
I agree that running in min_size of 1 is worse than running with only 3
failure domains. Even if it's just for a short time and you're monitoring
it closely... it takes mere seconds before you could have corrupt data with
min_size of 1 (depending on your use case). That right there is the key.
What you're saying that if we only have 3 failure domains then ceph can do
nothing to maintain 3 copies in case of an entire failure domain is lost, that
is correct.
BUT if you're losing 2 replicas out of 3 of your data, and your min size is set
to 2 (the recommended minimum) then you have an
You wouldn't be able to guarantee that the cluster will not use 2 servers
from the same rack. The problem with 3 failure domains, however, is if you
lose a full failure domain ceph can do nothing to maintain 3 copies of your
data. It leaves you in a position where you need to rush to the
Hi David,
If I understand correctly your suggestion is the following:
If we have for instance 12 servers grouped into 3 racks (4/rack) then you would
build a crush map saying that you have 6 racks (virtual ones), and 2 servers in
each of them, right?
In this case if we are setting the failure
day, June 01, 2017 2:05 PM
To: Deepak Naidu; ceph-users
Subject: Re: [ceph-users] Crushmap from Rack aware to Node aware
If all 6 racks are tagged for Ceph storage nodes, I'd go ahead and just put the
nodes in there now and configure the crush map accordingly. That way you can
grow each of the
:* David Turner [mailto:drakonst...@gmail.com]
> *Sent:* Thursday, June 01, 2017 12:23 PM
> *To:* Deepak Naidu; ceph-users
> *Subject:* Re: [ceph-users] Crushmap from Rack aware to Node aware
>
>
>
> The way to do this is to download your crush map, modify it manually after
Subject: Re: [ceph-users] Crushmap from Rack aware to Node aware
The way to do this is to download your crush map, modify it manually after
decompiling it to text format or modify it using the crushtool. Once you have
your crush map with the rules in place that you want, you will upload
The way to do this is to download your crush map, modify it manually after
decompiling it to text format or modify it using the crushtool. Once you
have your crush map with the rules in place that you want, you will upload
the crush map to the cluster. When you change your failure domain from