> 
>> If I set min_size=k+1, does that mean I'd need a minimum of m=3 to avoid 
>> loss-of-write with a 2-node failure?
> 
> Yes.

Having at least k+m+1 failure domains (nodes in your case) also, subtly, means 
that your failure domains can be of different aggregate CRUSH weights without 
the capacity delta being unusable.

> 
>> And to avoid loss-of-write with a 2-node failure, would I need:
>> a) a minimum of k+m nodes, or
> 
> This is the bare minimum for EC without any recovery location.


> 
>> b) a minimum of k+m+2 nodes?
> 
> This is recommended because with this number of nodes the cluster still has 
> "spare" nodes to recover to in case of a node failure.

Indeed, Ceph is all about strong consistency.  It would rather interdict writes 
than have you make a risky write.

If up-front CapEx is your concern, remember that you don't have to fully 
populate nodes with drives, at least not initially.  I sometimes recommend a 
minimum of 7 nodes so that 4+2 or 3+3 EC can be done safely. Seven nodes 
half-full is better in multiple ways than 4 nodes  fully populated. 

As for nodes, used Dell R640 can be had with 8 or 10 NVMe bays and lots of 
cores very inexpensively these days. Will they give another (let's be honest) 
10 years of service?  Hard to say. But today's SSDs can be transplanted into 
tomorrow's chassis, and when the latter is cheap, one can easily afford to lay 
in a couple of spares.  It's like with full-frame digital photography: a body 
from 10 years ago is meh by today's standards, but that $10,000 lens for 
shooting sports ball is still golden.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to