On 2017-06-22T00:51:38, Blair Bethwaite <[email protected]> wrote:
> I'm doing some work to evaluate the risks involved in running 2r storage
> pools. On the face of it my naive disk failure calculations give me 4-5
> nines for a 2r pool of 100 OSDs (no copyset awareness, i.e., secondary disk
> failure based purely on chance of any 1 of the remaining 99 OSDs failing
> within recovery time). 5 nines is just fine for our purposes, but of course
> multiple disk failures are only part of the story.
You are confounding availability with data durability, too.
"Traditional" multi-node replicated storage solutions can get away with
only two nodes to mirrot the data inbetween because they typically have
an additional RAID5/6 at the local node level. (Which also helps with
recovery impact of a single device failure.) Ceph typically doesn't.
(That's why rbd-mirror between two Ceph clusters can be OK too.)
A disk failing while a node is down, or being rebooted, ...
> thereof, e.g., something that would enable the functional equivalent of:
> "this OSD/node is going to go offline so please create a 3rd replica in
> every PG it is participating in before we shutdown that/those OSD/s"...?
You can evacuate the node by setting it's weight to 0. It's a very
expensive operation, just like your proposal would be.
Regards,
Lars
--
Architect SDS
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284
(AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com