This is my planned OSD configuration:
root
room1
OSD host1
OSD host2
room2
OSD host3
OSD host4
There are 6 OSDs per host.
Is it possible to configure crush map such that it will tolerate "room"
failure? In my case, there is one network switch per room and one power supply
per room, which makes a single point of (room) failure. This is what I would
like to mitigate.
I could not find any crush rule that would make this configuration redundant
and safe.
Namely, to tolerate a sudden room (switch, power) failure, there must be a rule
to "ack" write only after BOTH rooms make the "ack". The problem is that this
rule holds only until both rooms are up. As soon as one room goes down (with
the rule like this) the cluster won't be able to write any more data since the
"ack" is not allowed by the rule. It looks like impossible task with a fix
crush map rule. The cluster would somehow need to switch rules to make this
redundant. What am I missing?
In general: can ceph tolerate sudden loss of half of the OSDs?
If not, what is the best redundancy I could get out of my configuration?
Is there any workaround with some external tools maybe to detect such failure
and reconfigure ceph automatically?
regards,
Zoran
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com