On 11/18/2010 09:45 AM, MORITA Kazutaka wrote:
Hi,
At Wed, 17 Nov 2010 14:44:34 +0100,
Dennis Jacobfeuerborn wrote:
Hi,
I've been following Sheepdog for a while and now that patches are being
sent to include it in libvirt I want to start testing it. One question I
have is how I can ensure the reliability of the Sheepdog cluster as a
whole. Specifically I'm looking at two cases:
Lets assume a setup with 4 nodes and a redundancy of 3.
If one node fails what are the effects both for the cluster and the clients
(e.g. potential i/o delays, messages, etc.)
Until Sheepdog starts a new round of membership, the cluster suspends
any requests to data objects and the clients I/O is waited. How long
to wait is up to the value of totem/consensus in corosync.conf. The
default value is 1200 ms. If you want to run Sheepdog with large
number of nodes, the value need to be larger number and the delay time
becomes larger.
Wouldn't it be better to decouple the client requests from these cluster
timings? This looks like a unnecessary bottleneck that gets worse as the
cluster gets larger. Why not let the client request have it's own timeout
of say 1 second and if no response arrives retry the request to one of the
nodes that carry one of the redundant copy of the blocks?
That way a node failure would have less of an impact on the applications
and delays for the application request would become independent of the
cluster size.
and what needs to be done once
the node is replaced to get the cluster back into a healthy state?
All you need to do is only starting a sheep daemon again. If it
doesn't work, please let me know.
So when the node goes down will the cluster copy all of the lost blocks to
another node automatically to re-establish the redundancy requirement of 3
copies?
If the new node is added to the cluster will it stay empty or will the
cluster rebalance the blocks according to some load criterium?
What happens if *all* nodes fail due to e.g. a power outage? What needs to
be done to bring the cluster back up again?
If no VM is running when all nodes fail, all you need to do is
starting all sheep daemons again. However, if I/O requests are
processed when all nodes fail, Sheepdog needs to recover the objects
whose replicas are in inconsistent states (and it is not implemented
yet).
What is the timeframe for this implementation after all this has to be
implemented before Sheepdog can go into productive use.
Regards,
Dennis
--
sheepdog mailing list
[email protected]
http://lists.wpkg.org/mailman/listinfo/sheepdog