tl;dr summary: On EC2, we can't reuse IP addresses, and we need a reliable,
scriptable procedure for replacing a dead (guaranteed no longer running) server
with another one without needing to take the remaining cluster members down.
I'm trying to build a Pacemaker solution using Percona Replication Manager
(https://github.com/jayjanssen/Percona-Pacemaker-Resource-Agents/blob/master/doc/PRM-setup-guide.rst)
within our EC2 environment. Essentially, the architecture would be 3
independent MySQL servers, running in different data centers, each of which
runs Pacemaker/Corosync with an agent that manages master/slave replication.
I have a script that builds a new instance from the base OS, which installs the
cluster software, generates the appropriate config files, and loads the CRM
configuration on boot. This is the method we use to launch servers; in the
event that a server dies, we don't attempt to recover it. Instead, we launch an
entirely new instance (possibly even in a different data center), which
corresponds to building a brand new server, assigning it a new private IP
address. (Every server has a private IP address that directs traffic within the
data center, and a public address that leaves the cloud only to come back in,
introducing security implications, latency and additional cost.) Ideally, the
boot script should be able to handle everything on its own -- we should be able
to create the instance, and by the time it's finished running, the new box
should be in the cluster as a slave, taking the place of whichever one had
previously died.
The problem I'm running into is that because we're on EC2, we don't control our
IP address allocation. If we did, we'd start a new server with the same IP as
the one that it's replacing, and my understanding is that Pacemaker would pick
right back up and let it join the cluster. Instead, because it has a new IP, we
always end up in a split-brain situation, where the two original members of the
cluster see each other but think the third is down, and the new one thinks it's
the first member of a new cluster with two members that are down. The only way
I've found to correct this is to stop pacemaker/corosync on all instances,
regenerate the config files, and start them up again. This is not really an
ideal scenario.
Does anyone have any experience or suggestions with working in this kind of
situation? Moving off of EC2 is not an option; creating a private network
(Amazon VPC) so that we can get static addresses has performance implications
we'd rather avoid. Any ideas for solutions or reliable workarounds, especially
if they can be scriptable, would be extremely helpful. (That is, we won't have
any process that automatically replaces a server after one goes down, but we
would like to be able to have the chef boot script, which is kicked off
manually, be able to go from software installation to rejoining the cluster
automatically.)
Some options we have available, along with some things we've tried:
- We can create DNS entries for the three servers by known names (i.e.
mysql-01, mysql-02, mysql-03) which point to the private network IP addresses.
We can put those hostnames into the config files, or we can resolve them at
boot time and put the IP addresses directly. However, this requires that all
three servers be online before running the installation scripts on any one box.
The ideal solution would use only hostnames and re-resolve the IP any time the
cluster needs to configure membership, thus letting any new server take over
the DNS entry but not the IP address.
- We can create an Elastic IP, which provides a static public IP even before
any of the servers are running. This way, the config can always reference that
IP, and always be accessible, but requires the traffic going to that IP leave
the cloud, which we'd like to avoid. Given that pacemaker/corosync is
relatively low traffic, however, having only those services run over the public
IP would be acceptable; however, so far that has not seemed to solve our
split-brain problem.
- We can always ensure that there is only one server corresponding to one of
the DNS entries at any given time. (That is, no running server thinks that it's
mysql-02 if we launch another one with the same name.)
- We can regenerate the corosync.conf at any time without requiring the
services to be stopped, if it's possible to have that config take effect
without a service restart.
- We can always determine the current IPs of all members from external scripts
via DNS.
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org