I'm beginning to think that your starting point is going to make
improvement difficult ...

On Tue, Aug 2, 2016 at 5:48 PM, Adrian <[email protected]> wrote:
> For now yes, this particular server is self contained. If either the
> front or the back end fails, one is no good without the other and
> separating them would mean doubling resources. Also don't want to touch
> existing systems. Not yet at least.

If you want to get more resiliency, you're going to have to commit
more resources. Now I appreciate that this is to be spread out over
time, but if your end goal is "resiliency using a single production
host with a warm spare" you can only achieve that by changing your
production software to also cluster into a warm-spare model, not by
fixing it out-of-band from the OS.

You can certainly improve things from the OS (rsync, DRDB and so on)
but you'll never end up with a perfect fit. Obviously you're realistic
about this :-)

> Ideally all the data created before the crash will be replicated to the
> mirror so after the switch the work can resume without loss of data.

So a replication model usually says "once every period, we'll
synchronise state", and therefore you will lose any update that
happened between sync events. A clustered model says "every update is
committed to multiple locations automatically", and you get to balance
risk against performance; but if both members of the cluster are
'identical' and 'on the same LAN' they should pretty much always
happen simultaneously.

> I have to keep an open mind here while also
> balancing some budgets.

Absolutely. You need a Risk assessment - take the simple Risk =
Likelihood * Impact model, work out the cost of failure and this will
give you a guide to how much money to sink into addressing the
situation.

So right now, the business impact seems to be very low. We don't
expect the likelihood of failure to change, but in the future the
impact will be higher; therefore the combined Risk will be higher,
therefore they should invest in the production platform's resiliency
at a rate that matches the business growth/dependency. Many people
forget to review the Risk over time, and usually that results in an
under-resourced system that will let them down when it fails. On the
other hand, many people want "perfect from the start", and therefore
spend too much on the initial solution.

> I have already organised a test and a development environment with
> separate hardware.

It would be interesting if the test environment could be a full
production cluster member when not being used for testing; that way
you also get to regularly *test* how to change the cluster
membership/state :-) Just like restores :-)

> Yes, with multiple VMs that is the first thing that springs to mind,
> replication in pairs with each VM responsible for its mirror. I guess
> in the end it will be the most cost effective option, providing that I
> can reduce everything to file system changes. Of which I'm not sure yet
> at this stage. But if I can, looping a script will be interesting.

I'm assuming that the host box is a Linux, but I'm not assuming
anything about the VMs. However, if the VMs are also Linux, I'd
question why you need to maintain them in such a separate and
difficult-to-manage manner. You have to do OS updates on the host and
on each VM ... and backups ... and everything. You might achieve
better *replication* if the processes in these VMs were all on the
same box. Containers might be more manageable than VMs, too.

> If you or anyone else knows of such solution, I'm open to suggestion.

Really sorry that I don't actually have "a solution" to offer though :-(

-jim
_______________________________________________
Linux-users mailing list
[email protected]
http://lists.canterbury.ac.nz/mailman/listinfo/linux-users

Reply via email to