That's a great list of tips Nathan.

Interestingly our datacentre (Primus) suffered loss of power 5 days
after I posed this question to the list. Backup generators failed and
some customers were without power for over four hours.

I visited the datacentre as one server failed to boot due to a full
log in and IPMI card*. When I got there I saw *lots* of sysadmins. It
appears a lot of systems didn't recover well after a power cut.

- fsck can take a *long* time on large disks.
- RAID arrays don't take kindly to having the power yoinked.
- Ensure BIOS is set to boot server after power outage.

I would like to have been able to cut over to Amazon ec2 during the
outage. This would be a *relatively* cheap DR/continuity option and
wouldn't cost much when not in use (mainly storage and backup
traffic).

I'm getting a group together to talk about the technical side of using
the cloud for this sort of thing. Get in touch if you're interested.

- Mike

* It turns out there is a firmware fix to avoid this.

On Tue, Jan 27, 2009 at 11:14 PM, Nathan de Vries <[email protected]> wrote:
>
> On 27/01/2009, at 4:34 PM, Mike Bailey wrote:
>> I'm interested in hearing from people who have good Disaster
>> Recovery setups.
>
> Some simple steps I try and abide by (if it's a serious app, for
> novelty apps I generally don't care):
>
> * Keep deployment automated and cheap
> * Set low TTLs for all DNS records
> * MySQL master/slave replication with slave used for backing up. Live
> dumps from a non-replicated DB is fine with low traffic, though.
> * s3sync.rb backing up MySQL dumps from the slave, as well as user-
> generated files/content (if applicable)
> * Pre-configured maintenance pages with the ability to include
> downtime messages to users
> * Configuration options to disable critical features (e.g. checkout if
> payment gateway is down), re-configurable without code-redeployment
> * No architectural astronauting
> * Don't let disaster-resilience get in the way of normal development
>
> Simplest seems to be the best. Working with up-front "code-for-scale"
> tends to make programming unbearable for me. I prefer being reactive
> on the whole, but with the above proactive measures depending on the
> seriousness of the project.
>
>
> Cheers,
>
> --
> Nathan de Vries
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby 
or Rails Oceania" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/rails-oceania?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to