> You made a comment about the outage in April.  I know that we have taken it 
> very >seriously and have identified and implemented things to prevent similar 
> things from >happening.  It was a all hands on deck situation. On the side, I 
> manage a friends site and >even though he was affected and a single instance 
> user (the site doesn't get enough traffic >to justify anything else, yet), I 
> was able to take a snapshot of the instance and launch in a >different AZ and 
> get him back up after a few hours even though the main instance that was 
> >affected was stuck for the whole duration of the outage.


On the other hand I had a client lose an entire site during the
outage.  He was in a single AZ, his data was stored on an EBS volume
and that volume somehow became non-recoverable even after the outage
was over.  Thankfully his weekly backup had run a few days before and
we were able to move his data to a shared host temporarily and restore
when things were worked out.

Still I agree if you design it with a different mindset you can
quickly make highly survivable infrastructure, but in my own opinion
it does require a completely different mindset.

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/

Reply via email to