Re: Summary of the Amazon EC2 and Amazon RDS Service Disruption

Andrew S. Baker Fri, 29 Apr 2011 14:03:11 -0700

In the case of my organization, we have those particular items in two
different hosted data centers.   And we have a DR plan that allows failover
to the other facility.


We would likely do something similar, within the constraints of the
particular technologies invovled.

I'm sure that it has not escaped your notice that not everyone in the Amazon
EC2 cloud (in the affected zone) experienced a failure -- particularly those
who availed themselves of several DR services that Amazon offers.

Two months ago, we experienced failure in a collocation facility in our
<some country> data center.  If it had been live at the time, or if it had
been the only one, then that would have represented a huge problem for us.
 For organizations that cannot afford to have more than one site, or elect
not to have more than one for some other reason, they are exposed to more
risk.

This fact is regardless of whether or not the failure happens at the
physical layer.

We've experienced network level outages with hosting providers -- both at a
colo or at at our primary offices.  Physical proximity offers nothing if the
backup plan isn't already in place.

So, the issue here is really with the planning and prior arrangements...


*ASB *(Professional Bio <http://about.me/Andrew.S.Baker/bio>)
 *Harnessing the Advantages of Technology for the SMB market...

 *



On Fri, Apr 29, 2011 at 4:00 PM, Kurt Buff <[email protected]> wrote:

> Take as your starting point the recent Amazon outage, in which your
> hypothetical business has placed its Exchange/DC/File-print/ERP/CRM
> infrastructure (or any portion thereof you deem reasonable) and it's
> unavailable to your company. Let's, for the moment, not worry about
> whether data has been lost - you're worried about losing sales, etc.,
> because of the unavailability.
>
> On Fri, Apr 29, 2011 at 12:05, Andrew S. Baker <[email protected]> wrote:
> > Define: Downed Cloud
> > Certainly, the appropriate solution will depend on the type of problem
> being
> > faced.
> >
> >
> > ASB (Professional Bio)
> > Harnessing the Advantages of Technology for the SMB market...
> >
> >
> >
> >
> > On Fri, Apr 29, 2011 at 1:48 PM, Kurt Buff <[email protected]> wrote:
> >>
> >> Indeed. But, the market is touting that the cloud is especially
> >> suitable for SMBs, and the large/complex failures that Amazon suffered
> >> are not normally experienced in those environments. The SMB market is
> >> where I live, and by going to the cloud I would subject my company to
> >> a risk for which I don't see a good, or indeed any, mitigation.
> >>
> >> Hence the question: What's the mitigation for a downed cloud?
> >>
> >>
> >> Kurt
> >>
> >> On Fri, Apr 29, 2011 at 10:39, Steven M. Caesare <[email protected]>
> >> wrote:
> >> > Not all outages are simple failed component swaps.
> >> >
> >> > As a matter of fact, those tend to be the simple issues. The complex
> >> > ones often involve human error, or unanticipated system interactions
> and/or
> >> > cascading events, as in the case with this recent AWS event. As ASB
> points
> >> > out, those can (and do) happen in any datacenter, regardless if it's
> yours,
> >> > or in the cloud.
> >> >
> >> > -sc
> >> >
> >> >> -----Original Message-----
> >> >> From: Kurt Buff [mailto:[email protected]]
> >> >> Sent: Friday, April 29, 2011 1:37 PM
> >> >> To: NT System Admin Issues
> >> >> Subject: Re: Summary of the Amazon EC2 and Amazon RDS Service
> >> >> Disruption
> >> >>
> >> >> Where's my 4 hour response time when the service fails?
> >> >>
> >> >> I can get a new mobo or hard drive, router or switch when I call the
> >> >> vendor,
> >> >> or I can even pre-order one and have it on the shelf if it's that
> >> >> critical -
> >> >> where's the new cloud when the current one is down?
> >> >>
> >> >> Is there high availability between cloud vendors  - or do you have
> some
> >> >> other
> >> >> mitigation strategy in mind?
> >> >>
> >> >> Kurt
> >> >>
> >> >> On Fri, Apr 29, 2011 at 09:32, Andrew S. Baker <[email protected]>
> >> >> wrote:
> >> >> > "The Cloud" is already ready for production.
> >> >> > Hosted Data Centers, which is a concept we are all acquainted with,
> >> >> > are not immune to failures either.
> >> >> > Failures happen.  Just make sure your entire mitigation plan is not
> >> >> > "the vendor's got it".
> >> >> >
> >> >> >
> >> >> > ASB (Professional Bio)
> >> >> > Harnessing the Advantages of Technology for the SMB market...
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Fri, Apr 29, 2011 at 12:16 PM, Kurt Buff <[email protected]>
> >> >> > wrote:
> >> >> >>
> >> >> >> Lesson: Large distributed systems management is hard.
> >> >> >>
> >> >> >> That's a really good read.
> >> >> >>
> >> >> >> I think after three or four more large scale problems like that,
> >> >> >> some
> >> >> >> clouds might be ready for production.
> >> >> >>
> >> >> >> On Fri, Apr 29, 2011 at 06:03, Andrew S. Baker <[email protected]
> >
> >> >> wrote:
> >> >> >> > http://aws.amazon.com/message/65648/
> >> >> >> >
> >> >> >> > This is a very good read...
> >> >> >> >
> >> >> >> > -ASB: http://about.me/Andrew.S.Baker
> >> >> >> >
> >> >> >> > Sent from my Motorola Droid
> >> >> >> >
>

~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here: 
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to [email protected]
with the body: unsubscribe ntsysadmin

Re: Summary of the Amazon EC2 and Amazon RDS Service Disruption

Reply via email to