On Jul 2, 2012, at 7:03 PM, James Downs <e...@egon.cc> wrote:

> 
> On Jul 2, 2012, at 1:20 PM, david raistrick wrote:
> 
>> Amazon resources are controlled (from a consumer viewpoint) by API - that 
>> API is also used by amazon's internal toolkits that support ELB (and RDS..). 
>>   Those (http accessed) API interfaces were unavailable for a good portion 
>> of the outages.
> 
> Right, and other toolkits like boto. Each AZ has a different endpoint (url), 
> and as I have no resources running in East, I saw no problems with the API 
> endpoints I use. So, as you note, US-EAST Region was "not controllable".
> 
>> I know nothing of the netflix side of it - but that's what -we- saw. (and 
>> that caused all us-east RDS instances in every AZ to appear 
> 
> 
> And, if you lose US-EAST, you need to run *somewhere*. Netflix did not 
> cutover www.netflix.com to another Region. Why not is another question.

At which point are you guys going to realize that no matter how much 
resiliency, redundancy and fault tolerance you plan into an infrastructure 
there are always the unforeseen that just doesn't make any sense to plan for. 

Four major decision factors are cost, complexity, time and failure rate. At 
some point a business need to focus on its core business. IT like any other 
business resource has to be managed efficiently and its sole purpose is for the 
enablement of said business nothing more. 

Some of the post here are highly laughable and so unrealistic. 

People are acting as if Netflix is part of some critical service they stream 
movies for Christ sake.  Some acceptable level of loss is fine for 99.99% of 
Netflix's user base just like cable, electricity and running water I suffer a 
few hours of losses each year from those services it suck yes, is it the end of 
the world no.. 

This horse is dead! 

> 

Reply via email to