Primary Outage
At 12:47 AM PDT on April 21st, a network change was performed as part of our 
normal AWS scaling activities in a single Availability Zone in the US East 
Region. The configuration change was to upgrade the capacity of the primary 
network. During the change, one of the standard steps is to shift traffic off 
of 
one of the redundant routers in the primary EBS network to allow the upgrade to 
happen. The traffic shift was executed incorrectly and rather than routing the 
traffic to the other router on the primary network, the traffic was routed onto 
the lower capacity redundant EBS network. For a portion of the EBS cluster in 
the affected Availability Zone, this meant that they did not have a functioning 
primary or secondary network because traffic was purposely shifted away from 
the 
primary network and the secondary network couldn’t handle the traffic level it 
was receiving. As a result, many EBS nodes in the affected Availability Zone 
were completely isolated from other EBS nodes in its cluster. Unlike a normal 
network interruption, this change disconnected both the primary and secondary 
network simultaneously, leaving the affected nodes completely isolated from one 
another.

http://aws.amazon.com/message/65648/

The item gives a broad back ground on the "cloud" and is interesting reading if 
you haven't kept up with the PC world.

Ed


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to