[Once upon a time, single points-of-failure were addressed by various forms of redundancy. It's expensive, and its exponentially difficult to implement as the number of elements within the system increases.

[Yet an expectation arose that critical services would have features such as hotsites that could pick up the load of a failed primary site in real-time, and warmsites that could re-start a service in minutes to an hour, while people scratched their heads trying to work out why the primary site was down, and what to do about it.

[In recent years, any number of services have had long outages, some of them with serious consequences. Some of those were still in-house rather than be-clouded. Clearly the multiple bank and airline outages should have had hotsite or at least warmsite recovery plans, and didn't.

[But, once you've switched to the cloud, surely it's easy, even inherent. We were told by the spruikers that supply is elastic, and more instances are run up in real-time. And it's all highly dispersed and therefore single-point-of-failure issues are more manageable.

[I'm not sure how critical the ACT ESA's website is. It might be used only to inform the public; or it might deliver operational services. But, either way, you'd have expected inexpensive warmsite-like features to be part of what an emergency services site would be about.

[What am I missing here?]


AWS outage cripples ACT Emergency Services Agency website as Canberra bushfire rages
Wobble drags on through Thursday
Julian Bajkowski
itNews
Thu Jan 23 2020

The ACT Government’s Emergency Services Agency (ESA) has attributed a website outage that hit in the middle of a rapidly escalating bushfire between Canberra Airport and Queanbeyan to Thursday’s AWS outage in Sydney.

Capping off an already bad day for AWS after significant availability problems hit its Sydney region, the ESA took to twitter to redirect Canberrans to Facebook and local media to obtain current information on the fire hitting the national capital that remains at a watch and act level.

The outage hit as Canberra Airport was shut to commercial traffic because of the fire, with residents around Oaks Estate warned to get out of the road of the oncoming blaze after two fires merged and engulfed a rubbish tip.

It is still unclear why the ESA website was hit by a single point of failure, however the blaze, known as the Beard fire, is burning close to the industrial suburb of Fyshwick which houses several data centres.

The blaze near the airport is also within stone’s throw of the the Australian Signals Directorate’s Australian Cyber Security Centre offices at the Brindabella Park office complex that houses a clutch of other technology, consulting and miltech tenants.

AWS users started noting problems with services around 11.15am AEDT with the problems continuing at 4.00pm.

The issues affect services including EC2, elastic load balancing (ELB), relational database service (RDS), AppStream 2.0, ElastiCache, WorkSpaces and Lambda.

Update: The ESA's website was restored on Thursday evening as the fire was downgraded to 'advice' level overnight.


--
Roger Clarke                            mailto:[email protected]
T: +61 2 6288 6916   http://www.xamax.com.au  http://www.rogerclarke.com

Xamax Consultancy Pty Ltd 78 Sidaway St, Chapman ACT 2611 AUSTRALIA
Visiting Professor in the Faculty of Law            University of N.S.W.
Visiting Professor in Computer Science    Australian National University
_______________________________________________
Link mailing list
[email protected]
http://mailman.anu.edu.au/mailman/listinfo/link

Reply via email to