A month or so ago, I saw an article in the Boston Glob about horrible things happening to to the network over at Beth Israel/Deaconess. The hospital had no email or other electronic services for the better part of three days, knocking the hospital back to the 1970's the hard way. The Glob article was unusual in that it had a major organization 'fessing up about their mistakes, and also in the amount of technical details included. Apparently the main culprit in the collapse (aside from cumulative mistakes and such) was spanning tree algorithms. I thought this was very interested and mailed to the Glob author asking if there was more public information about what happened.
This is one of the original articles: http://www.boston.com/dailyglobe2/323/metro/Hospital_computer_crash_a_lesson_to_the_industry+.shtml This is not the article that mentioned spanning trees. That one was a week or so later in the Health/Science section. Today the Globe author send out a mass mailing to the many of us who asked for more information. She referenced this page at BI: http://home.caregroup.org/templatesnew/departments/BID/network_outage Here's some of the key information: "When Cisco TAC was first able to access and assess the network, they found the Layer 2 structure of the network to be unstable and out of specification with 802.1d standards. The management vlan (vlan 1) had in some locations 10 Layer2 hops from root. "The conservative default values for the Spanning Tree Protocol (STP) impose a maximum network diameter of seven. This means that two distinct bridges in the network should not be more than seven hops away from one to the other." There are other related pages on this site that detail different aspects of the outage. We should be grateful to BI for making this information public so we have examples of how things shouldn't be done and what it can cost to make this kind of mistake. This is also a good illustration for something that network geeks have been telling me, but that I didn't understand at a gut level, namely the importance of keeping routing and similar messes on layer 3, not layer 2. Have fun, Lauren P.S. Still looking for work. http://www.linnaean.org/~lpb/r.html --- Send mail for the `bblisa' mailing list to `[EMAIL PROTECTED]'. Mail administrative requests to `[EMAIL PROTECTED]'.
