No one had hit the ISIS bug before the IETF enforced maintenance freeze because 
no one in their right mind would be running three week old code back then. I 
don't think things have changed that much. ;)

-dorian

On Feb 7, 2013, at 4:19 PM, Siegel, David wrote:

> I remember being glued to my workstation for 10 straight hours due to an OSPF 
> bug that took down the whole of net99's network.
> 
> I was pretty proud of our size at the time...about 30Mbps at peak.  Times are 
> different and so are expectations.  :-)
> 
> Dave
> 
> 
> -----Original Message-----
> From: Brett Watson [mailto:[email protected]] 
> Sent: Wednesday, February 06, 2013 6:07 PM
> To: [email protected]
> Subject: Re: Level3 worldwide emergency upgrade?
> 
> Hell, we used to not have to bother notifying customers of anything, we just 
> fixed the problem. Reminds me a of a story I've probably shared on the past. 
> 
> 1995, IETF in Dallas. The "big ISP" I worked for at the time got tripped up 
> on a 24-day IS-IS timer bug (maybe all of them at the time did, I don't 
> recall)  where all adjacencies reset at once. That's like, entire network 
> down. Working with our engineering team in the *terminal* lab mind you, and 
> Ravi Chandra (then at Cisco) we reloaded the entire network of routers with 
> new code from Cisco once they'd fixed the bug. I seem to remember this being 
> my first exposure to Tony Li's infamous line, "... Confidence Level: boots in 
> the lab."
> 
> Good times.
> 
> -b
> 
> 
> On Feb 6, 2013, at 5:41 PM, Brandt, Ralph wrote:
> 
>> David. I am on an evening shift and am just now reading this thread.   
>> 
>> I was almost tempted to write an explanation that would have had 
>> identical content with yours based simply on Level3 doing something 
>> and keeping the information close.
>> 
>> Responsible Vendors do not try to hide what is being done unless it is 
>> an Op Sec issue and I have never seen Level3 act with less than 
>> responsibility so it had to be Op Sec.
>> 
>> When it is that, it is best if the remainder of us sit quietly on the 
>> sidelines.
>> 
>> Ralph Brandt
>> 
>> 
>> -----Original Message-----
>> From: Siegel, David [mailto:[email protected]]
>> Sent: Wednesday, February 06, 2013 12:01 PM
>> To: 'Ray Wong'; [email protected]
>> Subject: RE: Level3 worldwide emergency upgrade?
>> 
>> Hi Ray,
>> 
>> This topic reminds me of yesterday's discussion in the conference 
>> around getting some BCOP's drafted.  it would be useful to confirm my 
>> own view of the BCOP around communicating security issues.  My 
>> understanding for the best practice is to limit knowledge distribution 
>> of security related problems both before and after the patches are 
>> deployed.  You limit knowledge before the patch is deployed to prevent 
>> yourself from being exploited, but you also limit knowledge afterwards 
>> in order to limit potential damage to others (customers, 
>> competitors...the Internet at large).  You also do not want to 
>> announce that you will be deploying a security patch until you have a 
>> fix in hand and know when you will deploy it (typically, next 
>> available maintenance window unless the cat is out of the bag and danger is 
>> real and imminent).
>> 
>> As a service provider, you should stay on top of security alerts from 
>> your vendors so that you can make your own decision about what action 
>> is required.  I would not recommend relying on service provider 
>> maintenance bulletins or public operations mailing lists for obtaining 
>> this type of information.  There is some information that can cause 
>> more harm than good if it is distributed in the wrong way and 
>> information relating to security vulnerabilities definitely falls into that 
>> category.
>> 
>> Dave
>> 
>> -----Original Message-----
>> From: Ray Wong [mailto:[email protected]]
>> Sent: Wednesday, February 06, 2013 9:16 AM
>> To: [email protected]
>> Subject: Re: Level3 worldwide emergency upgrade?
>> 
>>> 
>> 
>> OK, having had that first cup of coffee, I can say perhaps the main 
>> reason I was wondering is I've gotten used to Level3 always being on 
>> top of things (and admittedly, rarely communicating). They've reached 
>> the top by often being a black box of reliability, so it's (perhaps
>> unrealistically) surprising to see them caught by surprise. Anything 
>> that pushes them into scramble mode causes me to lose a little sleep 
>> anyway. The alternative to what they did seems likely for at least a 
>> few providers who'll NOT manage to fix things in time, so I may well 
>> be looking at longer outages from other providers, and need to issue 
>> guidance to others on what to do if/when other links go down for 
>> periods long enough that all the cost-bounding monitoring alarms start 
>> to scream even louder.
>> 
>> I was also grumpy at myself for having not noticed advance 
>> communication, which I still don't seem to have, though since I 
>> outsourced my email to bigG, I've noticed I'm more likely to miss 
>> things. Perhaps giving up maintaining that massive set of procmail 
>> rules has cost me a bit more edge.
>> 
>> Related, of course, just because you design/run your network to 
>> tolerate some issues doesn't mean you can also budget to be in support 
>> contract as well. :) Knowing more about the exploit/fix might mean 
>> trying to find a way to get free upgrades to some kit to prevent more 
>> localized attacks to other types of gear, as well, though in this case 
>> it's all about Juniper PR839412 then, so vendor specific, it seems?
>> 
>> There are probably more reasons to wish for more info, too. There's 
>> still more of them (exploiters/attackers) than there are those of us 
>> trying to keep things running smoothly and transparently, so anything 
>> that smells of "OMG new exploit found!" also triggers my desire to 
>> share information. The network bad guys share information far more 
>> quickly and effectively than we do, it often seems.
>> 
>> -R>
>> 
>> 
>> 
> 
> 
> 


Reply via email to