Re: NANOG Digest, Vol 54, Issue 3 (Comcast's IPv6 Information Site Unreachable)

2012-07-02 Thread Brzozowski, John
Folks, We will report back shortly with some updates. Thanks for the mail. John = John Jason Brzozowski Comcast Cable m) +1-609-377-6594 e) mailto:john_brzozow...@cable.comcast.com o) +1-484-962-0060 w) http://www.comcast6.net

Re: How do the lowest layers of the DSL stack work?

2012-07-02 Thread Stefan Bethke
Am 01.07.2012 um 21:01 schrieb James Bensley: [15.24 Mbit/s raw bit rate compared to 8.128 Mbit/s net] is quite a drop in speed and I'm trying to understand where this is happening. ... According to that extract, it all disappeared because of [Reed-Solomon] encoding, which is hugely vague.

RE: FYI Netflix is down

2012-07-02 Thread Dan Golding
-Original Message- From: Todd Underwood [mailto:toddun...@gmail.com] scott, This was not a cascading failure.  It was a simple power outage Actually, it was a very complex power outage. I'm going to assume that what happened this weekend was similar to the event that happened

Re: FYI Netflix is down

2012-07-02 Thread Todd Underwood
Actually, it was a very complex power outage. I'm going to assume that what happened this weekend was similar to the event that happened at the same facility approximately two weeks ago (its immaterial - the details are probably different, but it illustrates the complexity of a data center

Re: FYI Netflix is down

2012-07-02 Thread AP NANOG
While I was working for a wireless telecom company our primary datacenter was knocked off the power grid due to weather, the generators kicked on and everything was fine, till one generator was struck by lighting and that same strike fried the control panel on the second one. Considering the

Re: F-ckin Leap Seconds, how do they work?

2012-07-02 Thread AP NANOG
Do you happen to know all the kernels and versions affected by this? -- Thank you, Robert Miller http://www.armoredpackets.com Twitter: @arch3angel On 7/1/12 12:44 PM, George Bonser wrote: -Original Message- From: Roy Sent: Saturday, June 30, 2012 10:03 PM To: nanog@nanog.org

Re: F-ckin Leap Seconds, how do they work?

2012-07-02 Thread Alex Harrowell
On 02/07/12 16:47, AP NANOG wrote: Do you happen to know all the kernels and versions affected by this? 2.6.26 to 3.3 inclusive per news.ycombinator.com/item?id=4183122

Re: F-ckin Leap Seconds, how do they work?

2012-07-02 Thread Jay Ashworth
- Original Message - From: Alex Harrowell a.harrow...@gmail.com On 02/07/12 16:47, AP NANOG wrote: Do you happen to know all the kernels and versions affected by this? 2.6.26 to 3.3 inclusive per news.ycombinator.com/item?id=4183122 Well, my 2.6.32 CentOS6/64 machine, which is not

Re: FYI Netflix is down

2012-07-02 Thread Leo Bicknell
In a message written on Mon, Jul 02, 2012 at 11:30:06AM -0400, Todd Underwood wrote: from the perspective of people watching B-rate movies: this was a failure to implement and test a reliable system for streaming those movies in the face of a power outage at one facility. I want to emphasize

Re: F-ckin Leap Seconds, how do they work?

2012-07-02 Thread Michael Thomas
On 07/02/2012 09:04 AM, Jay Ashworth wrote: - Original Message - From: Alex Harrowell a.harrow...@gmail.com On 02/07/12 16:47, AP NANOG wrote: Do you happen to know all the kernels and versions affected by this? 2.6.26 to 3.3 inclusive per news.ycombinator.com/item?id=4183122 Well,

Re: FYI Netflix is down

2012-07-02 Thread david raistrick
On Mon, 2 Jul 2012, Leo Bicknell wrote: I used to work with a guy who had a simple test for these things, and if I was a VP at Amazon, Netflix, or any other large company I would do the same. About once a month he would walk out on the you mean like this?

Re: FYI Netflix is down

2012-07-02 Thread Leo Bicknell
In a message written on Mon, Jul 02, 2012 at 12:13:22PM -0400, david raistrick wrote: you mean like this? http://techblog.netflix.com/2011/07/netflix-simian-army.html Yes, Netflix seems to get it, and I think their Simian Army is a great QA tool. However, it is not a complete testing

Re: FYI Netflix is down

2012-07-02 Thread david raistrick
On Mon, 2 Jul 2012, Leo Bicknell wrote: http://techblog.netflix.com/2011/07/netflix-simian-army.html Yes, Netflix seems to get it, and I think their Simian Army is a great QA tool. However, it is not a complete testing system, I have never seen them talk about testing non-software

Re: FYI Netflix is down

2012-07-02 Thread AP NANOG
This is an excellent example of how tests should be ran, unfortunately far too many places don't do this... -- Thank you, Robert Miller http://www.armoredpackets.com Twitter: @arch3angel On 7/2/12 12:09 PM, Leo Bicknell wrote: In a message written on Mon, Jul 02, 2012 at 11:30:06AM -0400,

Re: FYI Netflix is down

2012-07-02 Thread Grant Ridder
The problem is large scale tests take a lot of time and planning. For it to be done right, you really need a dedicated DR team. -Grant On Mon, Jul 2, 2012 at 11:31 AM, AP NANOG na...@armoredpackets.com wrote: This is an excellent example of how tests should be ran, unfortunately far too many

Re: F-ckin Leap Seconds, how do they work?

2012-07-02 Thread Joly MacFie
Made the press.. http://www.washingtonpost.com/business/technology/leap-second-bug-takes-down-reddit-and-a-bunch-of-other-sites/2012/07/02/gJQAlXg1HW_story.html -- --- Joly MacFie 218 565 9365 Skype:punkcast WWWhatsup NYC -

Re: FYI Netflix is down

2012-07-02 Thread Leo Bicknell
In a message written on Mon, Jul 02, 2012 at 12:23:57PM -0400, david raistrick wrote: When the hardware is outsourced how would you propose testing the non-software components? They do simulate availability zone issues (and AZ is as close as you get to controlling which internal

Re: FYI Netflix is down

2012-07-02 Thread Cameron Byrne
On Jul 2, 2012 10:53 AM, Leo Bicknell bickn...@ufp.org wrote: In a message written on Mon, Jul 02, 2012 at 12:23:57PM -0400, david raistrick wrote: When the hardware is outsourced how would you propose testing the non-software components? They do simulate availability zone issues (and AZ

Re: FYI Netflix is down

2012-07-02 Thread James Downs
On Jul 2, 2012, at 9:23 AM, david raistrick wrote: When the hardware is outsourced how would you propose testing the non-software components? They do simulate availability zone issues (and AZ is as close as you get to controlling which internal power/network/etc grid you're attached to).

Re: FYI Netflix is down

2012-07-02 Thread Tony McCrory
On 2 July 2012 19:20, Cameron Byrne cb.li...@gmail.com wrote: Make your chaos animal go after sites and regions instead of individual VMs. CB From a previous post mortem http://techblog.netflix.com/2011_04_01_archive.html Create More Failures Currently, Netflix uses a service called

Re: FYI Netflix is down

2012-07-02 Thread Paul Graydon
On 07/02/2012 08:53 AM, Tony McCrory wrote: On 2 July 2012 19:20, Cameron Byrne cb.li...@gmail.com wrote: Make your chaos animal go after sites and regions instead of individual VMs. CB From a previous post mortem http://techblog.netflix.com/2011_04_01_archive.html Create More Failures

RE: FYI Netflix is down

2012-07-02 Thread Dan Golding
-Original Message- From: Leo Bicknell [mailto:bickn...@ufp.org] I want to emphasize _and test_. [snip] I used to work with a guy who had a simple test for these things, and if I was a VP at Amazon, Netflix, or any other large company I would do the same. About once a month

Re: FYI Netflix is down

2012-07-02 Thread AP NANOG
I believe in my dictionary Chaos Gorilla translates into Time To Go Home, with a rough definition of Everything just crapped out - The world is ending; but then again I may have hat incorrect :-) -- Thank you, Robert Miller http://www.armoredpackets.com Twitter: @arch3angel On 7/2/12 2:59

Re: FYI Netflix is down

2012-07-02 Thread Joly MacFie
Good band name. Chaos Gorilla -- --- Joly MacFie 218 565 9365 Skype:punkcast WWWhatsup NYC - http://wwwhatsup.com http://pinstand.com - http://punkcast.com VP (Admin) - ISOC-NY - http://isoc-ny.org

Re: FYI Netflix is down

2012-07-02 Thread Greg D. Moore
At 03:08 PM 7/2/2012, George Herbert wrote: If folks have not read it, I would suggest reading Normal Accidents by Charles Perrow. The it can't happen is almost guaranteed to happen. ;-) And when it does, it'll often interact in ways we can't predict or sometimes even understand. As for

Re: FYI Netflix is down

2012-07-02 Thread david raistrick
On Mon, 2 Jul 2012, James Downs wrote: back-plane / control-plane was unable to cope with the requests. Netflix uses Amazon's ELB to balance the traffic and no back-plane meant they were unable to reconfigure it to route around the problem. Someone needs to define back-plane/control-plane

Re: FYI Netflix is down

2012-07-02 Thread Brett Frankenberger
On Mon, Jul 02, 2012 at 09:09:09AM -0700, Leo Bicknell wrote: In a message written on Mon, Jul 02, 2012 at 11:30:06AM -0400, Todd Underwood wrote: from the perspective of people watching B-rate movies: this was a failure to implement and test a reliable system for streaming those movies

RE: FYI Netflix is down

2012-07-02 Thread Dan Golding
-Original Message- From: Greg D. Moore [mailto:moor...@greenms.com] If folks have not read it, I would suggest reading Normal Accidents by Charles Perrow. Also, Human Error by James Reason.

Re: FYI Netflix is down

2012-07-02 Thread George Herbert
On Mon, Jul 2, 2012 at 12:43 PM, Greg D. Moore moor...@greenms.com wrote: At 03:08 PM 7/2/2012, George Herbert wrote: If folks have not read it, I would suggest reading Normal Accidents by Charles Perrow. The it can't happen is almost guaranteed to happen. ;-)  And when it does, it'll often

Re: FYI Netflix is down

2012-07-02 Thread Greg D. Moore
At 05:04 PM 7/2/2012, George Herbert wrote: On Mon, Jul 2, 2012 at 12:43 PM, Greg D. Moore moor...@greenms.com wrote: At 03:08 PM 7/2/2012, George Herbert wrote: If folks have not read it, I would suggest reading Normal Accidents by Charles Perrow. The it can't happen is almost guaranteed

Re: F-ckin Leap Seconds, how do they work?

2012-07-02 Thread Steven Bellovin
On Jul 2, 2012, at 11:47 AM, AP NANOG wrote: Do you happen to know all the kernels and versions affected by this? See http://landslidecoding.blogspot.com/2012/07/linuxs-leap-second-deadlocks.html --Steve Bellovin, https://www.cs.columbia.edu/~smb

Re: FYI Netflix is down

2012-07-02 Thread Steven Bellovin
On Jul 2, 2012, at 3:43 PM, Greg D. Moore wrote: At 03:08 PM 7/2/2012, George Herbert wrote: If folks have not read it, I would suggest reading Normal Accidents by Charles Perrow. Strong second to that suggestion. --Steve Bellovin, https://www.cs.columbia.edu/~smb

Re: FYI Netflix is down

2012-07-02 Thread James Downs
On Jul 2, 2012, at 1:20 PM, david raistrick wrote: Amazon resources are controlled (from a consumer viewpoint) by API - that API is also used by amazon's internal toolkits that support ELB (and RDS..). Those (http accessed) API interfaces were unavailable for a good portion of the

Re: F-ckin Leap Seconds, how do they work?

2012-07-02 Thread Jimmy Hess
On 7/2/12, Steven Bellovin s...@cs.columbia.edu wrote: On Jul 2, 2012, at 11:47 AM, AP NANOG wrote: Do you happen to know all the kernels and versions affected by this? See http://landslidecoding.blogspot.com/2012/07/linuxs-leap-second-deadlocks.html --Steve Bellovin,

Re: F-ckin Leap Seconds, how do they work?

2012-07-02 Thread Joly MacFie
On Mon, Jul 2, 2012 at 8:46 PM, Jimmy Hess mysi...@gmail.com wrote: Someone should write a dastardly system clock daemon to cause the insertion of frequent spurious positive leap seconds, followed by the spurious insertion of negative leap seconds. Chaos time bandit? --

Re: FYI Netflix is down

2012-07-02 Thread Rodrick Brown
On Jul 2, 2012, at 7:03 PM, James Downs e...@egon.cc wrote: On Jul 2, 2012, at 1:20 PM, david raistrick wrote: Amazon resources are controlled (from a consumer viewpoint) by API - that API is also used by amazon's internal toolkits that support ELB (and RDS..). Those (http accessed)

Northern Virginia 9-1-1 service after storm

2012-07-02 Thread Sean Donelan
Probably not as interesting as talking about Amazon/Netflix. http://www.washingtonpost.com/local/after-storm-911-phone-service-remains-spotty/2012/07/02/gJQA33dHJW_story.html Fairfax County's 911 emergency center operated at just half capacity Monday as Verizon struggled to figure out why

Re: FYI Netflix is down

2012-07-02 Thread James Downs
On Jul 2, 2012, at 7:19 PM, Rodrick Brown wrote: People are acting as if Netflix is part of some critical service they stream movies for Christ sake. Some acceptable level of loss is fine for 99.99% of Netflix's user base just like cable, electricity and running water I suffer a few

Contributing to the community

2012-07-02 Thread Matt Chung
I've been so fortunate and appreciative over the years to have colleagues (many whom I consider my close friends) cultivate my career by providing sound advise that I will continue to pass on. In addition to those I've known personally, I have gleaned a substantial amount of information through

Re: FYI Netflix is down

2012-07-02 Thread Hal Murray
George Herbert george.herb...@gmail.com said: I worked for a Sun clone vendor (Axil) for a while and took some of our systems and storage to Comdex one year in the 90s. We had a RAID unit (Mylex controller) we had just introduced. Beforehand, I made REALLY REALLY SURE that the