Not everyone will peer with you, notably, AS3356 (unless you're big
enough, which few can say.)
On 8/31/20 4:33 PM, Tomas Lynch wrote:
Maybe we are idealizing these so-called tier-1 carriers and we, tier-ns,
should treat them as what they really are: another AS. Accept that they
are going to fail and do our best to mitigate the impact on our own
networks, i.e. more peering.
On Mon, Aug 31, 2020 at 9:54 AM Martijn Schmidt via NANOG
<nanog@nanog.org <mailto:nanog@nanog.org>> wrote:
At this point you don't even know whether it's a human error
(example: generating a flowspec rule for port TCP/179), a filtering
issue (example: accepting a flowspec rule for port TCP/179), or a
software issue (example: certain flowspec update crashes the BGP
daemon). And in the third scenario I think that at least some
portion of the blame shifts from the carrier to its vendors,
assuming the thing that crashed was not a home-grown BGP implementation.
With the route optimizer incidents - because let's face it, Honest
Networker is on the money as usual
https://honestnetworker.net/2020/08/06/as10990-routing/ - there is
really no excuse for any tier-1 carrier, they should at the very
least have strict prefix-list based filtering in place for
customer-facing EBGP sessions. In those cases it's much easier to
state who's not taking care of their proverbial lawn.
Best regards,
Martijn
On 8/31/20 3:25 PM, Tom Beecher wrote:
https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
I definitely found Mr. Prince's writing about yesterday's events
fascinating.
Verizon makes a mistake with BGP filters that allows a secondary
mistake from leaked "optimizer" routes to propagate, and Mr.
Prince takes every opportunity to lob large chunks of granite
about how terrible they are.
L3 allows an erroneous flowspec announcement to cause massive
global connectivity issues, and Mr. Prince shrugs and says
"Incidents happen."
On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher
<h...@interall.co.il <mailto:h...@interall.co.il>> wrote:
On 30/08/2020 20:08, Baldur Norddahl wrote:
https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
Sounds like Flowspec possibly blocking tcp/179 might be the cause.
But that is Cloudflare speculation.
Regards,
Hank
Caveat: The views expressed above are solely my own and do not
express the views or opinions of my employer
An outage is what it is. I am not worried about outages. We
have multiple transits to deal with that.
It is the keep announcing prefixes after withdrawal from
peers and customers that is the huge problem here. That is
killing all the effort and money I put into having
redundancy. It is sabotage of my network after I cut the
ties. I do not want to be a customer at an outlet who has a
system that will do that. Luckily we do not currently have a
contract and now they will have to convince me it is safe for
me to make a contract with them. If that is impossible I
guess I won't be getting a contract with them.
But I disagree in that it would be impossible. They need to
make a good report telling exactly what went wrong and how
they changed the design, so something like this can not
happen again. The basic design of BGP is such that this
should not happen easily if at all. They did something
unwise. Did they make a route reflector based on a database
or something?
Regards,
Baldur
On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho
<mikeboli...@gmail.com <mailto:mikeboli...@gmail.com>> wrote:
Exactly. And asking that they somehow prove this won't
happen again is impossible.
- Mike Bolitho
On Sun, Aug 30, 2020, 8:10 AM Drew Weaver
<drew.wea...@thenap.com <mailto:drew.wea...@thenap.com>>
wrote:
I’m not defending them but I am sure it isn’t
intentional.
*From:* NANOG
<nanog-bounces+drew.weaver=thenap....@nanog.org
<mailto:thenap....@nanog.org>> *On Behalf Of *Baldur
Norddahl
*Sent:* Sunday, August 30, 2020 9:28 AM
*To:* nanog@nanog.org <mailto:nanog@nanog.org>
*Subject:* Re: Centurylink having a bad morning?
How is that acceptable behaviour? I shall remember
never to make a contract with these guys until they
can prove that they won't advertise my prefixes after
I pull them. Under any circumstances.
søn. 30. aug. 2020 15.14 skrev Joseph Jenkins
<j...@breathe-underwater.com
<mailto:j...@breathe-underwater.com>>:
Finally got through on their support line and
spoke to level1. The only thing the tech could
say was it was an issue with BGP route reflectors
and it started about 3am(pacific). They were
still trying to isolate the issue. I've tried
failing over my circuits and no go, the traffic
just dies as L3 won't stop advertising my routes.
On Sun, Aug 30, 2020 at 5:21 AM Drew Weaver via
NANOG <nanog@nanog.org <mailto:nanog@nanog.org>>
wrote:
Hello,
Woke up this morning to a bunch of reports of
issues with connectivity had to shut down
some Level3/CTL connections to get it to
return to normal.
As of right now their support portal won’t
load: https://www.centurylink.com/business/login/
Just wondering what others are seeing.