Re: Global Akamai Outage

2021-07-27 Thread Lukas Tribus
Hello, On Tue, 27 Jul 2021 at 21:02, heasley wrote: > > But I have to emphasize that all those are just examples. Unknown bugs > > or corner cases can lead to similar behavior in "all in one" daemons > > like Fort and Routinator. That's why specific improvements absolutely > > do not mean we

Re: Global Akamai Outage

2021-07-27 Thread Lukas Tribus
On Tue, 27 Jul 2021 at 16:10, Mark Tinka wrote: > > > > On 7/26/21 19:04, Lukas Tribus wrote: > > > rpki-client can only remove outdated VRP's, if it a) actually runs and > > b) if it successfully completes a validation cycle. It also needs to > > do this BEFORE the RTR server distributes data. >

Re: Global Akamai Outage

2021-07-27 Thread heasley
Mon, Jul 26, 2021 at 07:04:41PM +0200, Lukas Tribus: > Hello! > > On Mon, 26 Jul 2021 at 17:50, heasley wrote: > > > > Mon, Jul 26, 2021 at 02:20:39PM +0200, Lukas Tribus: > > > rpki-client 7.1 emits a new per VRP attribute: expires, which makes it > > > possible for RTR servers to stop

Re: Global Akamai Outage

2021-07-27 Thread Mark Tinka
On 7/26/21 19:04, Lukas Tribus wrote: rpki-client can only remove outdated VRP's, if it a) actually runs and b) if it successfully completes a validation cycle. It also needs to do this BEFORE the RTR server distributes data. If rpki-client for whatever reason doesn't complete a validation

Re: Global Akamai Outage

2021-07-26 Thread Lukas Tribus
Hello! On Mon, 26 Jul 2021 at 17:50, heasley wrote: > > Mon, Jul 26, 2021 at 02:20:39PM +0200, Lukas Tribus: > > rpki-client 7.1 emits a new per VRP attribute: expires, which makes it > > possible for RTR servers to stop considering outdated VRP's: > >

Re: Global Akamai Outage

2021-07-26 Thread Mark Tinka
On 7/26/21 17:50, heasley wrote: Since rpki-client removes "outdated" (expired) VRPs, how does an RTR server "stop considering" something that does not exist from its PoV? Did you mean that it can warn about impending expiration? StayRTR reads the VRP data generated by rpki-client. Mark.

Re: Global Akamai Outage

2021-07-26 Thread heasley
Mon, Jul 26, 2021 at 02:20:39PM +0200, Lukas Tribus: > rpki-client 7.1 emits a new per VRP attribute: expires, which makes it > possible for RTR servers to stop considering outdated VRP's: > https://github.com/rpki-client/rpki-client-openbsd/commit/9e48b3b6ad416f40ac3b5b265351ae0bb13ca925 Since

Re: Global Akamai Outage

2021-07-26 Thread Mark Tinka
On 7/26/21 14:20, Lukas Tribus wrote: Some specific failure scenarios are currently being addressed, but this doesn't make monitoring optional: rpki-client 7.1 emits a new per VRP attribute: expires, which makes it possible for RTR servers to stop considering outdated VRP's:

Re: Global Akamai Outage

2021-07-26 Thread Lukas Tribus
Hello, On Mon, 26 Jul 2021 at 11:40, Mark Tinka wrote: > I can count, on my hands, the number of RPKI-related outages that we > have experienced, and all of them have turned out to be a > misunderstanding of how ROA's work, either by customers or some other > network on the Internet. The good

Re: Global Akamai Outage

2021-07-26 Thread Mark Tinka
On 7/26/21 07:25, Saku Ytti wrote: Doesn't matter. And I'm not trying to say RPKI is a bad thing. I like that we have good AS:origin mapping that is verifiable and machine readable, that part of the solution will be needed for many applications which intend to improve the Internet by some

Re: Global Akamai Outage

2021-07-25 Thread Saku Ytti
On Sun, 25 Jul 2021 at 21:41, Mark Tinka wrote: > Are you speaking globally, or for NTT? Doesn't matter. And I'm not trying to say RPKI is a bad thing. I like that we have good AS:origin mapping that is verifiable and machine readable, that part of the solution will be needed for many

Re: Global Akamai Outage

2021-07-25 Thread Mark Tinka
On 7/25/21 17:32, Saku Ytti wrote: Steering dangerously off-topic from this thread, we have so far had more operational and availability issues from RPKI than from hijacks. And it is a bit more embarrassing to say 'we cocked up' than to say 'someone leaked to internet, it be like it do'.

Re: Global Akamai Outage

2021-07-25 Thread Randy Bush
> Very often the corrective and preventive actions appear to be > different versions and wordings of 'dont make mistakes', in this case: > > - Reviewing and improving input safety checks for mapping components > - Validate and strengthen the safety checks for the configuration > deployment zoning

Re: Global Akamai Outage

2021-07-25 Thread Saku Ytti
On Sun, 25 Jul 2021 at 18:14, Jared Mauch wrote: > How can we improve response times when things are routed poorly? Time to > mitigate hijacks is improved my majority of providers doing RPKI OV, but > interprovider response time scales are much longer. I also think about the > two big CTL

Re: Global Akamai Outage

2021-07-25 Thread Jared Mauch
Work hat is not on, but context is included from prior workplaces etc. > On Jul 25, 2021, at 2:22 AM, Saku Ytti wrote: > > It doesn't seem like a tenable solution, when the solution is 'do > better', since I'm sure whoever did those checks did their best in the > first place. So we must assume

Re: Global Akamai Outage

2021-07-25 Thread Mark Tinka
On 7/25/21 08:18, Saku Ytti wrote: Hey, Not a critique against Akamai specifically, it applies just the same to me. Everything seems so complex and fragile. Very often the corrective and preventive actions appear to be different versions and wordings of 'dont make mistakes', in this case:

Re: Global Akamai Outage

2021-07-25 Thread Miles Fidelman
Indeed.  Worth rereading for that reason alone (or in particular). Miles Fidelman Hank Nussbacher wrote: On 23/07/2021 09:24, Hank Nussbacher wrote: From Akamai.  How companies and vendors should report outages: [07:35 UTC on July 24, 2021] Update: Root Cause: This configuration directive

Re: Global Akamai Outage

2021-07-25 Thread Hank Nussbacher
On 25/07/2021 09:18, Saku Ytti wrote: Hey, Not a critique against Akamai specifically, it applies just the same to me. Everything seems so complex and fragile. Complex systems are apt to break and only a very limited set of tier-3 engineers will understand what needs to be done to fix it.

Re: Global Akamai Outage

2021-07-25 Thread Saku Ytti
Hey, Not a critique against Akamai specifically, it applies just the same to me. Everything seems so complex and fragile. Very often the corrective and preventive actions appear to be different versions and wordings of 'dont make mistakes', in this case: - Reviewing and improving input safety

Re: Global Akamai Outage

2021-07-24 Thread Hank Nussbacher
On 23/07/2021 09:24, Hank Nussbacher wrote: From Akamai.  How companies and vendors should report outages: [07:35 UTC on July 24, 2021] Update: Root Cause: This configuration directive was sent as part of preparation for independent load balancing control of a forthcoming product. Updates to

Re: Global Akamai Outage

2021-07-23 Thread Hank Nussbacher
On 22/07/2021 19:34, Mark Tinka wrote: https://edgedns.status.akamai.com/ Mark. [18:30 UTC on July 22, 2021] Update: Akamai experienced a disruption with our DNS service on July 22, 2021. The disruption began at 15:45 UTC and lasted for approximately one hour. Affected customer sites were

Re: Global Akamai Outage

2021-07-22 Thread Andy Ringsmuth
> On Jul 22, 2021, at 12:38 PM, Grant Taylor via NANOG wrote: > > On 7/22/21 10:56 AM, Andy Ringsmuth wrote: >> The outage appears to have, ironically, taken out the outages and >> outages-discussion lists too. > > I received multiple messages from the Outages (proper) mailing list, >

Re: Global Akamai Outage

2021-07-22 Thread Grant Taylor via NANOG
On 7/22/21 10:56 AM, Andy Ringsmuth wrote: The outage appears to have, ironically, taken out the outages and outages-discussion lists too. I received multiple messages from the Outages (proper) mailing list, including messages about the Akamai issue. I'd be surprised if the Outages

Re: Global Akamai Outage

2021-07-22 Thread Jared Mauch
> On Jul 22, 2021, at 12:56 PM, Andy Ringsmuth wrote: > > The outage appears to have, ironically, taken out the outages and > outages-discussion lists too. > > Kinda like having a fire at the 911 dispatch center… Should not have impacted me in my hosting of the list. Obviously if the

Re: Global Akamai Outage

2021-07-22 Thread Andy Ringsmuth
The outage appears to have, ironically, taken out the outages and outages-discussion lists too. Kinda like having a fire at the 911 dispatch center... Andy Ringsmuth 5609 Harding Drive Lincoln, NE 68521-5831 (402) 304-0083 a...@andyring.com “Better even die free, than to live slaves.” -

Re: Global Akamai Outage

2021-07-22 Thread Mark Tinka
On 7/22/21 18:50, Matt Harris wrote: Seems to be clearing up at this point, was able to get to a site just now that I wasn't a little bit ago. Yes, seems to be restoring...     https://twitter.com/akamai/status/1418251400660889603?s=28 Mark.

Re: Global Akamai Outage

2021-07-22 Thread Matt Harris
Matt Harris|Infrastructure Lead 816-256-5446|Direct Looking for help? Helpdesk|Email Support We build customized end-to-end technology solutions powered by NetFire Cloud. On Thu, Jul 22, 2021 at 11:35 AM Mark Tinka wrote: > https://edgedns.status.akamai.com/ > > Mark. > Seems to be clearing up

Global Akamai Outage

2021-07-22 Thread Mark Tinka
https://edgedns.status.akamai.com/ Mark.