My apologies. I missed the reference to our Technical Emergency Hotline:

[1] RIPE NCC Technical Emergency Hotline:
https://www.ripe.net/support/contact/technical-emergency-hotline 
<https://www.ripe.net/support/contact/technical-emergency-hotline>


> On 6 Apr 2020, at 16:19, Felipe Victolla Silveira <[email protected]> wrote:
> 
> Dear Danny and all,
> 
> Thank you for your email.
> 
> We understand the importance of RPKI for Internet operations and we are
> taking recent outages very seriously.
> 
> We already have alerting systems in place that did not report the
> deletions because deletion of ROAs is sometimes a normal and necessary
> action. However, as Nathalie mentions in her post-mortem, we have
> already taken steps to ensure our systems prevent this from happening again.
> 
> We are also carrying out a separate investigation on the impact this
> outage had on networks in terms of hijacking and route leaks.
> 
> There is a 24/7 hotline[1] in place that people can use to report
> outages outside of office hours. In this case, none of the people who
> contacted us used this method to alert us.
> 
> In our Activity Plan and Budget 2020, we requested a significant budget
> allocation for resiliency of RPKI in anticipation of increased global
> demand and operational reliance on this system. Lessons learned from
> these outages will be incorporated into the RPKI activity and we will
> take all necessary steps to ensure the stability of the system.
> 
> Kind regards,
> 
> Felipe Victolla Silveira
> Chief Operations Officer
> RIPE NCC
> 
>> On 3 Apr 2020, at 22:56, Danny McPherson <[email protected]> wrote:
>> 
>> 
>> Agreed, thanks for this Nathalie.
>> 
>> Given the operational importance of RPKI now and each RIRs role therein can 
>> you say anything about what plans RIPE has to provide 24x7 monitoring / 
>> support for these services (i.e., beyond your current "office hours")?
>> 
>> I also look forward to [your] analysis of the Rostelecom incident that 
>> occurred in the same timeframe.
>> 
>> Thanks,
>> 
>> 
>> -danny
>> 
>> 
>> 
>> On 2020-04-03 08:55, Nathalie Trenaman wrote:
>>> Dear colleagues,
>>> After our accidental deletion of RPKI ROAs on Wednesday evening, we have
>>> a post-mortem report to share with the working group.
>>> Following an update to our internal registry software on 1 April at
>>> 18:16 (UTC+2), 2,669 ROAs were deleted from Provider Independent (PI)
>>> address assignments.
>>> This was caused by our registry software classifying these assignments
>>> as not-certifiable. From our logs, we can confirm that these blocks
>>> never left the RIPE Registry, and within 15 minutes the registry was
>>> back to normal. However, by that time the ROAs had already been deleted
>>> and could not be restored without intervention from our engineers.
>>> Affected users with alerts set up in the LIR Portal received a
>>> notification email on 31 March at 22:23, stating that their ROAs were
>>> missing. Some of these users emailed our Customer Service Department to
>>> ask why their ROAs had been deleted. As this was outside of office
>>> hours, our staff did not discover the issue until the next morning.
>>> Our engineers were able to reinstate all of the missing ROAs by 13:15 on
>>> 2 April. We then informed our membership via ncc-announce and notified
>>> the affected users directly.
>>> We have since implemented stricter checks on both our registry and RPKI
>>> software.
>>> We are also investigating whether any of these PI assignments suffered
>>> from route-leaks or hijacks after their ROAs were deleted.
>>> We apologise for any inconvenience this may have caused and we are
>>> taking all necessary steps to ensure this does not happen again in the
>>> future.
>>> Kind regards,
>>> Nathalie Trenaman
>>> Routing Security Programme Manager
>>> RIPE NCC
>> 
>> 
> 
> 

Reply via email to