Nishal,I read many times your inputs in response to the AFRINIC last outage and 
could not understand the rationale and what you were explaining and 
defending.AFRINIC as a Regional Internet Registry?? has committed to some 
technical expertise as described in section 5 of the ICP-2=====5) Technical 
expertiseThe new RIR must be technically capable of providing the required 
allocation and registration services to the community in its region. Specific 
technical requirements include provisioning by the RIR of:???production grade 
global Internet connectivity, in order to provide access to all services 
offered and for exchange of registry data to and from the other RIR-whois 
database server(s);???DNS servers to support Reverse DNS delegation;???suitable 
internal infrastructure for operational purposes; and???enough technically 
capable staff to ensure appropriate service levels to the LIRs, and to the 
Internet community.============This commitment was transferred in the AFRINIC 
Service Level Commitment (SLC) where at section 3, AFRINIC commits itself?? to 
99.8% of services and network availability(*)I am teaching you nothing through 
my references above, as a former CTO of AFRINIC, you are supposed to know. 
These stuffs are may be good for the community. Upon this condition, how could 
we understand that AFRINIC encountered a downtime of almost 4 hours which you 
tried to undermine.The post-mortem report published (**)?? said:" Upon further 
investigation we determined that all AFRINIC???s equipment at our main data 
centre in Johannesburg had lost power "So these are the?? questions which come 
to mind:Had the whole datacenter lost all power sources ?If not, how come that 
AFRINIC lost all power sources to its equipment?Why did it take 4 hours to 
restore power ?the report further said:" It is unfortunate that this incident 
happened while the AFRINIC infrastructure enhancement plan is still in his 
implementation phase "This sounds like an acceptance of non-readiness to meet 
the service level. I urge AFRINIC to take this commitment as serious as it 
should be and deploy the efforts and resources for that. Let???s respect 
separation of roles and responsibilities and take responsibilities seriously. 
AFRINIC is 15 years old, AFRICA should no longer accept such?? 
breaches.--Gregoire(*) https://www.afrinic.net/commitment(**) 
https://lists.afrinic.net/pipermail/announce/2019/002063.html------ Original 
message------From: Nishal GoburdhanDate: Sat, Jun 1, 2019 3:03 PMTo: 
[email protected];Cc: [email protected];Subject:Re: 
[Community-Discuss] Afrinic Services DOWNOn 31 May 2019, at 18:46, francis 
asiboh via Community-Discuss wrote:

> Dear board members and all members of the community
> All Afrinic services (Whois, RPKI, Afrinic.net, etc..) were down 
> yesterday

you didn???t mention DNS.  if you did, it would have invalidated the 
???all??? in your sentence.


> the 30th of May 2019 for a very long time. Board, where is the 
> Disaster
> Recovery Strategy in this particular kind of incident ?

the first thing you learn about a disaster recovery plan, is knowing 
_when_ to deploy it  (ie. what is classed as a disaster).   afrinic 
suffered from an outage of services in JNB for 3h53m.  nothing more.  it 
was not a disaster by any stretch of the imagination.  being hyperbolic, 
doesn???t help anyone.

at worst, your mail _to_ afrinic might have been slightly delayed (heck, 
you wouldn???t even get the 4h smtp warning!).  if you were doing 
validation, your RPKI cache would have not had the _most up to date_ 
ROAs  (but validation would have *still* worked!), and you wouldn???t 
have had been able to make a few DNS or WHOIS updates.

meanwhile, the internet, still carried on ..


> Since now, no Root Cause Analysis were sent for transparency to the
> community.

this is a good thing to ask for;  and i expect that this will be made 
available, as was the RFO after the last incident.
however, any RFO also includes a ???how are we going to make sure that 
this is not going to happen again??? part.  that bit actually takes time 
for analysis, planning, and, sign-off.  if anything, *this* is the part 
that you want afrinic to actually spend time and effort on, rather than 
simply whipping out a half-assed response to an outage.
so, it???s reasonable to believe that *this* part of the report, isn???t 
necessarily complete.  yet.

however, if you prefer to fit into the current climate ..

. maybe the BoD is still debating whether to release ..



> I am surprised that the Infrastructure Unit Manager, Mr Cedric MBEYET
> turned a deaf ear and did not learnt his lesson from the last incident
> where RPKI services were down.

if you read the outage incident from the last time, you _should_ be able 
to figure out that an incident that occurred previously, occurred in 
mauritius, and, is, in no way, related to thursday???s incident in 
johannesburg.  which is in a different city.  thousands of kilometres 
away.  the golden hint _should_ have been, that last time around, it was 
a certificate renewal that failed;  which won???t impact afrinic???s 
website, or whois services, or .. that were not available now.


unless, you have direct evidence that they are related.  in which case, 
for transparency, you should release that, eh?



> Instead of taking care of Afrinic services, the current board of 
> directors
> is busy hiding public document from its members.

the BoD is not meant to run afrinic operations;  they have enough to do. 
  whilst i share your distaste for their actions, in reading some of the 
shocking revelations that are emerging in other threads, this, at least, 
is not something you should be equating to the BoD.

read the RFO when it???s released;  and, if you think there???s 
something that???s glaringly poorly done, then, feel free to point it 
out.  posturing, without data, is just poor form.

--n.
network engineer  (retired)

ps.  btw, even with almost 4h of outage, depending on what you???re 
measuring, afrinic are still on track for 99.9% uptime, even if 
there???s a second 4h outage this year.

pps.  i happen to know that afrinic *does* have at least one copy of a 
disaster recovery plan.  i am somewhat still familiar with the contents 
of this, and, given the nature of what is involved in activating that 
(and then, reverting) in cedrick???s shoes, i would have made the same 
call for a 4h outage.  you???re free to disagree.

_______________________________________________
Community-Discuss mailing list
[email protected]
https://lists.afrinic.net/mailman/listinfo/community-discuss
_______________________________________________
Community-Discuss mailing list
[email protected]
https://lists.afrinic.net/mailman/listinfo/community-discuss

Reply via email to