AD review for draft-ietf-rtgwg-net2cloud-problem-statement

James Guichard Fri, 16 Aug 2024 06:55:07 -0700

Deat authors,

Please find my comments for draft-ietf-rtgwg-net2cloud-problem-statement (I 
have included line numbers from nits to help identify where in the document the 
comment is relevant):


Please update references below.


 == Outdated reference: A later version (-13) exists of

     draft-ietf-idr-sdwan-edge-discovery-12



  == Outdated reference: A later version (-12) exists of

     draft-ietf-opsawg-ntw-attachment-circuit-08



  == Outdated reference: A later version (-23) exists of

     draft-ietf-idr-5g-edge-service-metadata-16



  == Outdated reference: A later version (-15) exists of

     draft-ietf-opsawg-teas-attachment-circuit-10



  == Outdated reference: A later version (-14) exists of

     draft-ietf-add-split-horizon-authority-07



109     Cloud services are generally exposed, on-demand services that claim

110     to be scalable, highly available, and have usage-based billing. Most



Jim> The above sentence is difficult to parse. Do you mean “Cloud services are 
generally exposed as on-demand…” rather than “Cloud services are generally 
exposed,…”



115     hosts services to many customers.



Jim> s/to/too



137                 "edge" locations. <https://cloud.google.com/learn/what-

138                 is-hybrid-cloud>.



Jim> Please remove the in-text reference and replace with a [] reference as 
either normative or informative.



144                 https://en.wikipedia.org/wiki/Internet_exchange_point.



Jim> Please remove in-text reference and replace with a [] reference as either 
normative or informative.



186       - If a Cloud Gateway (GW), a BGP speaker, receives from its BGP

187           peer a capability that it does not itself support or recognize,

188           it need to ignore that capability, and the BGP session need not



Jim> As per RFC5492 it MUST ignore that capability and the BGP session MUST NOT 
be terminated. See section 3 of RFC5492 and correct the above text.



189           be terminated per [RFC5492]. When receiving a BGP UPDATE with a

190           malformed attribute, the revised BGP error handling procedure

191           in [RFC7606] should be followed instead of session resetting.



Jim> the above paragraph seems to be confused. The first sentence is talking 
about BGP OPEN and how to handle capabilities, and then the second sentence 
talks about BGP UPDATE messages that have malformed attributes. These are two 
completely different things so I am struggling to understand why they are 
referenced in the same paragraph and what exactly they have to do with each 
other in the context of a Cloud Gateway?. Everything referenced is existing 
behavior, nothing new, so why is it here and what are the authors trying to 
convey? If they are trying to simply say that a Cloud Gateway should adhere to 
the procedure as specified in RFCs 5492 and 7606 then why not simply say that? 
If the authors wish to keep the text I would suggest a rewrite as follows:



      - If a Cloud Gateway (GW), a BGP speaker, receives from its BGP peer a 
BGP OPEN with a capability that it does not support or recognize, it

     MUST ignore that capability, and the BGP session MUST NOT be terminated, 
as per [RFC 5492].

     - When receiving a BGP UPDATE with a malformed attribute, the revised BGP 
error handling procedures in [RFC 7606] should be followed instead of

     resetting the BGP session.



196       - When a Cloud DC eBGP session supports a limited number of

197           routes from external entities, the on-premises DCs need to set

198           up default routes and filter as many routes as practical

199           replacing them with a default in the eBGP advertisement to

200           minimize the number of routes to be exchanged with the Cloud DC

201           eBGP peers.



Jim> I do not understand the above paragraph. Is a Cloud DC different to an 
on-premise DC? Who is advertising default to who? The scenario that you are 
trying to convey above is non-obvious, at least to me, so please clarify.



202       - When a Cloud GW receives inbound routes exceeding the maximum

203           routes threshold for a peer, the currently common practice is

204           generating out-of-band alerts (e.g., Syslog entries) via the

205           management system or terminating the BGP session (with cease

206           notification messages [RFC4486] being sent). Although out of

207           the scope of this document, more discussion is needed in the

208           IETF Inter-Domain Routing (IDR) Working Group for potential in-

209           band or autonomous notification directly to the peers when the

210           inbound routes exceed the maximum routes threshold.



Jim> More explanation is needed here including a reference to section 4 of 
RFC4486 that describes the procedure for terminating a peering with a 
NOTIFICATION message and error code providing a reason e.g. “Maximum number of 
prefixes reached”.



222     Failures within a Cloud site, which can be a building, a floor, a

223     pod, or a server rack, include capacity degradation or complete out-

224     of-service failure. Here are some events that can trigger a site

225     failure: a) fiber cut for links connecting to the site or among pods

226     within the site; b) cooling failures; c) insufficient backup power

227     during a power failure; d) cyber threat attacks; e) too many changes

228     outside of the maintenance window; etc. A fiber-cut is not uncommon

229     in a Cloud site or between sites.



Jim> I would suggest to say above that the types of events are not an 
exhaustive list but just some examples.



244     [RFC7432] specifies a mass withdrawal mechanism for EVPN to signal a

245     large number of routes being changed to remote PE nodes as quickly

246     as possible.



Jim> I am not sure that RFC 7432 is relevant here or why EVPN is even 
mentioned. Is there a reason to mention this or should the text simply be 
removed?



597     premesis CPEs to a Cloud DC via a private VPN requires the private



Jim> s/premesis/premise



691     necessary. Alternative encapsulations, like SRH (Segment Routing



Jim> Please provide a reference to RFC 8754 (SRH)



695   6. Requirements for Networks Connecting Cloud Data Centers



Jim> Why are there requirements in a problem statement document? Did the WG 
discuss splitting these out into a separate document?



Thanks!



Jim

_______________________________________________
rtgwg mailing list -- [email protected]
To unsubscribe send an email to [email protected]

AD review for draft-ietf-rtgwg-net2cloud-problem-statement

Reply via email to