Rob, Did you want to spin a new version of the draft and get final comments from Shane? then move this along to IESG-land?
Or are there still comments/issues to deal with from other folk? (the russ/robert discussion seemed to peter out as well) -chris On Wed, Jul 18, 2012 at 1:54 PM, Rob Shakir <[email protected]> wrote: > Hi Shane, > > Thanks for the comments again, and apologies (again!) for the delay in > responding. > > Please find my responses in-line as [rjs]. > > On 11 Jul 2012, at 17:50, Shane Amante wrote: > >>>> [...snip...] >>> >>> [rjs]: I tried to add something to cover this that fits in with Section 1.1: >>> >>> <t> >>> The combination of the increased number of >>> deployments of BGP-4 as an intra-AS routing protocol, its use for the >>> propagation of additional types of routing and service information, and the >>> growth of IP services has resulted in a substantial increase in the volume >>> of information carried within BGP-4. In numerous networks, RIB sizes of the >>> order of millions of entries exist, with particular high-scale points >>> existing at BGP speakers performing aggregation or functionality designed >>> improve utilisation of network resources (e.g., route reflector >>> hierarchies). Whilst clearly an increase in the amount routing information >>> carried in BGP results in greater impact to services during failures, it is >>> also critical to their recovery time. The increased time to compute new >>> paths following a failures and subsequently re-learn them following >>> recoveries results in greater impact of failures within the protocol, and >>> hence adds further weight to the requirement to > avoid failures affecting all routing, or service, information carried via a > particular adjacency. Whilst an argument could be made the convergence time > of BGP-4 can be reduced through additional computational resource being > deployed, it is notable that significant challenges continue to exist for > operators of scaling BGP-4, and hence mechanisms which improve the > scalability of the protocol are of particular note. >>> </t> >> >> >> The above looks good, but I've made some minor modifications. See below. >> ---snip--- >> The combination of the increased number of deployments of BGP-4 as an >> intra-AS routing protocol, its use for the propagation of additional types >> of routing and service information, and the growth of IP services has >> resulted in a substantial increase in the volume of information carried >> within BGP-4. In numerous networks, RIB sizes of the order of millions of >> entries exist within individual BGP speakers, with particularly high-scale >> points exhibited at BGP speakers performing aggregation or functionality >> designed improve utilisation of network resources (e.g., route reflector >> hierarchies). Whilst clearly an increase in the amount routing information >> carried in BGP results in greater impact to services during failures, which >> is only amplified by a corresponding increase in recovery times. Following a >> failure, there is a substantial recovery time to learn, compute and >> distribute new paths, which results in a greater observed impact to services >> affected, and hence adds further > weight to the requirement to avoid failures altogether or, at least, > mitigate their impact to the narrowest scope possible, (e.g.: a specific > NLRI). Whilst an argument could be made that convergence time of BGP-4 could > potentially be reduced through deployment of additional computational > resource, it is notable that solution is not necessarily straightforward from > an implementation or deployment point-of-view, (e.g.: scaling computation > resources within a single address-family is difficult). Thus, significant > challenges continue to exist for operators when scaling BGP-4 deployments, > and hence mechanisms which improve the scalability of BGP-4 are very > important. >> ---snip--- > > [rjs]: Thanks, other than some minor editorial changes I adopted this > paragraph -- it seems like a good hybrid. > > >>>> [...snip...] >>> >>> [rjs]: I'm not quite clear on whether this gets the point across completely >>> - do we think that it is just that things have become in the realm of >>> provisioning activities, or rather is it that there are more and more >>> functions that are overloading onto BGP. I agree that this sentence doesn't >>> necessarily capture that - but do you think that it's the generic >>> information transfer protocol between PEs, as well as replacing >>> provisioning mechanisms? >> >> I believe that you are correct, and better off, in stating "more and more >> functions that are overloaded (sic) onto BGP". Although, I'm not sure that >> "overloaded" is an appropriate adjective. > > [rjs]: I guess there may be negative connotations of 'overloaded', I guess > what I really mean is maybe "layered" onto BGP -- poor wording perhaps. > >> The point I was trying to get at is as follows. I think there's a continuum >> of information exchanged within BGP from real-time information >> (reachability) to less dynamic (perhaps, even static) information, with >> _examples_ of the latter being auto-discovery/provisioning use cases. While >> traditional applications, such as vanilla Internet service for which BGP was >> originally designed, only fall into the "real-time information" category ... >> there are a lot of new(er) applications that do not fit "neatly" in a single >> category and, in fact, span the range of real-time to less dynamic >> categories depending on which facet of a particular protocol you look at, >> (examples being: IPVPN, MVPN, VPLS-BGP, etc.). Regardless, I don't think >> it's prudent to make value judgements (particularly at this point in time >> when these protocols are already widely deployed and successful) as to the >> "correctness" of these functions/services being in BGP, since that's bound >> to be very subjective. Rath e > r, we need to recognize the world for what it is today, which is why I think > use of the word "overloaded" may be inappropriate. Furthermore, I think that > talking about this in such a context is only recognizing a symptom (the more > complex the system, the higher the probability is to introduce errors), when > in reality we should be trying to focus in on the root problem: since we've > put so many eggs in one basket, we need unnoticeable (or, faster) recovery > from errors that affect real-time, reachability information. > > [rjs]: Completely agree with this. I think my poor choice of wording perhaps > portrayed my view as negative -- rather, the key point for me is that the > robustness and error handling that we are discussing here is designed with > the vanilla Internet service as the baseline - and as we extend the protocol > to different deployment cases (no judgement about the value of which is > made), then some of the initial assumptions perhaps don't hold true. I think > this is in agreement with yourself, insofar that I think we would both assert > that for the real-time information, potentially the behaviour required in a > number of areas of the protocol is not the same as the behaviour required for > relatively static information. > >>> >>> [rjs]: Yes - the intention is to define this based on the narrowest set >>> possible, the reason that I used this wording is that (in my view) this is >>> defined by the NLRI actually in the message (if there were differing path >>> attributes for NLRI, then we expect that this is packed into a second >>> UPDATE message). Perhaps a hybrid of our wording would clarify this (unless >>> you think the assertion above is erroneous?). >> >> I see your point now. How about the following hybrid text? >> ---snip--- >> ... it is a requirement of any enhanced error handling mechanism to >> constrain the error handling so that it is narrowly focused on the NLRI >> contained within the bad UPDATE message. >> ---snip--- > > [rjs]: Sure, this sounds good. > >>>> 3) Section 2: >>>> ---snip--- >>>> contained within the message. Since in this case, the message >>>> received from the remote peer is syntactically valid, it is >>>> considered that such an UPDATE is indicative of erroneous data within >>>> a path attribute. [...] >>>> ---snip--- >>>> s/path attribute/path attributes/ >>> >>> [rjs]: Is the point here "one or more path attributes"? I'm not sure I >>> quite understand the nit? :-) >> >> Yes, sorry: "one or more path attributes". (My point was you can't predict, >> here anyway, that it will only a single path attribute that is a problem. >> Ideally, a more robust error-handling solution would not make such >> assumptions :-). > > [rjs]: ACK, updated this to 'one or more' :-) > >>> Many thanks again for your comments - if you could cast your eyes over the >>> above corrections, and let me know if you feel they're sufficient, that'd >>> be fantastic. >> >> And, thank you Rob for your excellent work on this. > > [rjs]: No worries - I'll take a read through and submit an -05 of the draft > that merges the edits we've discussed in this thread. > > Thanks again for the comments, > r. > > _______________________________________________ > GROW mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/grow _______________________________________________ GROW mailing list [email protected] https://www.ietf.org/mailman/listinfo/grow
