Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-17 Thread Gyan Mishra
Alvaro

Very good points brought up on the limitations on the tunnel encapsulation
attribute BGP prefix sid sub tlv.  Can only be used with BGP LU AFI / SAFI.

Kind Regards

Gyan


On Mon, May 17, 2021 at 4:12 PM Alvaro Retana 
wrote:

> On May 14, 2021 at 1:04:53 PM, Adrian Farrel wrote:
>
>
> Hi!
>
> I share some of John's concerns -- quick comment on the first one.
>
>
> ...
> > > 1. There’s surprisingly little in this document that seems to be
> > > SR-specific (and what there is, has some problems, see below). Is there
> > > some reason you rule out interconnecting domains using other tunneling
> > > technologies? I ask this question first because if the answer were to
> be
> > > “oh huh, we don’t need to make this SR-specific after all” some of the
> > > other things I’m asking about might go away.
> >
> > I'm sorry this isn't clear, but the use of other tunnelling technologies
> is
> > very much in scope. As the Introduction says:
> >
> > The various ASes that provide connectivity between the Ingress and Egress
> > Domains could each be constructed differently and use different
> technologies
> > such as IP, MPLS with global table routing native BGP to the edge, MPLS
> IP
> > VPN, SR-MPLS IP VPN, or SRv6 IP VPN.
> >
> > SR is used to identify the tunnels and provide end-to-end SR paths
> because the
> > ingress and egress domains are SR domains, and the objective is to
> provide an
> > end-to-end SR path.
> >
> > So we are not "making this SR aware" so much as enabling "SR-over-foo"
> using
> > SIDs to identify the path segments that are tunnels.
> >
> > I don't know how to make this clearer except maybe using some red paint.
> We
> > would write...
> >
> > The various ASes that provide connectivity between the Ingress and Egress
> > Domains could each be constructed differently and use different
> technologies
> > such as IP, MPLS with global table routing native BGP to the edge, MPLS
> IP
> > VPN, SR-MPLS IP VPN, or SRv6 IP VPN. That is, the Ingress and Egress SR
> > Domains can be connected by tunnels across a variety of technologies.
> This
> > document describes how SR identifiers (SIDs) are use to identify the
> paths
> > between the Ingress and Egress and the techniques in this document apply
> to
> > routes of all AFI/SAFIs.
>
> >From §5: "To achieve this, each Tunnel TLV in the Tunnel Encapsulation
> attribute contains a Prefix SID sub-TLV [I-D.ietf-idr-tunnel-encaps]
> for X."  But rfc9012 restricts the use of the Prefix-SID sub-TLV:
>
>[RFC8669] only defines behavior when the BGP Prefix-SID attribute is
>attached to routes of type IPv4/IPv6 Labeled Unicast


Gyan> RFC8669 limitation


3.1 .
Label-Index TLV

   The Label-Index TLV MUST be present in the BGP Prefix-SID attribute
   attached to IPv4/IPv6 Labeled Unicast prefixes ([RFC8277
]).  It MUST
   be ignored when received for other BGP AFI/SAFI combinations.


RFC 9012 Tunnel encapsulation attribute

3.7 .
Prefix-SID Sub-TLV (Type Code 11)

   [RFC8669] defines a BGP path attribute known as the "BGP Prefix-SID
   attribute".  This attribute is defined to contain a sequence of one
   or more TLVs, where each TLV is either a Label-Index TLV or an
   Originator SRGB (Source Routing Global Block) TLV.

   This document defines a Prefix-SID (Prefix Segment Identifier) sub-
   TLV.  The Value field of the Prefix-SID sub-TLV can be set to any
   permitted value of the Value field of a BGP Prefix-SID attribute
   [RFC8669 ].

   [RFC8669] only defines behavior when the BGP Prefix-SID attribute is
   attached to routes of type IPv4/IPv6 Labeled Unicast
   [RFC4760 ][RFC8277],
and it only defines values of the BGP Prefix-SID
   attribute for those cases.  Therefore, similar limitations exist for
   the Prefix-SID sub-TLV: it SHOULD only be included in a BGP UPDATE
   message for one of the address families for which [RFC8669
] has a
   defined behavior, namely BGP IPv4/IPv6 Labeled Unicast [RFC4760
]
   [RFC8277 ].  If
included in a BGP UPDATE for any other address family,
   it MUST be ignored.




>[RFC4760][RFC8277], and it only defines values of the BGP Prefix-SID
>attribute for those cases.  Therefore, similar limitations exist for
>the Prefix-SID sub-TLV: it SHOULD only be included in a BGP UPDATE
>message for one of the address families for which [RFC8669] has a
>defined behavior, namely BGP IPv4/IPv6 Labeled Unicast [RFC4760]
>[RFC8277].  If included in a BGP UPDATE for any other address family,
>it MUST be ignored.
>
> IOW, even though the overall mechanism could not be SR-specific, the
> SR solution can't 

Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-17 Thread Gyan Mishra
Adrian

In the introduction it mentions the following backbone transport:

   The various ASes that provide connectivity between the Ingress and Egress
   Domains could each be constructed differently and use different
   technologies such as IP, MPLS with global table routing native BGP to
   the edge, MPLS IP VPN, SR-MPLS IP VPN, or SRv6 IP VPN.


  In the draft it talks about SR steering

  however can RSVP be used in the backbone transport .


I think to clarify we should mention the MUST for the data plane
requirements for the backbone transport.


On the call last week we confirmed  that the ingress and egress
domains now called “sites” not domains do not have to be SR enabled.


Kind Regards


Gyan


On Mon, May 17, 2021 at 5:04 PM Gyan Mishra  wrote:

> Hi John
>
> I agree with your comments that the scenario I mentioned is covered in
> Section 3 and agree as well on the RFC 2119 keyword usage scrub.
>
> In-line
>
> On Mon, May 17, 2021 at 3:55 PM John Scudder  wrote:
>
>> Hi Gyan,
>>
>> > On May 17, 2021, at 1:50 PM, Gyan Mishra  wrote:
>> >
>> > So if GW2 connection to external was down but GW1 still has its
>> connection to external.  GW2 would auto discover GW1 over iBGP and GW2
>> would advertise both GW1 and GW2 as reachable gateways.  However GW2 has
>> its external peer down.  So if GW1 continues to advertised GW2 as we stated
>> GW1 will auto discover  GW2 over iBGP.
>>
>> Isn’t this scenario covered? From §3:
>>
>>If a gateway becomes disconnected from the backbone network, or if
>>the SR domain operator decides to terminate the gateway's activity,
>>it withdraws the advertisements described above.  This means that
>>remote gateways at other sites will stop seeing advertisements from
>>this gateway.
>
>
>Gyan> Yes.  Agreed.  I wanted to draw some more attention to this to
> the authors on the withdrawal that it’s critical and agreed a MUST.
>
>>
>>
>> So when GW2’s external peering goes down, GW2 withdraws its auto
>> discovery route, and therefore GW1 re-advertises its routes externally
>> without GW2 listed in the tunnel attribute.
>>
>
>
>
>
>> I will say that reviewing the above-quoted text — which seems tailor-made
>> for a “MUST withdraw” — made me notice that the draft makes only sporadic
>> and desultory use of RFC2119 keywords. In fact there are so few used, that
>> it seems like it might be better to scrub those two SHOULD and two MUST out
>> and remove the 2119 citation.
>
>
> Gyan> Agreed.  I will parse the draft for RFC 2119 keyword  placement  in
> my final GEN-ART review update
>
>>
>>
>> —John
>
> --
>
> 
>
> *Gyan Mishra*
>
> *Network Solutions A**rchitect *
>
> *Email gyan.s.mis...@verizon.com *
>
>
>
> *M 301 502-1347*
>
> --



*Gyan Mishra*

*Network Solutions A**rchitect *

*Email gyan.s.mis...@verizon.com *



*M 301 502-1347*
___
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess


Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-17 Thread Gyan Mishra
Hi John

I agree with your comments that the scenario I mentioned is covered in
Section 3 and agree as well on the RFC 2119 keyword usage scrub.

In-line

On Mon, May 17, 2021 at 3:55 PM John Scudder  wrote:

> Hi Gyan,
>
> > On May 17, 2021, at 1:50 PM, Gyan Mishra  wrote:
> >
> > So if GW2 connection to external was down but GW1 still has its
> connection to external.  GW2 would auto discover GW1 over iBGP and GW2
> would advertise both GW1 and GW2 as reachable gateways.  However GW2 has
> its external peer down.  So if GW1 continues to advertised GW2 as we stated
> GW1 will auto discover  GW2 over iBGP.
>
> Isn’t this scenario covered? From §3:
>
>If a gateway becomes disconnected from the backbone network, or if
>the SR domain operator decides to terminate the gateway's activity,
>it withdraws the advertisements described above.  This means that
>remote gateways at other sites will stop seeing advertisements from
>this gateway.


   Gyan> Yes.  Agreed.  I wanted to draw some more attention to this to the
authors on the withdrawal that it’s critical and agreed a MUST.

>
>
> So when GW2’s external peering goes down, GW2 withdraws its auto discovery
> route, and therefore GW1 re-advertises its routes externally without GW2
> listed in the tunnel attribute.
>




> I will say that reviewing the above-quoted text — which seems tailor-made
> for a “MUST withdraw” — made me notice that the draft makes only sporadic
> and desultory use of RFC2119 keywords. In fact there are so few used, that
> it seems like it might be better to scrub those two SHOULD and two MUST out
> and remove the 2119 citation.


Gyan> Agreed.  I will parse the draft for RFC 2119 keyword  placement  in
my final GEN-ART review update

>
>
> —John

-- 



*Gyan Mishra*

*Network Solutions A**rchitect *

*Email gyan.s.mis...@verizon.com *



*M 301 502-1347*
___
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess


Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-17 Thread John Scudder
Hi Adrian,

Comments in line below.

> On May 14, 2021, at 1:04 PM, Adrian Farrel  wrote:
> 
> [External Email. Be cautious of content]
> 
> 
> Hi John,
> 
> Thanks for the careful review.
> 
>> DISCUSS:
>> 
>> I have several points I’d like to discuss, listed below from most
>> general to most specific.
>> 
>> 1. There’s surprisingly little in this document that seems to be SR-specific
>> (and what there is, has some problems, see below). Is there some reason you
>> rule out interconnecting domains using other tunneling technologies? I ask 
>> this
>> question first because if the answer were to be “oh huh, we don’t need to 
>> make
>> this SR-specific after all” some of the other things I’m asking about might 
>> go
>> away.
> 
> I'm sorry this isn't clear, but the use of other tunnelling technologies is 
> very much in scope. As the Introduction says:
> 
>   The
>   various ASes that provide connectivity between the Ingress and Egress
>   Domains could each be constructed differently and use different
>   technologies such as IP, MPLS with global table routing native BGP to
>   the edge, MPLS IP VPN, SR-MPLS IP VPN, or SRv6 IP VPN.
> 
> SR is used to identify the tunnels and provide end-to-end SR paths because 
> the ingress and egress domains are SR domains, and the objective is to 
> provide an end-to-end SR path.
> 
> So we are not "making this SR aware" so much as enabling "SR-over-foo" using 
> SIDs to identify the path segments that are tunnels.
> 
> I don't know how to make this clearer except maybe using some red paint.

That would be exclusionary to the colo(u)r-blind.

> We would write...
> 
>   The
>   various ASes that provide connectivity between the Ingress and Egress
>   Domains could each be constructed differently and use different
>   technologies such as IP, MPLS with global table routing native BGP to
>   the edge, MPLS IP VPN, SR-MPLS IP VPN, or SRv6 IP VPN.  That is, the
>   Ingress and Egress SR Domains can be connected by tunnels across a
>   variety of technologies.  This document describes how SR identifiers
>   (SIDs) are use to identify the paths between the Ingress and Egress
>   and the techniques in this document apply to routes of all AFI/SAFIs.

If you want, you could expand the paragraph as you’ve suggested, but I don’t 
think it’s necessary — now that you’ve pointed out the paragraph, it’s clear 
enough. However, I think the document is still misleading and even inconsistent 
about this. Let me quote some other paragraphs to you.

Section 1:

   Segment Routing (SR) [RFC8402] is a protocol mechanism that can be
   used within a DC, and also for steering traffic that flows between
   two DC sites.  

The “steering traffic that flows between two DC sites” can easily be read as 
meaning, steering it *through* the backbone network. I take it your intent is 
to mean, steering it *over* the backbone network. 

   In order for a source (ingress) DC that uses SR to
   load balance the flows it sends to a destination (egress) DC, it
   needs to know the complete set of entry nodes (i.e., GWs) for that
   egress DC from the backbone network connecting the two DCs.  Note
   that it is assumed that the connected set of DCs and the backbone
   network connecting them are part of the same SR BGP Link State (LS)
   instance ([RFC7752] and [I-D.ietf-idr-bgpls-segment-routing-epe]) so
   that traffic engineering using SR may be used for these flows.

The requirement that the sites *and the backbone network connecting them* must 
all be part of the same BGP-LS instance caused me to raise my eyebrows up into 
my hairline, but there it is in the text. This surprising assumption (most 
service providers do not, to my knowledge, allow their customers to consume 
their LSDB), plus “traffic engineering using SR may be used for these flows”, 
plus the sentence noted above, led me a long way down the garden path of 
thinking you were proposing end-to-end SR forwarding.

And then we have Section 4:

   When a remote GW receives a route to a prefix X it uses the Tunnel
   Egress Endpoint Sub-TLVs in the containing Tunnel Encapsulation
   attribute to identify the GWs through which X can be reached.  It
   uses this information to compute SR Traffic Engineering (SR TE) paths
   *across the backbone network*

(emphasis added). This serves to confirm my misapprehension that this is an 
exclusively SR solution. 

So now on the one hand, I accept that you were completely serious about the 
paragraph you quoted, and that I mentally elided, having been dazzled by the 
parts I just quoted. On the other hand, I wonder what I’m misunderstanding 
about the parts I’ve just quoted, or if I’m not misunderstanding them, how we 
can square this circle.

>> 2. There’s no discussion about what trust model you’re assuming. SR
>> brings with it its own assumed trust model, laid out in RFC 8402 as “SR
>> operates within a trusted domain” (whatever *that* means). On the one
>> hand, given you’re 

Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-17 Thread Alvaro Retana
On May 14, 2021 at 1:04:53 PM, Adrian Farrel wrote:


Hi!

I share some of John's concerns -- quick comment on the first one.


...
> > 1. There’s surprisingly little in this document that seems to be
> > SR-specific (and what there is, has some problems, see below). Is there
> > some reason you rule out interconnecting domains using other tunneling
> > technologies? I ask this question first because if the answer were to be
> > “oh huh, we don’t need to make this SR-specific after all” some of the
> > other things I’m asking about might go away.
>
> I'm sorry this isn't clear, but the use of other tunnelling technologies is
> very much in scope. As the Introduction says:
>
> The various ASes that provide connectivity between the Ingress and Egress
> Domains could each be constructed differently and use different technologies
> such as IP, MPLS with global table routing native BGP to the edge, MPLS IP
> VPN, SR-MPLS IP VPN, or SRv6 IP VPN.
>
> SR is used to identify the tunnels and provide end-to-end SR paths because the
> ingress and egress domains are SR domains, and the objective is to provide an
> end-to-end SR path.
>
> So we are not "making this SR aware" so much as enabling "SR-over-foo" using
> SIDs to identify the path segments that are tunnels.
>
> I don't know how to make this clearer except maybe using some red paint. We
> would write...
>
> The various ASes that provide connectivity between the Ingress and Egress
> Domains could each be constructed differently and use different technologies
> such as IP, MPLS with global table routing native BGP to the edge, MPLS IP
> VPN, SR-MPLS IP VPN, or SRv6 IP VPN. That is, the Ingress and Egress SR
> Domains can be connected by tunnels across a variety of technologies. This
> document describes how SR identifiers (SIDs) are use to identify the paths
> between the Ingress and Egress and the techniques in this document apply to
> routes of all AFI/SAFIs.

>From §5: "To achieve this, each Tunnel TLV in the Tunnel Encapsulation
attribute contains a Prefix SID sub-TLV [I-D.ietf-idr-tunnel-encaps]
for X."  But rfc9012 restricts the use of the Prefix-SID sub-TLV:

   [RFC8669] only defines behavior when the BGP Prefix-SID attribute is
   attached to routes of type IPv4/IPv6 Labeled Unicast
   [RFC4760][RFC8277], and it only defines values of the BGP Prefix-SID
   attribute for those cases.  Therefore, similar limitations exist for
   the Prefix-SID sub-TLV: it SHOULD only be included in a BGP UPDATE
   message for one of the address families for which [RFC8669] has a
   defined behavior, namely BGP IPv4/IPv6 Labeled Unicast [RFC4760]
   [RFC8277].  If included in a BGP UPDATE for any other address family,
   it MUST be ignored.

IOW, even though the overall mechanism could not be SR-specific, the
SR solution can't be implemented in a general way (without more
consideration).

Alvaro.

___
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess


Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-17 Thread John Scudder
Hi Gyan, 

> On May 17, 2021, at 1:50 PM, Gyan Mishra  wrote:
> 
> So if GW2 connection to external was down but GW1 still has its connection to 
> external.  GW2 would auto discover GW1 over iBGP and GW2 would advertise both 
> GW1 and GW2 as reachable gateways.  However GW2 has its external peer down.  
> So if GW1 continues to advertised GW2 as we stated GW1 will auto discover  
> GW2 over iBGP.  

Isn’t this scenario covered? From §3:

   If a gateway becomes disconnected from the backbone network, or if
   the SR domain operator decides to terminate the gateway's activity,
   it withdraws the advertisements described above.  This means that
   remote gateways at other sites will stop seeing advertisements from
   this gateway.

So when GW2’s external peering goes down, GW2 withdraws its auto discovery 
route, and therefore GW1 re-advertises its routes externally without GW2 listed 
in the tunnel attribute.

I will say that reviewing the above-quoted text — which seems tailor-made for a 
“MUST withdraw” — made me notice that the draft makes only sporadic and 
desultory use of RFC2119 keywords. In fact there are so few used, that it seems 
like it might be better to scrub those two SHOULD and two MUST out and remove 
the 2119 citation.

—John
___
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess


Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-17 Thread John Scudder
ivity within a domain as I’ve described is a 
broken situation to begin with, but stuff happens, and your spec would make 
matters worse. It might be worth acknowledging this issue somewhere in the 
document?”

I hope this is clearer now.

Thanks,

—John

> Cheers,
> Adrian
>  
> From: John Scudder  
> Sent: 14 May 2021 22:25
> To: Adrian Farrel 
> Cc: The IESG ; draft-ietf-bess-datacenter-gate...@ietf.org; 
> bess-cha...@ietf.org; bess@ietf.org; Matthew Bocci 
> Subject: Re: John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: 
> (with DISCUSS and COMMENT)
>  
> Having re-read Section 3 carefully (and skimmed the rest) I still think what 
> the document says (as opposed to what’s in the authors’ heads?) is the first 
> description I give below. Let me know if you want me to walk through my 
> reasoning in detail with reference to the document. 
>  
> —John
> 
> 
>> On May 14, 2021, at 4:12 PM, John Scudder  wrote:
>> 
>>  Hi Adrian, 
>>  
>> Thanks for your reply. Pressed for time at the moment but one partial 
>> response:
>> 
>> 
>>> On May 14, 2021, at 1:04 PM, Adrian Farrel  wrote:
>>>  
>>> Agree with you that "stuff happens." I think that what you have described 
>>> is a window not a permanent situation.
>>> When GW2 knows it can't reach X any more, it will stop advertising X, and 
>>> GW1 will receive that and will update what it advertises on behalf of GW2.
>>  
>> Ah, perhaps I have badly misunderstood the way this works. I had thought it 
>> went something like this:
>>  
>> - GW1 knows it can reach GW2 because of GW2’s auto discovery route
>> - GW1 knows the set S of internal prefixes it can reach
>> - GW1 advertises each prefix from S with both GW1 and GW2 in the tunnel 
>> attribute
>>  
>> In the description above, there’s no notion of GW2 telling GW1 what internal 
>> prefixes GW2 can reach, or GW1 caring.  Now I suppose you are telling me 
>> that it goes:
>>  
>> - GW1 knows it can reach GW2 because of GW2’s auto discovery route
>> - GW1 knows the full set of prefixes GW2 can reach. _How does it know this?_
>> - GW1 constructs each advertisement listing only the correct set of gateways 
>> in the tunnel attribute
>>  
>> The key question is the one I’ve highlighted: how does GW1 come to know 
>> GW2’s internally-reachable prefixes? I didn’t notice any of this in the 
>> spec. Maybe it was just my sloppy reading, I’ll look again.
>> 
>> 
>>> Further, if GW1 can no longer receive advertisements from GW2 then it will 
>>> stop advertising on behalf of GW2.
>>  
>> Yes, that’s understood, but I was positing a case where just because GW1 can 
>> reach GW2 stably, and just because GW1 can reach X stably, it does not imply 
>> GW2 can reach X.
>>  
>> —John

___
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess


Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-17 Thread Gyan Mishra
Adrian

I am wrapping up the Gen-ART review update.

The normative draft helped tremendously in understanding the problem and
solution.  Please add to the beginning of the introduction in your next
update.

https://datatracker.ietf.org/doc/html/draft-farrel-spring-sr-domain-interconnect-05#section-9

I have a question related to the partitioning scenario.

Comments in-line

On Sun, May 16, 2021 at 7:25 AM Adrian Farrel  wrote:

> Hi John,
>
>
>
> Trying to dismantle this…
>
>
>
> We are saying that a site is integral. You are is asking : what happens if
> a site becomes partitioned so that some prefixes are accessible through one
> GW and some through another.
>
>
>
> Consider a site with a set of prefixes S
>
> Consider two GWs: GW1 and GW2
>
> Initially GW1 and GW2 discover each other.
>
> So GW1 advertises reachability to S, and by the way GW2 exists
>
> GW2 advertises reachability to S, and by the way GW1 exists
>

Gyan> Ack

> Now the site becomes partitioned so that GW1 can reach S1 and GW2 can
> reach S2. (S = S1 U S2,  S1 n S2 = E)
>
>  Gyan> Ack
>
> You ask:
>
>1. What happens to packets for S2 arriving at GW1?
>2. What is the remedy in the protocol?
>
>
>
> My answer to 1. is that the packets will be black-holed either at GW1 or
> inside the site.
>
> My observation is that:
>
>1. GW1 cannot reach GW2 inside the site. If it could, then S2 would be
>reachable via GW1
>2. It is contrary to BCP38 for GW1 to forward a packet back into the
>external AS to be routed to GW2
>
>  Gyan> How would “b” be possible as loop avoidance would drop the packet
> sent from GW1 external to loop back in GW2 being part of the same AS would
> drop the packets so they would not be able to re-discover each other via
> external.
>
> My answer to 2. is that when the site becomes partitioned:
>
>- GW1 will stop advertising the whole of S and will fall back to
>advertising just S1
>- GW2 will stop advertising the whole of S and will fall back to
>advertising just S2
>- Initially, GW1 and GW2 will still advertise each other’s existence,
>but will “soon” un-auto-discover each other
>
>
>-
>
> At this point the site is effectively two sites that use the same site
> identifier.
>
>  Gyan> Ack
>
> How quickly this takes place depends on precisely what the failure case
> is, how fast the failure detection is done, and how fast BGP converges.
>
>  Gyan> Ack
>


**Perhaps** there is a wrinkle **if** the autodetection advertisements are
> sent external to the site. In this case, GW1 would continue to discover GW2
> and so would readvertise it (and vice versa).
>
Gyan> As I stated above  “b” would not be possible so the continuing
rediscovery would not occur and GW1 and GW2 would as you stated effectively
act as two sites.

This would continue to lead to the broken condition you noted.
>

Gyan> As stated due to as-path loop avoidance the broken condition would
not occur.

I think we assumed that the peering between GW1 and GW2 would be internal
> to the site (because otherwise it would constitute traffic leaving the site
> and re-entering it (breaking BCP38 again). If it would help, we could make
> this point clear by saying that the peering between GW1 and GW2 must be
> within the site.
>
Gyan>I agree we should add explicit verbiage that iBGP per BCP38 to prevent
site partitioning as well as required for both GW1 and GW2 to be part of
the same AS and be able to provide failover and back each other Up in case
of a failure.

Gyan> I believe John mentioned this slightly different scenario but it got
lost in the thread and I tried to answer early on but I  would like to get
your take on the behavior.  As this is critical component to complete my
Gen-Art review.

So if GW2 connection to external was down but GW1 still has its connection
to external.  GW2 would auto discover GW1 over iBGP and GW2 would advertise
both GW1 and GW2 as reachable gateways.  However GW2 has its external peer
down.  So if GW1 continues to advertised GW2 as we stated GW1 will auto
discover  GW2 over iBGP.

So now for any sites trying to reach the Data Center AS that GW1 and GW2
are part of using GW2 to get to S1 and S2 would be black hole.  How do we
remedy this situation.

>
> Cheers,
>
> Adrian
>
>
>
> *From:* John Scudder 
> *Sent:* 14 May 2021 22:25
> *To:* Adrian Farrel 
> *Cc:* The IESG ;
> draft-ietf-bess-datacenter-gate...@ietf.org; bess-cha...@ietf.org;
> bess@ietf.org; Matthew Bocci 
> *Subject:* Re: John Scudder's Discuss on
> draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)
>
>
>
> Having re-read Section 3 carefully (and skimmed the rest) I still think

Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-16 Thread Adrian Farrel
Hi John,

 

Trying to dismantle this…

 

We are saying that a site is integral. You are is asking : what happens if a 
site becomes partitioned so that some prefixes are accessible through one GW 
and some through another.

 

Consider a site with a set of prefixes S

Consider two GWs: GW1 and GW2

Initially GW1 and GW2 discover each other.

So GW1 advertises reachability to S, and by the way GW2 exists

GW2 advertises reachability to S, and by the way GW1 exists

Now the site becomes partitioned so that GW1 can reach S1 and GW2 can reach S2. 
(S = S1 U S2,  S1 n S2 = E)

 

You ask:

1.  What happens to packets for S2 arriving at GW1?
2.  What is the remedy in the protocol?

 

My answer to 1. is that the packets will be black-holed either at GW1 or inside 
the site.

My observation is that:

a.  GW1 cannot reach GW2 inside the site. If it could, then S2 would be 
reachable via GW1
b.  It is contrary to BCP38 for GW1 to forward a packet back into the 
external AS to be routed to GW2

 

My answer to 2. is that when the site becomes partitioned:

*   GW1 will stop advertising the whole of S and will fall back to 
advertising just S1
*   GW2 will stop advertising the whole of S and will fall back to 
advertising just S2
*   Initially, GW1 and GW2 will still advertise each other’s existence, but 
will “soon” un-auto-discover each other

At this point the site is effectively two sites that use the same site 
identifier.

 

How quickly this takes place depends on precisely what the failure case is, how 
fast the failure detection is done, and how fast BGP converges.  

 

*Perhaps* there is a wrinkle *if* the autodetection advertisements are sent 
external to the site. In this case, GW1 would continue to discover GW2 and so 
would readvertise it (and vice versa). This would continue to lead to the 
broken condition you noted. I think we assumed that the peering between GW1 and 
GW2 would be internal to the site (because otherwise it would constitute 
traffic leaving the site and re-entering it (breaking BCP38 again). If it would 
help, we could make this point clear by saying that the peering between GW1 and 
GW2 must be within the site.



Cheers,

Adrian

 

From: John Scudder  
Sent: 14 May 2021 22:25
To: Adrian Farrel 
Cc: The IESG ; draft-ietf-bess-datacenter-gate...@ietf.org; 
bess-cha...@ietf.org; bess@ietf.org; Matthew Bocci 
Subject: Re: John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: 
(with DISCUSS and COMMENT)

 

Having re-read Section 3 carefully (and skimmed the rest) I still think what 
the document says (as opposed to what’s in the authors’ heads?) is the first 
description I give below. Let me know if you want me to walk through my 
reasoning in detail with reference to the document. 

 

—John





On May 14, 2021, at 4:12 PM, John Scudder  wrote:

 Hi Adrian, 

 

Thanks for your reply. Pressed for time at the moment but one partial response:





On May 14, 2021, at 1:04 PM, Adrian Farrel mailto:adr...@olddog.co.uk> > wrote:

 

Agree with you that "stuff happens." I think that what you have described is a 
window not a permanent situation.
When GW2 knows it can't reach X any more, it will stop advertising X, and GW1 
will receive that and will update what it advertises on behalf of GW2.

 

Ah, perhaps I have badly misunderstood the way this works. I had thought it 
went something like this:

 

- GW1 knows it can reach GW2 because of GW2’s auto discovery route

- GW1 knows the set S of internal prefixes it can reach

- GW1 advertises each prefix from S with both GW1 and GW2 in the tunnel 
attribute

 

In the description above, there’s no notion of GW2 telling GW1 what internal 
prefixes GW2 can reach, or GW1 caring.  Now I suppose you are telling me that 
it goes:

 

- GW1 knows it can reach GW2 because of GW2’s auto discovery route

- GW1 knows the full set of prefixes GW2 can reach. _How does it know this?_

- GW1 constructs each advertisement listing only the correct set of gateways in 
the tunnel attribute

 

The key question is the one I’ve highlighted: how does GW1 come to know GW2’s 
internally-reachable prefixes? I didn’t notice any of this in the spec. Maybe 
it was just my sloppy reading, I’ll look again.





Further, if GW1 can no longer receive advertisements from GW2 then it will stop 
advertising on behalf of GW2.

 

Yes, that’s understood, but I was positing a case where just because GW1 can 
reach GW2 stably, and just because GW1 can reach X stably, it does not imply 
GW2 can reach X.

 

—John

___
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess


Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-15 Thread Gyan Mishra
Section 3 verbiage below describes the
re-advertisement of current set of GWs due to GW being added or deleted.

So the blackhole John mentioned due to a GW being disconnected from
backbone should not occur.

   As described in Section 1
,
each

   GW will include a Tunnel

   Encapsulation attribute with the GW encapsulation information for
   each of the SR domain's active GWs (including itself) in every route
   advertised externally to that SR domain.  As the current set of
   active GWs changes (due to the addition of a new GW or the failure/
   removal of an existing GW) each externally advertised route will be
   re-advertised with a new Tunnel Encapsulation attribute which
   reflects current set of active GWs.

   If a gateway becomes disconnected from the backbone network, or if
   the SR domain operator decides to terminate the gateway's activity,
   it withdraws the advertisements described above.  This means that
   remote gateways at other sites will stop seeing advertisements from
   this gateway.


I support this very important draft that provides a critical solution
for remote site GW auto-discovery over a Multi-AS transit SR / MPLS
domain.


For those whom have not read normative reference below I suggest
reading as it helps provide context to the problem statement and
solution.


https://datatracker.ietf.org/doc/html/draft-farrel-spring-sr-domain-interconnect-05



Kind Regards


Gyan


On Sat, May 15, 2021 at 12:35 AM Gyan Mishra  wrote:

>
> Adrian & Authors please correct me if I misspeak the way I read the draft.
>
> I did not see in the draft stating explicitly how the internal DC GW
> routes are advertised which I believe implicitly done via standard BGP AFI
> / SAFI route propagation natively over the SR domain.  So for example if
> the internal GW routers are SAFI 1 unicast they are propagated natively
> global table routed as SAFI 1 inter domain from ingress to egress DC GW.
>
> For VPN overlay SAFI 128 129 routes are propagated natively as SAFI 128
> 129 over the SR core.
>
> So all GW internal routes AFI / SAFI are supported meaning SAFI  1, 128,
> 129.
>
> Kind Regards
>
> Gyan
>
> On Fri, May 14, 2021 at 10:45 PM Gyan Mishra 
> wrote:
>
>> Hi Adrian
>>
>> I may have missed this in the draft but the solution for this failover
>> scenario is if each GW can only advertise itself, which I think that is
>> stated in section 3 then GW1 can only advertise itself via tunnel
>> encapsulation attribute and not GW2 as GW2 can only advertise itself when
>> it’s eBGP tie to the core comes Up.  Problem solved.
>>
>> Kind Regards
>>
>> Gyan
>>
>> On Fri, May 14, 2021 at 7:02 PM Gyan Mishra 
>> wrote:
>>
>>>
>>> Hi Adrian
>>>
>>> I believe what John is describing is a valid failure scenario where one
>>> of the GWs is no longer a valid gateway because it’s eBGP peering to core
>>> domain is down, however the routing underlay is stable between the GWs
>>> within the DC site. We are assuming the GWs at the site run an IGP to
>>> advertise next hop attribute and maybe also have next hop self configured
>>> between then for iBGP.  So the GW that had eBGP peer to core is able to
>>> send the auto discovery loopback prefix tunnel sub TLV for both GWs.
>>> That’s the problem.
>>>
>>> So that would cause the black hole of traffic between sites for the GW
>>> that has its eBGP link to the core down.
>>>
>>> The other question asked was eve set of internal prefixes how are they
>>> advertised and is that just over the native AFI SAFI iBGP peering between
>>> the GWs.
>>>
>>> So GW1 that is up advertises via iBGP the set of internal prefixes
>>> learned from the other domain.
>>>
>>> Kind Regards
>>>
>>> Gyan
>>>
>>> On Fri, May 14, 2021 at 5:25 PM John Scudder >> 40juniper@dmarc.ietf.org> wrote:
>>>
 Having re-read Section 3 carefully (and skimmed the rest) I still think
 what the document says (as opposed to what’s in the authors’ heads?) is the
 first description I give below. Let me know if you want me to walk through
 my reasoning in detail with reference to the document.

 —John

 On May 14, 2021, at 4:12 PM, John Scudder  wrote:

  Hi Adrian,


 Thanks for your reply. Pressed for time at the moment but one partial
 response:

 On May 14, 2021, at 1:04 PM, Adrian Farrel  wrote:

 Agree with you that "stuff happens." I think that what you have
 described is a window not a permanent situation.
 When GW2 knows it can't reach X any more, it will stop advertising X,
 and GW1 will receive that and will update what it advertises on behalf of
 GW2.


 Ah, perhaps I have badly misunderstood the way this works. I had
 thought it went something like this:

 - GW1 knows it can reach GW2 because of GW2’s auto discovery route
 - GW1 knows the set S of internal prefixes it can reach
 - GW1 advertises 

Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-14 Thread Gyan Mishra
Adrian & Authors please correct me if I misspeak the way I read the draft.

I did not see in the draft stating explicitly how the internal DC GW routes
are advertised which I believe implicitly done via standard BGP AFI / SAFI
route propagation natively over the SR domain.  So for example if the
internal GW routers are SAFI 1 unicast they are propagated natively global
table routed as SAFI 1 inter domain from ingress to egress DC GW.

For VPN overlay SAFI 128 129 routes are propagated natively as SAFI 128 129
over the SR core.

So all GW internal routes AFI / SAFI are supported meaning SAFI  1, 128,
129.

Kind Regards

Gyan

On Fri, May 14, 2021 at 10:45 PM Gyan Mishra  wrote:

> Hi Adrian
>
> I may have missed this in the draft but the solution for this failover
> scenario is if each GW can only advertise itself, which I think that is
> stated in section 3 then GW1 can only advertise itself via tunnel
> encapsulation attribute and not GW2 as GW2 can only advertise itself when
> it’s eBGP tie to the core comes Up.  Problem solved.
>
> Kind Regards
>
> Gyan
>
> On Fri, May 14, 2021 at 7:02 PM Gyan Mishra  wrote:
>
>>
>> Hi Adrian
>>
>> I believe what John is describing is a valid failure scenario where one
>> of the GWs is no longer a valid gateway because it’s eBGP peering to core
>> domain is down, however the routing underlay is stable between the GWs
>> within the DC site. We are assuming the GWs at the site run an IGP to
>> advertise next hop attribute and maybe also have next hop self configured
>> between then for iBGP.  So the GW that had eBGP peer to core is able to
>> send the auto discovery loopback prefix tunnel sub TLV for both GWs.
>> That’s the problem.
>>
>> So that would cause the black hole of traffic between sites for the GW
>> that has its eBGP link to the core down.
>>
>> The other question asked was eve set of internal prefixes how are they
>> advertised and is that just over the native AFI SAFI iBGP peering between
>> the GWs.
>>
>> So GW1 that is up advertises via iBGP the set of internal prefixes
>> learned from the other domain.
>>
>> Kind Regards
>>
>> Gyan
>>
>> On Fri, May 14, 2021 at 5:25 PM John Scudder > 40juniper@dmarc.ietf.org> wrote:
>>
>>> Having re-read Section 3 carefully (and skimmed the rest) I still think
>>> what the document says (as opposed to what’s in the authors’ heads?) is the
>>> first description I give below. Let me know if you want me to walk through
>>> my reasoning in detail with reference to the document.
>>>
>>> —John
>>>
>>> On May 14, 2021, at 4:12 PM, John Scudder  wrote:
>>>
>>>  Hi Adrian,
>>>
>>>
>>> Thanks for your reply. Pressed for time at the moment but one partial
>>> response:
>>>
>>> On May 14, 2021, at 1:04 PM, Adrian Farrel  wrote:
>>>
>>> Agree with you that "stuff happens." I think that what you have
>>> described is a window not a permanent situation.
>>> When GW2 knows it can't reach X any more, it will stop advertising X,
>>> and GW1 will receive that and will update what it advertises on behalf of
>>> GW2.
>>>
>>>
>>> Ah, perhaps I have badly misunderstood the way this works. I had thought
>>> it went something like this:
>>>
>>> - GW1 knows it can reach GW2 because of GW2’s auto discovery route
>>> - GW1 knows the set S of internal prefixes it can reach
>>> - GW1 advertises each prefix from S with both GW1 and GW2 in the tunnel
>>> attribute
>>>
>>> In the description above, there’s no notion of GW2 telling GW1 what
>>> internal prefixes GW2 can reach, or GW1 caring.  Now I suppose you are
>>> telling me that it goes:
>>>
>>> - GW1 knows it can reach GW2 because of GW2’s auto discovery route
>>> - GW1 knows the full set of prefixes GW2 can reach. _How does it know
>>> this?_
>>> - GW1 constructs each advertisement listing only the correct set of
>>> gateways in the tunnel attribute
>>>
>>> The key question is the one I’ve highlighted: how does GW1 come to know
>>> GW2’s internally-reachable prefixes? I didn’t notice any of this in the
>>> spec. Maybe it was just my sloppy reading, I’ll look again.
>>>
>>> Further, if GW1 can no longer receive advertisements from GW2 then it
>>> will stop advertising on behalf of GW2.
>>>
>>>
>>> Yes, that’s understood, but I was positing a case where just because GW1
>>> can reach GW2 stably, and just because GW1 can reach X stably, it does not
>>> imply GW2 can reach X.
>>>
>>> —John
>>>
>>> ___
>>> BESS mailing list
>>> BESS@ietf.org
>>> https://www.ietf.org/mailman/listinfo/bess
>>>
>> --
>>
>> 
>>
>> *Gyan Mishra*
>>
>> *Network Solutions A**rchitect *
>>
>> *Email gyan.s.mis...@verizon.com *
>>
>>
>>
>> *M 301 502-1347*
>>
>> --
>
> 
>
> *Gyan Mishra*
>
> *Network Solutions A**rchitect *
>
> *Email gyan.s.mis...@verizon.com *
>
>
>
> *M 301 502-1347*
>
> --



*Gyan Mishra*

*Network Solutions A**rchitect *

*Email gyan.s.mis...@verizon.com *



*M 301 502-1347*

Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-14 Thread Gyan Mishra
Hi Adrian

I may have missed this in the draft but the solution for this failover
scenario is if each GW can only advertise itself, which I think that is
stated in section 3 then GW1 can only advertise itself via tunnel
encapsulation attribute and not GW2 as GW2 can only advertise itself when
it’s eBGP tie to the core comes Up.  Problem solved.

Kind Regards

Gyan

On Fri, May 14, 2021 at 7:02 PM Gyan Mishra  wrote:

>
> Hi Adrian
>
> I believe what John is describing is a valid failure scenario where one of
> the GWs is no longer a valid gateway because it’s eBGP peering to core
> domain is down, however the routing underlay is stable between the GWs
> within the DC site. We are assuming the GWs at the site run an IGP to
> advertise next hop attribute and maybe also have next hop self configured
> between then for iBGP.  So the GW that had eBGP peer to core is able to
> send the auto discovery loopback prefix tunnel sub TLV for both GWs.
> That’s the problem.
>
> So that would cause the black hole of traffic between sites for the GW
> that has its eBGP link to the core down.
>
> The other question asked was eve set of internal prefixes how are they
> advertised and is that just over the native AFI SAFI iBGP peering between
> the GWs.
>
> So GW1 that is up advertises via iBGP the set of internal prefixes learned
> from the other domain.
>
> Kind Regards
>
> Gyan
>
> On Fri, May 14, 2021 at 5:25 PM John Scudder  40juniper@dmarc.ietf.org> wrote:
>
>> Having re-read Section 3 carefully (and skimmed the rest) I still think
>> what the document says (as opposed to what’s in the authors’ heads?) is the
>> first description I give below. Let me know if you want me to walk through
>> my reasoning in detail with reference to the document.
>>
>> —John
>>
>> On May 14, 2021, at 4:12 PM, John Scudder  wrote:
>>
>>  Hi Adrian,
>>
>>
>> Thanks for your reply. Pressed for time at the moment but one partial
>> response:
>>
>> On May 14, 2021, at 1:04 PM, Adrian Farrel  wrote:
>>
>> Agree with you that "stuff happens." I think that what you have described
>> is a window not a permanent situation.
>> When GW2 knows it can't reach X any more, it will stop advertising X, and
>> GW1 will receive that and will update what it advertises on behalf of GW2.
>>
>>
>> Ah, perhaps I have badly misunderstood the way this works. I had thought
>> it went something like this:
>>
>> - GW1 knows it can reach GW2 because of GW2’s auto discovery route
>> - GW1 knows the set S of internal prefixes it can reach
>> - GW1 advertises each prefix from S with both GW1 and GW2 in the tunnel
>> attribute
>>
>> In the description above, there’s no notion of GW2 telling GW1 what
>> internal prefixes GW2 can reach, or GW1 caring.  Now I suppose you are
>> telling me that it goes:
>>
>> - GW1 knows it can reach GW2 because of GW2’s auto discovery route
>> - GW1 knows the full set of prefixes GW2 can reach. _How does it know
>> this?_
>> - GW1 constructs each advertisement listing only the correct set of
>> gateways in the tunnel attribute
>>
>> The key question is the one I’ve highlighted: how does GW1 come to know
>> GW2’s internally-reachable prefixes? I didn’t notice any of this in the
>> spec. Maybe it was just my sloppy reading, I’ll look again.
>>
>> Further, if GW1 can no longer receive advertisements from GW2 then it
>> will stop advertising on behalf of GW2.
>>
>>
>> Yes, that’s understood, but I was positing a case where just because GW1
>> can reach GW2 stably, and just because GW1 can reach X stably, it does not
>> imply GW2 can reach X.
>>
>> —John
>>
>> ___
>> BESS mailing list
>> BESS@ietf.org
>> https://www.ietf.org/mailman/listinfo/bess
>>
> --
>
> 
>
> *Gyan Mishra*
>
> *Network Solutions A**rchitect *
>
> *Email gyan.s.mis...@verizon.com *
>
>
>
> *M 301 502-1347*
>
> --



*Gyan Mishra*

*Network Solutions A**rchitect *

*Email gyan.s.mis...@verizon.com *



*M 301 502-1347*
___
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess


Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-14 Thread Gyan Mishra
Hi Adrian

I believe what John is describing is a valid failure scenario where one of
the GWs is no longer a valid gateway because it’s eBGP peering to core
domain is down, however the routing underlay is stable between the GWs
within the DC site. We are assuming the GWs at the site run an IGP to
advertise next hop attribute and maybe also have next hop self configured
between then for iBGP.  So the GW that had eBGP peer to core is able to
send the auto discovery loopback prefix tunnel sub TLV for both GWs.
That’s the problem.

So that would cause the black hole of traffic between sites for the GW that
has its eBGP link to the core down.

The other question asked was eve set of internal prefixes how are they
advertised and is that just over the native AFI SAFI iBGP peering between
the GWs.

So GW1 that is up advertises via iBGP the set of internal prefixes learned
from the other domain.

Kind Regards

Gyan

On Fri, May 14, 2021 at 5:25 PM John Scudder  wrote:

> Having re-read Section 3 carefully (and skimmed the rest) I still think
> what the document says (as opposed to what’s in the authors’ heads?) is the
> first description I give below. Let me know if you want me to walk through
> my reasoning in detail with reference to the document.
>
> —John
>
> On May 14, 2021, at 4:12 PM, John Scudder  wrote:
>
>  Hi Adrian,
>
>
> Thanks for your reply. Pressed for time at the moment but one partial
> response:
>
> On May 14, 2021, at 1:04 PM, Adrian Farrel  wrote:
>
> Agree with you that "stuff happens." I think that what you have described
> is a window not a permanent situation.
> When GW2 knows it can't reach X any more, it will stop advertising X, and
> GW1 will receive that and will update what it advertises on behalf of GW2.
>
>
> Ah, perhaps I have badly misunderstood the way this works. I had thought
> it went something like this:
>
> - GW1 knows it can reach GW2 because of GW2’s auto discovery route
> - GW1 knows the set S of internal prefixes it can reach
> - GW1 advertises each prefix from S with both GW1 and GW2 in the tunnel
> attribute
>
> In the description above, there’s no notion of GW2 telling GW1 what
> internal prefixes GW2 can reach, or GW1 caring.  Now I suppose you are
> telling me that it goes:
>
> - GW1 knows it can reach GW2 because of GW2’s auto discovery route
> - GW1 knows the full set of prefixes GW2 can reach. _How does it know
> this?_
> - GW1 constructs each advertisement listing only the correct set of
> gateways in the tunnel attribute
>
> The key question is the one I’ve highlighted: how does GW1 come to know
> GW2’s internally-reachable prefixes? I didn’t notice any of this in the
> spec. Maybe it was just my sloppy reading, I’ll look again.
>
> Further, if GW1 can no longer receive advertisements from GW2 then it will
> stop advertising on behalf of GW2.
>
>
> Yes, that’s understood, but I was positing a case where just because GW1
> can reach GW2 stably, and just because GW1 can reach X stably, it does not
> imply GW2 can reach X.
>
> —John
>
> ___
> BESS mailing list
> BESS@ietf.org
> https://www.ietf.org/mailman/listinfo/bess
>
-- 



*Gyan Mishra*

*Network Solutions A**rchitect *

*Email gyan.s.mis...@verizon.com *



*M 301 502-1347*
___
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess


Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-14 Thread John Scudder
Having re-read Section 3 carefully (and skimmed the rest) I still think what 
the document says (as opposed to what’s in the authors’ heads?) is the first 
description I give below. Let me know if you want me to walk through my 
reasoning in detail with reference to the document.

—John

On May 14, 2021, at 4:12 PM, John Scudder  wrote:

 Hi Adrian,

Thanks for your reply. Pressed for time at the moment but one partial response:

On May 14, 2021, at 1:04 PM, Adrian Farrel 
mailto:adr...@olddog.co.uk>> wrote:

Agree with you that "stuff happens." I think that what you have described is a 
window not a permanent situation.
When GW2 knows it can't reach X any more, it will stop advertising X, and GW1 
will receive that and will update what it advertises on behalf of GW2.

Ah, perhaps I have badly misunderstood the way this works. I had thought it 
went something like this:

- GW1 knows it can reach GW2 because of GW2’s auto discovery route
- GW1 knows the set S of internal prefixes it can reach
- GW1 advertises each prefix from S with both GW1 and GW2 in the tunnel 
attribute

In the description above, there’s no notion of GW2 telling GW1 what internal 
prefixes GW2 can reach, or GW1 caring.  Now I suppose you are telling me that 
it goes:

- GW1 knows it can reach GW2 because of GW2’s auto discovery route
- GW1 knows the full set of prefixes GW2 can reach. _How does it know this?_
- GW1 constructs each advertisement listing only the correct set of gateways in 
the tunnel attribute

The key question is the one I’ve highlighted: how does GW1 come to know GW2’s 
internally-reachable prefixes? I didn’t notice any of this in the spec. Maybe 
it was just my sloppy reading, I’ll look again.

Further, if GW1 can no longer receive advertisements from GW2 then it will stop 
advertising on behalf of GW2.

Yes, that’s understood, but I was positing a case where just because GW1 can 
reach GW2 stably, and just because GW1 can reach X stably, it does not imply 
GW2 can reach X.

—John
___
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess


Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-14 Thread Robert Raszuk
Hi John,

The way I understood this is intending to work in practice is simply to
create IBGP session between GW1 & GW2,

If we have this IBGP session then there are two cases:

* we receive route to X from peer GW so we know peer GW can reach X hence
it is safe to advertise X with both GWs as NHs
* we send route to X to peer GW so we know that peer GW can reach X (at
least via sender's GW) hence it is again ok to  advertise X with both GWs
as NHs

Seems like this logic can solve your question ...

But good catch :)

Cheers,
Robert





On Fri, May 14, 2021 at 10:12 PM John Scudder  wrote:

> Hi Adrian,
>
> Thanks for your reply. Pressed for time at the moment but one partial
> response:
>
> On May 14, 2021, at 1:04 PM, Adrian Farrel  wrote:
>
> Agree with you that "stuff happens." I think that what you have described
> is a window not a permanent situation.
> When GW2 knows it can't reach X any more, it will stop advertising X, and
> GW1 will receive that and will update what it advertises on behalf of GW2.
>
>
> Ah, perhaps I have badly misunderstood the way this works. I had thought
> it went something like this:
>
> - GW1 knows it can reach GW2 because of GW2’s auto discovery route
> - GW1 knows the set S of internal prefixes it can reach
> - GW1 advertises each prefix from S with both GW1 and GW2 in the tunnel
> attribute
>
> In the description above, there’s no notion of GW2 telling GW1 what
> internal prefixes GW2 can reach, or GW1 caring.  Now I suppose you are
> telling me that it goes:
>
> - GW1 knows it can reach GW2 because of GW2’s auto discovery route
> - GW1 knows the full set of prefixes GW2 can reach. _How does it know
> this?_
> - GW1 constructs each advertisement listing only the correct set of
> gateways in the tunnel attribute
>
> The key question is the one I’ve highlighted: how does GW1 come to know
> GW2’s internally-reachable prefixes? I didn’t notice any of this in the
> spec. Maybe it was just my sloppy reading, I’ll look again.
>
> Further, if GW1 can no longer receive advertisements from GW2 then it will
> stop advertising on behalf of GW2.
>
>
> Yes, that’s understood, but I was positing a case where just because GW1
> can reach GW2 stably, and just because GW1 can reach X stably, it does not
> imply GW2 can reach X.
>
> —John
> ___
> BESS mailing list
> BESS@ietf.org
> https://www.ietf.org/mailman/listinfo/bess
>
___
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess


Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-14 Thread John Scudder
Hi Adrian,

Thanks for your reply. Pressed for time at the moment but one partial response:

On May 14, 2021, at 1:04 PM, Adrian Farrel 
mailto:adr...@olddog.co.uk>> wrote:

Agree with you that "stuff happens." I think that what you have described is a 
window not a permanent situation.
When GW2 knows it can't reach X any more, it will stop advertising X, and GW1 
will receive that and will update what it advertises on behalf of GW2.

Ah, perhaps I have badly misunderstood the way this works. I had thought it 
went something like this:

- GW1 knows it can reach GW2 because of GW2’s auto discovery route
- GW1 knows the set S of internal prefixes it can reach
- GW1 advertises each prefix from S with both GW1 and GW2 in the tunnel 
attribute

In the description above, there’s no notion of GW2 telling GW1 what internal 
prefixes GW2 can reach, or GW1 caring.  Now I suppose you are telling me that 
it goes:

- GW1 knows it can reach GW2 because of GW2’s auto discovery route
- GW1 knows the full set of prefixes GW2 can reach. _How does it know this?_
- GW1 constructs each advertisement listing only the correct set of gateways in 
the tunnel attribute

The key question is the one I’ve highlighted: how does GW1 come to know GW2’s 
internally-reachable prefixes? I didn’t notice any of this in the spec. Maybe 
it was just my sloppy reading, I’ll look again.

Further, if GW1 can no longer receive advertisements from GW2 then it will stop 
advertising on behalf of GW2.

Yes, that’s understood, but I was positing a case where just because GW1 can 
reach GW2 stably, and just because GW1 can reach X stably, it does not imply 
GW2 can reach X.

—John
___
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess


Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-14 Thread Adrian Farrel
Hi John,

Thanks for the careful review.

> DISCUSS:
>
> I have several points I’d like to discuss, listed below from most
> general to most specific.
>
> 1. There’s surprisingly little in this document that seems to be SR-specific
> (and what there is, has some problems, see below). Is there some reason you
> rule out interconnecting domains using other tunneling technologies? I ask 
> this
> question first because if the answer were to be “oh huh, we don’t need to make
> this SR-specific after all” some of the other things I’m asking about might go
> away.

I'm sorry this isn't clear, but the use of other tunnelling technologies is 
very much in scope. As the Introduction says:

   The
   various ASes that provide connectivity between the Ingress and Egress
   Domains could each be constructed differently and use different
   technologies such as IP, MPLS with global table routing native BGP to
   the edge, MPLS IP VPN, SR-MPLS IP VPN, or SRv6 IP VPN.

SR is used to identify the tunnels and provide end-to-end SR paths because the 
ingress and egress domains are SR domains, and the objective is to provide an 
end-to-end SR path.

So we are not "making this SR aware" so much as enabling "SR-over-foo" using 
SIDs to identify the path segments that are tunnels.

I don't know how to make this clearer except maybe using some red paint. We 
would write...

   The
   various ASes that provide connectivity between the Ingress and Egress
   Domains could each be constructed differently and use different
   technologies such as IP, MPLS with global table routing native BGP to
   the edge, MPLS IP VPN, SR-MPLS IP VPN, or SRv6 IP VPN.  That is, the
   Ingress and Egress SR Domains can be connected by tunnels across a
   variety of technologies.  This document describes how SR identifiers
   (SIDs) are use to identify the paths between the Ingress and Egress
   and the techniques in this document apply to routes of all AFI/SAFIs.

> 2. There’s no discussion about what trust model you’re assuming. SR
> brings with it its own assumed trust model, laid out in RFC 8402 as “SR
> operates within a trusted domain” (whatever *that* means). On the one
> hand, given you’re tying yourself to SR you presumably are tied to its trust
> model. On the other hand, there are some tantalizing tidbits that suggest
> otherwise. I would be happier if there were some explicit description of
> the trust model you’re presuming. It’s hard to evaluate some aspects of
> the document without knowing if you’re assuming the RFC 8402 closed
> domain model, or something else.

I believe that the term "SR domain" in 8402 is basically defined as "a set of 
nodes that support SR". 
The description in (the ever-so-skimpy section 8 of 8402) says:

   By default, SR operates within a trusted domain.  Traffic MUST be
   filtered at the domain boundaries.

What does "by default" mean in that context? 

I think there are two things to think about:

1. Forwarding plane trust model. Can packets get into the SR system? The answer 
to that remains, "No, because traffic MUST be filtered at the domain 
boundaries." This requires that the domain boundary is the interface between an 
SR-capable node, and a non-SR node. In this document all GWs and ASBRs are part 
of the SR domain connected by tunnels across the transit ASes (although the 
nodes in the transit ASes are not part of that domain). I dare say that 
draft-farrel-spring-sr-domain-interconnect explains this better through 
examples, but the chairs of SPRING told us that that draft had no chance of 
progressing.

2. Control plane trust model. What is the trust model in the BGP system? I'm 
pretty sure that Section 8 of our draft is adequate for this discussion,.

> 3. The use of the term “SR domain” in this document appears inconsistent with
> its definition in RFC 8402. Here’s that definition, from §2:
>
>   Segment Routing domain (SR domain): the set of nodes participating in
>   the source-based routing model.  These nodes may be connected to the
>   same physical infrastructure (e.g., a Service Provider's network).
>   They may as well be remotely connected to each other (e.g., an
>   enterprise VPN or an overlay).  If multiple protocol instances are
>   deployed, the SR domain most commonly includes all of the protocol
>   instances in a network.  However, some deployments may wish to
>   subdivide the network into multiple SR domains, each of which
>   includes one or more protocol instances.  It is expected that all
>   nodes in an SR domain are managed by the same administrative entity.
>
> And notably, later in 8402 §8 we are told that
>
>   By default, SR operates within a trusted domain.  Traffic MUST be
>   filtered at the domain boundaries.
>
> Which specifically means, to take the MPLS instantiation of SR (§8.1):
>
>   SR domain boundary routers MUST filter any external traffic destined
>   to a label associated with a segment within the trusted domain.  This
>   includes labels within the SRGB 

Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

2021-05-14 Thread Adrian Farrel
Hi John,

I'm currently constructing a reply to your points. Extensive review deserves 
extensive answers. May take another day or two.

Cheers,
Adrian

-Original Message-
From: John Scudder via Datatracker  
Sent: 13 May 2021 22:41
To: The IESG 
Cc: draft-ietf-bess-datacenter-gate...@ietf.org; bess-cha...@ietf.org; 
bess@ietf.org; Matthew Bocci ; matthew.bo...@nokia.com
Subject: John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with 
DISCUSS and COMMENT)

John Scudder has entered the following ballot position for
draft-ietf-bess-datacenter-gateway-10: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-bess-datacenter-gateway/



--
DISCUSS:
--

I have several points I’d like to discuss, listed below from most general to
most specific.

1. There’s surprisingly little in this document that seems to be SR-specific
(and what there is, has some problems, see below). Is there some reason you
rule out interconnecting domains using other tunneling technologies? I ask this
question first because if the answer were to be “oh huh, we don’t need to make
this SR-specific after all” some of the other things I’m asking about might go
away.

2. There’s no discussion about what trust model you’re assuming. SR brings with
it its own assumed trust model, laid out in RFC 8402 as “SR operates within a
trusted domain” (whatever *that* means). On the one hand, given you’re tying
yourself to SR you presumably are tied to its trust model. On the other hand,
there are some tantalizing tidbits that suggest otherwise. I would be happier
if there were some explicit description of the trust model you’re presuming.
It’s hard to evaluate some aspects of the document without knowing if you’re
assuming the RFC 8402 closed domain model, or something else.

3. The use of the term “SR domain” in this document appears inconsistent with
its definition in RFC 8402. Here’s that definition, from §2:

   Segment Routing domain (SR domain): the set of nodes participating in
   the source-based routing model.  These nodes may be connected to the
   same physical infrastructure (e.g., a Service Provider's network).
   They may as well be remotely connected to each other (e.g., an
   enterprise VPN or an overlay).  If multiple protocol instances are
   deployed, the SR domain most commonly includes all of the protocol
   instances in a network.  However, some deployments may wish to
   subdivide the network into multiple SR domains, each of which
   includes one or more protocol instances.  It is expected that all
   nodes in an SR domain are managed by the same administrative entity.

And notably, later in 8402 §8 we are told that

   By default, SR operates within a trusted domain.  Traffic MUST be
   filtered at the domain boundaries.

Which specifically means, to take the MPLS instantiation of SR (§8.1):

   SR domain boundary routers MUST filter any external traffic destined
   to a label associated with a segment within the trusted domain.  This
   includes labels within the SRGB of the trusted domain, labels within
   the SRLB of the specific boundary router, and labels outside either
   of these blocks.  External traffic is any traffic received from an
   interface connected to a node outside the domain of trust.

More simply put, 8402 says you can’t send an SR packet from outside an SR
domain, into that domain. But your document is written in terms of a
multiplicity of SR domains, for example this in Section 1:

   Tunnel Encapsulation attribute.  The gateway in the ingress SR domain
   can now see all possible paths to X in the egress SR domain

Maybe a quick fix, assuming you really do subscribe to the RFC 8402 trust
model, is to invent, define, and use the term “SR subdomain” and deem all the
subdomains to comprise one SR domain, in the sense of RFC 8402 §2 — “They may
as well be remotely connected to each other (e.g., an enterprise VPN or an
overlay)” seems to describe your situation pretty well.


--
COMMENT:
--

1. Abstract

   This document defines a mechanism using the BGP Tunnel Encapsulation
   attribute to allow each gateway router to advertise the routes to the
   prefixes in the Segment Routing domains to which it provides access,
   and also to advertise on behalf of each other gateway to the same
   Segment Routing domain.

The last clause has no object. To