Re: [Lsr] Prefix Unreachable Announcement Use Cases

2020-11-14 Thread 王爱俊

Hi, Acee:
I think Robert have given the good explaination for the purpose of this draft.
The aim of this draft is to improve the service convergence time, which can be 
notified quickly the failure of underlying network link or node.
BFD is another possible solution, but it requires massive configuration and 
other costs as that pointed out by Robert.


With PUA, the failure informaiton of link or node will be advertised 
automatically and quickly. It keeps also the summary behaviour on ABRs to limit 
the amounts of reachable prefixes advertisment. 


More detail responses are inline below.

Thanks in advance.
Aijun Wang
China Telecom



发件人:Robert Raszuk 
发送日期:2020-11-15 06:30:20
收件人:"Acee Lindem (acee)" 
抄送人:"lsr@ietf.org" 
主题:Re: [Lsr] Prefix Unreachable Announcement Use Cases
Hi Acee,

> 3.1 Inter-Area Node Failure Scenario – With respect to this use case, the 
> node 
> in question is actually unreachable. In this case, the ABRs will normally 
> install a 
> reject route for the advertised summary and will send an ICMP unreachable 
> when 
> the packets are received for the unreachable prefix. 


And what will the network do with such ICMP unreachable ? Is there some draft I 
missed where encapsulating PE will choose a path with different tunnel endpoint 
upon reception of ICMP unreachable message ? 


See the entire idea behind this draft is to trigger faster switchover to other 
PEs in the case of a multihomed attached site. 


Option 1 - withdraw a service route in BGP. Use aggregate withdraw to speed 
this up (say withdraw just RDs) 


Option 2 - signal next hop unreachability (aka negative route) of more specific 
prefix then the aggregate itself. 


While I think just option 1 is ok for the vast majority of services your answer 
seems to be talking about ICMP unreachable which IMO would not help much with 
the issue. The proposal is not about failing ... the proposal is about faster 
connectivity restoration. 


> If faster detection is required, BFD or other mechanisms are available.  



Now running a full mesh of BFD multihop sessions from each PE to each other PE 
may be ok in theory but rather no-op in practice. Just think 1000 PEs with 100 
ms BFD timers in a full mesh of BFD sessions. Then rethink the same with  BFD 
packets maxed to 1500 or 4K bytes packets as per some proposals floating 
around. 


If we want to move that way I would rather suggest we define a local BFD anchor 
explorers (one per area) which will probe all "interesting" next hops in a 
given area. Upon failure it would signal to those remote PEs which indicated 
interest in such tracking the event of failure. 


Again using BFD for this in any form or shape needs to be weighed for 
cost/benefit against option 1 and option 2 above. 


Thx,
Robert. 


PS . Now option 1 can easily be sub second if BFD is enabled on IBGP sessions 
between RRs and PEs. However I think there was some concerns expressed in the 
past by a vendor for this type of deployment of BFD between loopbacks. Maybe it 
would be beneficial for this discussion to better understand this concern. If 
valid I think the option 2 which IMO is the objective of this draft does 
present a valid problem to be solved. Today practically speaking networks flood 
in IGPs globally 1000s of /32 prefixes instead of summarizing them as this is 
the only way they can signal liveness of the remote PEs. 








On Sat, Nov 14, 2020 at 10:34 PM Acee Lindem (acee) 
 wrote:

Speaking as WG member…
 
With respect to the use cases in section 3:
 
  3.1 Inter-Area Node Failure Scenario – With respect to this use case, the 
node in question is actually unreachable. In this case, the ABRs will normally 
install a reject route for the advertised summary and will send an ICMP 
unreachable when the packets are received for the unreachable prefix. This is 
the expected behavior and there really isn’t that much of advantage to move the 
point of unreachable detection a couple hops closer. If faster detection is 
required, BFD or other mechanisms are available.
[WAJ] Please see the explainations above from Robert.
 
  3.3 Intra-Area Node Failure Scenario – In the first place, multiple areas 
with overlapping summaries is just a bad network design. If the prefix is 
unreachable, the case digresses to getting the ICMP unreachable from the ABR 
with the invalid overlapping summary.
[WAJ] It is common, for example, ISIS level1-2 router will announce the default 
route to the level 1 area. And, also in the OSPF totally stubby area. 
 
3.2 Inter-Area Links Failure Scenario – This is the case where the prefix is 
reachable but only through a subset of the area ABRs. This is really the only 
valid use case. IMO, it is better to solve this case with intra-area tunnels 
through the backbone as described in section 6.1. I think this is preferable to 
the complexity proposed in this draft and especially section 6. It is 
“interesting” when non-implementors specify implementation details.
[WAJ] The 

Re: [Lsr] Prefix Unreachable Announcement Use Cases

2020-11-14 Thread Robert Raszuk
Hi Acee,

> 3.1 *Inter-Area Node Failure Scenario – *With respect to this use case,
the node
> in question is actually unreachable. In this case, the ABRs will normally
install a
> reject route for the advertised summary and will send an ICMP unreachable
when
> the packets are received for the unreachable prefix.

And what will the network do with such ICMP unreachable ? Is there some
draft I missed where encapsulating PE will choose a path with different
tunnel endpoint upon reception of ICMP unreachable message ?

See the entire idea behind this draft is to trigger faster switchover to
other PEs in the case of a multihomed attached site.

Option 1 - withdraw a service route in BGP. Use aggregate withdraw to speed
this up (say withdraw just RDs)

Option 2 - signal next hop unreachability (aka negative route) of more
specific prefix then the aggregate itself.

While I think just option 1 is ok for the vast majority of services your
answer seems to be talking about ICMP unreachable which IMO would not help
much with the issue. The proposal is not about failing ... the proposal is
about faster connectivity restoration.

> If faster detection is required, BFD or other mechanisms are available.

Now running a full mesh of BFD multihop sessions from each PE to each other
PE may be ok in theory but rather no-op in practice. Just think 1000 PEs
with 100 ms BFD timers in a full mesh of BFD sessions. Then rethink the
same with  BFD packets maxed to 1500 or 4K bytes packets as per some
proposals floating around.

If we want to move that way I would rather suggest we define a local BFD
anchor explorers (one per area) which will probe all "interesting" next
hops in a given area. Upon failure it would signal to those remote PEs
which indicated interest in such tracking the event of failure.

Again using BFD for this in any form or shape needs to be weighed for
cost/benefit against option 1 and option 2 above.

Thx,
Robert.

PS . Now option 1 can easily be sub second if BFD is enabled on IBGP
sessions between RRs and PEs. However I think there was some concerns
expressed in the past by a vendor for this type of deployment of BFD
between loopbacks. Maybe it would be beneficial for this discussion to
better understand this concern. If valid I think the option 2 which IMO is
the objective of this draft does present a valid problem to be solved.
Today practically speaking networks flood in IGPs globally 1000s of /32
prefixes instead of summarizing them as this is the only way they can
signal liveness of the remote PEs.




On Sat, Nov 14, 2020 at 10:34 PM Acee Lindem (acee)  wrote:

> Speaking as WG member…
>
>
>
> With respect to the use cases in section 3:
>
>
>
>   3.1 *Inter-Area Node Failure Scenario – *With respect to this use case,
> the node in question is actually unreachable. In this case, the ABRs will
> normally install a reject route for the advertised summary and will send an
> ICMP unreachable when the packets are received for the unreachable prefix.
> This is the expected behavior and there really isn’t that much of advantage
> to move the point of unreachable detection a couple hops closer. If faster
> detection is required, BFD or other mechanisms are available.
>
>
>
>   3.3 *Intra-Area Node Failure Scenario *– In the first place, multiple
> areas with overlapping summaries is just a bad network design. If the
> prefix is unreachable, the case digresses to getting the ICMP unreachable
> from the ABR with the invalid overlapping summary.
>
>
>
> 3.2 *Inter-Area Links Failure Scenario – *This is the case where the
> prefix is reachable but only through a subset of the area ABRs. This is
> really the only valid use case. IMO, it is better to solve this case with
> intra-area tunnels through the backbone as described in section 6.1. I
> think this is preferable to the complexity proposed in this draft and
> especially section 6. It is “interesting” when non-implementors specify
> implementation details.
>
>
>
> Thanks,
> Acee
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] Prefix Unreachable Announcement Use Cases

2020-11-14 Thread Acee Lindem (acee)
Speaking as WG member…

With respect to the use cases in section 3:

  3.1 Inter-Area Node Failure Scenario – With respect to this use case, the 
node in question is actually unreachable. In this case, the ABRs will normally 
install a reject route for the advertised summary and will send an ICMP 
unreachable when the packets are received for the unreachable prefix. This is 
the expected behavior and there really isn’t that much of advantage to move the 
point of unreachable detection a couple hops closer. If faster detection is 
required, BFD or other mechanisms are available.

  3.3 Intra-Area Node Failure Scenario – In the first place, multiple areas 
with overlapping summaries is just a bad network design. If the prefix is 
unreachable, the case digresses to getting the ICMP unreachable from the ABR 
with the invalid overlapping summary.

3.2 Inter-Area Links Failure Scenario – This is the case where the prefix is 
reachable but only through a subset of the area ABRs. This is really the only 
valid use case. IMO, it is better to solve this case with intra-area tunnels 
through the backbone as described in section 6.1. I think this is preferable to 
the complexity proposed in this draft and especially section 6. It is 
“interesting” when non-implementors specify implementation details.

Thanks,
Acee








___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] [Idr] WG Adoption for draft-zhu-idr-bgp-ls-path-mtu (11/1/2020 to 11/16/2020)

2020-11-14 Thread Les Ginsberg (ginsberg)
Zhibo –

It is good of you to “keep me honest” as regards my past comments.

In reviewing the relevant material, the best I can say as regards my comments 
from 2 years ago is that they were made with insufficient diligence. Apologies 
for any resulting confusion.

https://www.rfc-editor.org/rfc/rfc7176.html#section-2.4 clearly indicates that 
the advertised value is dependent on the results of MTU-probe testing as 
specified in https://www.rfc-editor.org/rfc/rfc6325#section-4.3.2 .

Particularly relevant is the statement:

“o  MTU: This field is set to the largest successfully tested MTU size
  for this link or zero if it has not been tested, as specified in
  Section 4.3.2 of [RFC6325].”

So, as currently defined, IS-IS is not allowed to advertise a non-zero MTU 
value unless MTU-probes/acks have been exchanged.

https://www.rfc-editor.org/rfc/rfc7177#section-5 further clarifies that:

“The purpose of MTU testing is to ensure that the links used in the
   campus topology can pass TRILL IS-IS packets, particularly LSP PDUs,
   at the TRILL campus MTU.”

So the stated purpose of the TRILL definitions is NOT to provide assurances of 
data packet MTU, but more specifically to ensure that the IS-IS protocol can 
function correctly.

Could the MTU sub-TLV be repurposed to meet the requirements being discussed in 
the context of draft-zhu-idr-bgp-ls-path-mtu? Yes – I think that is possible - 
but it requires further  work.

draft-hu-lsr-isis-path-mtu was published over two years ago to define IS-IS 
extensions to advertise MTU. As you have noted, you received feedback from both 
Acee and myself which suggested that the TRILL defined MTU sub-TLV might be a 
better choice than the node property you had proposed. It seems based on that 
feedback you abandoned draft-hu-lsr-isis-path-mtu and focused only on 
draft-zhu-idr-bgp-ls-path-mtu. But as others have also noted, there is a gap 
regarding IGP support. OSPF has no ability to support MTU advertisement 
currently and – as this thread explains – even in IS-IS there is work to be 
done.

I would like to restate that I am – like others – supportive of this work – but 
I think WG adoption at this stage (in ANY WG) is premature.

   Les


From: Huzhibo 
Sent: Friday, November 13, 2020 7:20 PM
To: Les Ginsberg (ginsberg) ; Ketan Talaulikar (ketant) 
; Susan Hares ; 'Jeff Tantsura' 
; Stephane Litkowski (slitkows) ; 
i...@ietf.org; Acee Lindem (acee) 
Cc: lsr@ietf.org
Subject: 答复: [Idr] WG Adoption for draft-zhu-idr-bgp-ls-path-mtu (11/1/2020 to 
11/16/2020)

Hi Les, Acee:

Actually we have already discussed about this and reached agreements about two 
years ago. You may have forgotten. Please find the archives below.

https://mailarchive.ietf.org/arch/msg/lsr/C2bhd2ff2UJf4e_Gr-7j2W0g-SU/

I have followed your advice: "there already is a per link MTU sub-TLV defined 
by RFC 7176 ... in which case the existing sub-TLV is a perfect fit ...  IS-IS 
already has what is needed and therefore does not need any additional protocol 
extension". So we let our draft on ISIS extensions for MTU expired, i.e. 
draft-hu-lsr-isis-path-mtu. To be honest, I am a bit surprised when I saw your 
comments.

In this draft we focused on the extensions of BGP_LS for MTU only, which does 
not have to be feed by IGP LSDB. This draft has been discussed and revised a 
few rounds, based on the feedbacks both on ietf meetings and mail list, this 
document has good maturity to move forward.

Thanks
Zhibo


发件人: Idr [mailto:idr-boun...@ietf.org] 代表 Les Ginsberg (ginsberg)
发送时间: 2020年11月14日 7:58
收件人: Ketan Talaulikar (ketant) 
mailto:ketant=40cisco@dmarc.ietf.org>>; 
Susan Hares mailto:sha...@ndzh.com>>; 'Jeff Tantsura' 
mailto:jefftant.i...@gmail.com>>; 'Stephane Litkowski 
(slitkows)' 
mailto:slitkows=40cisco@dmarc.ietf.org>>;
 i...@ietf.org; 'Acee Lindem (acee)' 
mailto:acee=40cisco@dmarc.ietf.org>>
抄送: lsr@ietf.org
主题: Re: [Idr] WG Adoption for draft-zhu-idr-bgp-ls-path-mtu (11/1/2020 to 
11/16/2020)

The points which Ketan has made regarding the use of MTU advertisements defined 
in RFC 7176 are very valid. Indeed, the contents of the sub-TLV defined in 
https://www.rfc-editor.org/rfc/rfc7176.html#section-2.4 depend upon the TRILL 
specific MTU-probe/MTU-ack procedures defined in 
https://www.rfc-editor.org/rfc/rfc6325#section-4.4.3. These procedures are not 
currently applicable to non-TRILL environments.

So, there are no existing IGP advertisements defined which can support the 
goals of this draft.

As Ketan has also indicated, there is no discussion in the draft of how a BGP 
only network (for example) could provide the information of interest.

From my perspective, WG adoption of this draft in ANY WG is premature.
This might be a useful functionality to have – but at the moment we simply have 
an idea with no definition of how to implement/deploy it.

So I therefore oppose WG adoption of this draft by IDR.

Continuing the 

Re: [Lsr] [Idr] 答复: WG Adoption for draft-zhu-idr-bgp-ls-path-mtu (11/1/2020 to 11/16/2020)

2020-11-14 Thread Robert Raszuk
Hi Huzhibo,

I would like to highlight another aspect of this draft independent in which
(if any) WG it will end up with.

Many network operators today to interconnect their routers purchase
circuits. However those circuits in vast majority use brilliant technology
of VPWS, L2VPN, EVPN ... you name it. So effectively the circuit is just an
illusion and what is really sold is an emulated circuit running as IP
encapsulated L2 packets on someone IP backbone.

And here comes the crux of the issue - depending on the time of the day,
state and events in the carrier's underlay real MTU changes. And today what
is worse routers have no good way to even detect it. Some attempts pop up
here and there (like stuff BFD to 1500 or so), but the point is that what
could be promised, sold and configured on the interface may not be what is
really under the hood.

Things get even more colorful when only some discrete packet sizes fail
while smaller and bigger go through just fine - Swiss-MTU if you will :)

Sure your proposal just uses static MTU like any other mentioned in your
draft consumer of that information but I am bringing it here for two
reasons:

A) Draft needs to consider this problem and discuss dynamics related to
real MTU changes

B) A discussion needs to be started if it would not be much more effective
to simply detect MTU at the data plane between the src and dst in an end to
end fashion rather then using it in control plane as a atomic piece of
assumed truth used to make any path calculation.

Kind regards,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] [Idr] WG Adoption for draft-zhu-idr-bgp-ls-path-mtu (11/1/2020 to 11/16/2020)

2020-11-14 Thread Wanghaibo (Rainsword)
Hi Les,

 Inter-AS E2E sr-policy scenario also need this. The inter-as link info 
will be collected by BGP EPE.
   The MTU is link’s attribute, so we need independent attribute TLV for 
all protools’ link NLRI.

Regards,
Haibo

From: Idr [mailto:idr-boun...@ietf.org] On Behalf Of Jeff Tantsura
Sent: Saturday, November 14, 2020 9:52 AM
To: Les Ginsberg (ginsberg) 
Cc: i...@ietf.org; Ketan Talaulikar (ketant) 
; Stephane Litkowski (slitkows) 
; Acee Lindem (acee) 
; lsr@ietf.org; Susan Hares 
Subject: Re: [Idr] WG Adoption for draft-zhu-idr-bgp-ls-path-mtu (11/1/2020 to 
11/16/2020)

To add to Les’s point of BGP only scenario, during MSD IESG reviews, BGP-LS 
only deployment was found not well characterized and had been removed from the 
draft. It will require much better discussion to have it included.
Regards,
Jeff


On Nov 13, 2020, at 15:57, Les Ginsberg (ginsberg) 
mailto:ginsb...@cisco.com>> wrote:

The points which Ketan has made regarding the use of MTU advertisements defined 
in RFC 7176 are very valid. Indeed, the contents of the sub-TLV defined in 
https://www.rfc-editor.org/rfc/rfc7176.html#section-2.4 depend upon the TRILL 
specific MTU-probe/MTU-ack procedures defined in 
https://www.rfc-editor.org/rfc/rfc6325#section-4.4.3. These procedures are not 
currently applicable to non-TRILL environments.

So, there are no existing IGP advertisements defined which can support the 
goals of this draft.

As Ketan has also indicated, there is no discussion in the draft of how a BGP 
only network (for example) could provide the information of interest.

From my perspective, WG adoption of this draft in ANY WG is premature.
This might be a useful functionality to have – but at the moment we simply have 
an idea with no definition of how to implement/deploy it.

So I therefore oppose WG adoption of this draft by IDR.

Continuing the discussion is certainly useful – and I would encourage the 
current authors to investigate and propose relevant mechanisms in all the 
protocols of interest in some future version of the draft.
At that point we could then have a far more meaningful WG adoption call.

   Les


From: Idr mailto:idr-boun...@ietf.org>> On Behalf Of 
Ketan Talaulikar (ketant)
Sent: Friday, November 13, 2020 1:35 AM
To: Susan Hares mailto:sha...@ndzh.com>>; 'Jeff Tantsura' 
mailto:jefftant.i...@gmail.com>>; 'Stephane Litkowski 
(slitkows)' 
mailto:slitkows=40cisco@dmarc.ietf.org>>;
 i...@ietf.org; 'Acee Lindem (acee)' 
mailto:acee=40cisco@dmarc.ietf.org>>
Cc: lsr@ietf.org
Subject: Re: [Idr] WG Adoption for draft-zhu-idr-bgp-ls-path-mtu (11/1/2020 to 
11/16/2020)

Hi Authors,

I believe this work is useful and should be taken up. It has value in providing 
the link MTU as part of the topology information via BGP-LS. However, as 
pointed out by others on this thread, the draft should remain scoped to just 
that – i.e. providing link MTU information. The discussion related to Path MTU 
and applicability to SR networks are best left outside the scope of this 
standards track draft.

Hi Sue,

I would support the points made by Acee for not taking up this draft in IDR WG 
and instead doing this in LSR.

Besides the missing coverage for OSPFv2/v3, there are also issues with how this 
would work with ISIS. The reference to the ISIS Trill specification points to 
this TLV https://tools.ietf.org/html/rfc7176#section-2.4 – if you see, there is 
more here than just the MTU value. What is more critical is the ISIS procedures 
(in non-Trill deployments) on how this value is determined. Please do not mix 
the following :

The MTU sub-TLV is used to optionally announce the MTU of a link as
   specified in [RFC6325], Section 
4.2.4.4.

Are the authors trying to specify that these Trill procedures for testing MTU 
need to be adopted for regular ISIS deployments.

My take is that while the ISIS TLV defined for Trill may be re-used in normal 
ISIS deployments, its usage and procedures need to be specified. Copying the 
LSR WG so that I may be corrected if I am wrong here.

Coming to the point of BGP-only networks, the draft has zero text related to 
that scenario. Moreover, the procedures for BGP-LS advertisements in BGP only 
fabric need to be specified as a base. The 
https://datatracker.ietf.org/doc/draft-ketant-idr-bgp-ls-bgp-only-fabric/ was 
my attempt to specify those procedures and it would be great if the IDR WG 
could review and provide feedback to this document and consider for adoption so 
we can cover the BGP-only networks.

I would also like to offer support/help to the authors in adding the necessary 
OSPFv2/v3 support for the same in an LSR draft where we could tackle both the 
IGPs and BGP-LS encoding and procedures together.

Thanks,
Ketan

From: Idr mailto:idr-boun...@ietf.org>> On Behalf Of 
Susan Hares
Sent: 13 November 2020 00:20
To: 'Jeff Tantsura' mailto:jefftant.i...@gmail.com>>;