Re: [Lsr] UPA

2022-07-07 Thread Robert Raszuk
Peter,

In the scenario described there is really nothing to be tuned as you are
limited by the quality of local telco carriers.

Apparently you are not willing to consider it. Thank you.

Cheers,
R.


On Thu, Jul 7, 2022 at 2:43 PM Peter Psenak  wrote:

> Robert,
>
> people know how to tune IGPs for faster convergence. They may or may do,
> it's their decision based on their requirements. BFD is a standard
> mechanism used by IGPs for fast detection of the adjacency loss. I see
> no reason to require anything more or special for the UPA.
>
> thanks,
> Peter
>
> On 07/07/2022 14:28, Robert Raszuk wrote:
> > Peter,
> >
> > I think you are still not clear on some deployment scenarios.
> >
> > So allow me to restate ...
> >
> > It is pretty often (if not always) a valid requirement to redundantly
> > connect your PEs over different physical paths to the P nodes in the
> area.
> >
> > For simplicity let's assume there are two links (it could be more then
> > two which only makes the situation worse from perspective of UPA).
> >
> > One link belongs to telko A and is clean and solid BFD runs on it and
> > can detect link/peer down in 10s or 100s of milliseconds. The other link
> > is pretty bad (yet is used as backup as there is no physical
> > alternative)  and BFD timers on it are set to 2 sec probing x 3 = 6 sec
> > detection of link/peer down.
> >
> > I think we all agree (including Aijun) that you MUST not advertise UPA
> > before you receive all flooding from all adjacent to failed PE nodes -
> > that in the above case may take 6 sec.
> >
> > So I was asking if you see it feasible to run multihop BFD from ABRs to
> > PEs to detect node down much faster then long BFD timers would otherwise
> > permit you to achieve.
> >
> > And it can be just say milliseconds slower then fastest BFD timers so
> > effectively you get much faster detection then slowest BFD on links
> > would expose.
> >
> > That's the real life scenario which I am trying to map to UPA (or in
> > fact also DROID) mechanism.
> >
> > Many thx,
> > Robert
> >
> >
> > On Thu, Jul 7, 2022 at 2:03 PM Peter Psenak  > <mailto:ppse...@cisco.com>> wrote:
> >
> > On 07/07/2022 12:26, Robert Raszuk wrote:
> >  > That's true.
> >  >
> >  > I am pointing out that this in some networks may be much slower
> then
> >  > invalidating the next hops from BGP route reflectors by running
> > *local*
> >  > multihop BFD sessions to subject PEs (all within an area).
> >  >
> >  > So I have a question ... Let's forget about BGP and RRs and just
> > stay
> >  > focused on IGP:
> >  >
> >  > Would it be feasible to trigger UPA on ABRs by running
> multihop BFD
> >  > sessions between ABRs and local area PEs and not wait for PE-P
> > detection
> >  > of link down as well as flooding to carry the information to ABRs
> ?
> >
> > I would think running BFD on each individual link in the local area
> > would be a much better solution. And people already often do that.
> >
> > thanks,
> > Peter
> >
> >  >
> >  > Thx,
> >  > R.
> >  >
> >  >
> >  > On Thu, Jul 7, 2022 at 12:18 PM Peter Psenak  > <mailto:ppse...@cisco.com>
> >  > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>>> wrote:
> >  >
> >  > Robert,
> >  >
> >  > BGP PIC depends on the IGP convergence. We are not changing
> > any of that
> >  > by UPA.
> >  >
> >  > thanks,
> >  > Peter
> >  >
> >  >
> >  > On 07/07/2022 12:02, Robert Raszuk wrote:
> >  >  > Peter,
> >  >  >
> >  >  > All I am saying is that this may be pretty slow if even
> > directly
> >  >  > attached P routers must way say 6 seconds (3 x 2 sec BFD)
> > to declare
> >  >  > peer down.
> >  >  >
> >  >  > And that is in contrast to running BFD from say BGP RR to
> > all PEs
> >  > in an
> >  >  > area.
> >  >  >
> >  >  > The fundamental point is that in the case of PUA you MUST
> wait
> >  > for all P
> >   

Re: [Lsr] UPA

2022-07-07 Thread Robert Raszuk
Peter,

I think you are still not clear on some deployment scenarios.

So allow me to restate ...

It is pretty often (if not always) a valid requirement to redundantly
connect your PEs over different physical paths to the P nodes in the area.

For simplicity let's assume there are two links (it could be more then two
which only makes the situation worse from perspective of UPA).

One link belongs to telko A and is clean and solid BFD runs on it and can
detect link/peer down in 10s or 100s of milliseconds. The other link is
pretty bad (yet is used as backup as there is no physical alternative)  and
BFD timers on it are set to 2 sec probing x 3 = 6 sec detection of
link/peer down.

I think we all agree (including Aijun) that you MUST not advertise UPA
before you receive all flooding from all adjacent to failed PE nodes - that
in the above case may take 6 sec.

So I was asking if you see it feasible to run multihop BFD from ABRs to PEs
to detect node down much faster then long BFD timers would otherwise permit
you to achieve.

And it can be just say milliseconds slower then fastest BFD timers so
effectively you get much faster detection then slowest BFD on links would
expose.

That's the real life scenario which I am trying to map to UPA (or in fact
also DROID) mechanism.

Many thx,
Robert


On Thu, Jul 7, 2022 at 2:03 PM Peter Psenak  wrote:

> On 07/07/2022 12:26, Robert Raszuk wrote:
> > That's true.
> >
> > I am pointing out that this in some networks may be much slower then
> > invalidating the next hops from BGP route reflectors by running *local*
> > multihop BFD sessions to subject PEs (all within an area).
> >
> > So I have a question ... Let's forget about BGP and RRs and just stay
> > focused on IGP:
> >
> > Would it be feasible to trigger UPA on ABRs by running multihop BFD
> > sessions between ABRs and local area PEs and not wait for PE-P detection
> > of link down as well as flooding to carry the information to ABRs ?
>
> I would think running BFD on each individual link in the local area
> would be a much better solution. And people already often do that.
>
> thanks,
> Peter
>
> >
> > Thx,
> > R.
> >
> >
> > On Thu, Jul 7, 2022 at 12:18 PM Peter Psenak  > <mailto:ppse...@cisco.com>> wrote:
> >
> > Robert,
> >
> > BGP PIC depends on the IGP convergence. We are not changing any of
> that
> > by UPA.
> >
> > thanks,
> > Peter
> >
> >
> > On 07/07/2022 12:02, Robert Raszuk wrote:
> >  > Peter,
> >  >
> >  > All I am saying is that this may be pretty slow if even directly
> >  > attached P routers must way say 6 seconds (3 x 2 sec BFD) to
> declare
> >  > peer down.
> >  >
> >  > And that is in contrast to running BFD from say BGP RR to all PEs
> > in an
> >  > area.
> >  >
> >  > The fundamental point is that in the case of PUA you MUST wait
> > for all P
> >  > routers to tell you that PE in fact went down. While in case of
> >  > invalidating service routes the first trigger is good enough.
> >  >
> >  > To me this is significant architectural difference.
> >  >
> >  > Many thx,
> >  > R.
> >  >
> >  >
> >  > On Thu, Jul 7, 2022 at 11:54 AM Peter Psenak  > <mailto:ppse...@cisco.com>
> >  > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>>> wrote:
> >  >
> >  > On 07/07/2022 11:38, Robert Raszuk wrote:
> >  >  >
> >  >  >  > there is no such thing.
> >  >  >
> >  >  > By far away ABR I mean ABR far away from failing PE
> > connecting local
> >  >  > are to the area 0. There can be number of P routers in
> > between.
> >  >
> >  > ABR has the full visibility of the local area and knows when
> any
> >  > node or
> >  > prefix becomes unreachable. It is determined by the SPF
> > computation and
> >  > prefix processing that is triggered as a result of the change
> > in the
> >  > local area.
> >  >
> >  > thanks,
> >  > Peter
> >  >
> >  >  >
> >      >  > Let me provide you with an illustration:
> >  >  >
> >  >  > PE can be in Honolulu. ABR in Huston. All in one area. For
> me
> >  > this ABR
> > 

Re: [Lsr] UPA

2022-07-07 Thread Robert Raszuk
That's true.

I am pointing out that this in some networks may be much slower then
invalidating the next hops from BGP route reflectors by running *local*
multihop BFD sessions to subject PEs (all within an area).

So I have a question ... Let's forget about BGP and RRs and just stay
focused on IGP:

Would it be feasible to trigger UPA on ABRs by running multihop BFD
sessions between ABRs and local area PEs and not wait for PE-P detection of
link down as well as flooding to carry the information to ABRs ?

Thx,
R.


On Thu, Jul 7, 2022 at 12:18 PM Peter Psenak  wrote:

> Robert,
>
> BGP PIC depends on the IGP convergence. We are not changing any of that
> by UPA.
>
> thanks,
> Peter
>
>
> On 07/07/2022 12:02, Robert Raszuk wrote:
> > Peter,
> >
> > All I am saying is that this may be pretty slow if even directly
> > attached P routers must way say 6 seconds (3 x 2 sec BFD) to declare
> > peer down.
> >
> > And that is in contrast to running BFD from say BGP RR to all PEs in an
> > area.
> >
> > The fundamental point is that in the case of PUA you MUST wait for all P
> > routers to tell you that PE in fact went down. While in case of
> > invalidating service routes the first trigger is good enough.
> >
> > To me this is significant architectural difference.
> >
> > Many thx,
> > R.
> >
> >
> > On Thu, Jul 7, 2022 at 11:54 AM Peter Psenak  > <mailto:ppse...@cisco.com>> wrote:
> >
> > On 07/07/2022 11:38, Robert Raszuk wrote:
> >  >
> >  >  > there is no such thing.
> >  >
> >  > By far away ABR I mean ABR far away from failing PE connecting
> local
> >  > are to the area 0. There can be number of P routers in between.
> >
> > ABR has the full visibility of the local area and knows when any
> > node or
> > prefix becomes unreachable. It is determined by the SPF computation
> and
> > prefix processing that is triggered as a result of the change in the
> > local area.
> >
> > thanks,
> > Peter
> >
> >  >
> >  > Let me provide you with an illustration:
> >  >
> >  > PE can be in Honolulu. ABR in Huston. All in one area. For me
> > this ABR
> >  > is far away from PE.
> >  >
> >  > On Thu, Jul 7, 2022 at 11:35 AM Peter Psenak  > <mailto:ppse...@cisco.com>
> >  > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>>> wrote:
> >  >
> >  > Robert,
> >  >
> >  > On 07/07/2022 11:25, Robert Raszuk wrote:
> >  >  > Hi Peter,
> >  >  >
> >  >  >  > Section 4:
> >  >  >  >
> >  >  >  > "The intent of UPA is to provide an event driven signal
> > of the
> >  >  >   > transition of a destination from reachable to
> > unreachable."
> >  >  >
> >  >  > That is too vague.
> >  >
> >  > it's all that is needed.
> >  >
> >  >  >
> >  >  > I am asking how you detect that transition on a far away
> ABR.
> >  >
> >  > there is no such thing. The detection is done based on the
> prefix
> >  > transition from reachable to unreachabile in a local area by
> > local
> >  > ABRs.
> >  > Remote ABRs just propagate the UPA.
> >  >
> >  > thanks,
> >  > Peter
> >  >
> >  >  >
> >  >  > For example, are you tracking flooding on all links to
> > subject PE
> >  > from
> >  >  > all its neighbours and only when all of them remove that
> > link from
> >  >  > topology you signal PUA ?
> >  >  >
> >  >  > If so practically such trigger may be pretty slow and
> >  > inconsistent as in
> >  >  > real networks as links over which PEs are connected are
> > often of a
> >  >  > very different quality, coming from different carriers and
> may
> >  > have for
> >  >  > stability varying BFD timers. So here you would have to
> > wait for the
> >  >  > slowest link to be detected on the neighbouring P router
> > as down.
> >  >  >
> >  >  > Thx,

Re: [Lsr] UPA

2022-07-07 Thread Robert Raszuk
Peter,

All I am saying is that this may be pretty slow if even directly attached P
routers must way say 6 seconds (3 x 2 sec BFD) to declare peer down.

And that is in contrast to running BFD from say BGP RR to all PEs in an
area.

The fundamental point is that in the case of PUA you MUST wait for all P
routers to tell you that PE in fact went down. While in case of
invalidating service routes the first trigger is good enough.

To me this is significant architectural difference.

Many thx,
R.


On Thu, Jul 7, 2022 at 11:54 AM Peter Psenak  wrote:

> On 07/07/2022 11:38, Robert Raszuk wrote:
> >
> >  > there is no such thing.
> >
> > By far away ABR I mean ABR far away from failing PE connecting local
> > are to the area 0. There can be number of P routers in between.
>
> ABR has the full visibility of the local area and knows when any node or
> prefix becomes unreachable. It is determined by the SPF computation and
> prefix processing that is triggered as a result of the change in the
> local area.
>
> thanks,
> Peter
>
> >
> > Let me provide you with an illustration:
> >
> > PE can be in Honolulu. ABR in Huston. All in one area. For me this ABR
> > is far away from PE.
> >
> > On Thu, Jul 7, 2022 at 11:35 AM Peter Psenak  > <mailto:ppse...@cisco.com>> wrote:
> >
> > Robert,
> >
> > On 07/07/2022 11:25, Robert Raszuk wrote:
> >  > Hi Peter,
> >  >
> >  >  > Section 4:
> >  >  >
> >  >  > "The intent of UPA is to provide an event driven signal of the
> >  >   > transition of a destination from reachable to unreachable."
> >  >
> >  > That is too vague.
> >
> > it's all that is needed.
> >
> >  >
> >  > I am asking how you detect that transition on a far away ABR.
> >
> > there is no such thing. The detection is done based on the prefix
> > transition from reachable to unreachabile in a local area by local
> > ABRs.
> > Remote ABRs just propagate the UPA.
> >
> > thanks,
> > Peter
> >
> >  >
> >  > For example, are you tracking flooding on all links to subject PE
> > from
> >  > all its neighbours and only when all of them remove that link from
> >  > topology you signal PUA ?
> >  >
> >  > If so practically such trigger may be pretty slow and
> > inconsistent as in
> >  > real networks as links over which PEs are connected are often of a
> >  > very different quality, coming from different carriers and may
> > have for
> >  > stability varying BFD timers. So here you would have to wait for
> the
> >  > slowest link to be detected on the neighbouring P router as down.
> >  >
> >  > Thx,
> >  > R.
> >  >
> >  > On Thu, Jul 7, 2022 at 10:16 AM Peter Psenak  > <mailto:ppse...@cisco.com>
> >  > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>>> wrote:
> >  >
> >  > Robert,
> >  >
> >  > On 06/07/2022 15:07, Robert Raszuk wrote:
> >  >  > Hi Peter,
> >  >  >
> >  >  > Can you please point me in the draft
> >  >  >
> >  >
> >
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> > <
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> >
> >  >
> >   <
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> <
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> >>
> >  >
> >  >  >
> >  >
> >   <
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> <
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> >
> >  >
> >   <
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> <
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> >>>
> >  >
> >  >  > to some section which specifies based on exactly what
> network
> >  > flooding
> >  >  > changes UPA will be generated by ABRs ?
> >  >
> >  > Section 4:
> >  >
> >  > "The intent of UPA is to p

Re: [Lsr] UPA

2022-07-07 Thread Robert Raszuk
> there is no such thing.

By far away ABR I mean ABR far away from failing PE connecting local are to
the area 0. There can be number of P routers in between.

Let me provide you with an illustration:

PE can be in Honolulu. ABR in Huston. All in one area. For me this ABR is
far away from PE.

On Thu, Jul 7, 2022 at 11:35 AM Peter Psenak  wrote:

> Robert,
>
> On 07/07/2022 11:25, Robert Raszuk wrote:
> > Hi Peter,
> >
> >  > Section 4:
> >  >
> >  > "The intent of UPA is to provide an event driven signal of the
> >   > transition of a destination from reachable to unreachable."
> >
> > That is too vague.
>
> it's all that is needed.
>
> >
> > I am asking how you detect that transition on a far away ABR.
>
> there is no such thing. The detection is done based on the prefix
> transition from reachable to unreachabile in a local area by local ABRs.
> Remote ABRs just propagate the UPA.
>
> thanks,
> Peter
>
> >
> > For example, are you tracking flooding on all links to subject PE from
> > all its neighbours and only when all of them remove that link from
> > topology you signal PUA ?
> >
> > If so practically such trigger may be pretty slow and inconsistent as in
> > real networks as links over which PEs are connected are often of a
> > very different quality, coming from different carriers and may have for
> > stability varying BFD timers. So here you would have to wait for the
> > slowest link to be detected on the neighbouring P router as down.
> >
> > Thx,
> > R.
> >
> > On Thu, Jul 7, 2022 at 10:16 AM Peter Psenak  > <mailto:ppse...@cisco.com>> wrote:
> >
> > Robert,
> >
> > On 06/07/2022 15:07, Robert Raszuk wrote:
> >  > Hi Peter,
> >  >
> >  > Can you please point me in the draft
> >  >
> >
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> > <
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> >
> >
> >  >
> > <
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> > <
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> >>
> >
> >  > to some section which specifies based on exactly what network
> > flooding
> >  > changes UPA will be generated by ABRs ?
> >
> > Section 4:
> >
> > "The intent of UPA is to provide an event driven signal of the
> >transition of a destination from reachable to unreachable."
> >  >
> >  > I think such text is not an implementation detail, but it is
> > critical
> >  > for mix vendor interoperability.
> >  >
> >  > Can UPA also be generated by P node(s) ?
> >
> > only if they are ABRs or ASBRs.
> >
> >
> >  >
> >  > Specifically I was looking to find some information on how do you
> >  > achieve assurance that UPA really needs to be generated when using
> >  > various vendor's nodes with very different flooding behaviours
> > and when
> >  > subjects PEs may have a number of different links each with
> > different
> >  > node/link down detection timer ?
> >
> > sorry, I don't understand the above.
> >
> > thanks,
> > Peter
> >
> >  >
> >  > Many thx,
> >  > R.
> >  >
> >
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] UPA

2022-07-07 Thread Robert Raszuk
Hi Peter,

> Section 4:
>
> "The intent of UPA is to provide an event driven signal of the
 > transition of a destination from reachable to unreachable."

That is too vague.

I am asking how you detect that transition on a far away ABR.

For example, are you tracking flooding on all links to subject PE from all
its neighbours and only when all of them remove that link from topology you
signal PUA ?

If so practically such trigger may be pretty slow and inconsistent as in
real networks as links over which PEs are connected are often of a
very different quality, coming from different carriers and may have for
stability varying BFD timers. So here you would have to wait for the
slowest link to be detected on the neighbouring P router as down.

Thx,
R.

On Thu, Jul 7, 2022 at 10:16 AM Peter Psenak  wrote:

> Robert,
>
> On 06/07/2022 15:07, Robert Raszuk wrote:
> > Hi Peter,
> >
> > Can you please point me in the draft
> >
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
> > <
> https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt>
>
> > to some section which specifies based on exactly what network flooding
> > changes UPA will be generated by ABRs ?
>
> Section 4:
>
> "The intent of UPA is to provide an event driven signal of the
>   transition of a destination from reachable to unreachable."
> >
> > I think such text is not an implementation detail, but it is critical
> > for mix vendor interoperability.
> >
> > Can UPA also be generated by P node(s) ?
>
> only if they are ABRs or ASBRs.
>
>
> >
> > Specifically I was looking to find some information on how do you
> > achieve assurance that UPA really needs to be generated when using
> > various vendor's nodes with very different flooding behaviours and when
> > subjects PEs may have a number of different links each with different
> > node/link down detection timer ?
>
> sorry, I don't understand the above.
>
> thanks,
> Peter
>
> >
> > Many thx,
> > R.
> >
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] UPA

2022-07-06 Thread Robert Raszuk
Hi Peter,

Can you please point me in the draft
https://www.ietf.org/id/draft-ppsenak-lsr-igp-ureach-prefix-announce-00.txt
to some section which specifies based on exactly what network flooding
changes UPA will be generated by ABRs ?

I think such text is not an implementation detail, but it is critical for
mix vendor interoperability.

Can UPA also be generated by P node(s) ?

Specifically I was looking to find some information on how do you
achieve assurance that UPA really needs to be generated when using various
vendor's nodes with very different flooding behaviours and when subjects
PEs may have a number of different links each with different node/link down
detection timer ?

Many thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] YANG requirements for IDR drafts (was Re: [Idr] draft-head-idr-bgp-ls-isis-fr-01 - WG adoption call (6/6 to 6/20))

2022-07-05 Thread Robert Raszuk
Hi Jeff,

Many thx for your note. As I clarified to Sue my question was really about
LSR WG not IDR :)

And the trigger was Gunter's claim that his employer's OS is already
sending content of LSDB over YANG.

So I was a bit puzzled what happens with new extensions if they like ISIS
reflection if they do not contain the YANG model from day one ? How is that
data being encoded if at all ?

That answer is also important to alternative to BGP-LS discussion but let's
have a separate discussion on this in the coming weeks.

Best,
R.


On Tue, Jul 5, 2022 at 10:28 PM Jeffrey Haas  wrote:

> Robert,
>
>
> On Jun 30, 2022, at 6:56 PM, Robert Raszuk  wrote:
>
> Isn't the YANG section a requirement for all protocol extension
> documents before they are sent for publications these days ?
>
>
> We're not yet to the point where extensions to YANG modules are part of
> base IETF work, but we're probably going to need to have that discussions
> soon across IETF.
>
> This year will see base YANG modules for a number of protocols done.  I
> had hoped I could contribute toward the BGP YANG module getting done closer
> to start of year than not, the BGP module is more likely to be complete
> this fall.[1]
>
> Once we have the base modules out, augmentations for them covering various
> extensions will make sense.  Prior to the publication of the base modules,
> we wouldn't have had the documents advance due to MISREF dependencies.
>
> Once our base module is out, we'll have need of a number of small
> augmentation modules to fill in the missing features.  If you're looking to
> help with that work, there's probably room to start writing some drafts
> now.  I think the BGP YANG module is structurally solid for most
> configuration and operational state.  Policy is the remaining large piece
> of work.
>
> That said, I think we'll find trying to write YANG for BGP-LS challenging.
>
> The reason I am asking this is in fact in light of the other discussions
> we have on IDR list where at least one mode of link state state
> advertisement can be done using YANG encoding. Is YANG section optional in
> LSR WG documents which define new protocol extensions and new functionality
> ? If an implementation uses YANG to push LSDB how the new TLVs defined in
> the draft are going to be shared across ?
>
>
> I think your broader question about what a streaming protocol for IGP
> state looks like is probably best addressed in those threads.  But, as
> above, it's going to be an interesting modeling exercise.
>
> -- Jeff
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] [Idr] draft-head-idr-bgp-ls-isis-fr-01 - WG adoption call (6/6 to 6/20)

2022-06-30 Thread Robert Raszuk
Hello,

I have a question ... likely to the WG chairs.

Why
https://datatracker.ietf.org/doc/html/draft-ietf-lsr-isis-flood-reflection-07
does not have a YANG section ? Is there a separate document for it just
like we see a separate document for BGP-LS encoding ?

Isn't the YANG section a requirement for all protocol extension
documents before they are sent for publications these days ?

The reason I am asking this is in fact in light of the other discussions we
have on IDR list where at least one mode of link state state advertisement
can be done using YANG encoding. Is YANG section optional in LSR WG
documents which define new protocol extensions and new functionality ? If
an implementation uses YANG to push LSDB how the new TLVs defined in the
draft are going to be shared across ?

Many thx,
Robert


On Wed, Jun 29, 2022 at 10:56 PM Susan Hares  wrote:

> Greetings:
>
>
>
> I want to thank all the people who contributed to this WG adoption call.
>
>
>
> There are four points I pull from the adoption call:
>
>
>
> 1. IDR participants desire to discuss other potential ways to pass data
> currently past in IDR
>
>
>
> I will start a thread noted as “BGP-LS” alternative.   This thread has 2
> subjects:
>
> 1) clear problem statements on BGP-LS
>
> 2) Discussion of alternatives (e.g. Droid (draft-li-lsr-droid-00.txt)
>
>
>
> If there is enough on list discussion, IDR may hold an interim on the
> topic.
>
>
>
> 2.  Operators wish to deploy this draft
>
>
>
> I have confirmation on requests for deploying this draft.
>
> Like some other BGP features, it may not be widely deployed.
>
>
>
> 3. The WG wishes to add these features to this experimental draft.
>
>
>
> draft-ietf-idr-bgp-ls-isis-flood-reflection-00.txt
>
>
>
> Based on the call, this draft should include the changes promised on the
> call.
>
> The authors should resubmit the draft with this name.
>
>
>
> 4. The IDR + LSR chairs should review the agreements relating to
>
> BGP-LS TLVs at IETF-114 in their WG.
>
>
>
> The IDR chairs will request a time at IDR + LSR for this topic.
>
> Let  me know if a short video be better than slides.
>
> If so, we’ll post the video on YouTube before IETF and
>
> Take questions at LSR or IDR.
>
>
>
> Cheers, Sue Hares
>
>
>
> *From:* Lsr  *On Behalf Of * Susan Hares
> *Sent:* Friday, June 24, 2022 9:29 AM
> *To:* Tony Przygienda ; Ketan Talaulikar <
> ketant.i...@gmail.com>
> *Cc:* Jordan Head ; i...@ietf.org; lsr 
> *Subject:* Re: [Lsr] [Idr] draft-head-idr-bgp-ls-isis-fr-01 - WG adoption
> call (6/6 to 6/20)
>
>
>
>
>
> Tony P, Ketan and IDR WG:
>
>
>
> Thank you for input on this draft.
>
> I am closing the WG adoption call for this draft.
>
> The IDR Chairs will discuss the results of this consensus call, and
>
> Announce the results by July 8th.
>
>
>
> Cheers,
>
>
>
> Sue Hares
>
>
>
> *From:* Tony Przygienda 
> *Sent:* Wednesday, June 22, 2022 12:11 PM
> *To:* Ketan Talaulikar 
> *Cc:* Jordan Head ; Susan Hares ;
> i...@ietf.org; lsr 
> *Subject:* Re: [Idr] draft-head-idr-bgp-ls-isis-fr-01 - WG adoption call
> (6/6 to 6/20)
>
>
>
>
>
> hey Ketan, since as you know ;-) BGP-LS is not really IGP 1:1 translation
> we try to put into BGP-LS here only the stuff that is strictly needed for
> topology discovery and understanding for advanced computation and nothing
> else. And hence, since the 1:1 TLV correspondence is nowhere to be seen by
> now if you look at ospf/isis encoding and what BGP-LS formats are today
> your requirements are interesting but I'm not sure where they came from.
>
>
>
> The flag indicates already whether something is client or reflector on an
> adjacency. cluster ID is there as well to differentiate between different
> clusters. L2 C/FR adjacencies will show up as well. good enough to
> understand topology and compute AFAIS.  All this is achievable by putting
> this element on the link TLV (the draft should explain it better, it just
> grabs codepoints in node/link/prefix & e'thing else ;-). Yes, we could
> annotate just the node assuming strict adherence to the IGP draft today but
> putting the element on the link descriptor follows the IGP spec itself and
> will allow to break the RFC if necessary later also in BGP-LS (by e.g.
> allowing a node to be client of two different clusters or even a node being
> reflector for 2 different clusters. Observe that this will not work in case
> of auto-discoery since that's on node caps ;-) But those are sutble
> discussions that need to be documented into the BGP-LS draft as procedures
> once adopted. Those discussions are natural and necessary since BGP-LS is
> NOT IGP  database but a distorted, simplified view for topology discovery.
> Or at least that's how it's used in reality based on the shortcomings of
> its design ;-)
>
>
>
> As I explained, unless L1 adjacencies are being formed IMO they don't
> belong into BGP-LS FR information, otherwise will show up in BGP-LS
> naturally. Neither does IMO auto-discovery of FR.
>
>
>
> As to mismatch of e.g. 

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-16 Thread Robert Raszuk
UPAs may not even contain the advertised locator in SIDs. That is not
clearly spelled out what exactly ABRs should advertise.

I presume:
a) something which was flooded in the local domain and was not being leaked
AND
b) something which stopped to be flooded in a local domain
AND
c) there is local policy specifying such range

 agree with Bruno’s statement “If you believe that all you need is
> RFC5305/RFC5308 I guess this means that we don't need
> draft-ppsenak-lsr-igp-ureach-prefix-announce”
>

Well at this time this is an Informational draft.

But based on Bruno's comments I am worried if any node receiving something
with MAX_PATH_METRIC which was not advertised before as valid and reachable
prefix and did not make it into LSDB or RIB/FIB will not simply introduce a
new unknown for the implementations state how to handle such prefix which
may result in different interesting undefined behaviour(s).

Thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] draft-ppsenak-lsr-igp-ureach-prefix-announce

2022-06-16 Thread Robert Raszuk
Bruno,

Actually I like your flag suggestion for an additional and different
reason.

If someone does not need to flood UPAs in any remote area it is trivial to
filter those on the ABRs connecting those areas to the core. Otherwise such
filtering could be more difficult if at all possible.

Thx,
R.


[Bruno] I agree that the encoding for the explicit signaling is totally
> open to discussion.
>
> I proposed a flag since:
>
> - all we need is a binary information
>
> - given the use case, this RFC7794 sub-tlv (type 4) seems likely to be
> already present. (e.g. X or R flag, possibly N-flag)
>
>
>
> Thanks for your email and your contribution to this topic.
>
> Best regards,
>
> --Bruno
>
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-16 Thread Robert Raszuk
Gunter,

(1) Multiple-ABRs
>
>
>
> I was wondering for example if a ingress router receives a PUA signaling
> that a given locator becomes unreachable, does that actually really signals
> that the SID ‘really’ is unreachable for a router?
>

Aas there is no association between node_id (perhaps loopback) and SIDs
(note that egress can use many SIDs) UPA really does not signal anything
about SIDs reachability or liveness.


>  For example (simple design to illustrate the corner-case):
>
>
>
> ingressPE#1---area#1---ABR#1---area---ABR#2---area#3---egressPE#2
>
>  |  |
>
>  |  |
>
>  +area#1---ABR#3---area---ABR#4---area#3+
>
>
>
> What if ABR#4 would loose connectivity to egressPE#2 and ABR#2 does not?
>
> In that case ABR#4 will originate a UPA/PUA and ABR#2 does not originate a
> PUA/UPA.
>
> How is ingressPE#1 supposed to handle this situation? The only thing
> ingressPE#1 see is that suddenly there is a PUA/UPA but reachability may
> not have changed at all and remains perfectly reacheable.
>

Valid case. But PE1 should only switch when alternative backup path exists.
If there is a single path it should do nothing in any case of receiving
UPA. We have discussed that case before and as you know the formal answer
was "out of scope" or "vendor's secret sauce" :).

The justification here is that switching to healthy backup is better then
continue using perhaps semi-sick path.

Best,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Dynamic Flooding on Dense Graphs - draft-ietf-lsr-dynamic-flooding

2022-06-16 Thread Robert Raszuk
Hi Gyan,

While I agree with your final conclusion and description there is one
important detail missing.

PODs consist of both network elements and compute nodes. Virtualization
happens in the latter. Dynamic routing between those almost in all
cases talk BGP in the underlay not IGP simply as there is no great open
source IGP code to leverage to run link state protocol on the
compute nodes. We know that for over two decades. Some were eager to
produce ISIS-lite but to the best of my knowledge that never fully
surfaced (Yes there is ISIS and OSPFv2/v3 code in FRR or OSPF in BIRD, but
are those implementations ready for production scale ... not sure).

I would love to be corrected though and see any CNI deployment using ISIS.

Sure you can still interconnect clusters (collection of PODs) with IGP in
the underlay. But running IGP to computes I am unfortunately not seeing.

Thx,
Robert.




On Thu, Jun 16, 2022 at 6:42 AM Gyan Mishra  wrote:

>
> Hi Tony
>
> “So, can we PLEASE stop beating a dead horse?”
>
> As data center have evolved over the years prior to NVO overlay
> architectures becoming more prevalent, many operators had moved to from L2
> fault domains to an L3 POD based architecture carving the DC or MSDC into
> many smaller PODs sub domains where each POD is a BGP AS onto itself
> peering to a DCI core AS.
>
> From the DC POD architecture, operators have moved towards an NVO clos
> topology where  the size of the topology is not as massive as a single AS
> DC fabric, as now each POD becomes a fabric onto itself.  This does
> considerably reduce scope of the flooding as each POD is a BGP AS onto
> itself peering with a DCI core AS super spine.  As well within the POD, 2
> tier and now 3 or 4 tier micro fabrics within a POD can now further reduce
> the dense clos fabric as desired with scale up scale out as needed.  Each
> micro fabric could be carved up into mini AS or left as single AS per POD
> as desired.
>
> I  am sure there are BGP only MSDC installed base, however there are still
> plenty for of MSDC as I described above using POD architecture that have
> IGP based underlay that could definitely benefit from this draft.  As well
> operators with single AS MSDC that are not BGP Only that could take
> advantage of the draft.
>
>
> Kind Regards
>
> Gyan
>
>
> On Tue, Jun 14, 2022 at 5:01 PM Tony Li  wrote:
>
>>
>> Gyan,
>>
>> Cisco has (reportedly) implemented this, but done so with their own
>> proprietary, undocumented distributed algorithm.
>>
>> The responses that I have seen from operators have been somewhat
>> disappointing:
>>
>> “There is no  way that I would ever let a  IGP into
>> my data center.”
>>
>> Others have been more polite, but similarly dismissive.
>>
>> The fact of the matter is that there is an installed base of BGP and
>> folks are not open to experimenting with anything else.
>>
>> So, can we PLEASE stop beating a dead horse?
>>
>> Tony
>>
>>
>> On Jun 14, 2022, at 1:43 PM, Gyan Mishra  wrote:
>>
>> All
>>
>> I agree this is important work for operators in DC networks  NVO CLOS
>> architecture with extremely dense fabrics and massively scaled out spines.
>>
>> I agree we can move forward with progressing with only ISIS being
>> implemented.
>>
>> I do think that after the draft is published hopefully implementations
>> include OSPF as well as there is a lot of OSPF used by operators.
>>
>> NVO CLOS architecture I would say is being universally being deployed as
>> defacto standard  in the DC arena.  As well as most operators don’t want to
>> go for the BGP only solution in the DC due to the complexity as well as
>> having to provision many public ASNs.
>>
>> I support #1 first followed by #2.
>>
>> So far we have Arista implementation, and we have both Cisco and Juniper
>> Co-Authors as  well  on the draft.
>>
>> I think we have a good chance at #1 - Standards track.
>>
>> Les & Tony & Tony
>>
>> What is the chance of getting this implemented by Cisco & Juniper?
>>
>>
>> We also have a few major stakeholders in the industry supporting the
>> draft, Verizon, ATT and CenturyLink as co-authors which I think shows how
>> important this draft is for the industry.
>>
>> Kind Regards
>>
>> Gyan
>>
>> On Tue, Jun 14, 2022 at 4:05 PM John E Drake > 40juniper@dmarc.ietf.org> wrote:
>>
>>> Les,
>>>
>>> I'm happy with either 1 or 2.  It's good work and I think it will become
>>> important.
>>>
>>> Yours Irrespectively,
>>>
>>> John
>>>
>>>
>>> Juniper Business Use Only
>>>
>>> > -Original Message-
>>> > From: Les Ginsberg (ginsberg) 
>>> > Sent: Tuesday, June 14, 2022 4:01 PM
>>> > To: John E Drake ; Les Ginsberg (ginsberg)
>>> > ; John Scudder 
>>> > Cc: Tony Li ; tom petch ; Acee
>>> Lindem
>>> > (acee) ; lsr@ietf.org
>>> > Subject: RE: [Lsr] Dynamic Flooding on Dense Graphs -
>>> draft-ietf-lsr-dynamic-
>>> > flooding
>>> >
>>> > [External Email. Be cautious of content]
>>> >
>>> >
>>> > John -
>>> >
>>> > I would be inclined to agree with you - but...to my knowledge 

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Robert Raszuk
>
> looks to me that you are trying to steer the discussion towards the BGP
> based solution. Not something I'm interested on this thread.
>

Not at all. It was you not me who used argument that UPA/PUA is useful for
networks with no BGP ... example:

Quote:



*"I have explained that several times to you. There are SP networksrunning
the services on top of p2p IP sec tunnels for example, with no BGP."*



> > Also not all tunnels have keepalives. I am talking about mGRE
> > encapsulation as an example where you simply encapsulate and have no
> > idea other than consulting RIB if the dst node is up or down.
>
> in such case you can not use summarization at all.
>

Ok. Good to know :).

Best,
R.

PS.

Btw important point. Yes networks experience scale limits. But those limits
are usually not due to exponential grow of number of PEs. Such grow is
often associated with moving network services from routers to compute
blades. And guess what protocol is used in underlay to those compute blades
... BGP :).
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Robert Raszuk
> Traffic will initially switch to alternate path, if any, an
> later the native mechanism (BGP signalling, tunnel keepalive, etc), will
> take over and bring it to its final state.
>

On one hand you are saying that UPA is useful where there is no BGP. So
let's talk about such a scenario.

Also not all tunnels have keepalives. I am talking about mGRE encapsulation
as an example where you simply encapsulate and have no idea other than
consulting RIB if the dst node is up or down.

In this discussed case it will keep sending packets to remote area only to
drop it there ... not good.

Thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Robert Raszuk
Peter,

My question is precise  your answer is pretty loose :)

Imagine I use summarization and as you many times said there is no BGP
running. So how do I indicate planned scheduled maintenance in such cases ?
Say from either ABRs or PEs/Ps itself ?

In fact, looking practically that may be much more useful and needed then
signalling node failures.

And the issue I observed with using UPA is that as it is ephemeral it may
not work well during extended maintenance windows. Stateful solutions
however would work fine.

Thx,
R.







On Wed, Jun 15, 2022 at 2:34 PM Peter Psenak  wrote:

> Robert,
>
> On 15/06/2022 14:13, Robert Raszuk wrote:
> > Peter,
> >
> > the meaning of LSInfinity has been defined decades ago. No matter how
> >
> > much you may not like it, but it means unreachable.
> >
> >
> > True. But that brings another question ... Do you envision to use UPA
> > also to indicate planned maintenance of a node ?
>
> depends on how the planned maintenance is performed. If yo just turn the
> node off, UPA will catch it. If you instead set OL-bit, or use link max
> metric initially, it may or may not be used, depending on what the
> ABR/ASBR is programmed to do. There is quite some flexibility if needed.
>
> thanks,
> Peter
>
>
> >
> > Thx,
> > R.
> >
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-15 Thread Robert Raszuk
Peter,

the meaning of LSInfinity has been defined decades ago. No matter how
>
much you may not like it, but it means unreachable.


True. But that brings another question ... Do you envision to use UPA also
to indicate planned maintenance of a node ?

Thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Dynamic Flooding on Dense Graphs - draft-ietf-lsr-dynamic-flooding

2022-06-14 Thread Robert Raszuk
> Well, we can blame marketing all we want.  All I know is that we, as a
> group, failed to come together and present a unified front with
> interoperable implementations. That left us in a position where marketing
> is pushing rocks up hills and customers are waiting for the dust to settle.


I am not blaming marketing here. Real engineers never listen to marketing.

The main issue why BGP won in some MSDCs was that it was much easier (and
cheaper) to deploy on OEM white boxes then any alternative scalable IGP
(read use open source).

So yes - link state IGPs were late for MSDCs. Petr's draft then RFC was
like a hammer to this as well. But many use BGP as an underlay without
fully realizing the pros and cons. Some run BGP purely from positioning
perspective as they can be seen "weaker" as the largest hyperscalers. Many
still run BGP services with no hierarchy and clean decoupling.

IMHO even for many DCs this is still not the lost battle if positioned IGPs
right. So let's progress this as Experimental and clearly explain the
benefits if given implementation supports this.

Regards,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Dynamic Flooding on Dense Graphs - draft-ietf-lsr-dynamic-flooding

2022-06-14 Thread Robert Raszuk
Hi Tony,

> So, can we PLEASE stop beating a dead horse?

Are you stating that computing dynamic flooding topologies has no use case
outside of MSDCs or for that matter ANY-DCs ?

Thx,
R.

PS. It is true that folks even running 10 racks think BGP is the only
choice for the underlay but to me this is failure of deployment folks in
vendors to properly position each dynamic routing protocol then nothing
else.


On Tue, Jun 14, 2022 at 11:02 PM Tony Li  wrote:

>
> Gyan,
>
> Cisco has (reportedly) implemented this, but done so with their own
> proprietary, undocumented distributed algorithm.
>
> The responses that I have seen from operators have been somewhat
> disappointing:
>
> “There is no  way that I would ever let a  IGP into
> my data center.”
>
> Others have been more polite, but similarly dismissive.
>
> The fact of the matter is that there is an installed base of BGP and folks
> are not open to experimenting with anything else.
>
> So, can we PLEASE stop beating a dead horse?
>
> Tony
>
>
> On Jun 14, 2022, at 1:43 PM, Gyan Mishra  wrote:
>
> All
>
> I agree this is important work for operators in DC networks  NVO CLOS
> architecture with extremely dense fabrics and massively scaled out spines.
>
> I agree we can move forward with progressing with only ISIS being
> implemented.
>
> I do think that after the draft is published hopefully implementations
> include OSPF as well as there is a lot of OSPF used by operators.
>
> NVO CLOS architecture I would say is being universally being deployed as
> defacto standard  in the DC arena.  As well as most operators don’t want to
> go for the BGP only solution in the DC due to the complexity as well as
> having to provision many public ASNs.
>
> I support #1 first followed by #2.
>
> So far we have Arista implementation, and we have both Cisco and Juniper
> Co-Authors as  well  on the draft.
>
> I think we have a good chance at #1 - Standards track.
>
> Les & Tony & Tony
>
> What is the chance of getting this implemented by Cisco & Juniper?
>
>
> We also have a few major stakeholders in the industry supporting the
> draft, Verizon, ATT and CenturyLink as co-authors which I think shows how
> important this draft is for the industry.
>
> Kind Regards
>
> Gyan
>
> On Tue, Jun 14, 2022 at 4:05 PM John E Drake  40juniper@dmarc.ietf.org> wrote:
>
>> Les,
>>
>> I'm happy with either 1 or 2.  It's good work and I think it will become
>> important.
>>
>> Yours Irrespectively,
>>
>> John
>>
>>
>> Juniper Business Use Only
>>
>> > -Original Message-
>> > From: Les Ginsberg (ginsberg) 
>> > Sent: Tuesday, June 14, 2022 4:01 PM
>> > To: John E Drake ; Les Ginsberg (ginsberg)
>> > ; John Scudder 
>> > Cc: Tony Li ; tom petch ; Acee
>> Lindem
>> > (acee) ; lsr@ietf.org
>> > Subject: RE: [Lsr] Dynamic Flooding on Dense Graphs -
>> draft-ietf-lsr-dynamic-
>> > flooding
>> >
>> > [External Email. Be cautious of content]
>> >
>> >
>> > John -
>> >
>> > I would be inclined to agree with you - but...to my knowledge (happy to
>> be
>> > corrected...) -
>> >
>> > There has been no interoperability testing.
>> > It is really only possible to do interoperability testing on
>> centralized mode at
>> > present, since distributed mode requires standardization/multi-vendor
>> > implementation of at least one algorithm - which hasn’t happened yet.
>> > So, a significant portion of the protocol extensions remain untested.
>> And since
>> > enthusiasm for this work has waned - perhaps only temporarily - it seems
>> > unlikely that these gaps will be closed in the immediate future.
>> > Moving to standards track RFC with these gaps seems unwise and to some
>> > degree "irresponsible".
>> >
>> > I think there are then three viable paths:
>> >
>> > 1)Continue to refresh the draft until such time as the gaps are closed
>> or it
>> > becomes clear the work is more permanently not of interest 2)Capture the
>> > current contents as an Experimental RFC - noting the remaining work.
>> > 3)Capture the current contents as a Historic RFC - noting the remaining
>> work.
>> >
>> > I am not in favor of #3.
>> > I would be OK with #1 or #2.
>> >
>> >Les
>> >
>> >
>> > > -Original Message-
>> > > From: Lsr  On Behalf Of John E Drake
>> > > Sent: Tuesday, June 14, 2022 11:23 AM
>> > > To: Les Ginsberg (ginsberg) ;
>> > > John Scudder 
>> > > Cc: Tony Li ; tom petch ; Acee
>> > > Lindem (acee) ; lsr@ietf.org
>> > > Subject: Re: [Lsr] Dynamic Flooding on Dense Graphs -
>> > > draft-ietf-lsr-dynamic- flooding
>> > >
>> > > Hi,
>> > >
>> > > I don't understand why we don't just go through the normal Standards
>> > > track process.  I am sure there are any number of Standards track RFCs
>> > > which are published and which are neither widely implemented nor
>> > > widely deployed, but which may become so in the future.
>> > >
>> > > As Peter noted in the context of another draft, we are starting to see
>> > > extreme growth in the size of IGPs  which to me indicates that the
>> > > subject 

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-14 Thread Robert Raszuk
Acee,

> Note that any good implementation will allow one to punch holes in their
area ranges so that critical prefixes are advertised or

Every PE address is critical. The story that one PE can be more important
than any other is just to mislead you at best.

And we are (I hope) scoped the discussion to summaries.

I realize  PUE also wants to cover P failures so in this case each P is
also equally important.

Thx,
R,


On Tue, Jun 14, 2022 at 3:57 PM Acee Lindem (acee)  wrote:

> Speaking as WG member:
>
>
>
> *From: *Lsr  on behalf of Robert Raszuk <
> rob...@raszuk.net>
> *Date: *Tuesday, June 14, 2022 at 9:27 AM
> *To: *Christian Hopps 
> *Cc: *Gunter Van de Velde , lsr <
> lsr@ietf.org>, "draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org" <
> draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org>,
> draft-wang-lsr-prefix-unreachable-annoucement <
> draft-wang-lsr-prefix-unreachable-annoucem...@ietf.org>
> *Subject: *Re: [Lsr] Thoughts about PUAs - are we not over-engineering?
>
>
>
> All,
>
>
>
> > What is wrong with simply not doing summaries
>
>
>
> What's wrong is that you are reaching the scaling issue much sooner than
> when you inject summaries.
>
>
>
> Note that any good implementation will allow one to punch holes in their
> area ranges so that critical prefixes are advertised or included in the
> range existence criteria.
>
>
>
> Thanks,
>
> Acee
>
>
>
>
>
> Note that the number of those host routes is flooded irrespective of the
> actual need everywhere based on the sick assumption that perhaps they may
> be needed there. There is no today to the best of my knowledge controlled
> leaking to only subset to what is needed.
>
>
>
> But this is not the main worry. Main worry is that in redundant networks
> you are seeing many copies of the very same route being flooded all over
> the place. So in a not so big 1000 node network the number of host routes
> may exceed 8000 easily. cri
>
>
>
> Sure when things are stable all is cool. But we should prepare for the
> worst, not the best. In fact, the ability to encapsulate to an aggregate
> switch IP (GRE or UDP) or nowadays SRv6 has been one of the strongest
> advantages.
>
>
>
> So as started before the problem does exist. Neither PULSE nor PUE solve
> it which are both limited to PE failures detection which is not enough
> (maybe even not worth). But PE-CE failures need to be signalled in the case
> of injecting summaries. Maybe as I said in previous msg just BGP withdrawal
> is fine. If not we should seek a solution which addresses the real problem,
> not an infrequent one.
>
>
>
> Best,
>
> R.
>
>
>
>
>
>
>
> On Tue, Jun 14, 2022 at 2:51 PM Christian Hopps  wrote:
>
>
>
> > On Jun 14, 2022, at 04:59, Van De Velde, Gunter (Nokia - BE/Antwerp) <
> gunter.van_de_ve...@nokia.com> wrote:
> >
> > What is wrong with simply not doing summaries and forget about these
> PUAs to pinch holes in the summary prefixes? this worked very well during
> last two decennia. Are we not over-engineering with PUAs?
>
> 100% yes, IMO.
>
> Thanks,
> Chris.
> [as wg-member]
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-14 Thread Robert Raszuk
All,

> What is wrong with simply not doing summaries

What's wrong is that you are reaching the scaling issue much sooner than
when you inject summaries.

Note that the number of those host routes is flooded irrespective of the
actual need everywhere based on the sick assumption that perhaps they may
be needed there. There is no today to the best of my knowledge controlled
leaking to only subset to what is needed.

But this is not the main worry. Main worry is that in redundant networks
you are seeing many copies of the very same route being flooded all over
the place. So in a not so big 1000 node network the number of host routes
may exceed 8000 easily.

Sure when things are stable all is cool. But we should prepare for the
worst, not the best. In fact, the ability to encapsulate to an aggregate
switch IP (GRE or UDP) or nowadays SRv6 has been one of the strongest
advantages.

So as started before the problem does exist. Neither PULSE nor PUE solve it
which are both limited to PE failures detection which is not enough (maybe
even not worth). But PE-CE failures need to be signalled in the case of
injecting summaries. Maybe as I said in previous msg just BGP withdrawal is
fine. If not we should seek a solution which addresses the real problem,
not an infrequent one.

Best,
R.



On Tue, Jun 14, 2022 at 2:51 PM Christian Hopps  wrote:

>
>
> > On Jun 14, 2022, at 04:59, Van De Velde, Gunter (Nokia - BE/Antwerp) <
> gunter.van_de_ve...@nokia.com> wrote:
> >
> > What is wrong with simply not doing summaries and forget about these
> PUAs to pinch holes in the summary prefixes? this worked very well during
> last two decennia. Are we not over-engineering with PUAs?
>
> 100% yes, IMO.
>
> Thanks,
> Chris.
> [as wg-member]
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

2022-06-14 Thread Robert Raszuk
Hello Gunter,

I agree with pretty much all you said except the conclusion - do nothing
:).

To me if you need to accelerate connectivity restoration upon an unlikely
event like a complete PE failure the right vehicle to signal this is
within the service layer itself. Let's keep in mind that links do fail a
lot in the networks - routers do not (or they do it is multiple orders of
magnitude less frequent event). Especially links on the PE-CE boundaries do
fail a lot.

Removal of next hop reachability can be done with BGP and based on BGP
native recursion will have the exact same effect as presented ideas.
Moreover it will be stateful for the endpoints which again to me is a
feature not a bug.

Some suggested to define a new extension in BGP to signal it even without
using double recursion - well one of them has been proposed in the past -
https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt At that
time the feedback received was that native BGP withdraws are fast enough so
no need to bother. Well those native withdrawals are working today as well
as some claim that specific implementations can withdraw RD:* when PE
hosting such RDs fail and RDs are allocated in a unique per VRF fashion.

Then we have the DROID proposal which again may look like overkill for this
very problem, but if you consider the bigger picture of what networks
control plane pub-sub signalling needs, it establishes the foundation for
such.

Many thanks,
Robert


On Tue, Jun 14, 2022 at 10:59 AM Van De Velde, Gunter (Nokia - BE/Antwerp) <
gunter.van_de_ve...@nokia.com> wrote:

> Hi All,
>
> When reading both proposals about PUA's:
> * draft-ppsenak-lsr-igp-ureach-prefix-announce-00
> * draft-wang-lsr-prefix-unreachable-annoucement-09
>
> The identified problem space seems a correct observation, and indeed
> summaries hide remote area network instabilities. It is one of the
> perceived benefits of using summaries. The place in the network where this
> hiding takes the most impact upon convergence is at service nodes (PE's for
> L3/L2/transport) where due to the summarization its difficult to detect
> that the transport tunnel end-point suddenly becomes unreachable. My
> concern however is if it really is a problem that is worthy for LSR WG to
> solve.
>
> To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is not
> a preferred solution due to the expectation that all nodes in an area must
> be upgraded to support the IGP capability. From this operational
> perspective the draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is
> more elegant, as only the A(S)BR's and particular PEs must be upgraded to
> support PUA's. I do have concerns about the number of PUA advertisements in
> hierarchically summarized networks (/24 (site) -> /20 (region) -> /16
> (core)). More specific, in the /16 backbone area, how many of these PUAs
> will be floating around creating LSP LSDB update churns? How to control the
> potentially exponential number of observed PUAs from floating everywhere?
> (will this lead to OSPF type NSSA areas where areas will be purged from
> these PUAs for scaling stability?)
>
> Long story short, should we not take a step back and re-think this
> identified problem space? Is the proposed solution space not more evil as
> the problem space? We do summarization because it brings stability and
> reduce the number of link state updates within an area. And now with PUA we
> re-introduce additional link state updates (PUAs), we blow up the LSDB with
> information opaque to SPF best-path calculation. In addition there is
> suggestion of new state-machinery to track the igp reachability of
> 'protected' prefixes and there is maybe desire to contain or filter updates
> cross inter-area boundaries. And finally, how will we represent and track
> PUA in the RTM?
>
> What is wrong with simply not doing summaries and forget about these PUAs
> to pinch holes in the summary prefixes? this worked very well during last
> two decennia. Are we not over-engineering with PUAs?
>
> G/
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] Protection between flex-algo topologies

2022-05-18 Thread Robert Raszuk
/* As this departs from ip-flexalgo topic adjusting the subject line */

Hi Les,

I am not so much focusing on fallback just making sure I did not miss any
paragraph or draft which already would describe how to provide protection
in other then on a per topology basis.

Yes, fallbacks are tricky if you relax constraints. To put it in better
perspective fallbacks are easy for overlays. For underlays they may work
(as mentioned to Peter under unidirectional single protect constraint). But
"may work" perhaps is too weak.

Your suggestion to add fallback links to topologies with higher metric is
actually pretty cool. I did not think about it so having this little thread
seems already fruitful :)

But that still requires you to run computations for your favourite LFA type
multiple times (topo by topo) even if those algorithms only different in
some additional non forwarding related processing functions (different
function per slice basis). Speaking of which it seems that the ability to
specify per flex-algo additional processing on packets could have valid use
cases. But forwarding wise the topologies can be identical.

In such cases it seems that it could be very useful to still recognize in a
control plane the uniqueness while a data plane would be common. It looks
to me like we have all pieces of the puzzle here and all what is needed is
to rearrange it a bit.

What do you think ?

Thx,
Robert


On Wed, May 18, 2022 at 9:39 PM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> It isn’t clear to me why you are focused on “fallback” as a solution here.
>
> If you are willing to allow traffic that prefers the “Algo-X topology” to
> use other paths in the event of link/node failures, it seems
> straightforward – using the new metrics being defined in
> https://datatracker.ietf.org/doc/draft-ietf-lsr-flex-algo-bw-con/ - to
> include other links in the Algo-X topology and apply a high cost to them so
> they won’t be used unless the more preferred links are  unavailable.
>
>
>
> I think you and I both have some experience with “fallback”. It is complex
> to implement – especially in the forwarding plane – and would not be my
> first choice as a solution.
>
>
>
> ??
>
>
>
> Les
>
>
>
>
>
> *From:* Lsr  *On Behalf Of * Robert Raszuk
> *Sent:* Tuesday, May 17, 2022 10:58 AM
> *To:* Peter Psenak (ppsenak) 
> *Cc:* lsr 
> *Subject:* Re: [Lsr] Publication has been requested for
> draft-ietf-lsr-ip-flexalgo-06
>
>
>
> Hi Peter,
>
>
>
> Enabling local protection on all nodes in all topologies may also not be
> the best thing to do (for various reasons).
>
>
>
> While I agree that general fallback may be fragile, how about limited
> fallback and only to one special "protection topology" which would have few
> constraints allowing us to do such fallback safely ?
>
>
>
> I guess for ip flex-algo which is a subject of this thread this would not
> be possible, but for SR flex-algo I think this may work pretty well
> allowing N:1 fast connectivity restoration.
>
>
>
> Thx,
>
> Robert
>
>
>
> On Tue, May 17, 2022 at 2:19 PM Peter Psenak  wrote:
>
> Robert,
>
> On 17/05/2022 14:14, Robert Raszuk wrote:
> > Ok cool - thx Peter !
> >
> > More general question - for any FlexAlgo model (incl. SR):
> >
> > Is fallback between topologies - say during failure of primary one -
> > only allowed on the ingress to the network ?
>
> no. Fallback between flex-algos has never been a requirement and is not
> part of the flex-algo specification.
>
> I consider it a dangerous thing to do. It may work under certain
> conditions, but may cause loops under different ones.
>
> thanks,
> Peter
>
>
> >
> > If so the repair must be setup on each topology, otherwise repair will
> > be long as it would need to wait for igp flooding and ingress switchover
> > trigger ?
> >
> > Obviously for IP flex algo it would be much much longer as given prefix
> > needs to be completely reflooded network wide and purged from original
> > topo. Ouch considering time to trigger such action.
> >
> > Many thanks,
> > R.
> >
> > On Tue, May 17, 2022, 13:35 Peter Psenak  > <mailto:ppse...@cisco.com>> wrote:
> >
> > Hi Robert,
> >
> >
> > On 17/05/2022 12:11, Robert Raszuk wrote:
> >  >
> >  > Actually I would like to further clarify if workaround 1 is even
> > doable ...
> >  >
> >  > It seems to me that the IP flexalgo paradigm does not have a way
> for
> >  > more granular then destination prefix forwarding.
> >
> > that is correct. In IP flex-algo 

Re: [Lsr] Publication has been requested for draft-ietf-lsr-ip-flexalgo-06

2022-05-18 Thread Robert Raszuk
Peter,

This was not my question ...

Section 10 of soon to be published RFC clearly states that *"IGP restoration
will be fast and additional protection mechanisms will not be required." *

Those "additional mechanisms" are listed further like LFA, FRR with all its
flavors which of course can be enabled in each topology.

So if (as co-author) you make such bold statement in the document I am
asking what makes networks where IGP Flex-Algo is used so good in terms of
*native* connectivity restoration that "additional protection will not be
required" ?

This is 2022 and while many folks still got locked into archaic model that
for connectivity/service restoration you need to wait for protocol
convergence I  would actually observe that during failure you should first
repair base topology then worry about flex-algos.

- - -

See one of the valid deployment models which get's often presented for
flex-algo is ability to run some special function on each node of given
topology yet make no changes whatsoever to path selection criteria. With
that running 100s of compute cycles for LFA in each topologically identical
flex-algo seems like a huge waist. And that deployment model IMO should get
attention and be addressed in base flex-algo specs.

Cheers,
Robert


On Wed, May 18, 2022 at 11:46 AM Peter Psenak  wrote:

> Robert,
>
> On 18/05/2022 10:53, Robert Raszuk wrote:
> > Peter,
> >
> > It is not about someone thinking if this is a good idea or not. It is
> > about practical aspects of real deployments.
> >
> > But ok section 10 of the subject draft says something pretty interesting:
> >
> > /10.  Protection
> >
> > In many networks where IGP Flexible Algorithms are deployed, IGP
> > restoration will be fast and additional protection mechanisms will
> > not be required.
> > /
> >
> > *Question:* What makes networks with IGP flex-algo running any better
> > then networks without it in terms of protection needed or not ?
>
> the protection is provided withing the same algo, not between them. And
> one can use all existing LFA mechanisms to do so.
>
> thanks,
> Peter
>
>
> >
> > Sure when applicable ECMP can be used to locally protect the traffic.
> > But when you need to run flex-algo for mobile slicing requirements (as
> > discussed in section 3) the load on control plane CPUs and data plane
> > FIBs may become significant (especially when we are talking about lots
> > of "slices").
> >
> > Thx,
> > R.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Wed, May 18, 2022 at 9:45 AM Peter Psenak  > <mailto:ppse...@cisco.com>> wrote:
> >
> > Robert,
> >
> > I really do not want to get into fallback between algorithms. If
> > someone
> > really thinks it is a good idea, he can write a separate document and
> > describe the use case and how to do that safely. But please not in
> the
> > base flex-algo specification.
> >
> > thanks,
> > Peter
> >
> >
> >
> > On 17/05/2022 19:58, Robert Raszuk wrote:
> >  > Hi Peter,
> >  >
> >  > Enabling local protection on all nodes in all topologies may also
> > not be
> >  > the best thing to do (for various reasons).
> >  >
> >  > While I agree that general fallback may be fragile, how about
> > limited
> >  > fallback and only to one special "protection topology" which
> > would have
> >  > few constraints allowing us to do such fallback safely ?
> >  >
> >  > I guess for ip flex-algo which is a subject of this thread this
> > would
> >  > not be possible, but for SR flex-algo I think this may work
> > pretty well
> >  > allowing N:1 fast connectivity restoration.
> >  >
> >  > Thx,
> >  > Robert
> >  >
> >  > On Tue, May 17, 2022 at 2:19 PM Peter Psenak  > <mailto:ppse...@cisco.com>
> >  > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>>> wrote:
> >  >
> >  > Robert,
> >  >
> >  > On 17/05/2022 14:14, Robert Raszuk wrote:
> >  >  > Ok cool - thx Peter !
> >  >  >
> >  >  > More general question - for any FlexAlgo model (incl. SR):
> >  >  >
> >  >  > Is fallback between topologies - say during failure of
> > primary o

Re: [Lsr] Publication has been requested for draft-ietf-lsr-ip-flexalgo-06

2022-05-18 Thread Robert Raszuk
Missed it - sorry:

s/ control plane CPUs and data plane FIBs / control plane CPUs and data
plane FIBs with LFA or R-LFA enabled per topo/



On Wed, May 18, 2022 at 10:53 AM Robert Raszuk  wrote:

> Peter,
>
> It is not about someone thinking if this is a good idea or not. It is
> about practical aspects of real deployments.
>
> But ok section 10 of the subject draft says something pretty interesting:
>
>
>
>
>
>
> *10.  Protection   In many networks where IGP Flexible Algorithms are
> deployed, IGP   restoration will be fast and additional protection
> mechanisms will   not be required. *
>
> *Question:* What makes networks with IGP flex-algo running any better
> then networks without it in terms of protection needed or not ?
>
> Sure when applicable ECMP can be used to locally protect the traffic. But
> when you need to run flex-algo for mobile slicing requirements (as
> discussed in section 3) the load on control plane CPUs and data plane FIBs
> may become significant (especially when we are talking about lots of
> "slices").
>
> Thx,
> R.
>
>
>
>
>
>
>
>
>
>
>
> On Wed, May 18, 2022 at 9:45 AM Peter Psenak  wrote:
>
>> Robert,
>>
>> I really do not want to get into fallback between algorithms. If someone
>> really thinks it is a good idea, he can write a separate document and
>> describe the use case and how to do that safely. But please not in the
>> base flex-algo specification.
>>
>> thanks,
>> Peter
>>
>>
>>
>> On 17/05/2022 19:58, Robert Raszuk wrote:
>> > Hi Peter,
>> >
>> > Enabling local protection on all nodes in all topologies may also not
>> be
>> > the best thing to do (for various reasons).
>> >
>> > While I agree that general fallback may be fragile, how about limited
>> > fallback and only to one special "protection topology" which would have
>> > few constraints allowing us to do such fallback safely ?
>> >
>> > I guess for ip flex-algo which is a subject of this thread this would
>> > not be possible, but for SR flex-algo I think this may work pretty well
>> > allowing N:1 fast connectivity restoration.
>> >
>> > Thx,
>> > Robert
>> >
>> > On Tue, May 17, 2022 at 2:19 PM Peter Psenak > > <mailto:ppse...@cisco.com>> wrote:
>> >
>> > Robert,
>> >
>> > On 17/05/2022 14:14, Robert Raszuk wrote:
>> >  > Ok cool - thx Peter !
>> >  >
>> >  > More general question - for any FlexAlgo model (incl. SR):
>> >  >
>> >  > Is fallback between topologies - say during failure of primary
>> one -
>> >  > only allowed on the ingress to the network ?
>> >
>> > no. Fallback between flex-algos has never been a requirement and is
>> not
>> > part of the flex-algo specification.
>> >
>> > I consider it a dangerous thing to do. It may work under certain
>> > conditions, but may cause loops under different ones.
>> >
>> > thanks,
>> > Peter
>> >
>> >
>> >  >
>> >  > If so the repair must be setup on each topology, otherwise repair
>> > will
>> >  > be long as it would need to wait for igp flooding and ingress
>> > switchover
>> >  > trigger ?
>> >  >
>> >  > Obviously for IP flex algo it would be much much longer as given
>> > prefix
>> >  > needs to be completely reflooded network wide and purged from
>> > original
>> >  > topo. Ouch considering time to trigger such action.
>> >  >
>> >  > Many thanks,
>> >  > R.
>> >  >
>> >  > On Tue, May 17, 2022, 13:35 Peter Psenak > > <mailto:ppse...@cisco.com>
>> >  > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>>> wrote:
>> >  >
>> >  > Hi Robert,
>> >  >
>> >  >
>> >  > On 17/05/2022 12:11, Robert Raszuk wrote:
>> >  >  >
>> >  >  > Actually I would like to further clarify if workaround 1
>> > is even
>> >  > doable ...
>> >  >  >
>> >  >  > It seems to me that the IP flexalgo paradigm does not have
>> > a way for
>> >  >  > more granular then destination pre

Re: [Lsr] Publication has been requested for draft-ietf-lsr-ip-flexalgo-06

2022-05-18 Thread Robert Raszuk
Peter,

It is not about someone thinking if this is a good idea or not. It is
about practical aspects of real deployments.

But ok section 10 of the subject draft says something pretty interesting:






*10.  Protection   In many networks where IGP Flexible Algorithms are
deployed, IGP   restoration will be fast and additional protection
mechanisms will   not be required. *

*Question:* What makes networks with IGP flex-algo running any better then
networks without it in terms of protection needed or not ?

Sure when applicable ECMP can be used to locally protect the traffic. But
when you need to run flex-algo for mobile slicing requirements (as
discussed in section 3) the load on control plane CPUs and data plane FIBs
may become significant (especially when we are talking about lots of
"slices").

Thx,
R.











On Wed, May 18, 2022 at 9:45 AM Peter Psenak  wrote:

> Robert,
>
> I really do not want to get into fallback between algorithms. If someone
> really thinks it is a good idea, he can write a separate document and
> describe the use case and how to do that safely. But please not in the
> base flex-algo specification.
>
> thanks,
> Peter
>
>
>
> On 17/05/2022 19:58, Robert Raszuk wrote:
> > Hi Peter,
> >
> > Enabling local protection on all nodes in all topologies may also not be
> > the best thing to do (for various reasons).
> >
> > While I agree that general fallback may be fragile, how about limited
> > fallback and only to one special "protection topology" which would have
> > few constraints allowing us to do such fallback safely ?
> >
> > I guess for ip flex-algo which is a subject of this thread this would
> > not be possible, but for SR flex-algo I think this may work pretty well
> > allowing N:1 fast connectivity restoration.
> >
> > Thx,
> > Robert
> >
> > On Tue, May 17, 2022 at 2:19 PM Peter Psenak  > <mailto:ppse...@cisco.com>> wrote:
> >
> > Robert,
> >
> > On 17/05/2022 14:14, Robert Raszuk wrote:
> >  > Ok cool - thx Peter !
> >  >
> >  > More general question - for any FlexAlgo model (incl. SR):
> >  >
> >  > Is fallback between topologies - say during failure of primary
> one -
> >  > only allowed on the ingress to the network ?
> >
> > no. Fallback between flex-algos has never been a requirement and is
> not
> > part of the flex-algo specification.
> >
> > I consider it a dangerous thing to do. It may work under certain
> > conditions, but may cause loops under different ones.
> >
> > thanks,
> > Peter
> >
> >
> >  >
> >  > If so the repair must be setup on each topology, otherwise repair
> > will
> >  > be long as it would need to wait for igp flooding and ingress
> > switchover
> >  > trigger ?
> >  >
> >  > Obviously for IP flex algo it would be much much longer as given
> > prefix
> >  > needs to be completely reflooded network wide and purged from
> > original
> >  > topo. Ouch considering time to trigger such action.
> >  >
> >  > Many thanks,
> >  > R.
> >  >
> >  > On Tue, May 17, 2022, 13:35 Peter Psenak  > <mailto:ppse...@cisco.com>
> >  > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>>> wrote:
> >  >
> >  > Hi Robert,
> >  >
> >  >
> >  > On 17/05/2022 12:11, Robert Raszuk wrote:
> >  >  >
> >  >  > Actually I would like to further clarify if workaround 1
> > is even
> >  > doable ...
> >  >  >
> >  >  > It seems to me that the IP flexalgo paradigm does not have
> > a way for
> >  >  > more granular then destination prefix forwarding.
> >  >
> >  > that is correct. In IP flex-algo the prefix itself is bound
> > to the
> >  > algorithm.
> >  >
> >  >  >
> >  >  > So if I have http traffic vs streaming vs voice going to
> > the same
> >  > load
> >  >  > balancer (same dst IP address) there seems to be no way to
> > map some
> >  >  > traffic (based on say port number) to take specific
> topology.
> >  >
> >  > no, you can not do that with IP flex-algo.
> >  >
> >  >
> >

Re: [Lsr] Publication has been requested for draft-ietf-lsr-ip-flexalgo-06

2022-05-17 Thread Robert Raszuk
Hi Peter,

Enabling local protection on all nodes in all topologies may also not be
the best thing to do (for various reasons).

While I agree that general fallback may be fragile, how about limited
fallback and only to one special "protection topology" which would have few
constraints allowing us to do such fallback safely ?

I guess for ip flex-algo which is a subject of this thread this would not
be possible, but for SR flex-algo I think this may work pretty well
allowing N:1 fast connectivity restoration.

Thx,
Robert

On Tue, May 17, 2022 at 2:19 PM Peter Psenak  wrote:

> Robert,
>
> On 17/05/2022 14:14, Robert Raszuk wrote:
> > Ok cool - thx Peter !
> >
> > More general question - for any FlexAlgo model (incl. SR):
> >
> > Is fallback between topologies - say during failure of primary one -
> > only allowed on the ingress to the network ?
>
> no. Fallback between flex-algos has never been a requirement and is not
> part of the flex-algo specification.
>
> I consider it a dangerous thing to do. It may work under certain
> conditions, but may cause loops under different ones.
>
> thanks,
> Peter
>
>
> >
> > If so the repair must be setup on each topology, otherwise repair will
> > be long as it would need to wait for igp flooding and ingress switchover
> > trigger ?
> >
> > Obviously for IP flex algo it would be much much longer as given prefix
> > needs to be completely reflooded network wide and purged from original
> > topo. Ouch considering time to trigger such action.
> >
> > Many thanks,
> > R.
> >
> > On Tue, May 17, 2022, 13:35 Peter Psenak  > <mailto:ppse...@cisco.com>> wrote:
> >
> > Hi Robert,
> >
> >
> > On 17/05/2022 12:11, Robert Raszuk wrote:
> >  >
> >  > Actually I would like to further clarify if workaround 1 is even
> > doable ...
> >  >
> >  > It seems to me that the IP flexalgo paradigm does not have a way
> for
> >  > more granular then destination prefix forwarding.
> >
> > that is correct. In IP flex-algo the prefix itself is bound to the
> > algorithm.
> >
> >  >
> >  > So if I have http traffic vs streaming vs voice going to the same
> > load
> >  > balancer (same dst IP address) there seems to be no way to map
> some
> >  > traffic (based on say port number) to take specific topology.
> >
> > no, you can not do that with IP flex-algo.
> >
> >
> >  >
> >  > That's pretty coarse and frankly very limiting for applicability
> > of IP
> >  > flex-algo. If I am correct the draft should be very
> > explicit about this
> >  > before publication.
> >
> > please look at the latest version of the draft, section 3:
> >
> >
> >
> https://datatracker.ietf.org/doc/html/draft-ietf-lsr-ip-flexalgo#section-3
> > <
> https://datatracker.ietf.org/doc/html/draft-ietf-lsr-ip-flexalgo#section-3
> >
> >
> > thanks,
> > Peter
> >
> >  >
> >  > Kind regards
> >  > R.
> >  >
> >  > On Tue, May 17, 2022 at 12:01 PM Robert Raszuk  > <mailto:rob...@raszuk.net>
> >  > <mailto:rob...@raszuk.net <mailto:rob...@raszuk.net>>> wrote:
> >  >
> >  > Folks,
> >  >
> >  > A bit related to Aijun's point but I have question to
> > the text from
> >  > the draft he quoted:
> >  >
> >  > In cases where a prefix advertisement is received in both
> > a IPv4
> >  > Prefix Reachability TLV and an IPv4 Algorithm Prefix
> > Reachability
> >  > TLV, the IPv4 Prefix Reachability advertisement MUST be
> > preferred
> >  > when installing entries in the forwarding plane.
> >  >
> >  > Does this really mean that I can not for a given prefix say
> > /24 use
> >  > default topology for best effort traffic and new flex-algo
> > topology
> >  > for specific application ?
> >  >
> >  > Is the "workaround 1" to always build two new topologies for
> such
> >  > /24 prefix (one following base topo and one new) and never
> > advertise
> >  > it in base topology ?
> >  >
> >  > Is the "workaround 2" to forget about native forwarding and
&g

Re: [Lsr] Publication has been requested for draft-ietf-lsr-ip-flexalgo-06

2022-05-17 Thread Robert Raszuk
Ok cool - thx Peter !

More general question - for any FlexAlgo model (incl. SR):

Is fallback between topologies - say during failure of primary one - only
allowed on the ingress to the network ?

If so the repair must be setup on each topology, otherwise repair will be
long as it would need to wait for igp flooding and ingress switchover
trigger ?

Obviously for IP flex algo it would be much much longer as given prefix
needs to be completely reflooded network wide and purged from original
topo. Ouch considering time to trigger such action.

Many thanks,
R.

On Tue, May 17, 2022, 13:35 Peter Psenak  wrote:

> Hi Robert,
>
>
> On 17/05/2022 12:11, Robert Raszuk wrote:
> >
> > Actually I would like to further clarify if workaround 1 is even doable
> ...
> >
> > It seems to me that the IP flexalgo paradigm does not have a way for
> > more granular then destination prefix forwarding.
>
> that is correct. In IP flex-algo the prefix itself is bound to the
> algorithm.
>
> >
> > So if I have http traffic vs streaming vs voice going to the same load
> > balancer (same dst IP address) there seems to be no way to map some
> > traffic (based on say port number) to take specific topology.
>
> no, you can not do that with IP flex-algo.
>
>
> >
> > That's pretty coarse and frankly very limiting for applicability of IP
> > flex-algo. If I am correct the draft should be very explicit about this
> > before publication.
>
> please look at the latest version of the draft, section 3:
>
>
> https://datatracker.ietf.org/doc/html/draft-ietf-lsr-ip-flexalgo#section-3
>
> thanks,
> Peter
>
> >
> > Kind regards
> > R.
> >
> > On Tue, May 17, 2022 at 12:01 PM Robert Raszuk  > <mailto:rob...@raszuk.net>> wrote:
> >
> > Folks,
> >
> > A bit related to Aijun's point but I have question to the text from
> > the draft he quoted:
> >
> > In cases where a prefix advertisement is received in both a IPv4
> > Prefix Reachability TLV and an IPv4 Algorithm Prefix Reachability
> > TLV, the IPv4 Prefix Reachability advertisement MUST be preferred
> > when installing entries in the forwarding plane.
> >
> > Does this really mean that I can not for a given prefix say /24 use
> > default topology for best effort traffic and new flex-algo topology
> > for specific application ?
> >
> > Is the "workaround 1" to always build two new topologies for such
> > /24 prefix (one following base topo and one new) and never advertise
> > it in base topology ?
> >
> > Is the "workaround 2" to forget about native forwarding and use for
> > example SR and mark the packets such that SID pool corresponding to
> > base topology forwarding will be separate from SID pool
> > corresponding to new flex-algo topology ?
> >
> > Many thx,
> > Robert
> >
> >
> > -- Forwarded message -
> > From: *Acee Lindem via Datatracker*  > <mailto:nore...@ietf.org>>
> > Date: Mon, May 16, 2022 at 3:36 PM
> > Subject: [Lsr] Publication has been requested for
> > draft-ietf-lsr-ip-flexalgo-06
> > To: mailto:j...@juniper.net>>
> > Cc: mailto:a...@cisco.com>>,
> > mailto:iesg-secret...@ietf.org>>,
> > mailto:lsr-cha...@ietf.org>>,  > <mailto:lsr@ietf.org>>
> >
> >
> > Acee Lindem has requested publication of
> > draft-ietf-lsr-ip-flexalgo-06 as Proposed Standard on behalf of the
> > LSR working group.
> >
> > Please verify the document's state at
> > https://datatracker.ietf.org/doc/draft-ietf-lsr-ip-flexalgo/
> > <https://datatracker.ietf.org/doc/draft-ietf-lsr-ip-flexalgo/>
> >
> >
> > ___
> > Lsr mailing list
> > Lsr@ietf.org <mailto:Lsr@ietf.org>
> > https://www.ietf.org/mailman/listinfo/lsr
> > <https://www.ietf.org/mailman/listinfo/lsr>
> >
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Publication has been requested for draft-ietf-lsr-ip-flexalgo-06

2022-05-17 Thread Robert Raszuk
Actually I would like to further clarify if workaround 1 is even doable ...

It seems to me that the IP flexalgo paradigm does not have a way for more
granular then destination prefix forwarding.

So if I have http traffic vs streaming vs voice going to the same load
balancer (same dst IP address) there seems to be no way to map some
traffic (based on say port number) to take specific topology.

That's pretty coarse and frankly very limiting for applicability of IP
flex-algo. If I am correct the draft should be very explicit about this
before publication.

Kind regards
R.

On Tue, May 17, 2022 at 12:01 PM Robert Raszuk  wrote:

> Folks,
>
> A bit related to Aijun's point but I have question to the text from the
> draft he quoted:
>
>In cases where a prefix advertisement is received in both a IPv4
>Prefix Reachability TLV and an IPv4 Algorithm Prefix Reachability
>TLV, the IPv4 Prefix Reachability advertisement MUST be preferred
>when installing entries in the forwarding plane.
>
> Does this really mean that I can not for a given prefix say /24 use
> default topology for best effort traffic and new flex-algo topology for
> specific application ?
>
> Is the "workaround 1" to always build two new topologies for such /24
> prefix (one following base topo and one new) and never advertise it in base
> topology ?
>
> Is the "workaround 2" to forget about native forwarding and use for
> example SR and mark the packets such that SID pool corresponding to base
> topology forwarding will be separate from SID pool corresponding to new
> flex-algo topology ?
>
> Many thx,
> Robert
>
>
> -- Forwarded message -
> From: Acee Lindem via Datatracker 
> Date: Mon, May 16, 2022 at 3:36 PM
> Subject: [Lsr] Publication has been requested for
> draft-ietf-lsr-ip-flexalgo-06
> To: 
> Cc: , , , <
> lsr@ietf.org>
>
>
> Acee Lindem has requested publication of draft-ietf-lsr-ip-flexalgo-06 as
> Proposed Standard on behalf of the LSR working group.
>
> Please verify the document's state at
> https://datatracker.ietf.org/doc/draft-ietf-lsr-ip-flexalgo/
>
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] Fwd: Publication has been requested for draft-ietf-lsr-ip-flexalgo-06

2022-05-17 Thread Robert Raszuk
Folks,

A bit related to Aijun's point but I have question to the text from the
draft he quoted:

   In cases where a prefix advertisement is received in both a IPv4
   Prefix Reachability TLV and an IPv4 Algorithm Prefix Reachability
   TLV, the IPv4 Prefix Reachability advertisement MUST be preferred
   when installing entries in the forwarding plane.

Does this really mean that I can not for a given prefix say /24 use default
topology for best effort traffic and new flex-algo topology for specific
application ?

Is the "workaround 1" to always build two new topologies for such /24
prefix (one following base topo and one new) and never advertise it in base
topology ?

Is the "workaround 2" to forget about native forwarding and use for example
SR and mark the packets such that SID pool corresponding to base topology
forwarding will be separate from SID pool corresponding to new flex-algo
topology ?

Many thx,
Robert


-- Forwarded message -
From: Acee Lindem via Datatracker 
Date: Mon, May 16, 2022 at 3:36 PM
Subject: [Lsr] Publication has been requested for
draft-ietf-lsr-ip-flexalgo-06
To: 
Cc: , , , <
lsr@ietf.org>


Acee Lindem has requested publication of draft-ietf-lsr-ip-flexalgo-06 as
Proposed Standard on behalf of the LSR working group.

Please verify the document's state at
https://datatracker.ietf.org/doc/draft-ietf-lsr-ip-flexalgo/


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for draft-ietf-lsr-ip-flexalgo-04 - "IGP Flexible Algorithms (Flex-Algorithm) In IP Networks"

2022-04-14 Thread Robert Raszuk
Hi John,

In the IETF context I have always associated ‘data plane’ with packet
> forwarding,
>

No one disputes that.

But the fact that various sub-data-planes are built on top of base physical
data planes needs to be clearly distinguished.

Kind regards,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for draft-ietf-lsr-ip-flexalgo-04 - "IGP Flexible Algorithms (Flex-Algorithm) In IP Networks"

2022-04-14 Thread Robert Raszuk
Hey Peter,

It seems that we have double layer of confusion here.

First layer is what you are referring as applications in context of ASLA:

RSVP-TE
FRR (ex: LFA)
Flex-Algo

My suggestion about custom/logical topology/dataplane was in reference to
the above.

Then apparently we have more nested confusion which I am afraid is being
also called application like the list you mentioned

SR-MPLS
SR-v6
IPv4
IPv6
BIER
etc ..

Those to me those are forwarding behaviours/paradigms.

Even further confusion is caused by the fact that from network packet
forwarding RSVP-TE is comparable to SR-MPLS or SRv6.

And irrespective how you/we choose to call all of the above please just not
use term application as application is well understood as user applications
which can be http traffic, financial tickers, voip, iptv etc ...

Cheers,
R.



On Thu, Apr 14, 2022 at 11:29 AM Peter Psenak  wrote:

> Hi Robert,
>
> On 14/04/2022 11:21, Robert Raszuk wrote:
> > Hi Peter,
> >
> > Term "data-plane" usually means physical resources links, switch fabric,
> > ASIC etc ... so I am afraid it will also generate confusion with default
> > data plane.
> >
> > How about two alternatives:
> >
> > - custom/logical topology
> > - logical-data-plane
>
> flex-algo has been defined so far for:
>
> - SR-MPLS
> - SRv6
> - IP
> - BIER
>
> Would you call them "custom/logical topology" or "logical-data-plane"?
> I would not.
>
> thanks,
> Peter
>
>
> >
> > Thx,
> > Robert.
> >
> >
> >
> >
> >
> >
> > On Thu, Apr 14, 2022 at 9:27 AM Peter Psenak
> > mailto:40cisco@dmarc.ietf.org>>
>
> > wrote:
> >
> > Hi Ketan,
> >
> > On 13/04/2022 15:56, Ketan Talaulikar wrote:
> >  > Hi Peter,
> >  >
> >  > I would still reiterate the need to clarify the usage of
> > "application"
> >  > terminology in the base FlexAlgo spec. We don't need to call it
> >  > "data-plane", I was suggesting "forwarding mechanism" or it can be
> >  > something else as well.
> >
> > I will replace with data-plane. That's the best from what we have.
> >
> > thanks,
> > Peter
> >
> >
> >
> >  >
> >  > Just my 2c
> >  >
> >  > Thanks,
> >  > Ketan
> >  >
> >  >
> >  > On Wed, Apr 13, 2022 at 2:35 PM Peter Psenak  > <mailto:ppse...@cisco.com>
> >  > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>>> wrote:
> >  >
> >  > Hi Ketan,
> >  >
> >  > please see inline (##PP4):
> >  >
> >  >
> >  > On 13/04/2022 10:52, Ketan Talaulikar wrote:
> >  >  > Hi Peter,
> >  >  >
> >  >  > I will not press this point further if I am the only one
> that
> >  > finds this
> >  >  > complexity without any benefit. :-)
> >  >  >
> >  >  > Please check inline below for some clarifications with KT3.
> >  >  >
> >  >  >
> >  >  > On Wed, Apr 13, 2022 at 12:47 PM Peter Psenak
> > mailto:ppse...@cisco.com>
> >  > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>>
> >  >  > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>
> > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>>>> wrote:
> >  >  >
> >  >  > Hi Ketan,
> >  >  >
> >  >  >
> >  >  > please see inline (##PP3):
> >  >  >
> >  >  > On 13/04/2022 06:00, Ketan Talaulikar wrote:
> >  >  >  > Hi Peter,
> >  >  >  >
> >  >  >  > Please check inline below with KT2. I am trimming
> > everything
> >  >  > other than
> >  >  >  > the one point of continuing debate.
> >  >  >  >
> >  >  >  >  >  >
> >  >  >  >  >  > 2) The relationship between the algo
> > usage
> >  > for IP
> >  >  > FlexAlgo
> >  >  >  > and other
> >  >  >  >  >  > d

Re: [Lsr] Working Group Last Call for draft-ietf-lsr-ip-flexalgo-04 - "IGP Flexible Algorithms (Flex-Algorithm) In IP Networks"

2022-04-14 Thread Robert Raszuk
Hi Peter,

Term "data-plane" usually means physical resources links, switch fabric,
ASIC etc ... so I am afraid it will also generate confusion with default
data plane.

How about two alternatives:

- custom/logical topology
- logical-data-plane

Thx,
Robert.






On Thu, Apr 14, 2022 at 9:27 AM Peter Psenak  wrote:

> Hi Ketan,
>
> On 13/04/2022 15:56, Ketan Talaulikar wrote:
> > Hi Peter,
> >
> > I would still reiterate the need to clarify the usage of "application"
> > terminology in the base FlexAlgo spec. We don't need to call it
> > "data-plane", I was suggesting "forwarding mechanism" or it can be
> > something else as well.
>
> I will replace with data-plane. That's the best from what we have.
>
> thanks,
> Peter
>
>
>
> >
> > Just my 2c
> >
> > Thanks,
> > Ketan
> >
> >
> > On Wed, Apr 13, 2022 at 2:35 PM Peter Psenak  > > wrote:
> >
> > Hi Ketan,
> >
> > please see inline (##PP4):
> >
> >
> > On 13/04/2022 10:52, Ketan Talaulikar wrote:
> >  > Hi Peter,
> >  >
> >  > I will not press this point further if I am the only one that
> > finds this
> >  > complexity without any benefit. :-)
> >  >
> >  > Please check inline below for some clarifications with KT3.
> >  >
> >  >
> >  > On Wed, Apr 13, 2022 at 12:47 PM Peter Psenak  > 
> >  > >> wrote:
> >  >
> >  > Hi Ketan,
> >  >
> >  >
> >  > please see inline (##PP3):
> >  >
> >  > On 13/04/2022 06:00, Ketan Talaulikar wrote:
> >  >  > Hi Peter,
> >  >  >
> >  >  > Please check inline below with KT2. I am trimming
> everything
> >  > other than
> >  >  > the one point of continuing debate.
> >  >  >
> >  >  >  >  >
> >  >  >  >  > 2) The relationship between the algo usage
> > for IP
> >  > FlexAlgo
> >  >  > and other
> >  >  >  >  > data planes (e.g. FlexAlgo with SR) is not
> > very clear.
> >  >  > There arise
> >  >  >  >  > complications when the algo usage for IP
> > FlexAlgo
> >  > overlap
> >  >  > with other
> >  >  >  >  > (say SR) data planes since the FAD is shared
> but
> >  > the node
> >  >  >  > participation
> >  >  >  >  > is not shared. While Sec 9 suggests that we
> > can work
> >  >  > through these
> >  >  >  >  > complications, I question the need for such
> > complexity.
> >  >  > The FlexAlgo
> >  >  >  >  > space is large enough to allow it to be
> > shared between
> >  >  > various data
> >  >  >  >  > planes without overlap. My suggestion would
> > be to
> >  > neither
> >  >  > carve out
> >  >  >  >  > parallel algo spaces within IGPs for various
> > types of
> >  >  > FlexAlgo data
> >  >  >  >  > planes nor allow the same algo to be used by
> > both
> >  > IP and
> >  >  > SR data
> >  >  >  > planes.
> >  >  >  >  > So that we have a single topology
> computation in
> >  > the IGP
> >  >  > for a given
> >  >  >  >  > algo based on its FAD and data plane
> > participation and
> >  >  > then when it
> >  >  >  >  > comes to prefix calculation, the results
> > could involve
> >  >  >  > programming of
> >  >  >  >  > entries in respective forwarding planes
> > based on the
> >  >  > signaling of
> >  >  >  > the
> >  >  >  >  > respective prefix reachabilities. The
> > coverage of these
> >  >  > aspects in a
> >  >  >  >  > dedicated section upfront will help.
> >  >  >  >
> >  >  >  > ##PP
> >  >  >  > I strongly disagree.
> >  >  >  >
> >  >  >  > FAD is data-pane/app independent. Participation
> is
> >  > data-plane/app
> >  >  >  > dependent. Base flex-algo specification is very
> > clear
> >  > about
> >  >  > that. That
> >  >  >  > has advantages and we do not want to modify
> > that part.
> >  >  >  >
> >  >  >  >
> >  >  >  > KT> No issue with this part.
> >  >  >  >
> >  >  >  >
> >  >  >  > Topology calculation for algo/data-plane needs
> > to take
> >  > both
> >  >  > FAD and
> >  >  >  > participation into account. You need independent
> >  > calculation
> >  >  > for each
> >  >  >  > 

Re: [Lsr] Working Group Last Call for draft-ietf-lsr-ip-flexalgo-04 - "IGP Flexible Algorithms (Flex-Algorithm) In IP Networks"

2022-04-13 Thread Robert Raszuk
Hi Ketan,

> KT2> I see the primary use case for IP FlexAlgo (or another data plane)
> > to be that the data plane is used by itself. In the (rare?) case where
> > multiple data planes are required to coexist, it is simpler both from
> > implementation and deployment POV to use different algos. It would be
> > good to have operator inputs here. The only cost that I see for this is
> > that the same FAD may get advertised twice only in the case where it is
> > identical for multiple data planes. So I am still not seeing the benefit
> > of enabling multiple (i.e. per data plane) computations for a single
> > algo rather than just keeping it a single computation per algo where a
> > single data plane is associated with a specific algo.
>
> ##PP3
> I really do not see the problem. As you stated above repeating the same
> FAD for multiple algos would be inefficient. The beauty of FAD is that
> it is app independent and can be used by many of them.
>


As I had very same doubts as you I think the advantage here is that even
for the same FAD you can have different links attribute/metric values
advertised on a per "app" basis. Hence you may effectively get different
topologies on a per "application" basis while still using same algorithm.

Again as I and others said it few times the name "app" is badly chosen to
describe forwarding behaviour in data plane but I guess no one is going to
listen and change that name now :)

Practically if folks will use different algorithms to construct different
topologies or use the same algorithm with different metrics all depends on
what real user _applications_ the network is to carry modulo what
flexibility network elements used to construct such network provide.

Thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-droid-00.txt

2022-04-04 Thread Robert Raszuk
Hi Tony,

Just two follow up points,

#1 - I can't stop the feeling that DROID is very IGP centric and not
generic enough. Do you think we need DROID-2 to start offloading BGP or at
least stop trashing it ? Or you think that DROID as proposed could take on
day one flowspec v2 with its various extensions as example ?

#2 - When I mentioned anycast I meant not in the view of anycast redundancy
- too many issues with that. I meant just as session/service
bootstrapping then falling into dedicated pair of IPs. Simple redundancy
could be still easy with two such domain wide addresses.

Thx,
Robert.


On Mon, Apr 4, 2022 at 11:11 PM Tony Li  wrote:

>
> Hi Robert,
>
> > Very happy to see this draft.
>
>
> Thanks.
>
>
> > First question - the draft seems to be focusing on hierarchical IGPs and
> is clearly driven by liveness propagation discussion.
>
>
> The main problem in networking is scale. If you haven’t dealt with scale,
> you haven’t solved the problem. The way that we deal with scale is to
> install hierarchy. Thus, if you don’t have hiearchy, you don’t have a
> scalable network.
>
> Single level networks are simply degenerate cases of hierarchical networks
> and should be dealt with as such.  Pick two routers.  Make them L1L2.
> Poof, all done.
>
>
> > But the motivation of offloading non routing information from IGPs
> (and/or BGP) is also full applicable to non hierarchical IGPs where there
> is no ABRs. Do you plan to rewrite section 3 accordingly ?
>
>
> No, since it’s not necessary.  :)
>
> If you would like to see alternate text, please feel free to propose.
>
>
> > Also putting liveness aside wouldn't it be feasible to also relax the
> attachment to each area/level such that truly opaque information can be
> exchanged even if we use as broker DROID cluster sitting only in core area
> and listening to data or liveness from all clients ?
>
>
> I’m not sure that I parse this correctly.  Yes, you’re welcome to use
> DROID without worrying about liveness.  That’s one use case and it’s not
> mandatory.
>
>
> > The DROID discovery in the latter case could be as simple as one line
> cfg. Networks could also use well known anycast address to connect network
> elements to DROID cluster.
>
>
> Yes, but TCP/QUIC anycast is not quite as reliable as I would like.  I
> prefer simple redundancy. :-)
>
> Tony
>
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Fwd: New Version Notification for draft-li-lsr-droid-00.txt

2022-04-04 Thread Robert Raszuk
Hi Tony,

Very happy to see this draft.

First question - the draft seems to be focusing on hierarchical IGPs and is
clearly driven by liveness propagation discussion.

But the motivation of offloading non routing information from IGPs (and/or
BGP) is also full applicable to non hierarchical IGPs where there is no
ABRs. Do you plan to rewrite section 3 accordingly ?

Also putting liveness aside wouldn't it be feasible to also relax the
attachment to each area/level such that truly opaque information can be
exchanged even if we use as broker DROID cluster sitting only in core area
and listening to data or liveness from all clients ?

The DROID discovery in the latter case could be as simple as one line cfg.
Networks could also use well known anycast address to connect network
elements to DROID cluster.

Thx,
Robert


On Mon, Apr 4, 2022 at 6:48 PM Tony Li  wrote:

>
> Hi all,
>
> As discussed during our last meeting, the Node Liveness Protocol could be
> generalized to support arbitrary data.
>
> I’ve done that work, turning it into a distributed object store.
>
> In particular, node capabilities are now a generic example of the use of
> this mechanism.  Node Liveness is still included as an custom mechanism.
>
> Comments?
>
> Tony
>
>
> Begin forwarded message:
>
> *From: *internet-dra...@ietf.org
> *Subject: **New Version Notification for draft-li-lsr-droid-00.txt*
> *Date: *April 4, 2022 at 9:43:57 AM PDT
> *To: *"Tony Li" 
>
>
> A new version of I-D, draft-li-lsr-droid-00.txt
> has been successfully submitted by Tony Li and posted to the
> IETF repository.
>
> Name: draft-li-lsr-droid
> Revision: 00
> Title: Distributed Routing Object Information Database (DROID)
> Document date: 2022-04-04
> Group: Individual Submission
> Pages: 17
> URL:https://www.ietf.org/archive/id/draft-li-lsr-droid-00.txt
> Status: https://datatracker.ietf.org/doc/draft-li-lsr-droid/
> Htmlized:   https://datatracker.ietf.org/doc/html/draft-li-lsr-droid
>
>
> Abstract:
>   Over time, the routing protocols have been burdended with the
>   responsiblity of carrying a variety of information that is not
>   directly relevant to their mission.  This includes VPN parameters,
>   configuration information, and capability data.  All of the
>   additional data impacts the performance and stability of the routing
>   protocols negatively.
>
>   This has been convenient since the backbone of a routing protocol is
>   a small distributed database of routing information.  Any service
>   needing a distributed database has considered injecting its data into
>   a routing protocol so that it can leverage the protocols database
>   service.  Architecturally, this is a mistake that puts the protocol
>   at risk from undue complexity and overhead.
>
>   To avoid this, DROID is a subsystem that is tangential to, but
>   independent of the routing protocols, and provides distributed
>   database services for other routing services.  It is based on the
>   publish-subscribe (pub/sub) architecture and is intentionally crafted
>   to be an open mechanism for the transport of ancillary data.
>
>
>
>
> The IETF Secretariat
>
>
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Is it necessary to define new PUB/SUB model to monitor the node live?

2022-03-31 Thread Robert Raszuk
Aijun,

Hmm so you want ephemeral indication to trigger SPF and affect topology
computation ?

I do not think this is a sound idea.

At least PULSE folks (Peter & Les pls confirm) never assumed PULSE will
trigger SPF and will be used as topology change input.

Thx,
R.



On Thu, Mar 31, 2022 at 10:46 AM Aijun Wang 
wrote:

> Hi, Robert:
>
>
>
> There are possibilities that only one of the ABRs is detached from other
> nodes in the same area, the receivers should select other ABRs to reach the
> destination announced by PUA message.
>
> Such scenario is described in
> https://datatracker.ietf.org/doc/html/draft-wang-lsr-prefix-unreachable-annoucement-08#section-3.2
> .
>
> The corresponding solution will be updated later.
>
>
>
>
>
> Best Regards
>
>
>
> Aijun Wang
>
> China Telecom
>
>
>
> *From:* lsr-boun...@ietf.org  *On Behalf Of *Robert
> Raszuk
> *Sent:* Wednesday, March 30, 2022 4:58 PM
> *To:* Aijun Wang 
> *Cc:* Aijun Wang ; Tony Li ;
> lsr 
> *Subject:* Re: [Lsr] Is it necessary to define new PUB/SUB model to
> monitor the node live?
>
>
>
> Hi Aijun,
>
>
>
> *" Incremental SPF or other mechanism can be used to parse such
> unreachable information on the receiver to decrease Tony’ worry for the
> stability of “vital truck”.*
>
>
>
> H - could you kindly elaborate a bit more what *incremental SPF* has
> anything to do with parsing such unreachable information ?
>
>
>
> Thx a lot,
>
> Robert
>
>
>
>
>
>
>
> On Wed, Mar 30, 2022 at 10:48 AM Aijun Wang 
> wrote:
>
> Hi, Robert:
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Tuesday, March 29, 2022 3:53 PM
> *To:* Aijun Wang 
> *Cc:* Tony Li ; Aijun Wang ;
> lsr 
> *Subject:* Re: [Lsr] Is it necessary to define new PUB/SUB model to
> monitor the node live?
>
>
>
> Aijun,
>
>
>
> Your email is written prove that my question the other day which
> remain not answered is valid.
>
>
>
> I asked is the scope of PUA/PULSE to only signal service endpoints or is
> this to also carry any to any liveness across all areas/levels in the link
> state IGP ?
>
>
>
> It seems clear that you say is the latter. Not sure if PULSE authors are
> of the same opinion.
>
> *[WAJ] The scope of PUA can cover and aim to solve both scenarios.*
>
>
>
> If every node is interested in every other node's liveness that we are
> redefining scope of the work here, but I may still argue that not every
> node in the network will have a segment endpoints terminating on every
> other node.
>
> *[WAJ] Yes, such full mesh any-to-any connection may not happen at the
> same time, but the possibility of any to any segment list exists, the
> overall effect is that the any to any notification is needed *
>
>  So registration model handled outside of active link state nodes IMO
> still is far superior to flood and forget (via timeout) type of model.
>
> *[WAJ] Invent the new truck will alleviate the burden of the
> station(Router).  Utilize the existing flood mechanism to meet the above
> scenarios are the most efficient way.  Incremental SPF or other mechanism
> can be used to parse such unreachable information on the receiver to
> decrease Tony**’ worry for the stability of “vital truck”.*
>
>
>
> Best,
>
> R.
>
>
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Is it necessary to define new PUB/SUB model to monitor the node live?

2022-03-30 Thread Robert Raszuk
Hi Aijun,


*" Incremental SPF or other mechanism can be used to parse such unreachable
information on the receiver to decrease Tony’ worry for the stability of
“vital truck”.*

H - could you kindly elaborate a bit more what *incremental SPF* has
anything to do with parsing such unreachable information ?

Thx a lot,
Robert



On Wed, Mar 30, 2022 at 10:48 AM Aijun Wang 
wrote:

> Hi, Robert:
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Tuesday, March 29, 2022 3:53 PM
> *To:* Aijun Wang 
> *Cc:* Tony Li ; Aijun Wang ;
> lsr 
> *Subject:* Re: [Lsr] Is it necessary to define new PUB/SUB model to
> monitor the node live?
>
>
>
> Aijun,
>
>
>
> Your email is written prove that my question the other day which
> remain not answered is valid.
>
>
>
> I asked is the scope of PUA/PULSE to only signal service endpoints or is
> this to also carry any to any liveness across all areas/levels in the link
> state IGP ?
>
>
>
> It seems clear that you say is the latter. Not sure if PULSE authors are
> of the same opinion.
>
> *[WAJ] The scope of PUA can cover and aim to solve both scenarios.*
>
>
>
> If every node is interested in every other node's liveness that we are
> redefining scope of the work here, but I may still argue that not every
> node in the network will have a segment endpoints terminating on every
> other node.
>
> *[WAJ] Yes, such full mesh any-to-any connection may not happen at the
> same time, but the possibility of any to any segment list exists, the
> overall effect is that the any to any notification is needed *
>
>  So registration model handled outside of active link state nodes IMO
> still is far superior to flood and forget (via timeout) type of model.
>
> *[WAJ] Invent the new truck will alleviate the burden of the
> station(Router).  Utilize the existing flood mechanism to meet the above
> scenarios are the most efficient way.  Incremental SPF or other mechanism
> can be used to parse such unreachable information on the receiver to
> decrease Tony’ worry for the stability of “vital truck”.*
>
>
>
> Best,
>
> R.
>
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Is it necessary to define new PUB/SUB model to monitor the node live?

2022-03-29 Thread Robert Raszuk
Aijun,

Your email is written prove that my question the other day which remain not
answered is valid.

I asked is the scope of PUA/PULSE to only signal service endpoints or is
this to also carry any to any liveness across all areas/levels in the link
state IGP ?

It seems clear that you say is the latter. Not sure if PULSE authors are of
the same opinion.

If every node is interested in every other node's liveness that we are
redefining scope of the work here, but I may still argue that not every
node in the network will have a segment endpoints terminating on every
other node. So registration model handled outside of active link state
nodes IMO still is far superior to flood and forget (via timeout) type of
model.

Best,
R.

On Tue, Mar 29, 2022 at 3:40 AM Aijun Wang 
wrote:

> Hi, Robert:
>
> Let’s don’t make the conclusion in hurry.
>
> I think you should know the application scenarios for such unreachable
> information is not only for BGP services, but also for the tunnel
> services(for example, SRv6 loose-path routing).
>
> For the latter scenario, the P node on the path should know the status of
> other P node on the path, which is located in other areas.
>
> Then, NLP like approach will also result in ALL NODES within the areas
> needs to register such information, and the failures of one nodes will be
> sent to all the register.
>
> What’s the difference with the IGP flooding mechanism then?
>
>
>
> Best Regards
>
>
>
> Aijun Wang
>
> China Telecom
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Monday, March 28, 2022 5:01 PM
> *To:* Aijun Wang ; Tony Li 
> *Cc:* Aijun Wang ; lsr 
> *Subject:* Re: [Lsr] Is it necessary to define new PUB/SUB model to
> monitor the node live?
>
>
>
> Aijun,
>
>
>
> > For PUAM, the receiver NEED NOT register anything.
>
> > Once the node fails, all the receivers(normally the nodes within one
> area) will be notified.
>
>
>
> That's a spec bug not a feature.
>
>
>
> Not only those egress nodes which would have otherwise register will get
> it with PUAM, but also all P nodes in the area which do not have any
> interest what so ever will also get it.
>
>
>
> Worse - EVERY IGP NODE - in all areas/levels will get it.
>
>
>
> Can't you see how bad architecturally that is ? And I do not buy the
> justification - oh this is so little or - oh this is likely to never
> happen ... If that is so why bother when you can just either do it with
> pub-sub model or simply withdraw your service routes (either one by one or
> in bulk mode) ?
>
>
>
> Thx,
> R.
>
>
>
> PS. And if you like analogies - We are here about speed to service
> restoration - correct ? So what is better - to signal node failure using as
> a carrier a local train which requires to change trains at each of say 30
> stations or put the information into a RAPID one which only stops at two
> exchange stations ?
>
>
>
>
>
>
>
> On Mon, Mar 28, 2022 at 3:15 AM Aijun Wang 
> wrote:
>
> Hi, Tony:
>
> Let’s focus on the comparison of NLP and PUAM(Prefix Unreachable
> Announcement Mechanism):
>
> For NLP, the receiver should register the interested prefixes first. Once
> the node fails, all the receivers(normally the nodes within one area) that
> register such interested prefixes will be notified.
>
> For PUAM, the receiver NEED NOT register anything.   Once
> the node fails, all the receivers(normally the nodes within one area) will
> be notified.
>
>
>
> From the POV of the receiver, if it gets the same results, why don’t
> select the approach that need less work or nothing to do?
>
>
>
> And regarding to your arguments of “Dump Truck” worry about IGP protocol,
> I think defining one new protocol does not solve the ultimate pressure on
> Router. Let’s explain this via one analogy:
>
> The customer(Operator) want the truck(IGP Protocol) to piggyback(via some
> Tag) some information, the driver(Vendor) said he can’t, because the
> truck may crush the station(Router) when it pass. The operator need another
> truck(New Protocol) to carry it.
>
>
>
> Here is the problem then, from the POV of station(Router), if it can’t
> endure the pass of one truck, why can it can stand to pass the two trucks
> at the same time?
>
> Wish you can explain the above paradox in reasonable/logical manner.
>
>
>
> Best Regards
>
>
>
> Aijun Wang
>
> China Telecom
>
>
>
> *From:* lsr-boun...@ietf.org  *On Behalf Of *Tony Li
> *Sent:* Friday, March 25, 2022 7:20 PM
> *To:* Aijun Wang 
> *Cc:* lsr@ietf.org
> *Subject:* Re: [Lsr] Is it necessary to define new PUB/SUB model to
> monitor the node live?
&g

Re: [Lsr] Is it necessary to define new PUB/SUB model to monitor the node live?

2022-03-28 Thread Robert Raszuk
Aijun,

> For PUAM, the receiver NEED NOT register anything.
> Once the node fails, all the receivers(normally the nodes within one
area) will be notified.

That's a spec bug not a feature.

Not only those egress nodes which would have otherwise register will get it
with PUAM, but also all P nodes in the area which do not have any interest
what so ever will also get it.

Worse - EVERY IGP NODE - in all areas/levels will get it.

Can't you see how bad architecturally that is ? And I do not buy the
justification - oh this is so little or - oh this is likely to never
happen ... If that is so why bother when you can just either do it with
pub-sub model or simply withdraw your service routes (either one by one or
in bulk mode) ?

Thx,
R.

PS. And if you like analogies - We are here about speed to service
restoration - correct ? So what is better - to signal node failure using as
a carrier a local train which requires to change trains at each of say 30
stations or put the information into a RAPID one which only stops at two
exchange stations ?



On Mon, Mar 28, 2022 at 3:15 AM Aijun Wang 
wrote:

> Hi, Tony:
>
> Let’s focus on the comparison of NLP and PUAM(Prefix Unreachable
> Announcement Mechanism):
>
> For NLP, the receiver should register the interested prefixes first. Once
> the node fails, all the receivers(normally the nodes within one area) that
> register such interested prefixes will be notified.
>
> For PUAM, the receiver NEED NOT register anything.   Once
> the node fails, all the receivers(normally the nodes within one area) will
> be notified.
>
>
>
> From the POV of the receiver, if it gets the same results, why don’t
> select the approach that need less work or nothing to do?
>
>
>
> And regarding to your arguments of “Dump Truck” worry about IGP protocol,
> I think defining one new protocol does not solve the ultimate pressure on
> Router. Let’s explain this via one analogy:
>
> The customer(Operator) want the truck(IGP Protocol) to piggyback(via some
> Tag) some information, the driver(Vendor) said he can’t, because the truck
> may crush the station(Router) when it pass. The operator need another
> truck(New Protocol) to carry it.
>
>
>
> Here is the problem then, from the POV of station(Router), if it can’t
> endure the pass of one truck, why can it can stand to pass the two trucks
> at the same time?
>
> Wish you can explain the above paradox in reasonable/logical manner.
>
>
>
> Best Regards
>
>
>
> Aijun Wang
>
> China Telecom
>
>
>
> *From:* lsr-boun...@ietf.org  *On Behalf Of *Tony Li
> *Sent:* Friday, March 25, 2022 7:20 PM
> *To:* Aijun Wang 
> *Cc:* lsr@ietf.org
> *Subject:* Re: [Lsr] Is it necessary to define new PUB/SUB model to
> monitor the node live?
>
>
>
>
>
> Hi Aijun,
>
>
>
> Thanks for your clarification of the NLP mechanism during the meeting.
>
> 1. Regarding to the PUB/SUB model within IETF, there are already some
> of them:
>
> 1) https://datatracker.ietf.org/doc/html/rfc8641 (Subscription to
> YANG Notifications for Datastore Updates)
>
> 2) https://datatracker.ietf.org/doc/html/draft-ietf-lisp-pubsub-09
>
> 3) https://datatracker.ietf.org/doc/html/draft-ietf-ace-pubsub-profile
>
> 4)
> https://datatracker.ietf.org/doc/html/draft-ietf-core-coap-pubsub-09
>
> Why do we need to invent the new one again?
>
>
>
>
>
> Thank you, I was unaware of these.  If the WG is interested, I’m certainly
> willing to pursue using one of these.
>
> As far as I can tell from a quick perusal, none of these is intended to be
> generic.  That is to say, none of them
>
> is a dump truck either.
>
>
>
>
>
> And, if we prefer to the PUB/SUB model to solve the discussed problem, why
> RFC8641 can’t be used?
>
>
>
>
>
> YANG is evil. :-)
>
>
>
>
>
> 2. Regarding to the NLP mechanism itself:
>
> As you explained, current NLP adopt the “Subscribe Summary Prefixes,
> Notify the failure of Specific Address” to reduces the pressures on ABRs.
> Such approach has the following drawbacks again:
>
> 1) The register should know in advance the summary prefixes that the
> peers‘ loopback address belong to. Once the summary prefixes are changed,
> such information should be updated, which will be headache for the operators
>
>
>
>
>
> Not at all. Loopback address configuration is already handled by the
> management plane. A prefix or multiple prefixes for loopback addresses will
> also be incorporated into the management plane.
>
>
>
> Modern networking assumes automation. Yes, we didn’t have it back when I
> started, but it’s there today. Admittedly, it’s not perfect and it has a
> way to go, but there are MANY organizations now that are fully automated.
> Anyone that wants to be, can be.
>
>
>
>
>
> 2) If the register subscribe the “summary prefixes”, then it will
> receives all the nodes fail notifications within this summary prefixes,
> which should be avoided when you argue that IGP flooding has such side
> effect.The results is, NLP can’t avoid it also, 

Re: [Lsr] New Subject: Is Flex-Algo One App or Many (was “RE: IETF13: Comments on The Application Specific Link Attribute (ASLA) Any Application Bit”)

2022-03-28 Thread Robert Raszuk
Hi,

> I do see new drafts that mistakenly assume ASLA can be used to advertise
application specific values.

Exactly ! And I do not blame any authors of those new drafts as for anyone
with reading skills "Application Specific Link Attribute" means links
attributes which correspond to either user application (voice, video, web
etc ...) or at best class of service (gold, silver, bronze, grey ...)

No one would think those *applications* are just some forwarding
paradigms or constructs how to compute paths from A to Z.

And I also agree that RFC8919/8920 need revision perhaps -bis to rename
misleading terminology as well as perhaps add ANY flags which can be used
for "application agnostic link attributes - AALA".

Thx,
Robert.


On Mon, Mar 28, 2022 at 7:02 AM Shraddha Hegde  wrote:

>
>
> >Calling RSVP-TE, SR, LFA or Flex-Algo as "applications" is confusing as
> those are network forwarding paradigms and >not applications.
>
> I totally agree with that. It is very confusing to call them applications
> . I do see new drafts that mistakenly assume ASLA can be used to advertise
> application specific values. What it also implies is that the way industry
> is evolving, it is required to advertise “User application” specific values
> and use them to build paths no-matter what network forwarding paradigms are
> used.
>
> Having a protocol extension that allows for wildcarding the attributes for
> these forwarding paradigms is useful.
>
>
>
> Looks like the other side of the argument is, it would have been useful if
> it was done in RFC 8919/RFC 8920 but since its an RFC now, we should carry
> that debt forever. I don’t agree with that argument and believe we still
> have opportunity to improve.
>
>
>
> Rgds
>
> Shraddha
>
>
>
>
>
>
>
> Juniper Business Use Only
>
> *From:* Robert Raszuk 
> *Sent:* Sunday, March 27, 2022 5:22 AM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* lsr ; Christian Hopps ; Shraddha
> Hegde ; Martin Horneffer 
> *Subject:* Re: [Lsr] New Subject: Is Flex-Algo One App or Many (was “RE:
> IETF13: Comments on The Application Specific Link Attribute (ASLA) Any
> Application Bit”)
>
>
>
> *[External Email. Be cautious of content]*
>
>
>
> Hi Les,
>
>
>
> Nope the abbreviation is not confusing.
>
>
>
> Calling RSVP-TE, SR, LFA or Flex-Algo as "applications" is confusing as
> those are network forwarding paradigms and not applications.
>
>
>
> Applications (read user applications which samples I provided) are running
> on top of them. What you call applications are merely different types of
> pipes to carry user applications.
>
>
>
> And that alone if you just stay focused on IGP may be all fine. But the
> moment you need to carry user applications over your (network) applications
> each with set of different colors the picture becomes very confusing.
>
>
>
> - - -
>
>
>
> In any case - aside from the above - even considering your terminology,
> physical properties of the links are not application dependent.
> Irrespective of what encapsulation you use for your traffic for example the
> value of propagation delay of the link will always be application
> independent. Hence it does make sense to advertise it with ANY wildcard
> notion.
>
>
>
> Especially that you always have the ability within each such "application"
> algorithm definition or with use of link affinities to further select which
> specific links and link attributes to use to compute an instance of a
> forwarding paradigm.
>
>
>
> Thx,
>
> R.
>
>
>
>
>
> On Sun, Mar 27, 2022 at 12:21 AM Les Ginsberg (ginsberg) <
> ginsb...@cisco.com> wrote:
>
> Robert –
>
>
>
> The complete name (as reflected in the referenced registry name) is:  Link
> Attribute Application Identifiers
>
>
>
> In the context of ASLA we tend to abbreviate that as “Application”. If you
> find that confusing, we can all try to use the more complete name.
>
> But whatever name we use, that is what is being referenced when we discuss
> the use of ASLA.
>
>
>
>Les
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Saturday, March 26, 2022 3:16 PM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* lsr ; Christian Hopps ; Shraddha
> Hegde ; Martin Horneffer 
> *Subject:* Re: [Lsr] New Subject: Is Flex-Algo One App or Many (was “RE:
> IETF13: Comments on The Application Specific Link Attribute (ASLA) Any
> Application Bit”)
>
>
>
> Hi Les,
>
>
>
> What you call "an application" is simply counter intuitive and not what
> 99.9% of people understand by this term. Application to me is a web server
>

Re: [Lsr] New Subject: Is Flex-Algo One App or Many (was “RE: IETF13: Comments on The Application Specific Link Attribute (ASLA) Any Application Bit”)

2022-03-26 Thread Robert Raszuk
Hi Les,

Nope the abbreviation is not confusing.

Calling RSVP-TE, SR, LFA or Flex-Algo as "applications" is confusing as
those are network forwarding paradigms and not applications.

Applications (read user applications which samples I provided) are running
on top of them. What you call applications are merely different types of
pipes to carry user applications.

And that alone if you just stay focused on IGP may be all fine. But the
moment you need to carry user applications over your (network) applications
each with set of different colors the picture becomes very confusing.

- - -

In any case - aside from the above - even considering your terminology,
physical properties of the links are not application dependent.
Irrespective of what encapsulation you use for your traffic for example the
value of propagation delay of the link will always be application
independent. Hence it does make sense to advertise it with ANY wildcard
notion.

Especially that you always have the ability within each such "application"
algorithm definition or with use of link affinities to further select which
specific links and link attributes to use to compute an instance of a
forwarding paradigm.

Thx,
R.


On Sun, Mar 27, 2022 at 12:21 AM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> The complete name (as reflected in the referenced registry name) is:  Link
> Attribute Application Identifiers
>
>
>
> In the context of ASLA we tend to abbreviate that as “Application”. If you
> find that confusing, we can all try to use the more complete name.
>
> But whatever name we use, that is what is being referenced when we discuss
> the use of ASLA.
>
>
>
>Les
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Saturday, March 26, 2022 3:16 PM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* lsr ; Christian Hopps ; Shraddha
> Hegde ; Martin Horneffer 
> *Subject:* Re: [Lsr] New Subject: Is Flex-Algo One App or Many (was “RE:
> IETF13: Comments on The Application Specific Link Attribute (ASLA) Any
> Application Bit”)
>
>
>
> Hi Les,
>
>
>
> What you call "an application" is simply counter intuitive and not what
> 99.9% of people understand by this term. Application to me is a web server
> running on the host waiting for user requests, SIP gateway providing VoIP
> connections, database instance running on some specific port and responding
> to SQL queries, multicast streaming etc ...
>
>
>
> Each of these real applications may benefit from different network
> transport/forwarding class.
>
>
>
> Calling a network forwarding class as "an application" only generates huge
> confusion. Networks are servants to the user applications. Networks are not
> the applications itself.
>
>
>
> As each user application may benefit from different treatment it can be
> mapped to different network transport or network color. So again network
> color could be seen as a network behaviour constructed in an optimal way to
> the user application it is designed to carry.
>
>
>
> When you say "You also can – using the algo specific FAD – specify which
> colors are to be used by a given algo." to me means that you are also
> overloading the term "color" - at least from the notion of how CAR or CT
> proposal are defining it. And what CAR/CT proposals call a transport color
> is actually in line with network forwarding class and very intuitive.
>
>
>
> As you recall a few months back I defined an IP TE solution where we can
> steer packets between nodes without any stack of labels or segments imposed
> on them upfront on ingress. That would be in your terminology new
> application - as it does not use SPF, but constructs end to end
> waypoints using its own heuristic. Then when someone would like to reuse
> link metrics already advertised for flex-algo it would need to touch all
> the links in the network to add this new app.
>
>
>
> And this would continue every time someone invents a new network
> forwarding model while reusing physical metrics already advertised with
> each link.
>
>
>
> You mentioned that one of the reasons was to clearly separate RSVP-TE from
> SR running on a link. But as we discussed you could do that with the
> "include" notion using affinity bits - not with separate R vs S bits.
>
>
>
> So while I doubt that you will adjust the terminology used in flex-algo
> draft I continue to believe that there is value with advertising link
> attributes which can be used by any current and future network forwarding
> class or color.
>
>
>
> Best,
>
> Robert.
>
>
>
>
>
>
>
>
>
> On Sat, Mar 26, 2022 at 10:27 PM Les Ginsberg (ginsberg)  40cisco@dmarc.ietf.org>

Re: [Lsr] New Subject: Is Flex-Algo One App or Many (was “RE: IETF13: Comments on The Application Specific Link Attribute (ASLA) Any Application Bit”)

2022-03-26 Thread Robert Raszuk
Hi Les,

What you call "an application" is simply counter intuitive and not what
99.9% of people understand by this term. Application to me is a web server
running on the host waiting for user requests, SIP gateway providing VoIP
connections, database instance running on some specific port and responding
to SQL queries, multicast streaming etc ...

Each of these real applications may benefit from different network
transport/forwarding class.

Calling a network forwarding class as "an application" only generates huge
confusion. Networks are servants to the user applications. Networks are not
the applications itself.

As each user application may benefit from different treatment it can be
mapped to different network transport or network color. So again network
color could be seen as a network behaviour constructed in an optimal way to
the user application it is designed to carry.

When you say "You also can – using the algo specific FAD – specify which
colors are to be used by a given algo." to me means that you are also
overloading the term "color" - at least from the notion of how CAR or CT
proposal are defining it. And what CAR/CT proposals call a transport color
is actually in line with network forwarding class and very intuitive.

As you recall a few months back I defined an IP TE solution where we can
steer packets between nodes without any stack of labels or segments imposed
on them upfront on ingress. That would be in your terminology new
application - as it does not use SPF, but constructs end to end
waypoints using its own heuristic. Then when someone would like to reuse
link metrics already advertised for flex-algo it would need to touch all
the links in the network to add this new app.

And this would continue every time someone invents a new network forwarding
model while reusing physical metrics already advertised with each link.

You mentioned that one of the reasons was to clearly separate RSVP-TE from
SR running on a link. But as we discussed you could do that with the
"include" notion using affinity bits - not with separate R vs S bits.

So while I doubt that you will adjust the terminology used in flex-algo
draft I continue to believe that there is value with advertising link
attributes which can be used by any current and future network forwarding
class or color.

Best,
Robert.




On Sat, Mar 26, 2022 at 10:27 PM Les Ginsberg (ginsberg)  wrote:

> Robert –
>
>
>
> The defined set of APPs can be seen here
> <https://www.iana.org/assignments/igp-parameters/igp-parameters.xhtml#link-attribute-application-identifiers>
> :
>
>
>
> Bit  NameReference
>
> 0 RSVP-TE (R-bit) [RFC8919]
>
> 1 Segment Routing Policy (S-bit)[RFC8919]
>
> 2 Loop Free Alternate (F-bit)   [RFC8919]
>
>
>
> Note one additional APP – Flex-Algo – is not yet reflected in this
> registry.
>
>
>
> Now, you can advertise delay and extended admin groups (EAG) for Flex-Algo.
>
> You also can – using the algo specific FAD – specify which colors are to
> be used by a given algo.
>
>
>
> I don’t know of any SPF algorithm that supports specifying a range of
> metric values as part of its constraints. It is possible to advertise a
> number of user defined metrics using the Generic Metric sub-TLV defined in
> https://datatracker.ietf.org/doc/draft-ietf-lsr-flex-algo-bw-con/ and use
> the algo specific FAD to specify which of those metrics is to be used for
> that algo. But in doing so, the App associated with each of the Generic
> Metric advertisements will be Flex-Algo.
>
>
>
> Les
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Saturday, March 26, 2022 9:10 AM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* Christian Hopps ; Shraddha Hegde <
> shrad...@juniper.net>; Martin Horneffer ; lsr <
> lsr@ietf.org>
> *Subject:* Re: New Subject: Is Flex-Algo One App or Many (was “RE: [Lsr]
> IETF13: Comments on The Application Specific Link Attribute (ASLA) Any
> Application Bit”)
>
>
>
> Les,
>
>
>
> To me what corresponds to network application is effectively a forwarding
> paradigm/topology build and used to forward corresponding traffic classe.
> You can overload application name the way you like but it does not change
> anything.
>
>
>
> So if this is going to make you more happy let's rename my
> example accordingly and let's not get hang out on flex-algo name itself.
>
>
>
> Example:
>
>
>
> link attribute:  delay
>
>
>
> applications:
>
>
>
> app_1 - build topology using SPF_algo_1 where max delay does not exceed 10
> ms - color: premium best effort
>
> app_2 - build topology using SPF_algo_2 where max delay does not exceed 8
> ms -  color: black
>

Re: [Lsr] New Subject: Is Flex-Algo One App or Many (was “RE: IETF13: Comments on The Application Specific Link Attribute (ASLA) Any Application Bit”)

2022-03-26 Thread Robert Raszuk
Les,

To me what corresponds to network application is effectively a forwarding
paradigm/topology build and used to forward corresponding traffic classe.
You can overload application name the way you like but it does not change
anything.

So if this is going to make you more happy let's rename my
example accordingly and let's not get hang out on flex-algo name itself.

Example:

link attribute:  delay

applications:

app_1 - build topology using SPF_algo_1 where max delay does not exceed 10
ms - color: premium best effort
app_2 - build topology using SPF_algo_2 where max delay does not exceed 8
ms -  color: black
app_3 - build topology using SPF_algo_3 where max delay does not exceed 6
ms - color: bronze
app_4 - build topology using SPF_algo_4 where max delay does not exceed 4
ms - color: blue
app_5 - build topology using SPF_algo_5 where max delay does not exceed 3
ms - color: silver
app_6 - build topology using SPF_algo_6 where max delay does not exceed 1
ms - color: gold

etc ...

Now tell me how does it make sense to enable each app on the link delay
attribute ?

Thx,
R.


On Sat, Mar 26, 2022 at 4:56 PM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> I changed the subject – because what you are talking about has nothing to
> do w the discussion of ANY bit.
>
>
>
> You have mentioned this before – and been corrected before – but it seems
> that did not alter your thinking.
>
>
>
> Flex-Algo is ONE APP.
>
> There is not a bit in the SABM assigned per algo – nor is there any intent
> to do so.
>
> Nor does ANY bit introduce the notion of one bit per algo.
>
>
>
> All link attributes advertised with the Flex-Algo (X-bit) in the SABM are
> usable by any algo. Same would be true if you used ALL encoding. Same would
> be true if you used ANY encoding.
>
> Which ones are used and how they are used (e.g., which affinity bits apply
> to a given algorithm) is determined by the Algorithm Specific FAD.
>
>
>
> Either you don’t understand Flex-Algo – or you do understand it and want
> to do a major rewrite of it – I am not sure which.
>
> But either way, none of this has anything to do with the original thread.
>
>
>
>Les
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Saturday, March 26, 2022 3:43 AM
> *To:* Christian Hopps 
> *Cc:* Les Ginsberg (ginsberg) ; Shraddha Hegde <
> shrad...@juniper.net>; Martin Horneffer ; lsr <
> lsr@ietf.org>
> *Subject:* Re: [Lsr] IETF13: Comments on The Application Specific Link
> Attribute (ASLA) Any Application Bit
>
>
>
> Hi Chris,
>
>
>
> It seems that there is a subtle but important element on which we may have
> different opinion.
>
>
>
> You said: "has to deploy new software that contains the new Wizbang
> feature, right?"
>
>
>
> IMO however we are dealing with case where software already supports all
> required functions on a box. It is just not using it from day one. You buy
> a router with OS features which allow you to build zoo of different
> forwarding paradigms.
>
>
>
> Say day one you see a need to enable flex-algo_1 You enable day one links
> to distribute link attributes required for this.
>
>
>
> Day two you want to define new FAD and flood this enabling new
> flex-algo_2. You reuse already present link attributes entirely or
> partially in flex-algo_2 computation. You do not need to touch 10s of
> links each time you enable new flex_algo.
>
>
>
> That's a selling point to me.
>
>
>
> If we would expect that folks will limit flex-algo to just a few maybe
> this all does not matter. But if we see proposals with rainbow of colors
> each mapped to different flex-algo and perhaps subtle forwarding difference
> (same metric but just a different min threshold per each flex-algo) it
> seems pretty bad idea to explicitly enable each app each time such new
> threshold used to build different topology.
>
>
>
> Example:
>
>
>
> link attribute:  delay
>
>
>
> applications:
>
>
>
> flex-algo_1 - build topology where max delay does not exceed 10 ms -
> color: premium best effort
>
> flex-algo_2 - build topology where max delay does not exceed 8 ms -
> color: black
>
> flex-algo_3 - build topology where max delay does not exceed 6 ms - color:
> bronze
>
> flex-algo_4 - build topology where max delay does not exceed 4 ms - color:
> blue
>
> flex-algo_5 - build topology where max delay does not exceed 3 ms - color:
> silver
>
> flex-algo_6 - build topology where max delay does not exceed 3 ms - color:
> gold
>
>
>
> etc ...
>
>
>
> Now tell me how does it make sense to enable each app on the link delay
> attribute ?
>
>
>
> Cheers,
&

Re: [Lsr] IETF13: Comments on The Application Specific Link Attribute (ASLA) Any Application Bit

2022-03-26 Thread Robert Raszuk
Hi Chris,

It seems that there is a subtle but important element on which we may have
different opinion.

You said: "has to deploy new software that contains the new Wizbang
feature, right?"

IMO however we are dealing with case where software already supports all
required functions on a box. It is just not using it from day one. You buy
a router with OS features which allow you to build zoo of different
forwarding paradigms.

Say day one you see a need to enable flex-algo_1 You enable day one links
to distribute link attributes required for this.

Day two you want to define new FAD and flood this enabling new flex-algo_2.
You reuse already present link attributes entirely or partially in
flex-algo_2 computation. You do not need to touch 10s of links each
time you enable new flex_algo.

That's a selling point to me.

If we would expect that folks will limit flex-algo to just a few maybe this
all does not matter. But if we see proposals with rainbow of colors each
mapped to different flex-algo and perhaps subtle forwarding difference
(same metric but just a different min threshold per each flex-algo) it
seems pretty bad idea to explicitly enable each app each time such new
threshold used to build different topology.

Example:

link attribute:  delay

applications:

flex-algo_1 - build topology where max delay does not exceed 10 ms - color:
premium best effort
flex-algo_2 - build topology where max delay does not exceed 8 ms -  color:
black
flex-algo_3 - build topology where max delay does not exceed 6 ms - color:
bronze
flex-algo_4 - build topology where max delay does not exceed 4 ms - color:
blue
flex-algo_5 - build topology where max delay does not exceed 3 ms - color:
silver
flex-algo_6 - build topology where max delay does not exceed 3 ms - color:
gold

etc ...

Now tell me how does it make sense to enable each app on the link delay
attribute ?

Cheers,
Robert



On Sat, Mar 26, 2022 at 6:42 AM Christian Hopps  wrote:

>
> Robert Raszuk  writes:
>
> > Les,
> >
> > I don't think this is noise.
> >
> > Your examples are missing key operational consideration .. Link
> > attribute applicable to ANY application may be advertised well ahead
> > of enabling such application in a network.
> >
> > So requesting operator to always advertise tuple of app + attr is not
> > looking forward and makes unnecessary operational burden.
>
> [as wg-member]
>
> Hi Robert,
>
> Originally this was the argument that sort of put wind in the sails (for
> me) for this any bit, but some more thinking about how things would really
> work changed my mind.
>
> In order for some new feature, let's call it Wizbang, to take advantage of
> existing any bit marked attributes, the operator still has to deploy new
> software that contains the new Wizbang feature, right? So the addition of a
> new Wizbang bit pretty much comes free for the operator.
>
> So, this draft really is just about making the encoding a bit more
> efficient.
>
> I think if we were defining a new encoding, having this functionality
> makes sense, but we aren't defining a new encoding. The proposal is to
> change an existing published encoding, and the bar has to be higher for
> that I think.
>
> Thanks,
> Chris.
> [as wg member]
>
>
> >
> > Thx.
> > R.
> >
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] IETF13: Comments on The Application Specific Link Attribute (ASLA) Any Application Bit

2022-03-25 Thread Robert Raszuk
Les,

I don't think this is noise.

Your examples are missing key operational consideration .. Link attribute
applicable to ANY application may be advertised well ahead of enabling such
application in a network.

So requesting operator to always advertise tuple of app + attr is not
looking forward and makes unnecessary operational burden.

Thx.
R.




On Fri, Mar 25, 2022 at 1:11 PM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> I have documented several different scenarios and how each approach
> encodes them and their comparable byte usage.
>
> Your characterization isn’t accurate.
>
>
>
> I don’t think this sub-thread is useful – you are adding noise to the
> conversation.
>
>
>
> If you want to discuss offline – please contact me directly.
>
>
>
>Les
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Friday, March 25, 2022 5:06 AM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* Shraddha Hegde ; Martin Horneffer <
> m...@lab.dtag.de>; lsr@ietf.org
> *Subject:* Re: [Lsr] IETF13: Comments on The Application Specific Link
> Attribute (ASLA) Any Application Bit
>
>
>
> Les,
>
>
>
> There is clearly a valid case for having attributes common to all
> applications. Duplicating those on a per application is perhaps a way, but
> is this the most efficient way encoding wise ?
>
>
>
> Thx,
> R.
>
>
>
> On Fri, Mar 25, 2022 at 1:02 PM Les Ginsberg (ginsberg) <
> ginsb...@cisco.com> wrote:
>
> Robert –
>
>
>
> As I have stated in my original comments:
>
>
>
> 
>
> Existing encoding defined in RFCs 8919/8920 is fully functional i.e., the
> introduction of ANY application does not add support for deployment
> scenarios that could not otherwise be supported.
>
> 
>
>
>
> I don’t think there is anything else that needs to be said.
>
>
>
>Les
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Friday, March 25, 2022 4:53 AM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* Shraddha Hegde ; Martin Horneffer <
> m...@lab.dtag.de>; lsr@ietf.org
> *Subject:* Re: [Lsr] IETF13: Comments on The Application Specific Link
> Attribute (ASLA) Any Application Bit
>
>
>
>
>
> I am afraid not .. I am exploring how to solve requirement. Proposed
> method is natural and reuses ASLA encoding. There is apparently a wall this
> proposal is facing.
>
>
>
> So trying to see how to bypass that wall (or jump over it).
>
>
>
> Cheers,
>
> R.
>
>
>
>
>
>
>
> On Fri, Mar 25, 2022 at 12:50 PM Les Ginsberg (ginsberg) <
> ginsb...@cisco.com> wrote:
>
> Robert –
>
>
>
> I find your post a non-sequitar.
>
>
>
> ASLA ANY is NOT discussing advertising a new attribute. It is discussing a
> new format for the ASLA sub-TLV container inside of which link attribute
> values are advertised.
>
>
>
> You are off-topic with your post.
>
>
>
>Les
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Friday, March 25, 2022 4:41 AM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* Shraddha Hegde ; Martin Horneffer <
> m...@lab.dtag.de>; lsr@ietf.org
> *Subject:* Re: [Lsr] IETF13: Comments on The Application Specific Link
> Attribute (ASLA) Any Application Bit
>
>
>
> Hi Les,
>
>
>
> Let me clarify if I read you correctly ...
>
>
>
> Are you saying that because quoted RFCs have been published in Oct 2020 no
> one has a right to define any new standard link attributes any more as
> implementations are closed ? Including those which would be common to all
> applications on a link ?
>
>
>
> So if anyone would like to add their own new attributes one needs to
> define new sub-TLV codepoint to encode it outside of ASLA ? If so please
> state it then we could update the draft accordingly.
>
>
>
> Yes deployment of those is a challenge, but I am afraid this is a
> challenge with every protocol extensions unless we use mechanism like
> capabilities which will assist in an automated way.
>
>
>
> Thx,
>
> Robert.
>
>
>
>
>
> On Fri, Mar 25, 2022 at 11:39 AM Les Ginsberg (ginsberg)  40cisco@dmarc.ietf.org> wrote:
>
> Shraddha -
>
> RFCs 8919/8920 were published in October 2020 - and implementations based
> on draft versions of those document date back as much as two years before
> that. They all are written to support the encoding formats defined in the
> RFCs.
>
> The introduction of an additional way of encoding the same information is
> anything but "simplifying". As I have described below, you are imposing a
> requirement for ALL ASLA implementations to support an additional enc

Re: [Lsr] IETF13: Comments on The Application Specific Link Attribute (ASLA) Any Application Bit

2022-03-25 Thread Robert Raszuk
Les,

There is clearly a valid case for having attributes common to all
applications. Duplicating those on a per application is perhaps a way, but
is this the most efficient way encoding wise ?

Thx,
R.

On Fri, Mar 25, 2022 at 1:02 PM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> As I have stated in my original comments:
>
>
>
> 
>
> Existing encoding defined in RFCs 8919/8920 is fully functional i.e., the
> introduction of ANY application does not add support for deployment
> scenarios that could not otherwise be supported.
>
> 
>
>
>
> I don’t think there is anything else that needs to be said.
>
>
>
>Les
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Friday, March 25, 2022 4:53 AM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* Shraddha Hegde ; Martin Horneffer <
> m...@lab.dtag.de>; lsr@ietf.org
> *Subject:* Re: [Lsr] IETF13: Comments on The Application Specific Link
> Attribute (ASLA) Any Application Bit
>
>
>
>
>
> I am afraid not .. I am exploring how to solve requirement. Proposed
> method is natural and reuses ASLA encoding. There is apparently a wall this
> proposal is facing.
>
>
>
> So trying to see how to bypass that wall (or jump over it).
>
>
>
> Cheers,
>
> R.
>
>
>
>
>
>
>
> On Fri, Mar 25, 2022 at 12:50 PM Les Ginsberg (ginsberg) <
> ginsb...@cisco.com> wrote:
>
> Robert –
>
>
>
> I find your post a non-sequitar.
>
>
>
> ASLA ANY is NOT discussing advertising a new attribute. It is discussing a
> new format for the ASLA sub-TLV container inside of which link attribute
> values are advertised.
>
>
>
> You are off-topic with your post.
>
>
>
>Les
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Friday, March 25, 2022 4:41 AM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* Shraddha Hegde ; Martin Horneffer <
> m...@lab.dtag.de>; lsr@ietf.org
> *Subject:* Re: [Lsr] IETF13: Comments on The Application Specific Link
> Attribute (ASLA) Any Application Bit
>
>
>
> Hi Les,
>
>
>
> Let me clarify if I read you correctly ...
>
>
>
> Are you saying that because quoted RFCs have been published in Oct 2020 no
> one has a right to define any new standard link attributes any more as
> implementations are closed ? Including those which would be common to all
> applications on a link ?
>
>
>
> So if anyone would like to add their own new attributes one needs to
> define new sub-TLV codepoint to encode it outside of ASLA ? If so please
> state it then we could update the draft accordingly.
>
>
>
> Yes deployment of those is a challenge, but I am afraid this is a
> challenge with every protocol extensions unless we use mechanism like
> capabilities which will assist in an automated way.
>
>
>
> Thx,
>
> Robert.
>
>
>
>
>
> On Fri, Mar 25, 2022 at 11:39 AM Les Ginsberg (ginsberg)  40cisco@dmarc.ietf.org> wrote:
>
> Shraddha -
>
> RFCs 8919/8920 were published in October 2020 - and implementations based
> on draft versions of those document date back as much as two years before
> that. They all are written to support the encoding formats defined in the
> RFCs.
>
> The introduction of an additional way of encoding the same information is
> anything but "simplifying". As I have described below, you are imposing a
> requirement for ALL ASLA implementations to support an additional encoding
> format - at least for receiving.
> This does not simplify implementations - it further complicates them. And
> it increases the possibility of interoperability issues.
>
> It also does not simplify deployments. You impose requirements on
> operators to track which of the multiple formats are supported by each of
> the router versions deployed in their network so they can decide when it is
> safe to enable sending of the new format.
>
> For any protocol extension, there are almost always multiple possible
> syntaxes which could have been defined to advertise the necessary
> information. The point of having a standard is to define an agreed upon
> format so that interoperability can be achieved.
>
> Please do not complicate implementations/deployments for a feature which
> already has a fully functional standard and multiple implementations
> deployed based on that standard.
>
>Les
>
>
> > -Original Message-
> > From: Lsr  On Behalf Of Shraddha Hegde
> > Sent: Friday, March 25, 2022 2:22 AM
> > To: Les Ginsberg (ginsberg) ;
> Martin
> > Horneffer ; lsr@ietf.org
> > Subject: Re: [Lsr] IETF13: Comments on The Application Specific Link
> Attribute
> > (ASLA)

Re: [Lsr] IETF13: Comments on The Application Specific Link Attribute (ASLA) Any Application Bit

2022-03-25 Thread Robert Raszuk
I am afraid not .. I am exploring how to solve requirement. Proposed
method is natural and reuses ASLA encoding. There is apparently a wall this
proposal is facing.

So trying to see how to bypass that wall (or jump over it).

Cheers,
R.



On Fri, Mar 25, 2022 at 12:50 PM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> I find your post a non-sequitar.
>
>
>
> ASLA ANY is NOT discussing advertising a new attribute. It is discussing a
> new format for the ASLA sub-TLV container inside of which link attribute
> values are advertised.
>
>
>
> You are off-topic with your post.
>
>
>
>Les
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Friday, March 25, 2022 4:41 AM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* Shraddha Hegde ; Martin Horneffer <
> m...@lab.dtag.de>; lsr@ietf.org
> *Subject:* Re: [Lsr] IETF13: Comments on The Application Specific Link
> Attribute (ASLA) Any Application Bit
>
>
>
> Hi Les,
>
>
>
> Let me clarify if I read you correctly ...
>
>
>
> Are you saying that because quoted RFCs have been published in Oct 2020 no
> one has a right to define any new standard link attributes any more as
> implementations are closed ? Including those which would be common to all
> applications on a link ?
>
>
>
> So if anyone would like to add their own new attributes one needs to
> define new sub-TLV codepoint to encode it outside of ASLA ? If so please
> state it then we could update the draft accordingly.
>
>
>
> Yes deployment of those is a challenge, but I am afraid this is a
> challenge with every protocol extensions unless we use mechanism like
> capabilities which will assist in an automated way.
>
>
>
> Thx,
>
> Robert.
>
>
>
>
>
> On Fri, Mar 25, 2022 at 11:39 AM Les Ginsberg (ginsberg)  40cisco@dmarc.ietf.org> wrote:
>
> Shraddha -
>
> RFCs 8919/8920 were published in October 2020 - and implementations based
> on draft versions of those document date back as much as two years before
> that. They all are written to support the encoding formats defined in the
> RFCs.
>
> The introduction of an additional way of encoding the same information is
> anything but "simplifying". As I have described below, you are imposing a
> requirement for ALL ASLA implementations to support an additional encoding
> format - at least for receiving.
> This does not simplify implementations - it further complicates them. And
> it increases the possibility of interoperability issues.
>
> It also does not simplify deployments. You impose requirements on
> operators to track which of the multiple formats are supported by each of
> the router versions deployed in their network so they can decide when it is
> safe to enable sending of the new format.
>
> For any protocol extension, there are almost always multiple possible
> syntaxes which could have been defined to advertise the necessary
> information. The point of having a standard is to define an agreed upon
> format so that interoperability can be achieved.
>
> Please do not complicate implementations/deployments for a feature which
> already has a fully functional standard and multiple implementations
> deployed based on that standard.
>
>Les
>
>
> > -Original Message-
> > From: Lsr  On Behalf Of Shraddha Hegde
> > Sent: Friday, March 25, 2022 2:22 AM
> > To: Les Ginsberg (ginsberg) ;
> Martin
> > Horneffer ; lsr@ietf.org
> > Subject: Re: [Lsr] IETF13: Comments on The Application Specific Link
> Attribute
> > (ASLA) Any Application Bit
> >
> >
> > I believe we still have an opportunity to simplify ASLA as it is not
> that widely
> > deployed.
> > The inter-operability issues are almost always due to unclear and
> ambiguous
> > documentation
> > of standards. All we need is to ensure is that the protocol  extensions
> have
> > unambiguous documentation.
> >
> > The two main advantages of Any app are efficiency and simplicity.
> > The encoding efficiency of any app for common cases has been
> > demonstrated in the presentation.
> > The one byte overhead that Les brings about is a distraction. It's a
> always a
> > fixed additional one byte
> > for all vs any ,which is negligible whereas the benefits demonstrated
> for any
> > can be more if
> > more attributes fall in the same category.
> >
> > Rgds
> > Shraddha
> >
> >
> >
> > Juniper Business Use Only
> >
> > -Original Message-
> > From: Lsr  On Behalf Of Les Ginsberg (ginsberg)
> > Sent: Thursday, March 24, 2022 9:28 PM
> > To: Martin H

Re: [Lsr] IETF13: Comments on The Application Specific Link Attribute (ASLA) Any Application Bit

2022-03-25 Thread Robert Raszuk
Hi Les,

Let me clarify if I read you correctly ...

Are you saying that because quoted RFCs have been published in Oct 2020 no
one has a right to define any new standard link attributes any more as
implementations are closed ? Including those which would be common to all
applications on a link ?

So if anyone would like to add their own new attributes one needs to define
new sub-TLV codepoint to encode it outside of ASLA ? If so please state it
then we could update the draft accordingly.

Yes deployment of those is a challenge, but I am afraid this is a
challenge with every protocol extensions unless we use mechanism like
capabilities which will assist in an automated way.

Thx,
Robert.


On Fri, Mar 25, 2022 at 11:39 AM Les Ginsberg (ginsberg)  wrote:

> Shraddha -
>
> RFCs 8919/8920 were published in October 2020 - and implementations based
> on draft versions of those document date back as much as two years before
> that. They all are written to support the encoding formats defined in the
> RFCs.
>
> The introduction of an additional way of encoding the same information is
> anything but "simplifying". As I have described below, you are imposing a
> requirement for ALL ASLA implementations to support an additional encoding
> format - at least for receiving.
> This does not simplify implementations - it further complicates them. And
> it increases the possibility of interoperability issues.
>
> It also does not simplify deployments. You impose requirements on
> operators to track which of the multiple formats are supported by each of
> the router versions deployed in their network so they can decide when it is
> safe to enable sending of the new format.
>
> For any protocol extension, there are almost always multiple possible
> syntaxes which could have been defined to advertise the necessary
> information. The point of having a standard is to define an agreed upon
> format so that interoperability can be achieved.
>
> Please do not complicate implementations/deployments for a feature which
> already has a fully functional standard and multiple implementations
> deployed based on that standard.
>
>Les
>
>
> > -Original Message-
> > From: Lsr  On Behalf Of Shraddha Hegde
> > Sent: Friday, March 25, 2022 2:22 AM
> > To: Les Ginsberg (ginsberg) ;
> Martin
> > Horneffer ; lsr@ietf.org
> > Subject: Re: [Lsr] IETF13: Comments on The Application Specific Link
> Attribute
> > (ASLA) Any Application Bit
> >
> >
> > I believe we still have an opportunity to simplify ASLA as it is not
> that widely
> > deployed.
> > The inter-operability issues are almost always due to unclear and
> ambiguous
> > documentation
> > of standards. All we need is to ensure is that the protocol  extensions
> have
> > unambiguous documentation.
> >
> > The two main advantages of Any app are efficiency and simplicity.
> > The encoding efficiency of any app for common cases has been
> > demonstrated in the presentation.
> > The one byte overhead that Les brings about is a distraction. It's a
> always a
> > fixed additional one byte
> > for all vs any ,which is negligible whereas the benefits demonstrated
> for any
> > can be more if
> > more attributes fall in the same category.
> >
> > Rgds
> > Shraddha
> >
> >
> >
> > Juniper Business Use Only
> >
> > -Original Message-
> > From: Lsr  On Behalf Of Les Ginsberg (ginsberg)
> > Sent: Thursday, March 24, 2022 9:28 PM
> > To: Martin Horneffer ; lsr@ietf.org
> > Subject: Re: [Lsr] IETF13: Comments on The Application Specific Link
> Attribute
> > (ASLA) Any Application Bit
> >
> > [External Email. Be cautious of content]
> >
> >
> > Martin -
> >
> > I hear you.
> >
> > The reality is that ASLA need not be that complex.
> >
> > In many deployments life is simple. There are a small number of
> applications
> > using the same set of values on each link.
> > >From an encoding standpoint, all that needs to be done is to send a
> single
> > ASLA sub-TLV that lists the applications and the link attributes.
> >
> > The use of ALL applications is only an encoding optimization - it isn't
> required.
> > In hindsight, maybe we should never have defined it - but it seemed like
> a
> > nice optimization at the time.
> > But certainly,  we should not further complicate things - both for
> > implementations and deployments - by defining yet another encoding
> > option. As you suggest below, this increases the possibility of
> interoperability
> > issues w/o providing significant benefit.
> >
> > ASLA is needed. There are real world examples where it is necessary to
> > identify on each link which applications are using the advertised link
> attribute
> > values.
> >
> >Les
> >
> >
> > > -Original Message-
> > > From: Lsr  On Behalf Of Martin Horneffer
> > > Sent: Thursday, March 24, 2022 8:18 AM
> > > To: lsr@ietf.org
> > > Subject: Re: [Lsr] IETF13: Comments on The Application Specific Link
> > > Attribute
> > > (ASLA) Any Application Bit
> > >
> > > Dear WG,
> > >
> > > as 

Re: [Lsr] OSPF Monitor Node (draft-retana-lsr-ospf-monitor-node)

2022-03-08 Thread Robert Raszuk
Hi Acee,

Thank you for forwarding this. Yes I personally missed RFC8770 and
discussions on the list about it. It went smooth and quiet during fall 2019
so it was hard to notice :-)

That was exactly what I was looking for. Is there implementation report
documented anywhere ? I checked LSR WG wiki page but not much content there
...

Best,
Robert.



On Tue, Mar 8, 2022 at 3:11 PM Acee Lindem (acee)  wrote:

> Hi Robert,
>
>
>
> *From: *Robert Raszuk 
> *Date: *Tuesday, March 8, 2022 at 7:00 AM
> *To: *Acee Lindem 
> *Cc: *Aijun Wang , Alvaro Retana <
> alvaro.ret...@futurewei.com>, Lin Han , "
> lsr@ietf.org" 
> *Subject: *Re: [Lsr] OSPF Monitor Node
> (draft-retana-lsr-ospf-monitor-node)
>
>
>
> Can you please list those standards ?
>
>
>
> OSPFv3 -- RFC 5340 (Router-LSA R-Bit)
>
> OSPFv2 – RFC 8770
>
>RFC 6870 – Hiding Transit-Only Networks (could be used
> for monitoring link(s))
>
>
>
> Another option is to simply not advertise a Router-LSA, this would not
> prevent the adjacency from coming up and the bi-directional check in the
> OSPF SPF would prevent the router from being added to the OSPF topology.
>
>
>
> So, the only gaps we have here are in the understanding of the OSPF
> protocol and reading of the previous Email thread (hopefully, neither of
> those will require standardization).
>
>
>
> Thanks,
>
> Acee
>
>
>
>
>
> Thank you,
>
> R.
>
>
>
> On Tue, Mar 8, 2022 at 12:36 PM Acee Lindem (acee)  wrote:
>
> Hi Robert,
>
>
>
> *From: *Robert Raszuk 
> *Date: *Tuesday, March 8, 2022 at 4:09 AM
> *To: *Acee Lindem 
> *Cc: *Aijun Wang , Alvaro Retana <
> alvaro.ret...@futurewei.com>, Lin Han , "
> lsr@ietf.org" 
> *Subject: *Re: [Lsr] OSPF Monitor Node
> (draft-retana-lsr-ospf-monitor-node)
>
>
>
> Hi Acee,
>
>
>
> Imagine that I would like to place bunch of IGP nodes as anchors just for
> the purpose of network testing ... Never to include them in topology for
> transit.
>
>
>
> There are already standards to do this in both OSPFv2 and OSPFv3. No gaps…
>
>
>
> Thanks,
> Acee
>
>
>
> How would I advertise SR segment endpoint (say using SR-MPLS) from such
> nodes to construct paths ? Sure we could play with max-metric,  but as we
> discussed recently those nodes marked as such are still part of full
> topology graph - just being discouraged to be used.
>
>
>
> That is why I asked for extension to be a controller. IMO there is gap
> between passive node and active node which would be cool to fill.
>
>
>
> Thx,
> R.
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Mar 8, 2022 at 4:02 AM Acee Lindem (acee)  wrote:
>
> Hi Aijun,
>
>
>
>
>
>
>
> *From: *Aijun Wang 
> *Date: *Monday, March 7, 2022 at 9:41 PM
> *To: *Acee Lindem , Robert Raszuk ,
> 'Alvaro Retana' 
> *Cc: *'Lin Han' , "lsr@ietf.org" 
> *Subject: *RE: [Lsr] OSPF Monitor Node
> (draft-retana-lsr-ospf-monitor-node)
>
>
>
> Hi, Acee:
>
>
>
> The R-bit/H-bit is used to divert the transit traffic, but there still be
> traffic to the advertising node itself.
>
> It seems that the monitor node just want to the topology information from
> the network, but not any other forwarding traffic?
>
> In my POV, these special nodes are all connected by the “Stub Link”, we
> can unify them under different “Stub Link” Type:
>
> For example:
>
> For R-bit(Clear)/H-bit(Set) Node, the “Stub Link” Type should be “Passive
> Only Mode” , that is, the interface in such mode will only receive the
> LSAs from other end, but does not advertise any LSA to other end.
>
> For Monitor Node, the “Stub Link” should be “Active Only Mode”, that is
> the interface in such mode will only send the LSAs to other end, but does
> not receive any LSA from other end.
>
>
>
> If you reread my recommendation you’ll note that to avoid local traffic,
> you simply don’t advertise the stub links. Why would you advertise them
> with an option not to use them?  All the machinery for passive
> monitoring exists, no need to invent anything.
>
>
>
> Thanks,
> Acee
>
>
>
>
>
> Should we unified such requirements in such way then?
>
>
>
> Best Regards
>
>
>
> Aijun Wang
>
> China Telecom
>
>
>
> *From:* lsr-boun...@ietf.org  *On Behalf Of *Acee
> Lindem (acee)
> *Sent:* Monday, March 7, 2022 11:57 PM
> *To:* Robert Raszuk ; Alvaro Retana <
> alvaro.ret...@futurewei.com>
> *Cc:* Lin Han ; lsr@ietf.org
> *Subject:* Re: [Lsr] OSPF Monitor Node
> (draft-retana-l

Re: [Lsr] OSPF Monitor Node (draft-retana-lsr-ospf-monitor-node)

2022-03-08 Thread Robert Raszuk
Can you please list those standards ?

Thank you,
R.

On Tue, Mar 8, 2022 at 12:36 PM Acee Lindem (acee)  wrote:

> Hi Robert,
>
>
>
> *From: *Robert Raszuk 
> *Date: *Tuesday, March 8, 2022 at 4:09 AM
> *To: *Acee Lindem 
> *Cc: *Aijun Wang , Alvaro Retana <
> alvaro.ret...@futurewei.com>, Lin Han , "
> lsr@ietf.org" 
> *Subject: *Re: [Lsr] OSPF Monitor Node
> (draft-retana-lsr-ospf-monitor-node)
>
>
>
> Hi Acee,
>
>
>
> Imagine that I would like to place bunch of IGP nodes as anchors just for
> the purpose of network testing ... Never to include them in topology for
> transit.
>
>
>
> There are already standards to do this in both OSPFv2 and OSPFv3. No gaps…
>
>
>
> Thanks,
> Acee
>
>
>
> How would I advertise SR segment endpoint (say using SR-MPLS) from such
> nodes to construct paths ? Sure we could play with max-metric,  but as we
> discussed recently those nodes marked as such are still part of full
> topology graph - just being discouraged to be used.
>
>
>
> That is why I asked for extension to be a controller. IMO there is gap
> between passive node and active node which would be cool to fill.
>
>
>
> Thx,
> R.
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Mar 8, 2022 at 4:02 AM Acee Lindem (acee)  wrote:
>
> Hi Aijun,
>
>
>
>
>
>
>
> *From: *Aijun Wang 
> *Date: *Monday, March 7, 2022 at 9:41 PM
> *To: *Acee Lindem , Robert Raszuk ,
> 'Alvaro Retana' 
> *Cc: *'Lin Han' , "lsr@ietf.org" 
> *Subject: *RE: [Lsr] OSPF Monitor Node
> (draft-retana-lsr-ospf-monitor-node)
>
>
>
> Hi, Acee:
>
>
>
> The R-bit/H-bit is used to divert the transit traffic, but there still be
> traffic to the advertising node itself.
>
> It seems that the monitor node just want to the topology information from
> the network, but not any other forwarding traffic?
>
> In my POV, these special nodes are all connected by the “Stub Link”, we
> can unify them under different “Stub Link” Type:
>
> For example:
>
> For R-bit(Clear)/H-bit(Set) Node, the “Stub Link” Type should be “Passive
> Only Mode” , that is, the interface in such mode will only receive the
> LSAs from other end, but does not advertise any LSA to other end.
>
> For Monitor Node, the “Stub Link” should be “Active Only Mode”, that is
> the interface in such mode will only send the LSAs to other end, but does
> not receive any LSA from other end.
>
>
>
> If you reread my recommendation you’ll note that to avoid local traffic,
> you simply don’t advertise the stub links. Why would you advertise them
> with an option not to use them?  All the machinery for passive
> monitoring exists, no need to invent anything.
>
>
>
> Thanks,
> Acee
>
>
>
>
>
> Should we unified such requirements in such way then?
>
>
>
> Best Regards
>
>
>
> Aijun Wang
>
> China Telecom
>
>
>
> *From:* lsr-boun...@ietf.org  *On Behalf Of *Acee
> Lindem (acee)
> *Sent:* Monday, March 7, 2022 11:57 PM
> *To:* Robert Raszuk ; Alvaro Retana <
> alvaro.ret...@futurewei.com>
> *Cc:* Lin Han ; lsr@ietf.org
> *Subject:* Re: [Lsr] OSPF Monitor Node
> (draft-retana-lsr-ospf-monitor-node)
>
>
>
> Speaking as WG member:
>
>
>
> I was going to wait to comment on this due to more important tasks but it
> appears the discussion is under way. This requirement surfaced about 25-30
> years back. In fact, there was one SP (who will remain anonymous) that
> actually had a OSPF monitoring function that kept OSPF neighbors in
> Exchange state indefinitely just to learn the topology w/o participating in
> it. This wrecked with implementations trying to recover sessions that
> weren’t making progress in transition to Full state.
>
>
>
> For OSPFv3, we already have and have always had the Router-LSA R-bit to
> prevent a router from being used to in the topology.
>
>
>
> In OSPFv2, we have RFC 8770 which prevents an OSPFv2 router from being
> used for transit traffic. Now you can argue the stub links are still being.
> However, for these you could either use an unnumbered link or simply omit
> the stub-links from your router LSA. Or use RFC 6860 to hide them.
>
>
>
> Now one could argue that you still have these links in your topology.
> However, they are essentially “bridges to nowhere”. If you really don’t
> want them, then just don’t advertise them in the monitoring node’s
> Router-LSA.
>
>
>
> After 30 years of this requirement already being satisfied, I see no
> reason to introduce new machinery into the protocols. To me, this seems
> like a draft that the OSPF protocol(

Re: [Lsr] OSPF Monitor Node (draft-retana-lsr-ospf-monitor-node)

2022-03-08 Thread Robert Raszuk
Hi Acee,

Imagine that I would like to place bunch of IGP nodes as anchors just for
the purpose of network testing ... Never to include them in topology for
transit.

How would I advertise SR segment endpoint (say using SR-MPLS) from such
nodes to construct paths ? Sure we could play with max-metric,  but as we
discussed recently those nodes marked as such are still part of full
topology graph - just being discouraged to be used.

That is why I asked for extension to be a controller. IMO there is gap
between passive node and active node which would be cool to fill.

Thx,
R.





On Tue, Mar 8, 2022 at 4:02 AM Acee Lindem (acee)  wrote:

> Hi Aijun,
>
>
>
>
>
>
>
> *From: *Aijun Wang 
> *Date: *Monday, March 7, 2022 at 9:41 PM
> *To: *Acee Lindem , Robert Raszuk ,
> 'Alvaro Retana' 
> *Cc: *'Lin Han' , "lsr@ietf.org" 
> *Subject: *RE: [Lsr] OSPF Monitor Node
> (draft-retana-lsr-ospf-monitor-node)
>
>
>
> Hi, Acee:
>
>
>
> The R-bit/H-bit is used to divert the transit traffic, but there still be
> traffic to the advertising node itself.
>
> It seems that the monitor node just want to the topology information from
> the network, but not any other forwarding traffic?
>
> In my POV, these special nodes are all connected by the “Stub Link”, we
> can unify them under different “Stub Link” Type:
>
> For example:
>
> For R-bit(Clear)/H-bit(Set) Node, the “Stub Link” Type should be “Passive
> Only Mode” , that is, the interface in such mode will only receive the
> LSAs from other end, but does not advertise any LSA to other end.
>
> For Monitor Node, the “Stub Link” should be “Active Only Mode”, that is
> the interface in such mode will only send the LSAs to other end, but does
> not receive any LSA from other end.
>
>
>
> If you reread my recommendation you’ll note that to avoid local traffic,
> you simply don’t advertise the stub links. Why would you advertise them
> with an option not to use them?  All the machinery for passive
> monitoring exists, no need to invent anything.
>
>
>
> Thanks,
> Acee
>
>
>
>
>
> Should we unified such requirements in such way then?
>
>
>
> Best Regards
>
>
>
> Aijun Wang
>
> China Telecom
>
>
>
> *From:* lsr-boun...@ietf.org  *On Behalf Of *Acee
> Lindem (acee)
> *Sent:* Monday, March 7, 2022 11:57 PM
> *To:* Robert Raszuk ; Alvaro Retana <
> alvaro.ret...@futurewei.com>
> *Cc:* Lin Han ; lsr@ietf.org
> *Subject:* Re: [Lsr] OSPF Monitor Node
> (draft-retana-lsr-ospf-monitor-node)
>
>
>
> Speaking as WG member:
>
>
>
> I was going to wait to comment on this due to more important tasks but it
> appears the discussion is under way. This requirement surfaced about 25-30
> years back. In fact, there was one SP (who will remain anonymous) that
> actually had a OSPF monitoring function that kept OSPF neighbors in
> Exchange state indefinitely just to learn the topology w/o participating in
> it. This wrecked with implementations trying to recover sessions that
> weren’t making progress in transition to Full state.
>
>
>
> For OSPFv3, we already have and have always had the Router-LSA R-bit to
> prevent a router from being used to in the topology.
>
>
>
> In OSPFv2, we have RFC 8770 which prevents an OSPFv2 router from being
> used for transit traffic. Now you can argue the stub links are still being.
> However, for these you could either use an unnumbered link or simply omit
> the stub-links from your router LSA. Or use RFC 6860 to hide them.
>
>
>
> Now one could argue that you still have these links in your topology.
> However, they are essentially “bridges to nowhere”. If you really don’t
> want them, then just don’t advertise them in the monitoring node’s
> Router-LSA.
>
>
>
> After 30 years of this requirement already being satisfied, I see no
> reason to introduce new machinery into the protocols. To me, this seems
> like a draft that the OSPF protocol(s) and LSR WG could do better without.
>
>
>
> Thanks,
> Acee
>
>
>
> *From: *Lsr  on behalf of Robert Raszuk <
> rob...@raszuk.net>
> *Date: *Monday, March 7, 2022 at 9:59 AM
> *To: *Alvaro Retana 
> *Cc: *Lin Han , "lsr@ietf.org" 
> *Subject: *Re: [Lsr] OSPF Monitor Node
> (draft-retana-lsr-ospf-monitor-node)
>
>
>
> Hi Alvaro,
>
>
>
> Practically speaking, yes Monitor nodes are cool to have. But so are the
> Controller nodes. The difference would be that in both cases there is no
> topology information being injected by such nodes, however in the latter
> case the additional information could be injected.
>
>
>
> Such information could be related to p

Re: [Lsr] OSPF Monitor Node (draft-retana-lsr-ospf-monitor-node)

2022-03-07 Thread Robert Raszuk
Hi Alvaro,

Practically speaking, yes Monitor nodes are cool to have. But so are the
Controller nodes. The difference would be that in both cases there is no
topology information being injected by such nodes, however in the latter
case the additional information could be injected.

Such information could be related to providing extra data to computation of
topologies by other "Full IGP nodes" or could also be injecting or relaying
discovery information related to IGP or BGP (for example RRs).

Have you considered widening the scope a bit to accomplish this extra
delta ?

Thx
Robert


On Mon, Mar 7, 2022 at 1:17 PM Alvaro Retana 
wrote:

>
>
> Hi!
>
> Lin and I just published a draft that specifies mechanisms for an active
> OSPF monitor: one that can be authenticated into the network but does not
> affect the topology.  This mechanism contrasts to a passive monitor:
> listen-only node on a multiaccess link.
>
> The primary prompt for this work is that we have some applications where
> the monitor node will be on the other end of a p2p interface.  Therefore,
> we have described a mechanism for that case (Section 3: Monitoring
> Interface), and one for the general case where the monitor node can be
> present on any interface (Section 4: The Monitor Node Option).
>
> Please take a look and send comments.
>
>
> https://datatracker.ietf.org/doc/html/draft-retana-lsr-ospf-monitor-node
>
>
> Thanks!
>
> Alvaro.
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-pkaneria-lsr-multi-tlv-00.txt

2022-03-02 Thread Robert Raszuk
Hi Tony,


> If FlexAlgo is adopted, then we should expect that it will get stressed
> and that further conditions will be added to a FAD. The problem will only
> become worse the more success that FlexAlgo has. It sounds to me like you
> want FlexAlgo to fail, which seems strange.


Well I have a bit of a different view on this one.

If we keep networks simple (which is what IMHO we should do) I don't see
that we are any close to exceeding FAD limits as per today's encoding.

Now if on the other hand we load tons of application awareness into the
network and flatten all of this to the IGP transport layer - sure you are
right. That limit and what's worse today's REs CPU and RAM capacity will be
off limits. But do we really want this ? Would you call this a FlexAlgo
success ?

Isn't it better to keep all of the smartness as the overlay ? After all,
wherever you look around everything is transported encapsulated already.

Thx,
Robert
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Adoption Question Stub-Link vs RFC5316

2022-02-18 Thread Robert Raszuk
Hi Chris,

Tony Li (at least) seemed to think that it was useful to be able to attach
> TE attributes to a link, not just to prefixes. Perhaps I've missed this in
> the thread but what current mechanism (rfc?) are you referring to, to
> identify a link and attach TE attributes to it?


I have two questions which are perhaps more to the WG & Chairs then
authors of this draft.

1. While it is perhaps a good thing to construct better transport paths in
the network this draft puts clients data into the network. So for me it
look like client and application awareness injection is being mixed with
the transport layer. I am not sure if there is common agreement that IGPs
should do that now.

2. If we know that proposed solution may work only on a subset of links and
only in specific flat topologies do we still proceed ?

Thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Adoption Question Stub-Link vs RFC5316

2022-02-18 Thread Robert Raszuk
Hi Aijun,

I do have some sympathy to what you are trying to do here. But I also have
a concern if IGP is the right protocol to load it and use it disseminate
completely opaque to its main role information.

Imagine your CAN use case and use of areas with summarization. The ingress
nodes/machines may be far away in different areas. How are you planning to
pass the information around ? Leak all stub links along with new
information attached to them all over the domain ?

Thx,
R.










On Fri, Feb 18, 2022 at 9:39 AM Aijun Wang 
wrote:

> Hi, Peter:
>
> I think you may not have time to follow the previous discussions.
> Please refer to
> https://mailarchive.ietf.org/arch/msg/lsr/molRRoWXOBhaHFc5GPAPmvVISDs/ for
> my summary responses, for how to apply the solution in various scenarios,
> include the unnumbered scenario(we have updated the draft during the
> adoption call process).
> We have discussed the drawback of using RFC5316 to solve such requirements
> intensely, and I think we need not loop it back again.
>
> And, I can give you(also other LSR experts) another information for the
> potential application of this draft:
> The CAN(Computing-Aware Network) BOF has been established, here is its
> contents
> https://datatracker.ietf.org/doc/bofreq-liu-computing-aware-networking-can/
>
> In the last of its description section, we can see:
> "This work may then result in an analysis of which protocols/extensions are
> required to discover, advertise the location and status of, dynamically
> share, and load-balance between services using computationresources
> and those resources themselves."
>
> Besides the solution advantage compared to RFC 5316, the CAN related
> scenarios is also one drive to forward this draft, as also described in the
> reference document
> https://datatracker.ietf.org/doc/html/draft-dunbar-lsr-5g-edge-compute-03
>
>
> Best Regards
>
> Aijun Wang
> China Telecom
>
> -Original Message-
> From: lsr-boun...@ietf.org  On Behalf Of Peter
> Psenak
> Sent: Friday, February 18, 2022 3:52 PM
> To: Christian Hopps ; lsr 
> Subject: Re: [Lsr] Adoption Question Stub-Link vs RFC5316
>
> Chris,
>
> the draft attempt to use the local subnet information for identifying two
> endpoints of the same link. That seems wrong in itself. In addition:
>
> 1) We have link local/remote IDs (and IP addresses) to pair the two
> endpoints of the link in both OSPF and ISIS. We do not need another
> mechanism for the same.
>
> 2) What is proposed does not work for unnumbered links.
>
> thanks,
> Peter
>
>
>
> On 18/02/2022 05:45, Christian Hopps wrote:
> > [As WG Chair]
> >
> > Hi LSR-WG,
> >
> > As my co-chair has joined the draft as a co-author making the call on
> whether we have rough consensus to adopt
> draft-wang-lsr-stub-link-attributes-02 now falls to me alone.
> >
> > I've reread the numerous emails on this adoption call and I see some
> support, and a few objections, and most of the objections are not that
> there
> is no problem to solve here, but they think this draft isn't the right way
> to do it and a revision of RFC5316 could be done instead.
> >
> > "A bird in the hand is worth two in the bush"
> >
> > While it might be nice that there is another way to accomplish things by
> re-using an existing TLV, that work has not been done, whereas we have a
> written draft in front of us -- that has now been beaten up and reviewed a
> good deal -- that does seem to provide a solution to an actual problem.
> >
> > So I'd like to give the WG a final chance to comment here, is there a
> > strongly compelling reason to reject the work that is done here.
> > Examples of "strongly compelling" would be something like "This will
> > break the (IS-IS) decision process" or "this will badly affect
> > scaling" or "this will significantly complicate a protocol
> > implementation", but not "this can be done differently" as the latter
> > is work not done (i.e., it's two birds "in the bush")
> >
> > I am *not* looking to rehash the entire discussion we've already had so
> please restrict your replies to the above question only.
> >
> > Thanks,
> > Chris.
> > [As WG Chair]
> >
> > ___
> > Lsr mailing list
> > Lsr@ietf.org
> > https://www.ietf.org/mailman/listinfo/lsr
> >
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-02-13 Thread Robert Raszuk
Gyan,

The OSPF draft you quote does not make any assumptions nor restrictions on
how BFD session itself is setup.

So yes procedures described in draft-ietf-bfd-unsolicited could be used as
a way to bring up BFD session between peers.

Rgs,
R.


On Sun, Feb 13, 2022 at 9:05 PM Gyan Mishra  wrote:

>
> Hi Robert
>
> Would this BFD strict  mode apply to unsolicited BFD of which you are
> author?
>
> https://datatracker.ietf.org/doc/html/draft-ietf-bfd-unsolicited-03
>
> I think if applicable I think would be a good idea.
>
> Many Thanks
>
> Gyan
> On Thu, Feb 10, 2022 at 10:59 AM Acee Lindem (acee)  40cisco@dmarc.ietf.org> wrote:
>
>> Hi Robert,
>>
>> This is great to hear – I thought you wanted to make this required for
>> implementation as opposed to a recommendation.
>>
>> Thanks,
>>
>> Acee
>>
>>
>>
>> *From: *Robert Raszuk 
>> *Date: *Thursday, February 10, 2022 at 10:57 AM
>> *To: *Acee Lindem 
>> *Cc: *"lsr@ietf.org" , "
>> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org" <
>> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org>
>> *Subject: *Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
>> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>>
>>
>>
>> Hi Acee,
>>
>>
>>
>> > There was debate regarding making the delay timer described in section
>> 5 a normative requirement.
>>
>>
>>
>> I see added into new version of the draft the following text into section
>> 5:
>>
>>
>>
>>The use of OSPF BFD strict-
>>mode along with mechanisms such as hold-down
>> *(a delay in the initialOSPF adjacency bringup following BFD session
>> establishment)* and/or
>>dampening
>> *(a delay in the OSPF adjacency bringup following failuredetected by
>> BFD)* may help reduce the frequency of adjacency flaps and
>>therefore reduce the associated routing churn.
>>
>>
>>
>> Not sure if this is normative or informative, but it addresses my point.
>>
>>
>>
>> Thx,
>>
>> Robert.
>>
>>
>>
>> On Thu, Feb 10, 2022 at 4:50 PM Acee Lindem (acee) > 40cisco@dmarc.ietf.org> wrote:
>>
>> The WG last call has all but ended and we’ve had a lot of support, two
>> implementations, and some good discussion. Please review the -05 version of
>> the draft reflecting including changes reflecting this discussion. There
>> was debate regarding making the delay timer described in section 5 a
>> normative requirement. The consensus was to not make this a normative part
>> of the specification. I feel this is the right decision – especially given
>> that this is new functionality being requested at Working Group Last Call
>> and implementations accomplish the dampening in vary ways.
>>
>>
>>
>> https://datatracker.ietf.org/doc/draft-ietf-lsr-ospf-bfd-strict-mode/
>>
>>
>>
>> Thanks,
>>
>> Acee
>>
>>
>>
>> *From: *Lsr  on behalf of "Acee Lindem (acee)"
>> 
>> *Date: *Thursday, January 27, 2022 at 12:09 PM
>> *To: *"lsr@ietf.org" 
>> *Cc: *"draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org" <
>> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org>
>> *Subject: *[Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD"
>> - draft-ietf-lsr-ospf-bfd-strict-mode-04
>>
>>
>>
>> LSR WG,
>>
>>
>>
>> This begins a two week last call for the subject draft. Please indicate
>> your support or objection on this list prior to 12:00 AM UTC on February 11
>> th, 20222. Also, review comments are certainly welcome.
>>
>> Thanks,
>> Acee
>>
>>
>>
>> ___
>> Lsr mailing list
>> Lsr@ietf.org
>> https://www.ietf.org/mailman/listinfo/lsr
>>
>> ___
>> Lsr mailing list
>> Lsr@ietf.org
>> https://www.ietf.org/mailman/listinfo/lsr
>>
> --
>
> <http://www.verizon.com/>
>
> *Gyan Mishra*
>
> *Network Solutions A**rchitect *
>
> *Email gyan.s.mis...@verizon.com *
>
>
>
> *M 301 502-1347*
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-02-10 Thread Robert Raszuk
Hi Acee,

> There was debate regarding making the delay timer described in section 5
a normative requirement.

I see added into new version of the draft the following text into section
5:

   The use of OSPF BFD strict-
   mode along with mechanisms such as hold-down
*(a delay in the initial   OSPF adjacency bringup following BFD session
establishment)* and/or
   dampening
*(a delay in the OSPF adjacency bringup following failure   detected by
BFD)* may help reduce the frequency of adjacency flaps and
   therefore reduce the associated routing churn.

Not sure if this is normative or informative, but it addresses my point.

Thx,
Robert.

On Thu, Feb 10, 2022 at 4:50 PM Acee Lindem (acee)  wrote:

> The WG last call has all but ended and we’ve had a lot of support, two
> implementations, and some good discussion. Please review the -05 version of
> the draft reflecting including changes reflecting this discussion. There
> was debate regarding making the delay timer described in section 5 a
> normative requirement. The consensus was to not make this a normative part
> of the specification. I feel this is the right decision – especially given
> that this is new functionality being requested at Working Group Last Call
> and implementations accomplish the dampening in vary ways.
>
>
>
> https://datatracker.ietf.org/doc/draft-ietf-lsr-ospf-bfd-strict-mode/
>
>
>
> Thanks,
>
> Acee
>
>
>
> *From: *Lsr  on behalf of "Acee Lindem (acee)"
> 
> *Date: *Thursday, January 27, 2022 at 12:09 PM
> *To: *"lsr@ietf.org" 
> *Cc: *"draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org" <
> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org>
> *Subject: *[Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" -
> draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> LSR WG,
>
>
>
> This begins a two week last call for the subject draft. Please indicate
> your support or objection on this list prior to 12:00 AM UTC on February 11
> th, 20222. Also, review comments are certainly welcome.
>
> Thanks,
> Acee
>
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Lsr Digest, Vol 49, Issue 24

2022-02-07 Thread Robert Raszuk
HI Albert,

> This is precisely the issue that this draft intends to address,
> making sure that OSPF is not established, until BFD is UP.

What this draft tries to address is obvious. The real issue is however in
the true meaning of what "BFD UP" trigger means.

Some folks, perhaps including yourself, naturally and intuitively think
that BFD UP means that BFD session has been established and data plane has
been confirmed to work between local and remote node over the subject link.

That is unfortunate not the case modulo local implementation hacks. BFD
RFC5880 says that BFD UP only means that BFD session signaling completed.
It says nothing about the actual data plane being tested at least once
between two peers on the very link you are going to bring OSPF adj UP.

RFC5882 recommends implementation of BFD dampening but as we established
this does not cover the point of the discussion. It also discusses
interaction between BFD and client(s) when a session goes DOWN. It is
pretty silent on the session UP event leaving this pretty open to the
implementations.

*Proposal: *

If you do not want to add any text describing any recommended behavior on
the client why just not add instead a single sentence defining that in the
context of this draft "BFD UP" trigger as received by the client means not
only BFD session UP but at least one full test cycle to pass successfully ?
Is there any harm associated with adding such statement ?

Regards,
R.

PS,

> We know that BFD UP, as it stands today, does not mean that the link is
100% good
> (e.g. MTU-sized packets might not get through). IMO, link quality issue
is
> outside the scope of this draft.

That is not the point. The point it to make link really works in the data
plane. The extra testing is indeed out of scope and as I mentioned it was
just an example.





On Mon, Feb 7, 2022 at 3:51 PM Albert Fu (BLOOMBERG/ 120 PARK) <
af...@bloomberg.net> wrote:

> Hi Robert,
>
> This is precisely the issue that this draft intends to address, making
> sure that OSPF is not established, until BFD is UP.
>
> This ensures that there is a mechanism to quickly detect BFD failures, and
> avoid having to rely on lengthy OSPF protocol timer for failure detection
> (where OSPF is UP without BFD).
>
> If there's an issue where BFD packets can not get through after OSPF is
> UP, the fast detect mechanism will kick in as per the configured
> timer/multiplier, and bring OSPF down, diverting traffic away from the link.
>
> We know that BFD UP, as it stands today, does not mean that the link is
> 100% good (e.g. MTU-sized packets might not get through). IMO, link quality
> issue is outside the scope of this draft.
>
> Thanks
> Albert
>
> From: Robert Raszuk  To: Les Ginsberg <
> ginsb...@cisco.com>
> Subject: Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD"
> - draft-ietf-lsr-ospf-bfd-strict-mode-04
> Date: Mon, 7 Feb 2022 00:31:04 +0100
>
> ..
>
> When the interface goes UP OSPF will not bring adj UP till BFD comes UP.
> But then we are back in square one as OSPF adj comes UP and BFD after a
> full cycle of testing brings it back down. So what have we accomplished
> with this draft/RFC - nothing.
>
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-02-06 Thread Robert Raszuk
Les,

Please kindly present the facts.

The facts are that equivalent functionality in OSPF which has been approved
for years uses a configurable timer which allows both - to wait for BFD as
well to make sure that BFD stays UP till that timer expires. The point I
even started this discussion was about your threat *that this timer will be
removed* once this draft goes to RFC.

So today without this timer when the interface goes UP both OSPF and BFD go
in parallel and OSPF can win. That is bad as BFD when it comes UP and
shortly goes down causes routing churn and packet drops. Note that can
happen in the vast majority of cases when either links have problems with
unicast (possible but pretty rare) or when BFD came up some time (say 100
ms) after OSPF brought adj. UP and then BFD declared failure during Echo
packet exchange.

So how does the situation above change with this draft ...

When the interface goes UP OSPF will not bring adj UP till BFD comes UP.
But then we are back in square one as OSPF adj comes UP and BFD after a
full cycle of testing brings it back down. So what have we accomplished
with this draft/RFC - nothing.

In both cases you have to be pretty unlucky to get a link failure either
between OSPF adj UP and BFD full cycle of Echo packets. But the entire
purpose of this draft is to address that very unfortunate sequence of
events.

We all seem to agree we need to wait a bit longer. We have whatsoever no
agreement across WGs who should take it on it's shoulder. Should this be
BFD to delay UP notification to the client, should this be the client to
take UP but only move on if no DOWN was seen within a timer or maybe this
should be middlemen like RIB which is likely acting as a postman here.

I still believe all of this should be at least reflected in the draft even
if the conclusion is to leave it up to implementation choice.

IMO refusing to even mention this is equal to proceeding with
architecturally broken specification.

Cheers,
Robert.

On Mon, Feb 7, 2022 at 12:14 AM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
>
>
>
>
> I have brought this in the context of the waif-for-bfd OSPF proposal. This
> is the first time LSR WG is facing such a requirement so IMO it would be
> proper to at least discuss this in the draft.
>
>
>
> *[LES:] Well – no – that statement isn’t true.*
>
> *The strict-mode drafts (OSPF and BGP) are specifying behavior which has
> long been deployed.*
>
> *IS-IS specified this in RFC 6213 many years ago.*
>
> *Proprietary implementations of the equivalent functionality in OSPF have
> been deployed for many years – but they lack a means to successfully
> interoperate with implementations which do not have the functionality
> and/or are not configured to enable it.*
>
>
>
> *All this draft is doing is defining protocol extensions for OSPF to
> support strict-mode as it has been deployed for many years.*
>
> *As such, most of the discussion is out of scope and we should simply
> approve the document.*
>
>
>
> *It is both understandable and potentially useful that the context here
> has revived other concerns that you may have had for a long time. But
> addressing those concerns is new work, outside the scope of this draft, and
> likely demands a broader audience than LSR WG provides.*
>
>
>
> *Let’s move on with this draft as is.*
>
> *If you or others want to pursue new work related to this functionality,
> please do so – but NOT in the context of this draft.*
>
>
>
> *   Les*
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-02-06 Thread Robert Raszuk
Hi Chris,

> but I don't see how it is OSPF specific

I have brought this in the context of the waif-for-bfd OSPF proposal. This
is the first time LSR WG is facing such a requirement so IMO it would be
proper to at least discuss this in the draft.

And if so all I merely suggested was to mention this in the draft and make
sure readers understand what this draft should wait for to trigger OSPF
adj, to come up.  But as retrospect of this 100 mail thread I am drawing a
conclusion that maybe I am asking for too much perfection in the spec.
Maybe not many folks will even notice this. They will just spend hours on
troubleshooting why packets got dropped and will blame telco for providing
such a crappy circuit physical or an emulated one :)

And you have seen a clear recommendation from Jeff that such change or
delay is not likely going to happen on the BFD side and it is up to
the client to change its FSM in that respect. I do not see why this should
not be at least described in the draft.

Anyhow enough time spent on this ...

Many thx,
Robert







On Sun, Feb 6, 2022 at 11:29 PM Christian Hopps  wrote:

>
> Robert Raszuk  writes:
>
> > Hi Les,
> >
> >
> >
> > There is nothing in RFC 5880 (nor in, what I consider to be even
> > more relevant, RFC 5882) that requires a BFD implementation to
> > signal UP state to a BFD client within a specific time following
> > transition of the BFD state machine to UP . An implementation is
> > free to introduce a delay (as you suggest) before such signaling.
> >
> >
> > My reading of section 6.2 of RFC5880 clearly indicates that BFD is
> > signalling UP state when BFD session has been established without any
> > delay.
> >
> > I am not sure if BFD implementation is free to introduce any delay
> > there yet still to claim full compliance to RFC5880 (even if
> > technically it would make sense to have such delay).
>
> If you publish an RFC that adds to or extends the BFD "UP" concept then it
> can simply "updates 5880", if required.
>
> In any case, the delay concept you are talking about is not without merit,
> but I don't see how it is OSPF specific; it would also benefit IS-IS and
> other BFD clients as well, right? To me that says do this in BFD so
> everyone can benefit.
>
> Thanks,
> Chris.
> [as wg member]
>
> >
> > Quote:
> >
> >
> >Up state means that the BFD session has successfully been
> >established, and implies that connectivity between the systems is
> >working.  The session will remain in the Up state until either
> >connectivity fails or the session is taken down administratively.
> >
> >
> > Rgs,
> > Robert.
> >
> >
> >
> > ___
> > Lsr mailing list
> > Lsr@ietf.org
> > https://www.ietf.org/mailman/listinfo/lsr
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-02-06 Thread Robert Raszuk
Hi Les,

BFD dampening as per some documentation is applicable or is triggered by
flapping BFD session(s). And indeed it has its own very valid use case. But
IMHO it is only partially a solution for what we need in the light of this
thread.

Here in this context assume I am looking at a new interface being
provisioned.

So DOWN transitions happened infinitely long before to first UP and stays
UP. I see nowhere in BFD Dampening description that it will also
suppress for time T even first UP notification after a long enough DOWN
event. In bringing the new interface UP all dampening parameters have
expired.

To me dampening may kick in when you go DOWN and suppress excessive UP
events before presumed link stability is achieved as expressed in dampening
[ half-life-period reuse-threshold suppress-threshold max-suppress-time]
cli. But I see nowhere in the docs any indication that BFD Dampening can be
used as an unconditional UP transition suppression buffer.

The same applies to interfaces which went down and came UP after
max-suppress-time had already expired.

Thx,
R.


On Sun, Feb 6, 2022 at 11:05 PM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Sunday, February 6, 2022 10:42 AM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* lsr@ietf.org
> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Hi Les,
>
>
>
> There is nothing in RFC 5880 (nor in, what I consider to be even more
> relevant, RFC 5882) that requires a BFD implementation to signal UP state
> to a BFD client within a specific time following transition of the BFD
> state machine to UP . An implementation is free to introduce a delay (as
> you suggest) before such signaling.
>
>
>
> My reading of section 6.2 of RFC5880 clearly indicates that BFD is
> signalling UP state when BFD session has been established without any
> delay.
>


> *[LES:] This is specifying the BFD State Machine and signaling between BFD
> peers – not signaling between BFD and its local clients.*
>
> *RFC 5882 has some discussion of the latter – particularly
> https://www.rfc-editor.org/rfc/rfc5882.html#section-3
> <https://www.rfc-editor.org/rfc/rfc5882.html#section-3> *
>
>
>
> *It is worth quoting this sentence:*
>
>
>
> *“The interaction between a BFD session and its associated client*
>
> *   applications is, for the most part, an implementation issue that is*
>
> *   outside the scope of this specification.”*
>
>
>
> *Indeed, one way of implementing “BFD Dampening” (as some vendors have
> done) is to delay notification of BFD UP state to BFD clients.*
>
>
>
> *The obvious benefits of implementing such a delay (if desired) before BFD
> signals UP to clients are that it is client agnostic and does not require
> any knowledge on the part of the clients as to when BFD has completed any
> additional procedures. The client simply operates as if BFD session is DOWN
> until it gets an UP indication from BFD.*
>
>
>
> *   Les*
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-02-06 Thread Robert Raszuk
Hi Les,


> There is nothing in RFC 5880 (nor in, what I consider to be even more
> relevant, RFC 5882) that requires a BFD implementation to signal UP state
> to a BFD client within a specific time following transition of the BFD
> state machine to UP . An implementation is free to introduce a delay (as
> you suggest) before such signaling.
>

My reading of section 6.2 of RFC5880 clearly indicates that BFD is
signalling UP state when BFD session has been established without any
delay.

I am not sure if BFD implementation is free to introduce any delay there
yet still to claim full compliance to RFC5880 (even if technically it would
make sense to have such delay).

Quote:

   Up state means that the BFD session has successfully been
   established, and implies that connectivity between the systems is
   working.  The session will remain in the Up state until either
   connectivity fails or the session is taken down administratively.


Rgs,
Robert.

>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-02-06 Thread Robert Raszuk
Gyan,

Exchanging BFD control packets does not guarantee data path liveness nor it
guarantees that subsequent BFD Echo packets will succeed.

BFD UDP control packets can use a different IP address (src or dst)
than the one used for data path probing. Both UDP ports are also different
(3784 vs 3785). Please also observe that BFD control packets are handled by
RE/RP while BFD Echo packet processing is very often offloaded to the line
card(s).

So to me bringing up OSPF adj. immediately after BFD session transitions to
UP state is neither a good thing nor should be stated so by the subject
draft to bring up OSPF adj. without risk of it to shortly go down.

Thx,
R



On Sun, Feb 6, 2022 at 6:24 PM Gyan Mishra  wrote:

> Hi Robert
>
> On Sun, Feb 6, 2022 at 6:11 AM Robert Raszuk  wrote:
>
>> Gyan,
>>
>> > Overall I believe the goal of the strict mode BFD “wait for BFD” is
>> accomplished
>> > and solve all problems
>>
>> I do not agree with this statement.
>>
>
>
>> As currently defined in the posted version of the draft "wait for BFD"
>> means wait for BFD control packets to bring the session up.
>>
>> The session comes up - yet no BFD Echo packets have been exchanged. That
>> in turn triggers OSPF adj. to come up.
>>
>
> Gyan>. My understanding with RFC 5880 is that BFD control packets have
> been sent in asynchronous mode per the interval and multiplier period
> specified with the 3 way handshake being completed which tests the bi
> directional path between the client endpoints before the session BFD FSM
> transitions to the “UP” state.  We can get confirmation from Greg on the
> behavior.
>
> Key Excerpts  from RFC 5880 below related to this topic.  BFD control
> packets are sent during init and 3 way handshake in async mode until BFD
> FSM transitions to UP state and only in UP state is Echo if configured is
> sent.  So the 3 way handshake from reading below verifies the bi
> directional communication between the endpoints which according to the
> draft would all occur prior to client coming UP.  However if Echo is
> configured for looping the packets for testing that would happen after OSPF
> FSM has started.
>
> Section 1
>
>The goal of Bidirectional Forwarding Detection (BFD) is to provide
>low-overhead, short-duration detection of failures in the path
>between adjacent forwarding engines, including the interfaces, data
>link(s), and, to the extent possible, the forwarding engines
>themselves.
>
>An additional goal is to provide a single mechanism that can be used
>for liveness detection over any media, at any protocol layer, with a
>wide range of Detection Times and overhead, to avoid a proliferation
>of different methods.
>
>
>An additional goal is to provide a single mechanism that can be used
>for liveness detection over any media, at any protocol layer, with a
>wide range of Detection Times and overhead, to avoid a proliferation
>of different methods.
>
>
> BFD had a per link concept “BFD over bundle member”
>
>
>BFD can provide failure detection on any kind of path between
>systems, including direct physical links, virtual circuits, tunnels,
>MPLS Label Switched Paths (LSPs), multihop routed paths, and
>unidirectional links (so long as there is some return path, of
>course).  Multiple BFD sessions can be established between the same
>pair of systems when multiple paths between them are present in at
>least one direction, even if a lesser number of paths are available
>in the other direction (multiple parallel unidirectional links or
>MPLS LSPs, for example).
>
>
> Section 2
>
>
>The BFD state machine implements a three-way handshake, both when
>establishing a BFD session and when tearing it down for any reason,
>to ensure that both systems are aware of the state change.
>
>
>BFD can be abstracted as a simple service.  The service primitives
>provided by BFD are to create, destroy, and modify a session, given
>the destination address and other parameters.  BFD in return provides
>a signal to its clients indicating when the BFD session goes up or
>down.
>
>
> Section 3
>
>A path is only declared to be operational when two-way communication
>has been established between systems, though this does not preclude
>the use of unidirectional links.
>
>A separate BFD session is created for each communications path and
>data protocol in use between two systems.
>
>Each system estimates how quickly it can send and receive BFD packets
>in order to come to an agreement with its neighbor about how rapidly
>detecti

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-02-06 Thread Robert Raszuk
Hello Acee,

I am afraid you completely missed my point. Perhaps this is my fault as in
this way too looong email thread I indeed brought additional testing
requirements - but never said those need to be part of this draft nor
specified in LSR WG. Those were just examples on what can occur in this
delta time we are talking about.

I am not asking for any additional BFD capabilities at all in respect to
this draft. I 100% agree those are out of scope of LSR WG.

*I am asking to let at least vanilla BFD probing cycle** to occur (at least
once) before doing any action on the client side. *

Doing any action on the client/protocol  just because BFD control packets
to setup the session got exchanged is a wrong thing to do. When BFD
control packets brought the session UP BFD probing did not even occur yet.

That's it. Subtle yet extremely important point.

** Cycle == probing frequency x multiplier - basic BFD parameters.

Many thx,
R.


On Sun, Feb 6, 2022 at 1:51 PM Acee Lindem (acee)  wrote:

> Hi Robert,
>
> I think that much of the additional functionality you are proposing is
> beyond the scope of the draft and IGP BFD usage today. You could propose
> all these additional capabilities (e.g., MTU testing and link quality
> determination beyond what is already in BFD) in a separate draft.
>
> Thanks,
>
> Acee
>
>
>
> *From: *Robert Raszuk 
> *Date: *Sunday, February 6, 2022 at 6:11 AM
> *To: *Gyan Mishra 
> *Cc: *Acee Lindem , Ketan Talaulikar <
> ketant.i...@gmail.com>, "lsr@ietf.org" 
> *Subject: *Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Gyan,
>
>
>
> > Overall I believe the goal of the strict mode BFD “wait for BFD” is
> accomplished
>
> > and solve all problems
>
>
>
> I do not agree with this statement.
>
>
>
> As currently defined in the posted version of the draft "wait for BFD"
> means wait for BFD control packets to bring the session up.
>
>
>
> The session comes up - yet no BFD Echo packets have been exchanged. That
> in turn triggers OSPF adj. to come up.
>
>
>
> So we bring OSPF adj UP knowing literally nothing about BFD test results
> over subject link. If the BFD timer is set to 2 seconds and the multiplier
> is 3 only in 6 seconds the BFD session could go down and take OSPF adj.
> down with it which means nothing what this draft aims to accomplish has
> been achieved.
>
>
>
> Sure one can argue if this is proper for BFD to signal UP state without at
> least once exchanging a set of Echo packets - but this is currently not the
> case in BFD FSM. If you want to "fix" BFD go for it, but for now the delay
> associated with any client action should be based on how BFD works
> per definition in RFC5880 and therefore should be specified on the client
> side.
>
>
>
> Rgs,
> Robert.
>
>
>
>
>
>
>
> On Sun, Feb 6, 2022 at 8:16 AM Gyan Mishra  wrote:
>
>
>
> All
>
>
>
> I have finally caught up with this thread and I agree with  Les, Ketan and
> Albert that the “wait for BFD” goal is accomplished with both the OSPF and
> BGP draft.  There is extra verbiage in BGP draft in case BFD does not come
> up for BGP to wait.  Agreed not applicable to OSPF.
>
>
>
> I agree with the spirit of Roberts idea of a delay as it would help as far
> as stability having a “pause” button for degraded links and quality issues,
> however I do agree with the WG that this is outside of LSRs scope and
> should really be with BFD or better yet IPPM for link quality monitoring.
>
>
>
> Overall I believe the goal of the strict mode BFD “wait for BFD” is
> accomplished and solve all problems except issues related to poor link
> quality issues.
>
>
>
> I support both the OSPF and BGP strict mode drafts and I think think this
> will be a big gain in itself for operators.
>
>
>
> As mentioned in the OSPF draft section 5 on use of hold down timers, BFD
> dampening and on ML use of  carrier delay and interface dampening can help
> operators with link quality issues until we are able to make some headway
> in BFD and IPPM WG.
>
>
>
> I would be happy to work with Greg and IPPM WGs as a follow up to this
> thread related to link quality issues.
>
>
>
> Kind Regards
>
> Gyan
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-02-06 Thread Robert Raszuk
Gyan,

> Overall I believe the goal of the strict mode BFD “wait for BFD” is
accomplished
> and solve all problems

I do not agree with this statement.

As currently defined in the posted version of the draft "wait for BFD"
means wait for BFD control packets to bring the session up.

The session comes up - yet no BFD Echo packets have been exchanged. That in
turn triggers OSPF adj. to come up.

So we bring OSPF adj UP knowing literally nothing about BFD test results
over subject link. If the BFD timer is set to 2 seconds and the multiplier
is 3 only in 6 seconds the BFD session could go down and take OSPF adj.
down with it which means nothing what this draft aims to accomplish has
been achieved.

Sure one can argue if this is proper for BFD to signal UP state without at
least once exchanging a set of Echo packets - but this is currently not the
case in BFD FSM. If you want to "fix" BFD go for it, but for now the delay
associated with any client action should be based on how BFD works
per definition in RFC5880 and therefore should be specified on the client
side.

Rgs,
Robert.



On Sun, Feb 6, 2022 at 8:16 AM Gyan Mishra  wrote:

>
> All
>
> I have finally caught up with this thread and I agree with  Les, Ketan and
> Albert that the “wait for BFD” goal is accomplished with both the OSPF and
> BGP draft.  There is extra verbiage in BGP draft in case BFD does not come
> up for BGP to wait.  Agreed not applicable to OSPF.
>
> I agree with the spirit of Roberts idea of a delay as it would help as far
> as stability having a “pause” button for degraded links and quality issues,
> however I do agree with the WG that this is outside of LSRs scope and
> should really be with BFD or better yet IPPM for link quality monitoring.
>
> Overall I believe the goal of the strict mode BFD “wait for BFD” is
> accomplished and solve all problems except issues related to poor link
> quality issues.
>
> I support both the OSPF and BGP strict mode drafts and I think think this
> will be a big gain in itself for operators.
>
> As mentioned in the OSPF draft section 5 on use of hold down timers, BFD
> dampening and on ML use of  carrier delay and interface dampening can help
> operators with link quality issues until we are able to make some headway
> in BFD and IPPM WG.
>
> I would be happy to work with Greg and IPPM WGs as a follow up to this
> thread related to link quality issues.
>
> Kind Regards
> Gyan
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-02-04 Thread Robert Raszuk
Ahh ok .. this is "OSPF virtual link" not an emulated "virtual link" seen
as p2p to any routing protocol.

Thx,
R.



On Fri, Feb 4, 2022 at 7:14 PM Acee Lindem (acee)  wrote:

> Hi Robert,
>
> There is no tunnel for an OSPF virtual link, the transit area will require
> leaking of backbone routes without summarization. Also note that the
> virtual link endpoint could be reachable in the transit area but may not be
> up. Multi-hop BFD would still be useful for a virtual link.
>
> Thanks,
>
> Acee
>
>
>
> *From: *Robert Raszuk 
> *Date: *Friday, February 4, 2022 at 12:48 PM
> *To: *Muthu Arul Mozhi Perumal 
> *Cc: *Ketan Talaulikar , "lsr@ietf.org" <
> lsr@ietf.org>, "draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org" <
> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org>, Acee Lindem  >
> *Subject: *Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Muthu,
>
>
>
> If you are using virtual link why is this still multihop BFD ?
>
>
>
> Thx,
> R.
>
>
>
> On Fri, Feb 4, 2022 at 6:22 PM Muthu Arul Mozhi Perumal <
> muthu.a...@gmail.com> wrote:
>
> Hi Ketan,
>
>
>
> Sure, looking forward to the clarification in the draft on multi-hop BFD..
>
>
>
> Just curious, are there interoperable implementations for OSPF multi-hop
> BFD strict mode for virtual links or p2p unnumbered interfaces?
>
>
>
> Regards,
>
> Muthu
>
>
>
> On Fri, Feb 4, 2022 at 5:36 PM Ketan Talaulikar 
> wrote:
>
> Hi Muthu,
>
>
>
> When we say a "link" here, it is in the context of the OSPF interface and
> neighbor FSM. My understanding is that this term includes virtual links as
> well. As such, we can add some text in the introduction section to clarify
> the same and also put a reference to RFC5883 for BFD multi-hop use for
> VLINKs.
>
>
>
> I hope that works for you.
>
>
>
> Thanks,
>
> Ketan
>
>
>
>
>
> On Wed, Feb 2, 2022 at 11:05 AM Muthu Arul Mozhi Perumal <
> muthu.a...@gmail.com> wrote:
>
> Hi Ketan,
>
>
>
> Thanks for your response..
>
>
>
> The draft says:
>
> 
>
>This document defines the B-bit in the LLS Type 1 Extended Options
>and Flags field.  This bit is defined for the LLS block included in
>Hello and Database Description (DD) packets and
> *indicates that BFD isenabled on the link* and that the router
> requests strict-mode for BFD.
>
> 
>
>
>
> You don't enable multi-hop BFD on a link, instead you enable it b/w two
> (multi-hop) routers, right?
>
>
>
> How about replacing it with:
>
> indicates that (1) single-hop BFD [RFC5881] is enabled on the link in case
> of point-to-point (numbered) and LAN interfaces, and (2) multi-hop BFD
> [RFC5883] is enabled between the neighbors in case of virtual links and
> point-to-point unnumbered interfaces.
>
>
>
> Also, add a note at the beginning of the draft that BFD refers to both
> single-hop and multi-hop BFD when not explicitly specified..
>
>
>
> Regards,
>
> Muthu
>
>
>
> On Sun, Jan 30, 2022 at 10:40 PM Ketan Talaulikar 
> wrote:
>
> Hi Muthu,
>
>
>
> Thanks for your review and your support.
>
>
>
> Regarding your question, I would like to clarify that this document
> doesn't specify BFD operations with OSPF. That was done by RFC5882. Indeed
> for virtual links, there would need to be a BFD multi-hop session and the
> same would apply to p-t-p unnumbered.
>
>
>
> However, I am not sure what specific applicability or operations need to
> be called out for Strict Mode of operations for those links.
>
>
>
> Thanks,
>
> Ketan
>
>
>
>
>
> On Sun, Jan 30, 2022 at 12:52 PM Muthu Arul Mozhi Perumal <
> muthu.a...@gmail.com> wrote:
>
> Hi,
>
>
>
> I support the draft. A quick question:
>
> Should it describe the applicability of the mechanism over OSPF virtual
> links and unnumbered interfaces? With virtual links, one would have to
> establish a multi-hop BFD session, so it is slightly different from a BFD
> operational standpoint. For e.g, capability to support single-hop BFD may
> not translate to the capability to support multi-hop BFD..
>
>
>
> Regards,
>
> Muthu
>
>
>
> On Thu, Jan 27, 2022 at 10:38 PM Acee Lindem (acee)  40cisco@dmarc.ietf.org> wrote:
>
> LSR WG,
>
>
>
> This begins a two week last call for the subject draft. Please indicate
> your support or objection on this list prior to 12:00 AM UTC on February 11
> th, 20222. Also, review comments are certainly welcome.
>
> Thanks,
> Acee
>
>
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-02-04 Thread Robert Raszuk
Muthu,

If you are using virtual link why is this still multihop BFD ?

Thx,
R.

On Fri, Feb 4, 2022 at 6:22 PM Muthu Arul Mozhi Perumal <
muthu.a...@gmail.com> wrote:

> Hi Ketan,
>
> Sure, looking forward to the clarification in the draft on multi-hop BFD..
>
> Just curious, are there interoperable implementations for OSPF multi-hop
> BFD strict mode for virtual links or p2p unnumbered interfaces?
>
> Regards,
> Muthu
>
> On Fri, Feb 4, 2022 at 5:36 PM Ketan Talaulikar 
> wrote:
>
>> Hi Muthu,
>>
>> When we say a "link" here, it is in the context of the OSPF interface and
>> neighbor FSM. My understanding is that this term includes virtual links as
>> well. As such, we can add some text in the introduction section to clarify
>> the same and also put a reference to RFC5883 for BFD multi-hop use for
>> VLINKs.
>>
>> I hope that works for you.
>>
>> Thanks,
>> Ketan
>>
>>
>> On Wed, Feb 2, 2022 at 11:05 AM Muthu Arul Mozhi Perumal <
>> muthu.a...@gmail.com> wrote:
>>
>>> Hi Ketan,
>>>
>>> Thanks for your response..
>>>
>>> The draft says:
>>> 
>>>This document defines the B-bit in the LLS Type 1 Extended Options
>>>and Flags field.  This bit is defined for the LLS block included in
>>>Hello and Database Description (DD) packets and
>>> *indicates that BFD is   enabled on the link* and that the router
>>> requests strict-mode for BFD.
>>> 
>>>
>>> You don't enable multi-hop BFD on a link, instead you enable it b/w two
>>> (multi-hop) routers, right?
>>>
>>> How about replacing it with:
>>> indicates that (1) single-hop BFD [RFC5881] is enabled on the link in
>>> case of point-to-point (numbered) and LAN interfaces, and (2) multi-hop BFD
>>> [RFC5883] is enabled between the neighbors in case of virtual links and
>>> point-to-point unnumbered interfaces.
>>>
>>> Also, add a note at the beginning of the draft that BFD refers to both
>>> single-hop and multi-hop BFD when not explicitly specified..
>>>
>>> Regards,
>>> Muthu
>>>
>>> On Sun, Jan 30, 2022 at 10:40 PM Ketan Talaulikar 
>>> wrote:
>>>
 Hi Muthu,

 Thanks for your review and your support.

 Regarding your question, I would like to clarify that this document
 doesn't specify BFD operations with OSPF. That was done by RFC5882. Indeed
 for virtual links, there would need to be a BFD multi-hop session and the
 same would apply to p-t-p unnumbered.

 However, I am not sure what specific applicability or operations need
 to be called out for Strict Mode of operations for those links.

 Thanks,
 Ketan


 On Sun, Jan 30, 2022 at 12:52 PM Muthu Arul Mozhi Perumal <
 muthu.a...@gmail.com> wrote:

> Hi,
>
> I support the draft. A quick question:
> Should it describe the applicability of the mechanism over OSPF
> virtual links and unnumbered interfaces? With virtual links, one would 
> have
> to establish a multi-hop BFD session, so it is slightly different from a
> BFD operational standpoint. For e.g, capability to support single-hop BFD
> may not translate to the capability to support multi-hop BFD..
>
> Regards,
> Muthu
>
> On Thu, Jan 27, 2022 at 10:38 PM Acee Lindem (acee)  40cisco@dmarc.ietf.org> wrote:
>
>> LSR WG,
>>
>>
>>
>> This begins a two week last call for the subject draft. Please
>> indicate your support or objection on this list prior to 12:00 AM UTC on
>> February 11th, 20222. Also, review comments are certainly welcome.
>>
>> Thanks,
>> Acee
>>
>>
>> ___
>> Lsr mailing list
>> Lsr@ietf.org
>> https://www.ietf.org/mailman/listinfo/lsr
>>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-02-01 Thread Robert Raszuk
Ketan,


> What the OSPF draft discusses in Sec 5 is a "hold-down" wait period where
> even though the BFD session is established the protocol FSM does not
> proceed further until a period of time has passed to ensure the stability
> of the BFD session.
>

Which protocol FSM ? BFD FSM or OSPF FSM ?

If BFD FSM then I think this is a false assumption or perhaps based on
specific implementation. If OSPF FSM then we are all in sync.

See the point being is that the BFD session UP the draft is referring to as
a trigger for OSPF adj to come UP does not mean anything yet about
path liveness (except proving that BFD control packets made it to a peer -
depending on BFD mode of operation). So reacting on it immediately by any
client would be a wrong thing to do.  I see nothing in section 6.2 of
RFC5880 which would indicate any hold time or which would block BFD state
transition to UP waiting for even single BFD Echo packet to be exchanged.

BFD probing interval can be set to 2 sec and multiplier set to 3 which
would mean that only after 6 sec from BFD UP state you would get some
meaningful data about the link letting BFD Echo packets to get exchanged.
If you bring OSPF adj. UP immediately after seeing BFD session UP you have
not accomplished anything what wait-for-bfd is trying to do. With that I
actually think the draft in the current form as stated in section 4 is
harmful - it only mentions to wait for BFD session to get established.

All along I was trying to highlight that point. And let me self correct one
thing I said earlier ... In one of the emails to Albert I mentioned that
such timer could be 0. Well not really - the min amount of time between BFD
UP and OSPF adj. UP should be: (BFD probing interval x multiplier) + time
it takes on a given platform to sent messages between LCs and RE/RP.

Regards,
Robert.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-31 Thread Robert Raszuk
Hey Albert,

Ok now we are in sync as far as what is the topic.

I think such a delay is very useful and completely in the spirit of OSPF or
ISIS strict-mode operation.

So I do recommend that the draft should discuss it.

The default can be 0 if you think that is the proper value. But the
operational section of that draft IMO is exactly the place where such
paragraph should be placed. Maybe I would not be so persistent in this
little thread if Les wouldn't indicate that current timer will be removed
(current timer acting as an artificial delay and being much longer then
time needed for BFD to come UP).

Author's choice. My mission is accomplished here for the WG mailing list
records :)

Cheers,
Robert.



On Mon, Jan 31, 2022 at 9:04 PM Albert Fu (BLOOMBERG/ 120 PARK) <
af...@bloomberg.net> wrote:

> Hi Robert,
>
> As mentioned in my previous email, I feel it is better not to specify in
> the draft the timer for when OSPF should come up after BFD is up.
>
> The current implementation is for OSPF to come up as soon as BFD is up. A
> user can change this behaviour via configuration, to delay when OSPF can
> come up after BFD is up. Different customers may have different delay
> requirements, and there may also be platform dependent limitation.
>
> Thanks
>
> Albert
>
> From: rob...@raszuk.net At: 01/31/22 14:52:43 UTC-5:00
> To: Albert Fu (BLOOMBERG/ 120 PARK ) 
> Cc: a...@cisco.com, draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org,
> ginsb...@cisco.com, ketant.i...@gmail.com, lsr@ietf.org
> Subject: Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD"
> - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
> Hi Albert,
>
> On Mon, Jan 31, 2022 at 8:38 PM Albert Fu (BLOOMBERG/ 120 PARK) <
> af...@bloomberg.net> wrote:
>
>> Hi Robert,
>>
>> Do you mean we should make it mandatory in the draft to stipulate a delay
>> time between when OSPF should wait for BFD to come up?
>>
>
> No.
>
> The timer is for OSPF to bring adj up only after X timer expires from the
> moment BFD session came up and stayed up (never went down).
>
> No changes to BFD needed at all.
>
> Trivial to implement on the client side and very useful operationally.
>
> Thx,
> Robert
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-31 Thread Robert Raszuk
> I have stated that “IF” additional functionality is required from BFD

No one says so, It would be not realistic to require for BFD to come up in
a hidden mode, operate for timer X then when timer X expires signal that to
clients. And this is precisely what you are suggesting as a push back.

It is client thing to delay their action according to the operational
needs.

Many thx,
R.

On Mon, Jan 31, 2022 at 8:48 PM Les Ginsberg (ginsberg) 
wrote:

> Jeff –
>
>
>
> I appreciate that you have been pulled into reading a very lengthy thread
> and then commenting  on it – which is a difficult/time consuming  thing to
> do accurately.
>
> And I certainly welcome your input and agree with your input.
>
>
>
> I have not asked for BFD extensions.
>
> I have stated that “IF” additional functionality is required from BFD that
> the proper place to discuss that is in the BFD WG – and such discussions
> are definitely not in scope of this draft.
>
>
>
> The main content of this lengthy thread is Robert asking for additional
> specification in this draft and other folks (myself, Albert, Ketan) saying
> it doesn’t belong in this draft. Which is why I agree with everything you
> say below except for your perception that you are agreeing with Robert. You
> are actually agreeing with myself, Albert, Ketan. 
>
>
>
> Thanx for your participation.
>
>
>
>     Les
>
>
>
> *From:* Jeffrey Haas 
> *Sent:* Monday, January 31, 2022 11:28 AM
> *To:* Robert Raszuk 
> *Cc:* Ketan Talaulikar ; Les Ginsberg (ginsberg) <
> ginsb...@cisco.com>; draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Acee
> Lindem (acee) ; Albert Fu ; lsr <
> lsr@ietf.org>
> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> [Note that I read the LSR mailing list infrequently, but this thread was
> brought to my attention.]
>
>
>
> I wish to largely support Robert's point here.  BFD is not intended as a
> link quality protocol.  It's a very simple hello protocol that can operate
> quite quickly and provide simple edge transition events of Up and Down.
>
>
>
> There has been work in the BFD Working Group over the years to attempt to
> bring more of "link quality" behaviors to the protocol.  One, of interest
> to this thread, is the BFD for Large Packets work, which can support MTU
> probes as part of BFD operation.
>
>
>
> draft-ietf-bfd-stability discusses leveraging BFD internal state to help
> look at link instability issues as BFD sees them.
>
>
>
> And, of course, Greg Mirsky had several times he wanted to get BFD to do
> more active behaviors.  He was encouraged to leverage the BFD machinery in
> his own non-BFD draft if he found it helpful.  I suspect he'll respond to
> this thread with comments on his thinking here.
>
>
>
> That said, the BFD strict work is about removing control-plane protocol
> ambiguity with regards to how it uses BFD and how the state machines
> interact with each other.  I think that work has been reasonably done.
>
>
>
> The thing that BFD isn't about in such contexts is being more than a
> simple proxy for the link being of bad enough quality for BFD to go down
> taking the client protocols down with it.  It's important for those client
> protocols and the operators to set the timers and Detection Multiplier
> (number of lost packets) to speeds they think support their needs.  If you
> have a noisy link that can drop several packets in succession and that's
> what you want to be your trigger, BFD is your protocol.  If you want it to
> take an apparently continuous loss over most of a second, BFD can do that
> too if you tune your timers appropriately.
>
>
>
> But, as you say Robert, it's not intended to be a general IPPM style
> tool.  I don't believe the BFD strict drafts should try to treat BFD as if
> it is one.
>
>
>
> -- Jeff
>
>
>
>
>
>
>
> On Jan 31, 2022, at 5:31 AM, Robert Raszuk  wrote:
>
>
>
> HI Ketan & Les,
>
>
>
> To finish this topic I would like to observe that IMHO you have it quite
> backwords.
>
>
>
> *Comment #1*
>
>
>
> The tone of your expressions is trying to illustrate that there can be
> many clients for given link probing tool (here BFD). In reality the
> situation is vastly different. There is usually one link state IGP running
> on the node and given set of probing protocols are associated with it.
> Moreover, the world does not end on BFD. BFD is just one possible tool, but
> more and more path probing tools are emerging or are already deployed.
> Asking for each of them to introduce into their

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-31 Thread Robert Raszuk
sed after a time
>>of X.  This time X is referred as "BGP BFD Hold time".  The proposed
>>default BGP BFD Hold time value is 30 seconds.  The BGP BFD Hold time
>>value is configurable.
>>
>> To me it is clear that BGP BFD Hold time is on the client side and here
>> affects BGP FSM.
>>
>> Thx,
>> Robert.
>>
>>
>>
>>
>>
>>
>>
>> From: ginsb...@cisco.com At: 01/30/22 14:38:37 UTC-5:00
>>> To: rob...@raszuk.net, ketant.i...@gmail.com
>>> Cc: Albert Fu (BLOOMBERG/ 120 PARK ) ,
>>> a...@cisco.com, draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org,
>>> lsr@ietf.org
>>> Subject: RE: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
>>> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>>>
>>> Robert –
>>>
>>>
>>>
>>> Here is what you said (emphasis added):
>>>
>>>
>>>
>>> 
>>>
>>> But the timer I am suggesting is not related to BFD operation, but to
>>> OSPF (and/or ISIS). It is not about BFD sessions being UP or DOWN. It is
>>> about *allowing BFD for more testing (with various parameters (for
>>> example increasing test packet size in some discrete steps)* before
>>> OSPF is happy to bring the adj. up.
>>>
>>> 
>>>
>>>
>>>
>>> Point #1: If you want BFD to do more testing (such as MTU testing) then
>>> clearly you need extensions to BFD (such as
>>> https://datatracker.ietf.org/doc/draft-ietf-bfd-large-packets/ )
>>>
>>>
>>>
>>> Point #2: The existing timers (as Ketan points out are mentioned in
>>> Section 5) are applied today at the OSPF level precisely because OSPF does
>>> not currently have strict-mode operation. So in a flapping scenario you
>>> could see the following behavior:
>>>
>>>
>>>
>>> a)BFD goes down
>>>
>>> b)OSPF goes down in response to BFD
>>>
>>> c)OSPF comes back up
>>>
>>> d)Link is still unstable – so traffic is being dropped some of the time
>>> – but perhaps OSPF adjacency stays up (i.e., OSPF hellos get through often
>>> enough to keep the OSPF adjacency up)
>>>
>>>
>>>
>>> So some implementations have chosen to insert a delay following “b”.
>>> This doesn’t guarantee stability, but hopefully makes it less likely. And
>>> because OSPF today does NOT wait for BFD to come up, the delay has to be
>>> implemented at the OSPF level.
>>>
>>>
>>>
>>> Once you have strict mode support, the sequence becomes:
>>>
>>>
>>>
>>> a)BFD goes down
>>>
>>> b)OSPF goes down in response to BFD
>>>
>>> c)BFD comes back up
>>>
>>> d)OSPF comes back up
>>>
>>>
>>>
>>> Now, if the concern is that BFD comes back up while the link is still
>>> unstable, the way to address that is to put a delay either before BFD
>>> attempts to bring up a new session or a delay after achieving UP state
>>> before it signals UP to its clients – such as OSPF. This is a better
>>> solution because all BFD clients benefit from this. Ad if the link is still
>>> unstable, it is more likely that the BFD session will go down during the
>>> delay period than it would be for OSPF because the BFD timers are
>>> significantly more aggressive.
>>>
>>> (BTW, this behavior can be done w/o a BFD protocol extension – it is
>>> purely an implementation choice.)
>>>
>>>
>>>
>>> From a design perspective, dampening is always best done at the lowest
>>> layer possible. In most cases, interface layer dampening is best. If that
>>> is not reliable for some reason, then move one layer up – not two layers up.
>>>
>>>
>>>
>>>Les
>>>
>>>
>>>
>>>
>>>
>>> *From:* Robert Raszuk 
>>> *Sent:* Sunday, January 30, 2022 10:05 AM
>>> *To:* Ketan Talaulikar 
>>> *Cc:* Les Ginsberg (ginsberg) ; Acee Lindem (acee) <
>>> a...@cisco.com>; draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Albert
>>> Fu ; lsr 
>>> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
>>> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>>>
>>>
>>>
>>> Hi Ketan,
>>>
>>>
>>>
>>> I would like to point out that the draft discusses the BFD "dampening"
>>> or "hold-down" mechanism in Sec 5. We are aware of BFD implementations that
>>> include such mechanisms in a protocol-agnostic manner.
>>>
>>>
>>>
>>> BFD dampening or hold-time are completely orthogonal to my point. Both
>>> have nothing to do with it.
>>>
>>>
>>>
>>> Those timers only fire when BFD goes down. In my example BFD does not go
>>> down. But we want to bring up the client adj. only after X ms/sec/min etc
>>> ...of normal BFD operation if no failure is detected during that timer.
>>>
>>>
>>>
>>> This draft indicates that OSPF adjacency will "advance" in the neighbor
>>> FSM only after BFD reports UP.
>>>
>>>
>>>
>>> And that is exactly too soon. In fact if you do that today
>>> without waiting some time (if you retire the current OSPF timer) you will
>>> not help at all in the case you are trying to address.
>>>
>>>
>>>
>>> Reason being that perhaps 200 ms after BFD UP it will go down, but OSPF
>>> adj. will get already established. It is really pretty simple.
>>>
>>>
>>>
>>> Thx,
>>>
>>> Robert.
>>>
>>>
>>>
>>> PS. And yes I think ISIS should also get fixed in that respect.
>>>
>>>
>>>
>>
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-31 Thread Robert Raszuk
> what new signal should BFD send to OSPF when this is done?

None. Lack of DOWN signal is enough for OSPF to proceed.

Thx,
R.

On Mon, Jan 31, 2022 at 6:50 PM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> The paragraph you quote below has to do with BGP behavior in the event
> “BFD session does not transition to the Up state”.
>
> There is no disagreement about what the protocol (BGP or OSPF) should do
> in this case. The point of strict-mode is to “wait-for-BFD”.
>
>
>
> You, however, are trying to introduce some additional requirements. To
> this end you said:
>
>
>
> *“What I find missing in the draft is a mutually (between OSPF peers)
> timer fired after BFD session is up which in OSPF could hold on allowing
> BFD to do some more testing before declaring adj to be established. I think
> just bringing OSPF adj immediately after the BFD session is up is not a
> good thing.”*
>
>
>
> So apparently you want BFD to signal UP – but have the protocol do nothing
> until BFD completes some additional testing. What then was the point of BFD
> signaling UP to OSPF? And since you want the additional testing to be done
> by BFD, what new signal should BFD send to OSPF when this is done?
>
> The point of BFD sending UP to its clients is to indicate that BFD thinks
> the link has been verified from the BFD perspective. I do not see the point
> of sending two such signals. If you think current BFD testing is inadequate
> please ask for extensions to BFD (in the BFD WG).
>
>
>
> You also said:
>
>
>
> “BFD is a great tool to tell you if the end to end path is UP or DOWN. It
> was not designed to give you any characteristics or metrics for the path
> quality.”
>
>
>
> I agree. But if you are now proposing that protocol adjacencies should not
> come up until certain link quality metrics are met (e.g., link loss, delay)
> – you are moving into an area that is completely out of scope of this draft.
>
> I won’t dig deeper into what could be a very lengthy discussion. If you
> really want to pursue this idea, I suggest you write a new draft.
>
>
>
>Les
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Monday, January 31, 2022 6:59 AM
> *To:* Albert Fu ; Les Ginsberg (ginsberg) <
> ginsb...@cisco.com>; Ketan Talaulikar 
> *Cc:* Acee Lindem (acee) ;
> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; lsr 
> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Les & Ketan
>
>
>
> Nowadays, it is also common to see the "break-in-middle" failures. we use
> BFD to detect this sort of failure within sub-second. And to dampen this
> sort of break-in-middle failures, we will need to use BFD
> holdtime/dampening.
>
>
>
> Another data point to the above and this discussion which Albert is
> co-author of.
>
>
>
> Ref:
> https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-bfd-strict-mode
>
>
>
> Please see the below paragraph which clearly says *BGP BFD Hold time*:
>
>
>
>If the BFD session does not transition to the Up state, and the
>HoldTimer has been negotiated to a non-zero value, the BGP FSM will
>close the session appropriately.  If the HoldTimer has been
>negotiated to a zero value, the session should be closed after a time
>of X.  This time X is referred as "BGP BFD Hold time".  The proposed
>default BGP BFD Hold time value is 30 seconds.  The BGP BFD Hold time
>value is configurable.
>
>
>
> To me it is clear that BGP BFD Hold time is on the client side and here
> affects BGP FSM.
>
>
>
> Thx,
>
> Robert.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> From: ginsb...@cisco.com At: 01/30/22 14:38:37 UTC-5:00
>
> To: rob...@raszuk.net, ketant.i...@gmail.com
> Cc: Albert Fu (BLOOMBERG/ 120 PARK ) , a...@cisco.com,
> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org, lsr@ietf.org
> Subject: RE: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD"
> - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Robert –
>
>
>
> Here is what you said (emphasis added):
>
>
>
> 
>
> But the timer I am suggesting is not related to BFD operation, but to OSPF
> (and/or ISIS). It is not about BFD sessions being UP or DOWN. It is about 
> *allowing
> BFD for more testing (with various parameters (for example increasing test
> packet size in some discrete steps)* before OSPF is happy to bring the
> adj. up.
>
> 
>
>
>
> Point #1: If you want BFD to do more testing (such as MTU testing) then
> clearly you need extensions to BFD (suc

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-31 Thread Robert Raszuk
ic is being dropped some of the time –
> but perhaps OSPF adjacency stays up (i.e., OSPF hellos get through often
> enough to keep the OSPF adjacency up)
>
>
>
> So some implementations have chosen to insert a delay following “b”. This
> doesn’t guarantee stability, but hopefully makes it less likely. And
> because OSPF today does NOT wait for BFD to come up, the delay has to be
> implemented at the OSPF level.
>
>
>
> Once you have strict mode support, the sequence becomes:
>
>
>
> a)BFD goes down
>
> b)OSPF goes down in response to BFD
>
> c)BFD comes back up
>
> d)OSPF comes back up
>
>
>
> Now, if the concern is that BFD comes back up while the link is still
> unstable, the way to address that is to put a delay either before BFD
> attempts to bring up a new session or a delay after achieving UP state
> before it signals UP to its clients – such as OSPF. This is a better
> solution because all BFD clients benefit from this. Ad if the link is still
> unstable, it is more likely that the BFD session will go down during the
> delay period than it would be for OSPF because the BFD timers are
> significantly more aggressive.
>
> (BTW, this behavior can be done w/o a BFD protocol extension – it is
> purely an implementation choice.)
>
>
>
> From a design perspective, dampening is always best done at the lowest
> layer possible. In most cases, interface layer dampening is best. If that
> is not reliable for some reason, then move one layer up – not two layers up.
>
>
>
>Les
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Sunday, January 30, 2022 10:05 AM
> *To:* Ketan Talaulikar 
> *Cc:* Les Ginsberg (ginsberg) ; Acee Lindem (acee) <
> a...@cisco.com>; draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Albert Fu <
> af...@bloomberg.net>; lsr 
> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Hi Ketan,
>
>
>
> I would like to point out that the draft discusses the BFD "dampening" or
> "hold-down" mechanism in Sec 5. We are aware of BFD implementations that
> include such mechanisms in a protocol-agnostic manner.
>
>
>
> BFD dampening or hold-time are completely orthogonal to my point. Both
> have nothing to do with it.
>
>
>
> Those timers only fire when BFD goes down. In my example BFD does not go
> down. But we want to bring up the client adj. only after X ms/sec/min etc
> ...of normal BFD operation if no failure is detected during that timer.
>
>
>
> This draft indicates that OSPF adjacency will "advance" in the neighbor
> FSM only after BFD reports UP.
>
>
>
> And that is exactly too soon. In fact if you do that today without waiting
> some time (if you retire the current OSPF timer) you will not help at all
> in the case you are trying to address.
>
>
>
> Reason being that perhaps 200 ms after BFD UP it will go down, but OSPF
> adj. will get already established. It is really pretty simple.
>
>
>
> Thx,
>
> Robert.
>
>
>
> PS. And yes I think ISIS should also get fixed in that respect.
>
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-31 Thread Robert Raszuk
Les & Ketan


> Nowadays, it is also common to see the "break-in-middle" failures. we use
> BFD to detect this sort of failure within sub-second. And to dampen this
> sort of break-in-middle failures, we will need to use BFD
> holdtime/dampening.
>

Another data point to the above and this discussion which Albert is
co-author of.

Ref:
https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-bfd-strict-mode

Please see the below paragraph which clearly says *BGP BFD Hold time*:

   If the BFD session does not transition to the Up state, and the
   HoldTimer has been negotiated to a non-zero value, the BGP FSM will
   close the session appropriately.  If the HoldTimer has been
   negotiated to a zero value, the session should be closed after a time
   of X.  This time X is referred as "BGP BFD Hold time".  The proposed
   default BGP BFD Hold time value is 30 seconds.  The BGP BFD Hold time
   value is configurable.

To me it is clear that BGP BFD Hold time is on the client side and here
affects BGP FSM.

Thx,
Robert.







From: ginsb...@cisco.com At: 01/30/22 14:38:37 UTC-5:00
> To: rob...@raszuk.net, ketant.i...@gmail.com
> Cc: Albert Fu (BLOOMBERG/ 120 PARK ) , a...@cisco.com,
> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org, lsr@ietf.org
> Subject: RE: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD"
> - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
> Robert –
>
>
>
> Here is what you said (emphasis added):
>
>
>
> 
>
> But the timer I am suggesting is not related to BFD operation, but to OSPF
> (and/or ISIS). It is not about BFD sessions being UP or DOWN. It is about 
> *allowing
> BFD for more testing (with various parameters (for example increasing test
> packet size in some discrete steps)* before OSPF is happy to bring the
> adj. up.
>
> 
>
>
>
> Point #1: If you want BFD to do more testing (such as MTU testing) then
> clearly you need extensions to BFD (such as
> https://datatracker.ietf.org/doc/draft-ietf-bfd-large-packets/ )
>
>
>
> Point #2: The existing timers (as Ketan points out are mentioned in
> Section 5) are applied today at the OSPF level precisely because OSPF does
> not currently have strict-mode operation. So in a flapping scenario you
> could see the following behavior:
>
>
>
> a)BFD goes down
>
> b)OSPF goes down in response to BFD
>
> c)OSPF comes back up
>
> d)Link is still unstable – so traffic is being dropped some of the time –
> but perhaps OSPF adjacency stays up (i.e., OSPF hellos get through often
> enough to keep the OSPF adjacency up)
>
>
>
> So some implementations have chosen to insert a delay following “b”. This
> doesn’t guarantee stability, but hopefully makes it less likely. And
> because OSPF today does NOT wait for BFD to come up, the delay has to be
> implemented at the OSPF level.
>
>
>
> Once you have strict mode support, the sequence becomes:
>
>
>
> a)BFD goes down
>
> b)OSPF goes down in response to BFD
>
> c)BFD comes back up
>
> d)OSPF comes back up
>
>
>
> Now, if the concern is that BFD comes back up while the link is still
> unstable, the way to address that is to put a delay either before BFD
> attempts to bring up a new session or a delay after achieving UP state
> before it signals UP to its clients – such as OSPF. This is a better
> solution because all BFD clients benefit from this. Ad if the link is still
> unstable, it is more likely that the BFD session will go down during the
> delay period than it would be for OSPF because the BFD timers are
> significantly more aggressive.
>
> (BTW, this behavior can be done w/o a BFD protocol extension – it is
> purely an implementation choice.)
>
>
>
> From a design perspective, dampening is always best done at the lowest
> layer possible. In most cases, interface layer dampening is best. If that
> is not reliable for some reason, then move one layer up – not two layers up.
>
>
>
>Les
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Sunday, January 30, 2022 10:05 AM
> *To:* Ketan Talaulikar 
> *Cc:* Les Ginsberg (ginsberg) ; Acee Lindem (acee) <
> a...@cisco.com>; draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Albert Fu <
> af...@bloomberg.net>; lsr 
> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Hi Ketan,
>
>
>
> I would like to point out that the draft discusses the BFD "dampening" or
> "hold-down" mechanism in Sec 5. We are aware of BFD implementations that
> include such mechanisms in a protocol-agnostic manner.
>
>
>
> BFD dampening or hold-time are completely orth

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Robert Raszuk
Hi Les,

> the way to address that is to put a delay either before BFD attempts to
bring up a new session

No this will not work. The BFD session must be fully up and BFD has to have
a chance for normal operation for X units of time. (By normal I mean with
existing or new BFD extensions which is out of scope of this discussion).

> or a delay after achieving UP state before it signals UP to its clients –
such as OSPF.

This is exactly what I am describing. Except you think that now BFD should
hold on on a per client or per OSPF neighbor basis and I think that it is
clients who should hold on from reacting to signaled UP state.

The way you are suggesting puts unnecessary burden on BFD where from BFD
POV link went up at t0 and never went down. It is the client who may need
to delay his action depending on the nature of the client.

At least we got to the point that both of us are clear on the topic.
Before when I see dampening or hold times insertion only indicates that
there was a mismatch in understanding. And to your examples imagine that
this is a new interface and BFD was never up before on it. The
behavior should be identical.

Thx,
R.

On Sun, Jan 30, 2022 at 8:38 PM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> Here is what you said (emphasis added):
>
>
>
> 
>
> But the timer I am suggesting is not related to BFD operation, but to OSPF
> (and/or ISIS). It is not about BFD sessions being UP or DOWN. It is about 
> *allowing
> BFD for more testing (with various parameters (for example increasing test
> packet size in some discrete steps)* before OSPF is happy to bring the
> adj. up.
>
> 
>
>
>
> Point #1: If you want BFD to do more testing (such as MTU testing) then
> clearly you need extensions to BFD (such as
> https://datatracker.ietf.org/doc/draft-ietf-bfd-large-packets/ )
>
>
>
> Point #2: The existing timers (as Ketan points out are mentioned in
> Section 5) are applied today at the OSPF level precisely because OSPF does
> not currently have strict-mode operation. So in a flapping scenario you
> could see the following behavior:
>
>
>
> a)BFD goes down
>
> b)OSPF goes down in response to BFD
>
> c)OSPF comes back up
>
> d)Link is still unstable – so traffic is being dropped some of the time –
> but perhaps OSPF adjacency stays up (i.e., OSPF hellos get through often
> enough to keep the OSPF adjacency up)
>
>
>
> So some implementations have chosen to insert a delay following “b”. This
> doesn’t guarantee stability, but hopefully makes it less likely. And
> because OSPF today does NOT wait for BFD to come up, the delay has to be
> implemented at the OSPF level.
>
>
>
> Once you have strict mode support, the sequence becomes:
>
>
>
> a)BFD goes down
>
> b)OSPF goes down in response to BFD
>
> c)BFD comes back up
>
> d)OSPF comes back up
>
>
>
> Now, if the concern is that BFD comes back up while the link is still
> unstable, the way to address that is to put a delay either before BFD
> attempts to bring up a new session or a delay after achieving UP state
> before it signals UP to its clients – such as OSPF. This is a better
> solution because all BFD clients benefit from this. Ad if the link is still
> unstable, it is more likely that the BFD session will go down during the
> delay period than it would be for OSPF because the BFD timers are
> significantly more aggressive.
>
> (BTW, this behavior can be done w/o a BFD protocol extension – it is
> purely an implementation choice.)
>
>
>
> From a design perspective, dampening is always best done at the lowest
> layer possible. In most cases, interface layer dampening is best. If that
> is not reliable for some reason, then move one layer up – not two layers up.
>
>
>
>Les
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Sunday, January 30, 2022 10:05 AM
> *To:* Ketan Talaulikar 
> *Cc:* Les Ginsberg (ginsberg) ; Acee Lindem (acee) <
> a...@cisco.com>; draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Albert Fu <
> af...@bloomberg.net>; lsr 
> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Hi Ketan,
>
>
>
> I would like to point out that the draft discusses the BFD "dampening" or
> "hold-down" mechanism in Sec 5. We are aware of BFD implementations that
> include such mechanisms in a protocol-agnostic manner.
>
>
>
> BFD dampening or hold-time are completely orthogonal to my point. Both
> have nothing to do with it.
>
>
>
> Those timers only fire when BFD goes down. In my example BFD does not go
> down. But we want to bring up the client adj. only after X ms/se

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Robert Raszuk
Hi Ketan,

> It explains the scenario of a noisy link that experiences traffic drops.

The point is that BFD may or may not detect noisy links or links with
"degraded or poor quality". There are many failure scenarios - especially
brownouts - where BFD will continue to run just fine over a link and where
at the same time user data will experience very poor performance.

So stating in the RFC that BFD may help to detect such cases is simply very
misleading (to say it gently :).

And you are stating so exactly in the below sentence:

*"In certain other scenarios, a degraded or poor quality link will allow
OSPF adjacency formation to succeed*
*but the BFD session establishment will fail or the BFD session will flap.*

Thx,
R.


On Sun, Jan 30, 2022 at 6:03 PM Ketan Talaulikar 
wrote:

> Hi Robert,
>
> Thanks for your review and comments.
>
> This email is in response to your first point "overpromise".
>
> First, there is no text in the draft that "overpromises" that the strict
> mode of operation detects "all forwarding" issues. We are talking about BFD
> and its capabilities are well-known. It is not in the scope of this
> document to discuss BFD capabilities and shortcomings (e.g. the MTU issue
> you describe).
>
> The draft text that you have asked to remove is important. It explains the
> scenario of a noisy link that experiences traffic drops. I am aware of
> issues in production networks, where we've had OSPF adjacency flaps
> continuously or sporadically due to OSPF adjacency coming up somehow but
> then BFD bringing it down. This causes routing churn and service
> degradation. This is one of the key drivers for this draft.
>
> However, welcome any text clarifications/suggestions for improving the
> document.
>
> Thanks,
> Ketan
>
>>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Robert Raszuk
Hi Ketan,

I would like to point out that the draft discusses the BFD "dampening" or
> "hold-down" mechanism in Sec 5. We are aware of BFD implementations that
> include such mechanisms in a protocol-agnostic manner.
>

BFD dampening or hold-time are completely orthogonal to my point. Both have
nothing to do with it.

Those timers only fire when BFD goes down. In my example BFD does not go
down. But we want to bring up the client adj. only after X ms/sec/min etc
...of normal BFD operation if no failure is detected during that timer.

This draft indicates that OSPF adjacency will "advance" in the neighbor FSM
> only after BFD reports UP.
>

And that is exactly too soon. In fact if you do that today without waiting
some time (if you retire the current OSPF timer) you will not help at all
in the case you are trying to address.

Reason being that perhaps 200 ms after BFD UP it will go down, but OSPF
adj. will get already established. It is really pretty simple.

Thx,
Robert.

PS. And yes I think ISIS should also get fixed in that respect.

>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Robert Raszuk
an interest in such
>> problems e.g., MTU.
>>
>>
>>
>> In regards to “dampening” = which I think is the relevant term for the
>> timer related suggestions you are making - this also does not belong in the
>> IGP. If you do not want the BFD session to come back up too quickly after a
>> failure, the proper place to put timers is either at the interface layer or
>> in the BFD implementation.
>>
>> I am familiar with implementations which apply this timer at the protocol
>> level (AKA BFD client in this context) and this is done precisely because
>> the protocol does NOT have the functionality being defined in this draft.
>> Once you have implemented “wait-for-BFD” logic as defined in this draft you
>> do not need additional delay timers in the protocol.
>>
>>
>>
>> I don’t think the suggestions you are making belong in this document.
>>
>>
>>
>> Les
>>
>>
>>
>>
>>
>> *From:* Lsr  *On Behalf Of * Robert Raszuk
>> *Sent:* Saturday, January 29, 2022 11:25 AM
>> *To:* Acee Lindem (acee) 
>> *Cc:* Ketan Talaulikar ;
>> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Albert Fu <
>> af...@bloomberg.net>; lsr 
>> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
>> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>>
>>
>>
>> Hi Acee,
>>
>>
>>
>> Can you suggest text which with you’d be happy? I’m sure the authors
>> would add you to the acknowledgements.
>>
>>
>>
>> Actually instead of suggesting any new text I would suggest to delete the
>> two below sentences and it will be fine:
>>
>>
>>
>> *"In certain other scenarios, a degraded or poor quality link will allow
>> OSPF adjacency formation to succeed*
>>
>> *but the BFD session establishment will fail or the BFD session
>> will flap.  In this case, traffic that gets *
>>
>> *forwarded over such a link may experience packet drops while the failure
>> of the BFD session establishment *
>>
>> *would not enable fast routing convergence if the link were to go down or
>> flap."*
>>
>>
>>
>> This could be described but I don’t think it should be normative. This
>> begs the question as to why a hold down timer is not a part of the BFD
>> protocol itself.
>>
>>
>>
>> There is one - BFD calls it multiplier.
>>
>>
>>
>> But the timer I am suggesting is not related to BFD operation, but to
>> OSPF (and/or ISIS). It is not about BFD sessions being UP or DOWN. It is
>> about allowing BFD for more testing (with various parameters (for example
>> increasing test packet size in some discrete steps) before OSPF is happy to
>> bring the adj. up.
>>
>>
>>
>> Thx,
>>
>> R.
>>
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-29 Thread Robert Raszuk
Hi Les,

That timer and its consistency on both ends clearly belongs to OSPF not to
> BFD.
>


> *[LES:] I disagree. The definition of UP state belongs to the BFD
> protocol/implementation.*
>
> *If you don’t want BFD clients (like OSPF) to react “too quickly” then
> build additional config/logic into your BFD implementation so it does not
> signal UP state before additional criteria is met – do not make each BFD
> client (and there could be multiple for a given session) configure its own
> definition of BFD UP.*
>

I think we are looking at this from different perspectives.

I am saying bring BFD UP and allow X seconds/minutes/hours to run a
sequence of testing before bringing OSPF adj up.

You are saying do not declare BFD as UP before all of those testing passes.
That test sequence could be just running vanilla normal BFD for X
seconds/minutes/hours.

That would require introducing a completely new BFD state. Worse, that
timer may be very different on a per type of interface basis as each
interface type has completely different characteristics. Also such timer
would need to have a different value on a per BFD client basis. (For
example OSPF adj UP could be very different then PE-PE BFD for BGP as PULSE
alternative :)

Sorry I really do not think this belongs to BFD at all. It is a local
client thing how long from t0 = BFD UP it will wait before proceeding
further.

And last but not least - such extended testing does not need to kick in
every time interface flaps. Maybe the operator only wants to run it during
maintenance windows once per day ? Or once per week ?

But I am not going to even remotely hope I can convince you :) So let's
forget it.

Cheers,
R/
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-29 Thread Robert Raszuk
Hi Les,

> Discussion of how to make BFD failure detection more robust belongs in
the BFD WG
> If you do not want the BFD session to come back up too quickly after a
failure

Nothing I suggested is related to any of the above.

Let me perhaps provide a very simple example.

BFD being used is *AS*IS*.

All the operator wants is to run it for say X sec without ever going
down before bringing OSPF adj up.

That timer and its consistency on both ends clearly belongs to OSPF not to
BFD.

Now what happens within those 30 sec, what BFD packets are formed and how
they are exchanged is all BFD business - but I am not suggesting to include
any of those in this draft.

Do we have a common understanding so far ?

Hint: Albert already stated that he needs that timer and that both vendors
provided it via cfg. All that confirms is that timer is needed. All I am
suggesting (even before being aware of the manual cfg for it) was to
synchronize the value or pick lower configured between two peers.

Kind regards,
R.
















On Sat, Jan 29, 2022 at 9:08 PM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> It is good that you take an active interest in this technology – but I
> think the suggestions you are making should not be targeted at IGP use of
> BFD.
>
>
>
> Discussion of how to make BFD failure detection more robust belongs in the
> BFD WG – and – as you know – that WG has taken an interest in such problems
> e.g., MTU.
>
>
>
> In regards to “dampening” = which I think is the relevant term for the
> timer related suggestions you are making - this also does not belong in the
> IGP. If you do not want the BFD session to come back up too quickly after a
> failure, the proper place to put timers is either at the interface layer or
> in the BFD implementation.
>
> I am familiar with implementations which apply this timer at the protocol
> level (AKA BFD client in this context) and this is done precisely because
> the protocol does NOT have the functionality being defined in this draft.
> Once you have implemented “wait-for-BFD” logic as defined in this draft you
> do not need additional delay timers in the protocol.
>
>
>
> I don’t think the suggestions you are making belong in this document.
>
>
>
> Les
>
>
>
>
>
> *From:* Lsr  *On Behalf Of * Robert Raszuk
> *Sent:* Saturday, January 29, 2022 11:25 AM
> *To:* Acee Lindem (acee) 
> *Cc:* Ketan Talaulikar ;
> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Albert Fu <
> af...@bloomberg.net>; lsr 
> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Hi Acee,
>
>
>
> Can you suggest text which with you’d be happy? I’m sure the authors would
> add you to the acknowledgements.
>
>
>
> Actually instead of suggesting any new text I would suggest to delete the
> two below sentences and it will be fine:
>
>
>
> *"In certain other scenarios, a degraded or poor quality link will allow
> OSPF adjacency formation to succeed*
>
> *but the BFD session establishment will fail or the BFD session
> will flap.  In this case, traffic that gets *
>
> *forwarded over such a link may experience packet drops while the failure
> of the BFD session establishment *
>
> *would not enable fast routing convergence if the link were to go down or
> flap."*
>
>
>
> This could be described but I don’t think it should be normative. This
> begs the question as to why a hold down timer is not a part of the BFD
> protocol itself.
>
>
>
> There is one - BFD calls it multiplier.
>
>
>
> But the timer I am suggesting is not related to BFD operation, but to OSPF
> (and/or ISIS). It is not about BFD sessions being UP or DOWN. It is about
> allowing BFD for more testing (with various parameters (for example
> increasing test packet size in some discrete steps) before OSPF is happy to
> bring the adj. up.
>
>
>
> Thx,
>
> R.
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-29 Thread Robert Raszuk
Acee,


> I don’t anyone has implemented the later capability. This MTU test
> extension could be added in a separate draft if there were a strong
> requirement.
>

I think you are mixing an example of what BFD could be doing to make sure
the link is fine with the delay timer allowing it to do whatever it needs.

The former is a local operator's decision. The latter IMO could be added to
the draft to make the interoperability seamless. But this is just a
suggestion/hint. Nothing more.

Many thx,
R.


>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-29 Thread Robert Raszuk
Hi Acee,

Can you suggest text which with you’d be happy? I’m sure the authors would
> add you to the acknowledgements.
>

Actually instead of suggesting any new text I would suggest to delete the
two below sentences and it will be fine:

*"In certain other scenarios, a degraded or poor quality link will allow
OSPF adjacency formation to succeed*
*but the BFD session establishment will fail or the BFD session will flap.
In this case, traffic that gets *
*forwarded over such a link may experience packet drops while the failure
of the BFD session establishment *
*would not enable fast routing convergence if the link were to go down or
flap."*

This could be described but I don’t think it should be normative. This begs
> the question as to why a hold down timer is not a part of the BFD protocol
> itself.
>

There is one - BFD calls it multiplier.

But the timer I am suggesting is not related to BFD operation, but to OSPF
(and/or ISIS). It is not about BFD sessions being UP or DOWN. It is about
allowing BFD for more testing (with various parameters (for example
increasing test packet size in some discrete steps) before OSPF is happy to
bring the adj. up.

Thx,
R.

>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-29 Thread Robert Raszuk
Hi Albert,

> [AF] This draft ensures that BFD can be used to detect failure quickly
> when there is a complete path failure between the nodes. You are right that
> there are many other types of failure that BFD cannot detect.
>
> Indeed, but the draft says otherwise. I think that needs to be adjusted
before publication. If you say it detects complete path failure then
perfect. It could detect more than that ... say path failure at some MTUs
if more time is allowed to test the link before ospf adj comes up.

[AF] This is a good point you brought up. Both router vendors that I have
> tested (Cisco & Juniper) do indeed have timer mechanism to delay when OSPF
> would be allowed to come up, which we have tested (useful to guard against
> flapping links). I am not sure if this "hold down" mechanism needs to be
> included in the draft.
>
>
Sure one option is to keep it as a cfg knob.

But If you do not exchange this timer with agreement to choose a lower one
between peers you are both risking misconfiguration as well as adding a bit
more operational complexity.

Even if this is not exchanged, draft/rfc should still mention it and
recommend some wise default timer - say 5 sec. Maybe more. But I see no
harm to signal it explicitly between peers.

Moreover there can be implementations which will not support that timer and
it will be needed to ask them each time to add it.

Thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-29 Thread Robert Raszuk
Hi Ketan and all,

I support this draft - it is a useful addition.

There are two elements which I would suggest to adjust in the text before
publication.

*#1 Overpromise*

Even below you say:

> Since there is a issue with forwarding *(which is what BFD detects)*

and in the text we see:

"In certain other scenarios, a degraded or poor quality link will allow
OSPF adjacency formation to succeed
   but the *BFD session establishment will fail or the BFD session
will flap*."

Reader may get an impression that if he enables strict mode he is 100%
safe. Sure he is safer then before but not 100% safe.

Real networks prove that there are classes of failures which BFD can not
detect. And Albert knows them too :)

For example some emulated circuits can experience periodic drops only at
some MTUs and only when link utilization reaches X %. While there is
ongoing extension to BFD to fill it with payload I don't think that BFD can
be useful to also saturate say in 80% 10G link with probes to test it well
before allowing OSPF to be established.

*#2 Timer *

What I find missing in the draft is a mutually (between OSPF peers) timer
fired after BFD session is up which in OSPF could hold on allowing BFD to
do some more testing before declaring adj to be established. I think just
bringing OSPF adj immediately after the BFD session is up is not a good
thing. Keep in mind that we are bringing the interface up so by applying
such a timer we are not dropping packets .. in fact quite the reverse we
are making sure user packets would not be dropped.

Cheers,
Robert
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] How to forward the solutions for "Prefixes Unreachable Notification" problem

2022-01-26 Thread Robert Raszuk
Hi Les,

Yes you are correct. It is a classic pull vs push model.

Push gives you notification about state. That's it.

Pull gives you much more as it includes e2e elements of the data plane - of
course for a bit higher cost.

I would not disregard any of the above.

We have been having similar discussions in the past - I am sure some of us
still remember MARP proposal from Alvaro and Russ inspired by David Oran's
idea. While it was L2 switch centric I mention it here to highlight how a
cool idea went nowhere ... ref:
https://tools.ietf.org/id/draft-retana-marp-02.txt and how the end to end
pull model overwhelmed it.

At least we could learn from it for the WAN now. Perhaps fully
coincidentally it is also very similar to proposed by Tony pub-sub model.

Thx,
R.

On Thu, Jan 27, 2022 at 1:49 AM Les Ginsberg (ginsberg)  wrote:

> Chris -
>
> The scale request comes from real customers. So, it is understandable for
> you to be "aghast" - but it is a real request.
>
> As far as BFD goes, my opinion is this won’t scale. There is a significant
> difference between operating sessions which continuously monitor liveness
> in a full mesh versus using some approach which only triggers network-wide
> traffic when some topology change is locally detected. There are multiple
> approaches being discussed which do the latter - but BFD is not one of them.
>
> You can disagree - or - as Greg has done - say we don’t really have to
> consider this scale. I am not going to try to convince you otherwise.
> But if so you aren’t solving the problem we have been asked to solve.
>
>Les
>
>
> > -Original Message-
> > From: Christian Hopps 
> > Sent: Wednesday, January 26, 2022 2:15 PM
> > To: Les Ginsberg (ginsberg) 
> > Cc: Greg Mirsky ; Aijun Wang
> > ; lsr@ietf.org
> > Subject: Re: [Lsr] How to forward the solutions for "Prefixes Unreachable
> > Notification" problem
> >
> >
> > "Les Ginsberg (ginsberg)"  writes:
> >
> > > Greg –
> > >
> > > With 100K PE scale, we are talking about 100K BFD sessions/PE and
> > > close to 5 million BFD sessions network-wide.
> > >
> > > Eliminating one of the options we are discussing is admittedly a
> > > small step, but still worthwhile.
> >
> > Hang on a sec. :)
> >
> > We are starting off with this GINORMOUS network with 100,000 PE routers!
> > Why would 5 million sessions of anything over this gigantic network of
> > routers be a reason to disregard it as a solution? (How many total
> routers are
> > there BTW?)
> >
> > If you build something gignatic *everything* is going to scale way up.
> To use
> > an oldie but a goodie: TANSTAAFL.
> >
> > Thanks,
> > Chris.
> >
> >
> > >
> > >
> > > However, If you still want to continue to advocate for BFD, I will
> > > say no more.
> > >
> > >
> > >
> > >Les
> > >
> > >
> > >
> > > From: Lsr  On Behalf Of Greg Mirsky
> > > Sent: Tuesday, January 25, 2022 7:06 PM
> > > To: Aijun Wang 
> > > Cc: lsr 
> > > Subject: Re: [Lsr] How to forward the solutions for "Prefixes
> > > Unreachable Notification" problem
> > >
> > >
> > >
> > > Hi Aijun,
> > >
> > > I believe that under Option D you can add multihop BFD per RFC 5883.
> > > No new protols needed.
> > >
> > >
> > >
> > > Regards,
> > >
> > > Greg
> > >
> > >
> > >
> > > On Tue, Jan 25, 2022, 18:17 Aijun Wang 
> > > wrote:
> > >
> > > Hi, All:
> > >
> > >
> > >
> > > As Peter’s example and Acee’s suggestions, let’s focus on the
> > > following problem to think how to solve it efficiently and
> > > reasonably:
> > >
> > > Scenario: 100 areas each with 1000 PEs (100K total PEs) with 2
> > > ABRs per area
> > >
> > > Problem: Overlay services(BGP or Tunnel) that rely on the IGP
> > > needs to be notified immediately when the remote Peer failed, to
> > > assist such overlay service accomplish fast switchover(how to
> > > switchover is out of the discussion)
> > >
> > > Potential Solutions:
> > >
> > >There are now mainly four categories of the solutions, as
> > > described below and their brief analysis:
> > >
> > >Category A: PUA/PULSE. Utilizes the existing IGP mechanism to
> > > transport/flooding the notification message.
> > >
> > >Category B: Detail/Important Prefixes Leaks. Bypass the
> > > summary side-effect for some detailed/important prefixes by
> > > leaking/not summarize them into each area.
> > >
> > >Category C: BGP based solution: Utilize the existing BGP
> > > infrastructure to transport the notification message
> > >
> > >Category D: OOB Solution. Design some new OOB protocol to
> > > transport the notification message.
> > >
> > >
> > >
> > > Because we are in LSR WG, and people are all IGP experts. After
> > > the intense discussion, can we now focus on the Category A/B?
> > >
> > > It is very curious that LSR WG will and should produce some BGP
> > > or OOB based solution. I think they may be feasible, but should
> > > be evaluated/discussed by 

Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-26 Thread Robert Raszuk
> The pulse solution does not suffer from the scale issues.

It shifts that "suffering" to flood the entire domain with information
which is not needed on P routers and selectively useful on the remote PEs.

Also fast signaling the fact that PE may have been disconnected from the
network for a few seconds may be actually more harmful to the actual
applications running behind it.

For single homed sites this is disaster as after next hop invalidation you
are stuck for the timeout (as discussed about 200 sec) before we connect
again.

For dual homed sites such switchover to a backup PE may result with
switchover to a backup CE (where PE-CE signaling is dynamic) where lots of
networks uses outbound NAT. While all cool from the perspective of the WAN
side - the NAT pool switchover means that application TCP sessions are
reset. What may mean real long service disruption for the customers apps
(especially those running long lived sessions).

The reason I mention this here is that whatever we do we should alway take
end to end user application analysis into account.

Thx,
R.










On Wed, Jan 26, 2022 at 10:20 AM Peter Psenak  wrote:

> Tony,
>
> On 25/01/2022 17:11, Tony Li wrote:
> >
> >
> > Peter,
> >
> >> we just moved the problem from IGPs to some "other" application.
> >
> >
> > That was the entire point. Hopefully, you see that as a good thing.
>
> actually I don't. I want to solve the problem, not to move it to other
> app running on the same nodes.
>
>
> The pulse solution does not suffer from the scale issues. With the limit
> of number of concurrent pulses on ABR it also address the catastrophic
> failure scenario you were worried about.
>
> thanks,
> Peter
> >
> > Tony
> >
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] BGP vs PUA/PULSE

2022-01-25 Thread Robert Raszuk
I think we have to accept that we have very different understanding on what
out-of-band means. But let's not get hang on this here.

Because to do it efficiently and in scalable manner close cooperation with
LSDB is required. Management system is completely orthogonal to that.

IMO Tony's proposal is a a new IGP message type with new transport. While a
separate subsystem (which can in fact run on different core and have
independent memory space ie. be a different thread or process) it is both
functionally and operationally an IGP service. Just like PUA or PULSE.

Flooding is not MUST have paradigm for an IGP where IGP == Interior Gateway
Protocol. As you know some folks use BGP as IGP for MSDCs. Clearly BGP does
not use too much flooding.

Best,
Robert.

On Wed, Jan 26, 2022 at 12:48 AM Aijun Wang 
wrote:

> Hi, Robert:
> Then why not let all of these out of band messages delivered via the
> management system?
>
> Aijun Wang
> China Telecom
>
> On Jan 25, 2022, at 23:28, Robert Raszuk  wrote:
>
> 
>
> Auto discovery is described in the draft.
>
> You may also provision this session by your management plane just like you
> push 1000s of configuration lines anyway to each network element.
>
> Those are commonly used techniques to run a network.
>
> On Tue, Jan 25, 2022 at 4:07 PM Aijun Wang 
> wrote:
>
>> Or, I guess you still need the ABR to act as the server. But, how these
>> RRs know which router is ABR?
>>
>> Aijun Wang
>> China Telecom
>>
>> On Jan 25, 2022, at 23:01, Aijun Wang  wrote:
>>
>> Hi, Robert:
>>
>> You mean make every PE as the register server?
>>
>> Aijun Wang
>> China Telecom
>>
>> On Jan 25, 2022, at 21:21, Robert Raszuk  wrote:
>>
>> 
>> Aijun,
>>
>> No, I think you misunderstanding our purpose.
>>>
>>
>> You are using this argument towards a number of people ... I recommend
>> you reconsider.
>>
>>
>>> The proposed solution can fit in small network, or large network and RR
>>> can locate anywhere the operator want to place. We have no assumption about
>>> the location of RR and PEs.
>>>
>>
>> Please observe that if you really want to put RRs outside of your local
>> area for whatever reason (maybe you run RR as a service in the cloud) then
>> actually we can combine X from my additional point with Tony's proposal. It
>> just occurred to me like a really interesting deployment mode so let me
>> describe the WG. Maybe Tony can add this model to his draft in the possible
>> deployment section.
>>
>> - - -
>>
>> When network elements residing outside of the local area are interested
>> in node liveness of selected nodes in the area (for example BGP Route
>> Reflectors running in the cloud) they can register with node
>> liveness servers in an area to receive targetted notifications for
>> interested addresses.
>>
>> Such notifications can be used to invalidate service next hops or tunnel
>> endpoints. Upon such action service information will be immediately
>> withdrawn.
>>
>> That deployment model offers full flexibility with just a handful of
>> additional TCP or QUIC sessions needed and very little to no extra state
>> injected in the network.
>>
>> - - -
>>
>> That model also addresses some concerns associated with any to any
>> registrations. No longer PEs need to register anything with ABRs nor ABRs
>> need to pass that information around.
>>
>> Best regards,
>> R.
>>
>>
>> ___
>> Lsr mailing list
>> Lsr@ietf.org
>> https://www.ietf.org/mailman/listinfo/lsr
>>
>>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-25 Thread Robert Raszuk
Hi Tony,

If a given PE needs to get all notifications from all other PEs it is
> sufficient that it sends to local ABRs a single registration in a form of
> 0.0.0.0/0 and be done.
>
>
> If you look a bit more carefully, you will find that registering for 0/0
> doesn’t work without a bit more smartness in the ABR. It’s doable, but not
> yet in the text.
>

I took it as an implementation detail.


> As it stands right now, the PE’s COULD register for the summaries of the
> other areas.  That would decrease the number of registrations. How the PE
> learns what those summaries are is currently magic.
>

Well pretty easy magic ,,, local service route nh lookup in the RIB should
easily yield the answer.


> At the end of the day, I don’t think that scale discussions will resolve
> this. In fact, if the cost of deployment was actually zero, I doubt that we
> would still see any progress in this conversation.
>

Unfortunately I do agree with this assessment.

Thx,
R.


> How do we depolarize this?
>
> Tony
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Fwd: New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-25 Thread Robert Raszuk
Peter,

If a given PE needs to get all notifications from all other PEs it is
sufficient that it sends to local ABRs a single registration in a form of
0.0.0.0/0 and be done.

But realistically speaking the case where services offered by a PE also
exist on all 100K other PEs would be pretty rare.

Thx,
R.

On Tue, Jan 25, 2022 at 4:42 PM Peter Psenak  wrote:

> On 25/01/2022 15:18, Robert Raszuk wrote:
> > Peter,
> >
> > You clearly missed the added new sentence to section 4.3 in version -01
> >
> > It is RECOMMENDED that the ABR register for the
> > most specific prefix that is less specific than the original prefix.
>
> I don't think so. I'm talking about PE to ABR registrations only. I have
> not even counted the inter-ABR ones.
>
> Peter
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] BGP vs PUA/PULSE

2022-01-25 Thread Robert Raszuk
Auto discovery is described in the draft.

You may also provision this session by your management plane just like you
push 1000s of configuration lines anyway to each network element.

Those are commonly used techniques to run a network.

On Tue, Jan 25, 2022 at 4:07 PM Aijun Wang 
wrote:

> Or, I guess you still need the ABR to act as the server. But, how these
> RRs know which router is ABR?
>
> Aijun Wang
> China Telecom
>
> On Jan 25, 2022, at 23:01, Aijun Wang  wrote:
>
> Hi, Robert:
>
> You mean make every PE as the register server?
>
> Aijun Wang
> China Telecom
>
> On Jan 25, 2022, at 21:21, Robert Raszuk  wrote:
>
> 
> Aijun,
>
> No, I think you misunderstanding our purpose.
>>
>
> You are using this argument towards a number of people ... I recommend you
> reconsider.
>
>
>> The proposed solution can fit in small network, or large network and RR
>> can locate anywhere the operator want to place. We have no assumption about
>> the location of RR and PEs.
>>
>
> Please observe that if you really want to put RRs outside of your local
> area for whatever reason (maybe you run RR as a service in the cloud) then
> actually we can combine X from my additional point with Tony's proposal. It
> just occurred to me like a really interesting deployment mode so let me
> describe the WG. Maybe Tony can add this model to his draft in the possible
> deployment section.
>
> - - -
>
> When network elements residing outside of the local area are interested in
> node liveness of selected nodes in the area (for example BGP Route
> Reflectors running in the cloud) they can register with node
> liveness servers in an area to receive targetted notifications for
> interested addresses.
>
> Such notifications can be used to invalidate service next hops or tunnel
> endpoints. Upon such action service information will be immediately
> withdrawn.
>
> That deployment model offers full flexibility with just a handful of
> additional TCP or QUIC sessions needed and very little to no extra state
> injected in the network.
>
> - - -
>
> That model also addresses some concerns associated with any to any
> registrations. No longer PEs need to register anything with ABRs nor ABRs
> need to pass that information around.
>
> Best regards,
> R.
>
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] BGP vs PUA/PULSE

2022-01-25 Thread Robert Raszuk
No. Run this node liveness service on ABR or on any other IGP node in an
area.

On Tue, Jan 25, 2022 at 4:01 PM Aijun Wang 
wrote:

> Hi, Robert:
>
> You mean make every PE as the register server?
>
> Aijun Wang
> China Telecom
>
> On Jan 25, 2022, at 21:21, Robert Raszuk  wrote:
>
> 
> Aijun,
>
> No, I think you misunderstanding our purpose.
>>
>
> You are using this argument towards a number of people ... I recommend you
> reconsider.
>
>
>> The proposed solution can fit in small network, or large network and RR
>> can locate anywhere the operator want to place. We have no assumption about
>> the location of RR and PEs.
>>
>
> Please observe that if you really want to put RRs outside of your local
> area for whatever reason (maybe you run RR as a service in the cloud) then
> actually we can combine X from my additional point with Tony's proposal. It
> just occurred to me like a really interesting deployment mode so let me
> describe the WG. Maybe Tony can add this model to his draft in the possible
> deployment section.
>
> - - -
>
> When network elements residing outside of the local area are interested in
> node liveness of selected nodes in the area (for example BGP Route
> Reflectors running in the cloud) they can register with node
> liveness servers in an area to receive targetted notifications for
> interested addresses.
>
> Such notifications can be used to invalidate service next hops or tunnel
> endpoints. Upon such action service information will be immediately
> withdrawn.
>
> That deployment model offers full flexibility with just a handful of
> additional TCP or QUIC sessions needed and very little to no extra state
> injected in the network.
>
> - - -
>
> That model also addresses some concerns associated with any to any
> registrations. No longer PEs need to register anything with ABRs nor ABRs
> need to pass that information around.
>
> Best regards,
> R.
>
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Fwd: New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-25 Thread Robert Raszuk
Peter,

You clearly missed the added new sentence to section 4.3 in version -01

It is RECOMMENDED that the ABR register for the
most specific prefix that is less specific than the original prefix.


Thx,
R


On Tue, Jan 25, 2022 at 2:57 PM Peter Psenak  wrote:

> On 25/01/2022 14:07, Robert Raszuk wrote:
> > Peter,
> >
> > Your math is off.
>
>
> no, it's right. Every local PE in an area registers to different 100
> remote addresses to local ABRs. 1k PEs in area * 100 destinations equals
> to 100k registrations.
>
> Peter
>
>
> >
> >  > 1. 100k registrations in each ABR, 10 million network wide
> >
> > ABRs will "aggregate" atomic registrations to summaries when passing
> > them to other ABRs. Please recalculate,
> >
>
> > Thx,
> > R.
> >
> >
> > On Tue, Jan 25, 2022 at 12:25 PM Peter Psenak
> > mailto:40cisco@dmarc.ietf.org>>
>
> > wrote:
> >
> > Tony,
> >
> > I'm going to use my target scale of 100k PEs, split in 100 areas, 2
> > ABRs
> > per area, with the average VPN size of 100.
> >
> > What you propose would result in:
> >
> > 1. 100k registrations in each ABR, 10 million network wide
> > 2. 1200 TCP sessions in each ABR, 220k TCP sessions network wide
> >
> > Above is present in the network constantly, without any PE failure
> > happening.
> >
> > We use summarization to avoid keeping states associated with 100k PE
> > prefixes. With your proposed solution, on the ABRs, we replaced the
> > state associated 100k prefixes with the state associated with 100k
> > registrations. What did we solve? Not much, we just moved the problem
> > from IGPs to some "other" application.
> >
> >
> > Using the same scale, with the Pulse proposal, we will have no extra
> > state in a stable condition.
> >
> > If one PE fails we will have extra 2 pulses.
> >
> > If one PE fails in every area at the same time we will have 200
> > extra 200 pulses.
> >
> > If we limit the number of pulses per ABR at any given time to 10
> (which
> > is far enough to address any realistic PE failure scenario), in the
> > worst case we will have extra 2k pulses in the network for a very
> > limited amount of time. Yes, it goes everywhere, but given the
> > amount of
> > data, it's not significant. We have more then 2k prefixes being
> carried
> > in the IGP networks today, and we know it's not an issue.
> >
> > thanks,
> > Peter
> >
> > On 19/01/2022 02:05, Tony Li wrote:
> >  >
> >  > FYI.  This is a better alternative that PUA/Pulse.
> >  >
> >  > Tony
> >  >
> >  >
> >  >> Begin forwarded message:
> >  >>
> >  >> *From: *internet-dra...@ietf.org
> > <mailto:internet-dra...@ietf.org> <mailto:internet-dra...@ietf.org
> > <mailto:internet-dra...@ietf.org>>
> >  >> *Subject: **New Version Notification for
> > draft-li-lsr-liveness-00.txt*
> >  >> *Date: *January 18, 2022 at 5:04:22 PM PST
> >  >> *To: *mailto:t...@ietfa.amsl.com>
> > <mailto:t...@ietfa.amsl.com <mailto:t...@ietfa.amsl.com>>>, "Tony
> Li"
> >  >> mailto:tony...@tony.li>
> > <mailto:tony...@tony.li <mailto:tony...@tony.li>>>
> >  >>
> >  >>
> >  >> A new version of I-D, draft-li-lsr-liveness-00.txt
> >  >> has been successfully submitted by Tony Li, and posted to the
> >  >> IETF repository.
> >  >>
> >  >> Name:draft-li-lsr-liveness
> >  >> Revision:00
> >  >> Title:Node Liveness Protocol
> >  >> Document date:2022-01-18
> >  >> Group:Individual Submission
> >  >> Pages:9
> >  >> URL:
> > https://www.ietf.org/archive/id/draft-li-lsr-liveness-00.txt
> > <https://www.ietf.org/archive/id/draft-li-lsr-liveness-00.txt>
> >  >> <https://www.ietf.org/archive/id/draft-li-lsr-liveness-00.txt
> > <https://www.ietf.org/archive/id/draft-li-lsr-liveness-00.txt>>
> >  >> Status: https://datatracker.ietf.org/doc/draft-li-lsr-liveness/
> > <https://datatracker.ietf.org/doc/draft-li-lsr-liveness/>
> >  >> <https://datatracker.ie

Re: [Lsr] BGP vs PUA/PULSE

2022-01-25 Thread Robert Raszuk
Aijun,

No, I think you misunderstanding our purpose.
>

You are using this argument towards a number of people ... I recommend you
reconsider.


> The proposed solution can fit in small network, or large network and RR
> can locate anywhere the operator want to place. We have no assumption about
> the location of RR and PEs.
>

Please observe that if you really want to put RRs outside of your local
area for whatever reason (maybe you run RR as a service in the cloud) then
actually we can combine X from my additional point with Tony's proposal. It
just occurred to me like a really interesting deployment mode so let me
describe the WG. Maybe Tony can add this model to his draft in the possible
deployment section.

- - -

When network elements residing outside of the local area are interested in
node liveness of selected nodes in the area (for example BGP Route
Reflectors running in the cloud) they can register with node
liveness servers in an area to receive targetted notifications for
interested addresses.

Such notifications can be used to invalidate service next hops or tunnel
endpoints. Upon such action service information will be immediately
withdrawn.

That deployment model offers full flexibility with just a handful of
additional TCP or QUIC sessions needed and very little to no extra state
injected in the network.

- - -

That model also addresses some concerns associated with any to any
registrations. No longer PEs need to register anything with ABRs nor ABRs
need to pass that information around.

Best regards,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Fwd: New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-25 Thread Robert Raszuk
Peter,

Your math is off.

> 1. 100k registrations in each ABR, 10 million network wide

ABRs will "aggregate" atomic registrations to summaries when passing them
to other ABRs. Please recalculate,

Thx,
R.


On Tue, Jan 25, 2022 at 12:25 PM Peter Psenak  wrote:

> Tony,
>
> I'm going to use my target scale of 100k PEs, split in 100 areas, 2 ABRs
> per area, with the average VPN size of 100.
>
> What you propose would result in:
>
> 1. 100k registrations in each ABR, 10 million network wide
> 2. 1200 TCP sessions in each ABR, 220k TCP sessions network wide
>
> Above is present in the network constantly, without any PE failure
> happening.
>
> We use summarization to avoid keeping states associated with 100k PE
> prefixes. With your proposed solution, on the ABRs, we replaced the
> state associated 100k prefixes with the state associated with 100k
> registrations. What did we solve? Not much, we just moved the problem
> from IGPs to some "other" application.
>
>
> Using the same scale, with the Pulse proposal, we will have no extra
> state in a stable condition.
>
> If one PE fails we will have extra 2 pulses.
>
> If one PE fails in every area at the same time we will have 200
> extra 200 pulses.
>
> If we limit the number of pulses per ABR at any given time to 10 (which
> is far enough to address any realistic PE failure scenario), in the
> worst case we will have extra 2k pulses in the network for a very
> limited amount of time. Yes, it goes everywhere, but given the amount of
> data, it's not significant. We have more then 2k prefixes being carried
> in the IGP networks today, and we know it's not an issue.
>
> thanks,
> Peter
>
> On 19/01/2022 02:05, Tony Li wrote:
> >
> > FYI.  This is a better alternative that PUA/Pulse.
> >
> > Tony
> >
> >
> >> Begin forwarded message:
> >>
> >> *From: *internet-dra...@ietf.org 
> >> *Subject: **New Version Notification for draft-li-lsr-liveness-00.txt*
> >> *Date: *January 18, 2022 at 5:04:22 PM PST
> >> *To: *mailto:t...@ietfa.amsl.com>>, "Tony Li"
> >> mailto:tony...@tony.li>>
> >>
> >>
> >> A new version of I-D, draft-li-lsr-liveness-00.txt
> >> has been successfully submitted by Tony Li, and posted to the
> >> IETF repository.
> >>
> >> Name:draft-li-lsr-liveness
> >> Revision:00
> >> Title:Node Liveness Protocol
> >> Document date:2022-01-18
> >> Group:Individual Submission
> >> Pages:9
> >> URL: https://www.ietf.org/archive/id/draft-li-lsr-liveness-00.txt
> >> 
> >> Status: https://datatracker.ietf.org/doc/draft-li-lsr-liveness/
> >> 
> >> Htmlized: https://datatracker.ietf.org/doc/html/draft-li-lsr-liveness
> >> 
> >>
> >>
> >> Abstract:
> >>   Prompt notification of the loss of node liveness or reachability is
> >>   useful for restoring services in tunneled topologies.  IGP
> >>   summarization precludes remote nodes from directly observing the
> >>   status of remote nodes.  This document proposes a service that, in
> >>   conjunction with the IGP, provides prompt notifications without
> >>   impacting IGP summarization.
> >>
> >>
> >>
> >>
> >> The IETF Secretariat
> >>
> >>
> >
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] BGP vs PUA/PULSE

2022-01-25 Thread Robert Raszuk
Aijun,

 The solution can’t rely only on the limited assumption.
>

And the protocol extensions can not be justified based on the limited
corner cases of broken network designs.

And, actually, the PE are Provider Edge Router, we always locate them at
> the non-backbone area in large network , that is close to the customer.
> There maybe some small network that may satisfy your assumption.
>

Thank you for educating me what "PE" stands for.


> We will not deploy the RR on every IGP area.
>

Great ! It's your network where you can do whatever you like there. But I
am so happy not to be one of your customers.

If you are given a recommendation to design the network in a way which
solves your requirement, then you dismiss it "just because" you can't
request any WG to standardize your solution to fit your network. It's not
how IETF operates.

Regards,
Robert.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] BGP vs PUA/PULSE

2022-01-25 Thread Robert Raszuk
Aijun,

[WAJ] X aims to how to withdraw the VPN prefixes with the mentioned
> extended communities, right?
>

Extended communities have nothing to do with this discussion at all.


> Y aims to how assist the RR get the prefix cost from one node that other
> than the RR itself. Right?
>

No.


> I think they all don’t answer the questions how to detect the failure of
> BGP peer. Right? For this requirement, you can only depend on the BGP hello
> timers, or BFD for BGP. Right?
>

Wrong. Let me explain.

PEs peer with iBGP sessions to a pair of RR in a local area. In the vast
majority of cases those RRs are IGP nodes. If so in exactly the same way as
ABRs, also RRs will be receiving local area link state flooding about PEs
going down.

That along with next hop tracking (local feature) will trigger event driven
service path invalidation followed by immediate withdrawal. Note that those
withdraws will be propagated via one or max two RR hops before reaching the
destination PE(s). That propagation speed is in milliseconds when measured
with solid BGP implementation. It can be much faster then flooding via 10s
of IGP nodes across two areas.

If RRs are running on x86 blades they do not need to be IGP nodes, but
nothing stops them from being passive IGP listeners.

That is how you detect the failures local to your areas on the BGP RRs. The
trigger is exactly the same for RRs as is for ABRs. Has been done and
deployed for decades now and works perfectly fine.



> [WAJ] For SRv6 tunnel services, which layer control? How?
>

Something runs over those tunnels right ? If not we have no issue anyway as
there is no service to be protected or :).

Thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] BGP vs PUA/PULSE

2022-01-25 Thread Robert Raszuk
Aijun,

On Tue, Jan 25, 2022 at 10:30 AM Aijun Wang 
wrote:

> Hi, Robert:
>
>
>
> So the main point here is that yes it is highly recommended to use
> summaries across areas. But what's not clear (at least to me) is if we
> really need to signal node liveness in IGP to accomplish the ultimate goal
> of few sec connectivity restoration upon PE failure in the cases of
> redundant egress connectivity.
>
> *[WAJ] I think the goal is same as that we invented the BFD for BGP, or
> BFD for other protocol.  We have discussed several rounds that why we don’t
> want to reply on BFD for the previous mentioned two categories scenarios.*
>


You are completely missing the point. I never mentioned BFD. BFD is not
needed in neither X nor Y options listed in my note.


Especially as some folks apparently still believe that "BGP is slow" and
> that iBGP def timers of 180 sec are even relevant to the topic. They are
> clearly not.
>
> *[WAJ] What we are discussing is that the “BGP Peer Status Detection(BGP’s
> hello timer)” is slow.  For tunnel services, there is no timers at all.*
>


And the fundamental observation is that “BGP Peer Status Detection(BGP’s
hello timer)” is absolutely irrelevant to neither X nor Y.

If you and/or others do not understand this basic premise then we have an
issue.

And for tunnel services with no BGP there is other layer controlling the
service. Tunnel all by itself is pretty useless.

Kind regards,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] BGP vs PUA/PULSE

2022-01-25 Thread Robert Raszuk
>
> As we’ve been saying for months now, the ordering is:
>
> 1) Leak PE loopbacks
> 2) Pub/Sub
> 3) Carry loopbacks in BGP and recurse
> 4) Multi-hop BFD
> 5) Pulse
> 6) PUA


I would like to actually add to this list two alternatives which some
vendors have been shipping for decades:

X) Withdraw service routes based on the local next hop tracking trigger
(from local RR)
Y) Even Further Aggregate the withdraws as described in X

X alone gives you today max a few seconds of connectivity disruption - if
that is the overall goal we are heading to no 1-6 is required.

Y as described in
https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt   and/or
 https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-nh-cost   can
even further reduce that. #3 above is even further speed up via native to
BGP recursion.

If BGP is not running there some service does and that service hopefully
has better convergence characteristics then BGP.

So the main point here is that yes it is highly recommended to use
summaries across areas. But what's not clear (at least to me) is if we
really need to signal node liveness in IGP to accomplish the ultimate goal
of few sec connectivity restoration upon PE failure in the cases of
redundant egress connectivity.

I am perhaps restating the above but trying to look holistically at the
problem. Especially as some folks apparently still believe that "BGP is
slow" and that iBGP def timers of 180 sec are even relevant to the topic.
They are clearly not.

Many thx,
Robert
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Robert Raszuk
Hi Les,

I think using IGP to *discover* some services is perfectly fine.

For example many years ago I proposed to use IGP to automatically discover
BGP route reflectors for the sole purpose of bgp auto discovery. After that
originally BGP friends suggested that we will do faster if we do not touch
IGP so I moved the proposal to be fully BGP based. However very recently I
see new requirements popping up to also support it with IGP.

I guess for you this would be a "service or an application" and would meet
resistance - understand that opinion.

Here however we are talking about so tiny information to be added to IGP to
help with seamless operation that frankly I do not understand your
resistance at all. Modulo so heavy commitment to PULSE of course
which this proposal could freeze.

> The service itself doesn’t even need to be running on a router at all.

That is true.

But for the service to be efficient it should at least listen to the local
area IGP to listen to LSAs/LSPs.

Now of course discovery of this "server" can be realized out of band (say
by CLI). But it beats me why we would not support auto discovery of such a
service if it happens to run on an IGP node.

What harm would such additional discovery do to the LSDB, CPU, memory,
traffic ???

Kind regards,
Robert

On Mon, Jan 24, 2022 at 10:11 PM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> Please read more carefully.
>
>
>
> The draft introduces “a protocol(service) that will provide prompt
> notification of changes in node liveness…”
>
> What I am talking about here is NOT the information being sent by the
> service – but rather the service itself. Advertisement of the
> existence/location of that service is not within the purview of the IGP.
>
> That’s all I am saying…
>
>
>
> If you don’t like my use of the word “application” feel free to replace it
> with “service”. Whatever it is, it is not the IGP itself. The iGP hasn’t
> been extended to do anything – in fact that is one of the points of Tony’s
> proposal since he doesn’t think the IGP should be in the business of
> sending node liveness information.
>
> The service itself doesn’t even need to be running on a router at all.
>
>
>
>Les
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Monday, January 24, 2022 12:33 PM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* Tony Li ; lsr 
> *Subject:* Re: [Lsr] New Version Notification for
> draft-li-lsr-liveness-00.txt
>
>
>
> Hi Les,
>
>
>
> > Advertisement of the availability of an application is not within the
> scope of an IGP
>
>
>
> Who proposes that ?
>
>
>
> AFAIK protocol Tony proposed indicates livness of an IGP node and
> specifically not any application on that node.
>
>
>
> Thx,
> R.
>
>
>
>
>
>
>
>
>
> On Mon, Jan 24, 2022 at 9:24 PM Les Ginsberg (ginsberg)  40cisco@dmarc.ietf.org> wrote:
>
> Tony –
>
>
>
> Advertisement of the availability of an application is not within the
> scope of an IGP no matter what level of TLV you use to do so.
>
>
>
> Existing capability advertisements (e.g., flex-algo participation, SR )
> are indicators of what the IGP implementation supports and/or is configured
> to support. Not the same thing as what you are proposing here.
>
>
>
>Les
>
>
>
>
>
> *From:* Tony Li  *On Behalf Of *Tony Li
> *Sent:* Monday, January 24, 2022 12:12 PM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* lsr 
> *Subject:* Re: [Lsr] New Version Notification for
> draft-li-lsr-liveness-00.txt
>
>
>
>
>
> Les,
>
>
>
>
>
> My precedent is the use Router Capability for advertising FlexAlgo
> definitions.  This is a service being provided by the area and it seems
> equally relevant. Would you prefer a top level TLV?
>
>
>
> *[LES:] Flex Algo is a routing calculation being performed by the IGPs who
> also advertise the algorithm specific attributes and algorithm specific
> forwarding identifiers.*
>
> *I don’t see what you are doing as analogous.*
>
>
>
>
>
> Well, IMHO, I can understand the participation of the router in an algo as
> a capability. The definition of the algo seems to be somewhat orthogonal.
> But it’s there anyway. Similarly, the capability of node liveness is pretty
> clear. Yes, the service access point information is orthogonal.
>
>
>
> You didn’t respond: Would you prefer a top level TLV?  That would the
> logical alternative.
>
>
>
> Tony
>
>
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


<    1   2   3   4   5   6   >