Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-03-26 Thread Christian Hopps



Peter Psenak  writes:


On 26/01/2022 10:40, Robert Raszuk wrote:

 > The pulse solution does not suffer from the scale issues.
It shifts that "suffering" to flood the entire domain with information which
is not needed on P routers and selectively useful on the remote PEs.


yes, but how much data? Minimal. It's not an issue, no matter how many times you
keep repeating it.


Anything that invokes the update process on every router in an area is not 
"Minimal" impact, no matter how much data is being synchronized as a result. 
I'm prepared to keep repeating this too as I think it will be true each time. :)

Thanks,
Chris.
[as wg member]

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-26 Thread Peter Psenak

Tony,

On 26/01/2022 16:46, Tony Li wrote:


Peter,


The pulse solution does not suffer from the scale issues.

It shifts that "suffering" to flood the entire domain with information which is 
not needed on P routers and selectively useful on the remote PEs.


yes, but how much data? Minimal. It's not an issue, no matter how many times 
you keep repeating it.



You say minimal, but then you have to have a mechanism in place to limit the 
amount that you flood. And as I mentioned previously, you know that some 
customer will turn that up to 11 so that they get more pulses and they will end 
up imploding.


we can put hard limit there. I don't see the above as a real issue. Any 
realistic case will be covered by a single digit number of concurent pulses.




Meanwhile 100k registrations is not a scale issue.  As a former BGP engineer, 
we call that ‘Monday’.  If you still don’t like it, we can go down the path 
that Robert suggested and register for 0/0.  We could even go half way down 
that path and aggregate PE’s into their own prefix and register for just that 
prefix.



I don't like the registration idea because it has serious scale 
implications. I don't see a need to put all the registration burden to 
the network (and operator). We can solve the problem without it.






I feel this discussion has reached a point where we keep repeating what has 
been said already several times. No point continuing, unless some new data are 
on the table.



Correct.  Shall we agree to disagree?


we disagree between ourselves.
We need to listen to what others have to say.

Peter




Tony





___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-26 Thread Tony Li

Peter,

>> > The pulse solution does not suffer from the scale issues.
>> It shifts that "suffering" to flood the entire domain with information which 
>> is not needed on P routers and selectively useful on the remote PEs.
> 
> yes, but how much data? Minimal. It's not an issue, no matter how many times 
> you keep repeating it.


You say minimal, but then you have to have a mechanism in place to limit the 
amount that you flood. And as I mentioned previously, you know that some 
customer will turn that up to 11 so that they get more pulses and they will end 
up imploding.

Meanwhile 100k registrations is not a scale issue.  As a former BGP engineer, 
we call that ‘Monday’.  If you still don’t like it, we can go down the path 
that Robert suggested and register for 0/0.  We could even go half way down 
that path and aggregate PE’s into their own prefix and register for just that 
prefix.  


> I feel this discussion has reached a point where we keep repeating what has 
> been said already several times. No point continuing, unless some new data 
> are on the table.


Correct.  Shall we agree to disagree?

Tony


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-26 Thread Peter Psenak

On 26/01/2022 10:40, Robert Raszuk wrote:


 > The pulse solution does not suffer from the scale issues.

It shifts that "suffering" to flood the entire domain with information 
which is not needed on P routers and selectively useful on the remote PEs.


yes, but how much data? Minimal. It's not an issue, no matter how many 
times you keep repeating it.




Also fast signaling the fact that PE may have been disconnected from the 
network for a few seconds may be actually more harmful to the actual 
applications running behind it.


For single homed sites this is disaster as after next hop invalidation 
you are stuck for the timeout (as discussed about 200 sec) before we 
connect again.


above is not the case. For single home services, pulse will not result 
in any action.





For dual homed sites such switchover to a backup PE may result with 
switchover to a backup CE (where PE-CE signaling is dynamic) where lots 
of networks uses outbound NAT. While all cool from the perspective of 
the WAN side - the NAT pool switchover means that application TCP 
sessions are reset. What may mean real long service disruption for the 
customers apps (especially those running long lived sessions).


above applies to regular BGP PIC used without summarization. There's 
nothing specific to Pulse.


I feel this discussion has reached a point where we keep repeating what 
has been said already several times. No point continuing, unless some 
new data are on the table.


Peter



The reason I mention this here is that whatever we do we should alway 
take end to end user application analysis into account.


Thx,
R.










On Wed, Jan 26, 2022 at 10:20 AM Peter Psenak 
mailto:40cisco@dmarc.ietf.org>> 
wrote:


Tony,

On 25/01/2022 17:11, Tony Li wrote:
 >
 >
 > Peter,
 >
 >> we just moved the problem from IGPs to some "other" application.
 >
 >
 > That was the entire point. Hopefully, you see that as a good thing.

actually I don't. I want to solve the problem, not to move it to other
app running on the same nodes.


The pulse solution does not suffer from the scale issues. With the
limit
of number of concurrent pulses on ABR it also address the catastrophic
failure scenario you were worried about.

thanks,
Peter
 >
 > Tony
 >

___
Lsr mailing list
Lsr@ietf.org 
https://www.ietf.org/mailman/listinfo/lsr




___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-26 Thread Robert Raszuk
> The pulse solution does not suffer from the scale issues.

It shifts that "suffering" to flood the entire domain with information
which is not needed on P routers and selectively useful on the remote PEs.

Also fast signaling the fact that PE may have been disconnected from the
network for a few seconds may be actually more harmful to the actual
applications running behind it.

For single homed sites this is disaster as after next hop invalidation you
are stuck for the timeout (as discussed about 200 sec) before we connect
again.

For dual homed sites such switchover to a backup PE may result with
switchover to a backup CE (where PE-CE signaling is dynamic) where lots of
networks uses outbound NAT. While all cool from the perspective of the WAN
side - the NAT pool switchover means that application TCP sessions are
reset. What may mean real long service disruption for the customers apps
(especially those running long lived sessions).

The reason I mention this here is that whatever we do we should alway take
end to end user application analysis into account.

Thx,
R.










On Wed, Jan 26, 2022 at 10:20 AM Peter Psenak  wrote:

> Tony,
>
> On 25/01/2022 17:11, Tony Li wrote:
> >
> >
> > Peter,
> >
> >> we just moved the problem from IGPs to some "other" application.
> >
> >
> > That was the entire point. Hopefully, you see that as a good thing.
>
> actually I don't. I want to solve the problem, not to move it to other
> app running on the same nodes.
>
>
> The pulse solution does not suffer from the scale issues. With the limit
> of number of concurrent pulses on ABR it also address the catastrophic
> failure scenario you were worried about.
>
> thanks,
> Peter
> >
> > Tony
> >
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-26 Thread Peter Psenak

Tony,

On 25/01/2022 17:11, Tony Li wrote:



Peter,


we just moved the problem from IGPs to some "other" application.



That was the entire point. Hopefully, you see that as a good thing.


actually I don't. I want to solve the problem, not to move it to other 
app running on the same nodes.



The pulse solution does not suffer from the scale issues. With the limit 
of number of concurrent pulses on ABR it also address the catastrophic 
failure scenario you were worried about.


thanks,
Peter


Tony



___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-25 Thread Robert Raszuk
Hi Tony,

If a given PE needs to get all notifications from all other PEs it is
> sufficient that it sends to local ABRs a single registration in a form of
> 0.0.0.0/0 and be done.
>
>
> If you look a bit more carefully, you will find that registering for 0/0
> doesn’t work without a bit more smartness in the ABR. It’s doable, but not
> yet in the text.
>

I took it as an implementation detail.


> As it stands right now, the PE’s COULD register for the summaries of the
> other areas.  That would decrease the number of registrations. How the PE
> learns what those summaries are is currently magic.
>

Well pretty easy magic ,,, local service route nh lookup in the RIB should
easily yield the answer.


> At the end of the day, I don’t think that scale discussions will resolve
> this. In fact, if the cost of deployment was actually zero, I doubt that we
> would still see any progress in this conversation.
>

Unfortunately I do agree with this assessment.

Thx,
R.


> How do we depolarize this?
>
> Tony
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-25 Thread Tony Li

Robert,

> If a given PE needs to get all notifications from all other PEs it is 
> sufficient that it sends to local ABRs a single registration in a form of 
> 0.0.0.0/0  and be done. 

If you look a bit more carefully, you will find that registering for 0/0 
doesn’t work without a bit more smartness in the ABR. It’s doable, but not yet 
in the text.

As it stands right now, the PE’s COULD register for the summaries of the other 
areas.  That would decrease the number of registrations. How the PE learns what 
those summaries are is currently magic.

At the end of the day, I don’t think that scale discussions will resolve this. 
In fact, if the cost of deployment was actually zero, I doubt that we would 
still see any progress in this conversation.

How do we depolarize this?

Tony


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-25 Thread Tony Li


Peter,

> we just moved the problem from IGPs to some "other" application.


That was the entire point. Hopefully, you see that as a good thing.

Tony

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Robert Raszuk
Hi Les,

I think using IGP to *discover* some services is perfectly fine.

For example many years ago I proposed to use IGP to automatically discover
BGP route reflectors for the sole purpose of bgp auto discovery. After that
originally BGP friends suggested that we will do faster if we do not touch
IGP so I moved the proposal to be fully BGP based. However very recently I
see new requirements popping up to also support it with IGP.

I guess for you this would be a "service or an application" and would meet
resistance - understand that opinion.

Here however we are talking about so tiny information to be added to IGP to
help with seamless operation that frankly I do not understand your
resistance at all. Modulo so heavy commitment to PULSE of course
which this proposal could freeze.

> The service itself doesn’t even need to be running on a router at all.

That is true.

But for the service to be efficient it should at least listen to the local
area IGP to listen to LSAs/LSPs.

Now of course discovery of this "server" can be realized out of band (say
by CLI). But it beats me why we would not support auto discovery of such a
service if it happens to run on an IGP node.

What harm would such additional discovery do to the LSDB, CPU, memory,
traffic ???

Kind regards,
Robert

On Mon, Jan 24, 2022 at 10:11 PM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> Please read more carefully.
>
>
>
> The draft introduces “a protocol(service) that will provide prompt
> notification of changes in node liveness…”
>
> What I am talking about here is NOT the information being sent by the
> service – but rather the service itself. Advertisement of the
> existence/location of that service is not within the purview of the IGP.
>
> That’s all I am saying…
>
>
>
> If you don’t like my use of the word “application” feel free to replace it
> with “service”. Whatever it is, it is not the IGP itself. The iGP hasn’t
> been extended to do anything – in fact that is one of the points of Tony’s
> proposal since he doesn’t think the IGP should be in the business of
> sending node liveness information.
>
> The service itself doesn’t even need to be running on a router at all.
>
>
>
>Les
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Monday, January 24, 2022 12:33 PM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* Tony Li ; lsr 
> *Subject:* Re: [Lsr] New Version Notification for
> draft-li-lsr-liveness-00.txt
>
>
>
> Hi Les,
>
>
>
> > Advertisement of the availability of an application is not within the
> scope of an IGP
>
>
>
> Who proposes that ?
>
>
>
> AFAIK protocol Tony proposed indicates livness of an IGP node and
> specifically not any application on that node.
>
>
>
> Thx,
> R.
>
>
>
>
>
>
>
>
>
> On Mon, Jan 24, 2022 at 9:24 PM Les Ginsberg (ginsberg)  40cisco@dmarc.ietf.org> wrote:
>
> Tony –
>
>
>
> Advertisement of the availability of an application is not within the
> scope of an IGP no matter what level of TLV you use to do so.
>
>
>
> Existing capability advertisements (e.g., flex-algo participation, SR )
> are indicators of what the IGP implementation supports and/or is configured
> to support. Not the same thing as what you are proposing here.
>
>
>
>Les
>
>
>
>
>
> *From:* Tony Li  *On Behalf Of *Tony Li
> *Sent:* Monday, January 24, 2022 12:12 PM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* lsr 
> *Subject:* Re: [Lsr] New Version Notification for
> draft-li-lsr-liveness-00.txt
>
>
>
>
>
> Les,
>
>
>
>
>
> My precedent is the use Router Capability for advertising FlexAlgo
> definitions.  This is a service being provided by the area and it seems
> equally relevant. Would you prefer a top level TLV?
>
>
>
> *[LES:] Flex Algo is a routing calculation being performed by the IGPs who
> also advertise the algorithm specific attributes and algorithm specific
> forwarding identifiers.*
>
> *I don’t see what you are doing as analogous.*
>
>
>
>
>
> Well, IMHO, I can understand the participation of the router in an algo as
> a capability. The definition of the algo seems to be somewhat orthogonal.
> But it’s there anyway. Similarly, the capability of node liveness is pretty
> clear. Yes, the service access point information is orthogonal.
>
>
>
> You didn’t respond: Would you prefer a top level TLV?  That would the
> logical alternative.
>
>
>
> Tony
>
>
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Les Ginsberg (ginsberg)
Robert –

Please read more carefully.

The draft introduces “a protocol(service) that will provide prompt notification 
of changes in node liveness…”
What I am talking about here is NOT the information being sent by the service – 
but rather the service itself. Advertisement of the existence/location of that 
service is not within the purview of the IGP.
That’s all I am saying…

If you don’t like my use of the word “application” feel free to replace it with 
“service”. Whatever it is, it is not the IGP itself. The iGP hasn’t been 
extended to do anything – in fact that is one of the points of Tony’s proposal 
since he doesn’t think the IGP should be in the business of sending node 
liveness information.
The service itself doesn’t even need to be running on a router at all.

   Les


From: Robert Raszuk 
Sent: Monday, January 24, 2022 12:33 PM
To: Les Ginsberg (ginsberg) 
Cc: Tony Li ; lsr 
Subject: Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

Hi Les,

> Advertisement of the availability of an application is not within the scope 
> of an IGP

Who proposes that ?

AFAIK protocol Tony proposed indicates livness of an IGP node and specifically 
not any application on that node.

Thx,
R.




On Mon, Jan 24, 2022 at 9:24 PM Les Ginsberg (ginsberg) 
mailto:40cisco@dmarc.ietf.org>> wrote:
Tony –

Advertisement of the availability of an application is not within the scope of 
an IGP no matter what level of TLV you use to do so.

Existing capability advertisements (e.g., flex-algo participation, SR ) are 
indicators of what the IGP implementation supports and/or is configured to 
support. Not the same thing as what you are proposing here.

   Les


From: Tony Li mailto:tony1ath...@gmail.com>> On Behalf 
Of Tony Li
Sent: Monday, January 24, 2022 12:12 PM
To: Les Ginsberg (ginsberg) mailto:ginsb...@cisco.com>>
Cc: lsr mailto:lsr@ietf.org>>
Subject: Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt


Les,


My precedent is the use Router Capability for advertising FlexAlgo definitions. 
 This is a service being provided by the area and it seems equally relevant. 
Would you prefer a top level TLV?

[LES:] Flex Algo is a routing calculation being performed by the IGPs who also 
advertise the algorithm specific attributes and algorithm specific forwarding 
identifiers.
I don’t see what you are doing as analogous.


Well, IMHO, I can understand the participation of the router in an algo as a 
capability. The definition of the algo seems to be somewhat orthogonal. But 
it’s there anyway. Similarly, the capability of node liveness is pretty clear. 
Yes, the service access point information is orthogonal.

You didn’t respond: Would you prefer a top level TLV?  That would the logical 
alternative.

Tony

___
Lsr mailing list
Lsr@ietf.org<mailto:Lsr@ietf.org>
https://www.ietf.org/mailman/listinfo/lsr
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Robert Raszuk
Hi Les,

> Advertisement of the availability of an application is not within the
scope of an IGP

Who proposes that ?

AFAIK protocol Tony proposed indicates livness of an IGP node and
specifically not any application on that node.

Thx,
R.




On Mon, Jan 24, 2022 at 9:24 PM Les Ginsberg (ginsberg)  wrote:

> Tony –
>
>
>
> Advertisement of the availability of an application is not within the
> scope of an IGP no matter what level of TLV you use to do so.
>
>
>
> Existing capability advertisements (e.g., flex-algo participation, SR )
> are indicators of what the IGP implementation supports and/or is configured
> to support. Not the same thing as what you are proposing here.
>
>
>
>Les
>
>
>
>
>
> *From:* Tony Li  *On Behalf Of *Tony Li
> *Sent:* Monday, January 24, 2022 12:12 PM
> *To:* Les Ginsberg (ginsberg) 
> *Cc:* lsr 
> *Subject:* Re: [Lsr] New Version Notification for
> draft-li-lsr-liveness-00.txt
>
>
>
>
>
> Les,
>
>
>
>
>
> My precedent is the use Router Capability for advertising FlexAlgo
> definitions.  This is a service being provided by the area and it seems
> equally relevant. Would you prefer a top level TLV?
>
>
>
> *[LES:] Flex Algo is a routing calculation being performed by the IGPs who
> also advertise the algorithm specific attributes and algorithm specific
> forwarding identifiers.*
>
> *I don’t see what you are doing as analogous.*
>
>
>
>
>
> Well, IMHO, I can understand the participation of the router in an algo as
> a capability. The definition of the algo seems to be somewhat orthogonal.
> But it’s there anyway. Similarly, the capability of node liveness is pretty
> clear. Yes, the service access point information is orthogonal.
>
>
>
> You didn’t respond: Would you prefer a top level TLV?  That would the
> logical alternative.
>
>
>
> Tony
>
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Les Ginsberg (ginsberg)
Tony –

Advertisement of the availability of an application is not within the scope of 
an IGP no matter what level of TLV you use to do so.

Existing capability advertisements (e.g., flex-algo participation, SR ) are 
indicators of what the IGP implementation supports and/or is configured to 
support. Not the same thing as what you are proposing here.

   Les


From: Tony Li  On Behalf Of Tony Li
Sent: Monday, January 24, 2022 12:12 PM
To: Les Ginsberg (ginsberg) 
Cc: lsr 
Subject: Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt


Les,



My precedent is the use Router Capability for advertising FlexAlgo definitions. 
 This is a service being provided by the area and it seems equally relevant. 
Would you prefer a top level TLV?

[LES:] Flex Algo is a routing calculation being performed by the IGPs who also 
advertise the algorithm specific attributes and algorithm specific forwarding 
identifiers.
I don’t see what you are doing as analogous.


Well, IMHO, I can understand the participation of the router in an algo as a 
capability. The definition of the algo seems to be somewhat orthogonal. But 
it’s there anyway. Similarly, the capability of node liveness is pretty clear. 
Yes, the service access point information is orthogonal.

You didn’t respond: Would you prefer a top level TLV?  That would the logical 
alternative.

Tony

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Tony Li

Les,


> My precedent is the use Router Capability for advertising FlexAlgo 
> definitions.  This is a service being provided by the area and it seems 
> equally relevant. Would you prefer a top level TLV?
>  
> [LES:] Flex Algo is a routing calculation being performed by the IGPs who 
> also advertise the algorithm specific attributes and algorithm specific 
> forwarding identifiers.
> I don’t see what you are doing as analogous.



Well, IMHO, I can understand the participation of the router in an algo as a 
capability. The definition of the algo seems to be somewhat orthogonal. But 
it’s there anyway. Similarly, the capability of node liveness is pretty clear. 
Yes, the service access point information is orthogonal.

You didn’t respond: Would you prefer a top level TLV?  That would the logical 
alternative.

Tony

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Les Ginsberg (ginsberg)
Tony –

Inline.

From: Tony Li  On Behalf Of Tony Li
Sent: Monday, January 24, 2022 10:15 AM
To: Les Ginsberg (ginsberg) 
Cc: lsr 
Subject: Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt


Hi Les,

Thank you for commenting.


I am not enthused about this solution. Full mesh isn’t appealing at scale. But 
I recognize this as an alternative which some might find useful in some 
deployments.


Is it clear that the full mesh is only at the ABR level?


[LES:] Yes


I also understand why and find it appropriate that you have introduced 
discussion of this alternative in LSR. But ultimately – as others have pointed 
out – this work does not belong in LSR.


As soon as the other solutions are withdrawn, I will be happy to go elsewhere. 
Tho I know not where as the reliance on the IGP will cause all others to 
disavow this as well. :)



Finally, I object to the use of IGP Router Capability advertisements as the 
vehicle for advertising the availability of the service. There are examples 
today of applications which monitor the IGP LSDB in order to provide value add 
– and they often execute on nodes not actively participating in IGP routing. 
While running such a service on ABRs is certainly one alternative, it is not 
the only one.  I do not want – nor do I find it appropriate – for Router 
Capability to be used as a form of DNS for such applications. Please find 
another means to advertise the availability of the service.


My precedent is the use Router Capability for advertising FlexAlgo definitions. 
 This is a service being provided by the area and it seems equally relevant. 
Would you prefer a top level TLV?

[LES:] Flex Algo is a routing calculation being performed by the IGPs who also 
advertise the algorithm specific attributes and algorithm specific forwarding 
identifiers.
I don’t see what you are doing as analogous.

   Les

The service is inexorably tied to the IGP to determine node liveness, so at 
least monitoring the IGP is a necessity. You’re absolutely correct, that this 
need not happen directly on ABRs. Certainly another IGP listener could provide 
this service.

Tony







___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Tony Li

Hi Les,

Thank you for commenting.


> I am not enthused about this solution. Full mesh isn’t appealing at scale. 
> But I recognize this as an alternative which some might find useful in some 
> deployments.


Is it clear that the full mesh is only at the ABR level?


> I also understand why and find it appropriate that you have introduced 
> discussion of this alternative in LSR. But ultimately – as others have 
> pointed out – this work does not belong in LSR.


As soon as the other solutions are withdrawn, I will be happy to go elsewhere. 
Tho I know not where as the reliance on the IGP will cause all others to 
disavow this as well. :)


> Finally, I object to the use of IGP Router Capability advertisements as the 
> vehicle for advertising the availability of the service. There are examples 
> today of applications which monitor the IGP LSDB in order to provide value 
> add – and they often execute on nodes not actively participating in IGP 
> routing. While running such a service on ABRs is certainly one alternative, 
> it is not the only one.  I do not want – nor do I find it appropriate – for 
> Router Capability to be used as a form of DNS for such applications. Please 
> find another means to advertise the availability of the service.


My precedent is the use Router Capability for advertising FlexAlgo definitions. 
 This is a service being provided by the area and it seems equally relevant. 
Would you prefer a top level TLV?

The service is inexorably tied to the IGP to determine node liveness, so at 
least monitoring the IGP is a necessity. You’re absolutely correct, that this 
need not happen directly on ABRs. Certainly another IGP listener could provide 
this service.

Tony







___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Tony Li

Indeed. If you have unreliable registrations, and a registration is lost, then 
subsequent notifications would be lost as well. The entire service could yield 
no results.  What is the use case?

On the bright side, you can always claim that the service is already deployed 
and you just happened to lose all registrations. :)

T


> On Jan 24, 2022, at 9:30 AM, Robert Raszuk  wrote:
> 
> 
> Are you sure this is operationally a good idea ? 
> 
> It's cool when registrations and up notifications will not get lost. But I 
> would not like to be the one troubleshooting a network when some 
> registrations or up notifications packets get lost ... It sounds like a 
> nightmare to me. 
> 
> Best,
> R.
> 
> On Mon, Jan 24, 2022 at 6:25 PM Greg Mirsky  > wrote:
> Frankly, I don't see that registrations, at least for the node liveness use 
> case, require using reliable transport. If the registration is lost, the 
> faster convergence doesn't work for that node but the existing slower 
> mechanism still does the job done. I want to note that I'm not proposing 
> replacing any of the transport options listed in the document but adding 
> optional unreliable transport.
> 
> Regards,
> Greg
> 
> On Mon, Jan 24, 2022 at 9:16 AM Tony Li  > wrote:
> Hi Greg,
> 
> 
> > thank you for your responses to my notes. I should have been more clear in 
> > explaining the rationale for adding the UDP transport option to the list. 
> > Reliability comes at a cost. If the system already has a mechanism that 
> > guarantees convergence a faster, lightweight though not reliable mechanism 
> > seems like a reasonable model. 
> 
> 
> Ok, yes, theoretically, notifications do not strictly require reliable 
> delivery and thus could be done with another mechanism. However, 
> registrations MUST be done reliably. Supporting two separate simultaneous 
> transports seems expensive and painful.
> 
> Tony
> 

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Greg Mirsky
Yours is a good operational point, Robert. I wonder what others might think.

Regards,
Greg

On Mon, Jan 24, 2022 at 9:30 AM Robert Raszuk  wrote:

>
> Are you sure this is operationally a good idea ?
>
> It's cool when registrations and up notifications will not get lost. But I
> would not like to be the one troubleshooting a network when some
> registrations or up notifications packets get lost ... It sounds like a
> nightmare to me.
>
> Best,
> R.
>
> On Mon, Jan 24, 2022 at 6:25 PM Greg Mirsky  wrote:
>
>> Frankly, I don't see that registrations, at least for the node liveness
>> use case, require using reliable transport. If the registration is lost,
>> the faster convergence doesn't work for that node but the existing slower
>> mechanism still does the job done. I want to note that I'm not proposing
>> replacing any of the transport options listed in the document but adding
>> optional unreliable transport.
>>
>> Regards,
>> Greg
>>
>> On Mon, Jan 24, 2022 at 9:16 AM Tony Li  wrote:
>>
>>> Hi Greg,
>>>
>>>
>>> > thank you for your responses to my notes. I should have been more
>>> clear in explaining the rationale for adding the UDP transport option to
>>> the list. Reliability comes at a cost. If the system already has a
>>> mechanism that guarantees convergence a faster, lightweight though not
>>> reliable mechanism seems like a reasonable model.
>>>
>>>
>>> Ok, yes, theoretically, notifications do not strictly require reliable
>>> delivery and thus could be done with another mechanism. However,
>>> registrations MUST be done reliably. Supporting two separate simultaneous
>>> transports seems expensive and painful.
>>>
>>> Tony
>>>
>>>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Robert Raszuk
Are you sure this is operationally a good idea ?

It's cool when registrations and up notifications will not get lost. But I
would not like to be the one troubleshooting a network when some
registrations or up notifications packets get lost ... It sounds like a
nightmare to me.

Best,
R.

On Mon, Jan 24, 2022 at 6:25 PM Greg Mirsky  wrote:

> Frankly, I don't see that registrations, at least for the node liveness
> use case, require using reliable transport. If the registration is lost,
> the faster convergence doesn't work for that node but the existing slower
> mechanism still does the job done. I want to note that I'm not proposing
> replacing any of the transport options listed in the document but adding
> optional unreliable transport.
>
> Regards,
> Greg
>
> On Mon, Jan 24, 2022 at 9:16 AM Tony Li  wrote:
>
>> Hi Greg,
>>
>>
>> > thank you for your responses to my notes. I should have been more clear
>> in explaining the rationale for adding the UDP transport option to the
>> list. Reliability comes at a cost. If the system already has a mechanism
>> that guarantees convergence a faster, lightweight though not reliable
>> mechanism seems like a reasonable model.
>>
>>
>> Ok, yes, theoretically, notifications do not strictly require reliable
>> delivery and thus could be done with another mechanism. However,
>> registrations MUST be done reliably. Supporting two separate simultaneous
>> transports seems expensive and painful.
>>
>> Tony
>>
>>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Greg Mirsky
Frankly, I don't see that registrations, at least for the node liveness use
case, require using reliable transport. If the registration is lost, the
faster convergence doesn't work for that node but the existing slower
mechanism still does the job done. I want to note that I'm not proposing
replacing any of the transport options listed in the document but adding
optional unreliable transport.

Regards,
Greg

On Mon, Jan 24, 2022 at 9:16 AM Tony Li  wrote:

> Hi Greg,
>
>
> > thank you for your responses to my notes. I should have been more clear
> in explaining the rationale for adding the UDP transport option to the
> list. Reliability comes at a cost. If the system already has a mechanism
> that guarantees convergence a faster, lightweight though not reliable
> mechanism seems like a reasonable model.
>
>
> Ok, yes, theoretically, notifications do not strictly require reliable
> delivery and thus could be done with another mechanism. However,
> registrations MUST be done reliably. Supporting two separate simultaneous
> transports seems expensive and painful.
>
> Tony
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Greg Mirsky
Hi Robert,
thank you for clearly stating the question I was implying - Is the reliable
distribution of notifications a requirement or recommendation? I think it
is the latter. In some scenarios and use cases, for example, when the
pub-sub provides the critical service that doesn't have a backup mechanism,
using the reliable distribution required.

Regards,
Greg

On Mon, Jan 24, 2022 at 9:09 AM Robert Raszuk  wrote:

> Hi Greg,
>
> Granted you are correct, but only if we consider p2p distribution and low
> level triggers. You are also correct if we would just be continue to use
> PULSE style. But Tony's proposal is fundamentally different. It does not
> send PULSE and forgets. It distributed node liveness both down and up
> state.
>
> So if we however consider a real network with multi hop distribution and
> we consider that the use case here is to distribute this extra state on a
> pub-sub basis then reliable delivery is a must.
>
> So why not reuse what's out there already instead of building our own ?
>
> Thx a lot,
> R.
>
> On Mon, Jan 24, 2022 at 6:01 PM Greg Mirsky  wrote:
>
>> Hi Tony and Robert,
>> thank you for your responses to my notes. I should have been more clear
>> in explaining the rationale for adding the UDP transport option to the
>> list. Reliability comes at a cost. If the system already has a mechanism
>> that guarantees convergence a faster, lightweight though not reliable
>> mechanism seems like a reasonable model.
>>
>> Regards,
>> Greg
>>
>> On Mon, Jan 24, 2022 at 7:38 AM Tony Li  wrote:
>>
>>>
>>> Hi Greg,
>>>
>>> > I got to think about the benefits of using reliable transport for
>>> notifications. If I understand the use case correctly, the proposed
>>> mechanism allows for a faster convergence but is supplemental to slower BGP
>>> convergence. If that is the case, it seems that if a subscriber to the
>>> service missed a notification, the situation would not be worse than it is
>>> today as that node will catch up with "luckier" neighbors eventually. I
>>> think that a discussion of adding the UDP transport to the list of
>>> transport options is a reasonable proposition and might add a useful model
>>> to the proposed mechanism.
>>>
>>>
>>> The great benefit of using a reliable transport is that there is no need
>>> to build reliability into the protocol. Thus, there is no such thing as a
>>> ‘missed’ notification. If a client has registered for a prefix, it will get
>>> a notification. Yes, it may be delayed. Moving the transport to a reliable
>>> one will not change that. Delay happens for a reason. Those reasons will
>>> not change with a different transport. Going to a purely UDP transport and
>>> building reliability on top of it would be simply recreating the transport
>>> protocol, which would be redundant.
>>>
>>> That said, if you have a burning need to send UDP packets, it should be
>>> noted that QUIC is already providing a reliable transport on top of UDP, so
>>> you could opt for that in your implementation.
>>>
>>> Tony
>>>
>>>
>>>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Tony Li
Hi Greg,


> thank you for your responses to my notes. I should have been more clear in 
> explaining the rationale for adding the UDP transport option to the list. 
> Reliability comes at a cost. If the system already has a mechanism that 
> guarantees convergence a faster, lightweight though not reliable mechanism 
> seems like a reasonable model. 


Ok, yes, theoretically, notifications do not strictly require reliable delivery 
and thus could be done with another mechanism. However, registrations MUST be 
done reliably. Supporting two separate simultaneous transports seems expensive 
and painful.

Tony

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Robert Raszuk
Hi Greg,

Granted you are correct, but only if we consider p2p distribution and low
level triggers. You are also correct if we would just be continue to use
PULSE style. But Tony's proposal is fundamentally different. It does not
send PULSE and forgets. It distributed node liveness both down and up
state.

So if we however consider a real network with multi hop distribution and we
consider that the use case here is to distribute this extra state on a
pub-sub basis then reliable delivery is a must.

So why not reuse what's out there already instead of building our own ?

Thx a lot,
R.

On Mon, Jan 24, 2022 at 6:01 PM Greg Mirsky  wrote:

> Hi Tony and Robert,
> thank you for your responses to my notes. I should have been more clear in
> explaining the rationale for adding the UDP transport option to the list.
> Reliability comes at a cost. If the system already has a mechanism that
> guarantees convergence a faster, lightweight though not reliable mechanism
> seems like a reasonable model.
>
> Regards,
> Greg
>
> On Mon, Jan 24, 2022 at 7:38 AM Tony Li  wrote:
>
>>
>> Hi Greg,
>>
>> > I got to think about the benefits of using reliable transport for
>> notifications. If I understand the use case correctly, the proposed
>> mechanism allows for a faster convergence but is supplemental to slower BGP
>> convergence. If that is the case, it seems that if a subscriber to the
>> service missed a notification, the situation would not be worse than it is
>> today as that node will catch up with "luckier" neighbors eventually. I
>> think that a discussion of adding the UDP transport to the list of
>> transport options is a reasonable proposition and might add a useful model
>> to the proposed mechanism.
>>
>>
>> The great benefit of using a reliable transport is that there is no need
>> to build reliability into the protocol. Thus, there is no such thing as a
>> ‘missed’ notification. If a client has registered for a prefix, it will get
>> a notification. Yes, it may be delayed. Moving the transport to a reliable
>> one will not change that. Delay happens for a reason. Those reasons will
>> not change with a different transport. Going to a purely UDP transport and
>> building reliability on top of it would be simply recreating the transport
>> protocol, which would be redundant.
>>
>> That said, if you have a burning need to send UDP packets, it should be
>> noted that QUIC is already providing a reliable transport on top of UDP, so
>> you could opt for that in your implementation.
>>
>> Tony
>>
>>
>>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Greg Mirsky
Hi Tony and Robert,
thank you for your responses to my notes. I should have been more clear in
explaining the rationale for adding the UDP transport option to the list.
Reliability comes at a cost. If the system already has a mechanism that
guarantees convergence a faster, lightweight though not reliable mechanism
seems like a reasonable model.

Regards,
Greg

On Mon, Jan 24, 2022 at 7:38 AM Tony Li  wrote:

>
> Hi Greg,
>
> > I got to think about the benefits of using reliable transport for
> notifications. If I understand the use case correctly, the proposed
> mechanism allows for a faster convergence but is supplemental to slower BGP
> convergence. If that is the case, it seems that if a subscriber to the
> service missed a notification, the situation would not be worse than it is
> today as that node will catch up with "luckier" neighbors eventually. I
> think that a discussion of adding the UDP transport to the list of
> transport options is a reasonable proposition and might add a useful model
> to the proposed mechanism.
>
>
> The great benefit of using a reliable transport is that there is no need
> to build reliability into the protocol. Thus, there is no such thing as a
> ‘missed’ notification. If a client has registered for a prefix, it will get
> a notification. Yes, it may be delayed. Moving the transport to a reliable
> one will not change that. Delay happens for a reason. Those reasons will
> not change with a different transport. Going to a purely UDP transport and
> building reliability on top of it would be simply recreating the transport
> protocol, which would be redundant.
>
> That said, if you have a burning need to send UDP packets, it should be
> noted that QUIC is already providing a reliable transport on top of UDP, so
> you could opt for that in your implementation.
>
> Tony
>
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Tony Li

Hi Greg,

> I got to think about the benefits of using reliable transport for 
> notifications. If I understand the use case correctly, the proposed mechanism 
> allows for a faster convergence but is supplemental to slower BGP 
> convergence. If that is the case, it seems that if a subscriber to the 
> service missed a notification, the situation would not be worse than it is 
> today as that node will catch up with "luckier" neighbors eventually. I think 
> that a discussion of adding the UDP transport to the list of transport 
> options is a reasonable proposition and might add a useful model to the 
> proposed mechanism.


The great benefit of using a reliable transport is that there is no need to 
build reliability into the protocol. Thus, there is no such thing as a ‘missed’ 
notification. If a client has registered for a prefix, it will get a 
notification. Yes, it may be delayed. Moving the transport to a reliable one 
will not change that. Delay happens for a reason. Those reasons will not change 
with a different transport. Going to a purely UDP transport and building 
reliability on top of it would be simply recreating the transport protocol, 
which would be redundant.

That said, if you have a burning need to send UDP packets, it should be noted 
that QUIC is already providing a reliable transport on top of UDP, so you could 
opt for that in your implementation.

Tony


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Tony Li

Hi Gyan,

> So basically the event notification process done in PUAM / Pulse inband, here 
> we are using an out-of-band transport protocol QUIC / TCP for the Publisher 
> RDB registration DB service on the ABRs which advertises the node liveliness 
> service into the L1 and L2 areas for subscriber client internal routers 
> within the area to use the service. 


“Out-of-band” implies that the signaling is being done using a separate 
communications channel, such as a different physical link. That would be 
inaccurate. It would be more accurate to say that it’s carrying information 
derived from the IGP in another protocol.


> Each subscriber internal router creates a active/active  QUIC/TCP session to 
> two ABRs minimum for redundancy for the liveliness service.  Since the 
> clients have two Active / Active sessions and so there would be duplicate 
> copies of each prefix. 


More accurately, there will be multiple registrations for a prefix. In a sane 
implementation, the prefixes are stored in something like a radix trie, with a 
structure below the prefix to hold the registrations. That could itself be a 
tree or hash table. But these are implementation nits...


> Do you have a sequence # to keep track as well as age of prefixes. 


No, this is not necessary. You’ll note that BGP’s BRIB has no sequence 
numbering nor is age necessary for the protocol (tho it’s sometimes included 
for management purposes).


> This is almost like building another LSDB like DB on the ABRs. 


It’s certainly another data store. It is likely to be far smaller than an LSDB 
as entries are far smaller. Each entry can be a pointer back to its connection, 
plus the overhead for data structures. I would suggest a singly-linked thread 
for the connection (for registration removal) and pointers for the prefix tree. 
That’s about 5 pointers. On a 64-bit architecture, that’s 40 bytes per 
registration. In relative terms, that’s tiny.  A memory conservative 
implementation could do far better, possibly at the cost of some time-expensive 
operations.


> For the service all the ABRs providing the PUB service need to be meshed with 
> QUIC/TCP session so they all stay synchronized as well have some type of 
> sequence number and age and maybe a max age value to refresh the database. 


There is no need to refresh the database. There is no need for sequence 
numbering.


> Maybe also prefix pacing similar to LSA group pacing to keep the prefixes 
> refreshed so you don’t have stale entires. 


There are no stale entries.


> Also maybe a method to MAXAGE  the prefix similar to LSA MAXAGE to flush the 
> DB when prefixes go down. 


There is no need. There are only two cases to remove a registration from the 
RDB: when the connection closes or when the client chooses to unregister.


> As this is a distributed database that has to be synchronized between all the 
> ABRs similar to IGP synchronized database all the similar mechanisms I would 
> think need to be employed here as well to ensure the prefixes in the database 
> are “valid” prefixes so you don’t have a false positive or negative 
> notification. 



This is an important point: This is NOT a distributed database. This is a 
per-ABR database. Yes, the content of the RDB should be similar between two 
ABRs that serve the same area(s), that is not necessarily true all the time.


> Also possible race conditions based on network triggers and timers.


There are no timers. There are no race conditions (that I can see — if I missed 
something, please call it out).


>  It does seem as though we are building and IGP layer onto of the existing 
> IGP.


We are building a service adjunct to the IGP. Yes, it has some additional state.


> I think the maintenance of the RDB seems it would be quite complex.


It is intentionally trivial.


> I do wonder as this has a tremendous out-of-band framework maybe this is out 
> of scope of LSR and maybe belongs in the transport area with regards to the 
> PUB / SUB framework.


Acee has already given us a provisional node to continue. Given the dependency 
on the IGP, plus the fact that the problem is being debated here in LSR, this 
only seems sane.


>  I wonder about the scalability for larger domains as there is a lot of 
> control plane overhead that is added with this service. 


This is not a ‘lot’ of overhead. This is basically BGP-lite and has about the 
same pain as another 100k prefixes in BGP. In other words, it’s almost free.


> It you can imagine you have 1000+ routers in an area which is typical for 
> core, aggregation layers in a transport domain.  Just to show here how the 
> scale factor maybe a significant issue.  To keep the math simple you have 2 
> ABR publishers  and 1000 internal router subscribers.  So the ABRs would have 
> a singe session between them.  However now each ABR would maintain 1000 
> QUIC/TCP sessions with each ABR.  


I’m not following your math here. If you have 1000 routers in an area, then 

Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-24 Thread Tony Li

Hi Greg,

Thank you for your suggestions.

> It seems that referencing the multi-hop BFD [RFC5883] in the Introduction 
> section as the existing mechanism detecting the node liveness can make the 
> document more thorough.

While I have no objection to being thorough, being that thorough would require 
a discussion of each of the alternatives, which seems like overkill at this 
point.

> along with the term "node liveness," the document mentions "node's 
> reachability". Do you think that the latter might be further clarified by 
> pointing out that that is reachability from an IGP perspective? Or, perhaps 
> use only the "node liveness" in the document.


Reachability is a well-understood graph theoretic term. I realize that some 
people still don’t understand it, but I’m not going to spend everyone’s 
valuable time reproducing Wikipedia (https://en.wikipedia.org/wiki/Reachability 
). I specifically used “node 
liveness” to emphasize that this is NOT dealing with “service liveness”.

Tony


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Tony Li

Hi Aijun,


> [WAJ] Then I think it is not the subject that the LSR should discuss.  There 
> are many modules on the routers. They may all relevant to the IGP modules.
> You are trying to accomplish the work that has been done via the management 
> system, for example https://datatracker.ietf.org/doc/html/rfc8639 
>  (Subscription to YANG 
> Notification)


I’m sorry that you’re feeling threatened. Acee has already provisionally said 
that the subject is on topic. That’s logical since it is a replacement for 
PUA/Pulse.  It has nothing to do with a YANG model. 

 
> [WAJ] What was said in your “Security 
> Consideration”(https://datatracker.ietf.org/doc/html/draft-li-lsr-liveness-01#section-6
>  ) 
> is the followings:
> “This document creates no new security issues.  Security of transport
>protocol connections are addressed by the use of conventional
>transport protocol security techniques, such as TLS.  IGP
>advertisements are not expected to have privacy, so the advertisement
>of the service is not a security issue.”
>What I think is that you introduce new security issues, or one new issue 
> on your mentioned “long list”.  And, you should configure on all the clients 
> and ABRs the communication key, or authenticate each other.


You’re welcome to think that. You would be wrong.  The rest of us run TCP 
applications without doing that. Strong crypto like TLS is both necessary and 
sufficient. 


> [WAJ] The key point to stress the router is that the tasks it should be 
> executed at the same time. You just divide the work into separate modules, I 
> think it is same challenge to the router.


Routers are (ignoring recent developments in multi-core CPUs) completely 
serial. They execute processes one at a time. Yes, there is concurrency. Yes, 
there is stress when the workload exceeds the available CPU. The secret to 
survivng the stress is to prioritize tasks. The IGP is a higher priority task 
than notification. This is clear. Thus, this creates zero stress for the IGP 
itself.


> The pub-sub proposal is an architecturally clean way of solving the problem, 
> as I understand it. It does not have the step function scale nightmare.
> [WAJ] There is existing such clean way, why you invent another? And why in 
> IGP/LSR WG?


Because there is no alternate clean way. Only ugly, ugly, ugly ways. Being 
discussed in LSR.

You didn’t listen to me then.  I don’t expect you to listen to me now.  Is 
there a point to this? You seem unwilling to consider alternatives to your 
thinking and risk learning.

If you don’t have anything constructive, we can just stop the conversation 
right here.

Tony


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Aijun Wang
Hi, Tony:

 

From: tony1ath...@gmail.com  On Behalf Of Tony Li
Sent: Thursday, January 20, 2022 11:23 PM
To: Aijun Wang 
Cc: lsr 
Subject: Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

 

 

Hi Aijun,

 

 

You are proposing to use the Out of Band channel to solve the IGP problem. 
There are already existing such channel, why we bother IGP to establish new one?

 

 

You’re missing the point completely. I’m proposing an entirely new subsystem. 
That’s architecturally necessary.

 

One of the things that we’ve learned over the years is that modularity is a 
necessity. If you don’t modularize, then you keep adding things onto the main 
structure and it grows in weird ways and becomes unmaintainable and fragile. A 
fine example from early operating systems was the monolithic monitor.  We would 
not dream of architecting things that way today.

 

To avoid this, as we add functionality, we need to respect the architectural 
boundaries of the systems that we’ve created and create alternate subsystems 
that interface with other subsystems but are largely independent. In this case, 
liveness information can be extracted from the IGP completely locally and flow 
into the pub-sub system for distribution subsystem.

[WAJ] Then I think it is not the subject that the LSR should discuss.  There 
are many modules on the routers. They may all relevant to the IGP modules.

You are trying to accomplish the work that has been done via the management 
system, for example https://datatracker.ietf.org/doc/html/rfc8639 (Subscription 
to YANG Notification)

 

And, don’t’ you think you open the gate for DDoS attack of the ABR, or all of 
the ABRs within the network? You need to consider various methods to mitigate 
it.

 

Every L2 point of attachment to the network is a DoS target. We know how to 
deal with those things. Management ports are DoS able. SSH ports are DoSable. 
This is not news. This is just one more in a very long list. We have techniques 
for DoS mitigation. They are well known. This was mentioned in the security 
considerations section.

[WAJ] What was said in your “Security 
Consideration”(https://datatracker.ietf.org/doc/html/draft-li-lsr-liveness-01#section-6)
 is the followings:

“This document creates no new security issues.  Security of transport
   protocol connections are addressed by the use of conventional
   transport protocol security techniques, such as TLS.  IGP
   advertisements are not expected to have privacy, so the advertisement
   of the service is not a security issue.”
   What I think is that you introduce new security issues, or one new issue on 
your mentioned “long list”.  And, you should configure on all the clients and 
ABRs the communication key, or authenticate each other.

 

And for the massive failures scenario, as that you argued for other proposed 
solutions, all the registered clients will also receive the massive 
notification information unless you do some filter action on the ABRs.

 

Each client will receive exactly the notifications that they registered for 
(modulo the discussion on registering for less specifics). If there is a 
massive outage, it does not affect the IGP.  At all. The LSDB is unchanged. 
Only the notification subsystem will be affected by the scale. And all of the 
notificiations should be delivered, eventually. Under stress, it will 
undoubtedly delay, but not drop information. That’s what we want.

[WAJ] The key point to stress the router is that the tasks it should be 
executed at the same time. You just divide the work into separate modules, I 
think it is same challenge to the router.

 

 

Then what the advantages that your proposal when compared to the PUA/PULSE 
solution?

 

 

The pub-sub proposal is an architecturally clean way of solving the problem, as 
I understand it. It does not have the step function scale nightmare.

[WAJ] There is existing such clean way, why you invent another? And why in 
IGP/LSR WG?

 

Tony

 

 

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Robert Raszuk
Sounds good.

Thx,
R.

On Thu, Jan 20, 2022 at 6:23 PM Tony Li  wrote:

>
> Hi Robert,
>
> While perhaps pretty obvious, it would be good to also highlight what
> implementation should do in the event of covering prefix changes in the
> LSDB if those are to happen after registrations have already gone out.
> Maybe as simple as de-register and re-register for the new covering prefix
> ?
>
>
>
> Ok, how about:
>
>  If the ABR has registered for a prefix and that prefix is no
>  longer advertised by another ABR then an ABR MAY unregister,
>  re-evaluate its registration and register for a different
>  prefix. In this way, if a summary prefix changes, the ABR
>  can shift to the new summary prefix.
>
> Tony
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Tony Li

Hi Robert,

> While perhaps pretty obvious, it would be good to also highlight what 
> implementation should do in the event of covering prefix changes in the LSDB 
> if those are to happen after registrations have already gone out. Maybe as 
> simple as de-register and re-register for the new covering prefix ? 


Ok, how about:

  If the ABR has registered for a prefix and that prefix is no
  longer advertised by another ABR then an ABR MAY unregister,
  re-evaluate its registration and register for a different
  prefix. In this way, if a summary prefix changes, the ABR
  can shift to the new summary prefix.

Tony

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Robert Raszuk
Hi Tony,

> You mean to show up in registration ? I guess there could be a triggering
> threshold with some wisely chosen % of the min. number of host routes in
> the summaries to avoid too much noise.
>
> That seems like a difficult condition to explain. How do you feel about
> just always selecting the most specific prefix that is less specific than
> the prefix?  This gets you the /24 bur avoids 0/0.
>

I think this is easiest and easiest is good. As this is actually a local
matter perhaps we do not need to spend much time on that in the spec other
then describing the general functionality of auto selecting less specific
covering prefix for registration between ABRs.

Then each vendor could add some extra logic or even ML twicks on what to
register to differentiate themselves :)

Or we could perhaps solve it very neatly and define ability to register for
> all prefixes (within given mask range) with some form of a wildcard or
> reserved prefix 0.0.0.0/32 or 0.0.0.0/24-32 in IPv4 case ?
>
> You can already register for 0/0, tho you would get everything. I don’t
> recommend it, tho I can see that it might be useful for network monitoring.
>

Not sure if we should add ability to register for any prefix with a given
mask or mask range (irrespective of the prefix itself). I think that could
be pretty useful especially between ABRs to avoid chattiness if someone
just wants to remote ABRs all notifications for  /32s or /64s or /128s etc
...


> I added explicit statements saying that an ABR should ignore unexpected
> notifications that have no matching registration. That should always hold.
>

Cool !


> I then added a section that says that an ABR may create global
> registrations for prefixes learned from the management plane.  That should
> cover the optional behavior that you’re thinking of.  I’ve called this
> “Autonomous Notificatin Mode”.
>

Ok.

- - -

While perhaps pretty obvious, it would be good to also highlight what
implementation should do in the event of covering prefix changes in the
LSDB if those are to happen after registrations have already gone out.
Maybe as simple as de-register and re-register for the new covering prefix
?

Kind regards,
Robert
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Tony Li
Hi Robert,


> Ok, this was the intent already. The current text says:
> 
>If an ABR receives a Registration Message for a prefix that is being
>injected by a non-attached area, then it should determine the set of
>ABRs that are advertising that prefix or less specifics and register
>with those ABRs for that prefix.
> 
> How about s/with those ABRs for that prefix./with only those ABRs for that 
> prefix./


Sure.


>> *B*   The same ABR as in *A* instead of registering 100 /32s could observe 
>> that all of them are part of /24 subnet X and instead of asking ABRs serving 
>> egress area to get notification for all those next hops could just ask those 
>> ABRs to send all events pertaining to entire /24 subnet hosts. Then upon 
>> receiving such events will filter according to local registrations in his 
>> own area. 
> 
> Yup, thought of this case and I like it. Is it sufficient if the ABR doesn’t 
> wait for the 100 /32s to show up?  
> 
> 
> You mean to show up in registration ? I guess there could be a triggering 
> threshold with some wisely chosen % of the min. number of host routes in  the 
> summaries to avoid too much noise. 


That seems like a difficult condition to explain. How do you feel about just 
always selecting the most specific prefix that is less specific than the 
prefix?  This gets you the /24 bur avoids 0/0.


>> Also what I had in mind was single filtering deployment model where ABRs 
>> react on all host routes (within set ranges) going up and down and unicast 
>> it to all other ABRs without any registration. Only the ingress area ABRs 
>> would do local filtering from local ingress PE registrations. 
> 
> 
> It seems to me that this can already be done with the existing protocol and 
> is just undocumented ABR behavior.  If folks feel this is worthwhile, we can 
> certainly add that.  I’m guessing it would add more configuration, which I 
> was trying to avoid.
> 
> I think that can be a valid deployment model so spelling it out may not hurt 
> IMHO. The receiving ABR needs to be prepared that it is receiving something 
> without registering for it. 
> 
> Or we could perhaps solve it very neatly and define ability to register for 
> all prefixes (within given mask range) with some form of a wildcard or 
> reserved prefix 0.0.0.0/32  or 0.0.0.0/24-32 
>  in IPv4 case ?


You can already register for 0/0, tho you would get everything. I don’t 
recommend it, tho I can see that it might be useful for network monitoring.

I added explicit statements saying that an ABR should ignore unexpected 
notifications that have no matching registration. That should always hold.

I then added a section that says that an ABR may create global registrations 
for prefixes learned from the management plane.  That should cover the optional 
behavior that you’re thinking of.  I’ve called this “Autonomous Notificatin 
Mode”.

Tony


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Acee Lindem (acee)
Hi Robert,

From: Robert Raszuk 
Date: Thursday, January 20, 2022 at 10:23 AM
To: Acee Lindem 
Cc: Aijun Wang , lsr , Tony Li 

Subject: Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

Hi Acee,

> I’d consider this out-of-band signaling with respect to the IGP as well.

Are you saying this as co-chair ? If so does this indicate that LSR WG charter 
would prohibit this type of work from proceeding in this WG with the exception 
of the autodiscovery piece ?

There’s nothing in the charter that says an IGP can’t use out-of-band 
signaling. I wouldn’t break this up since the reachability detection is done in 
the IGP.

Speaking as someone who doesn’t want to waste time on needless semantics 
discussions,
Acee

Thx,
R.


On Thu, Jan 20, 2022 at 4:01 PM Acee Lindem (acee) 
mailto:a...@cisco.com>> wrote:
Hi Robert,


From: Lsr mailto:lsr-boun...@ietf.org>> on behalf of 
Robert Raszuk mailto:rob...@raszuk.net>>
Date: Thursday, January 20, 2022 at 4:59 AM
To: Aijun Wang mailto:wangai...@tsinghua.org.cn>>
Cc: lsr mailto:lsr@ietf.org>>, Tony Li 
mailto:tony...@tony.li>>
Subject: Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

[WAJ] The exact description should be “It proposes to use IGP establishing the 
out of band channel to deliver the PUB/SUB information”

Sorry - no.  You seem to be locking IGPs to flooding based transport only. 
Anything else you call "out of band" - I do not subscribe to this.

The IGP is used solely to advertise support for the registration/notification 
protocol which operates over TCP or QUIC. I’d consider this out-of-band 
signaling with respect to the IGP as well. I will comment on the draft at a 
later time as I have some more exigent matters to attend to.

Thanks,
Acee

Many thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Tony Li

Hi Aijun,


> You are proposing to use the Out of Band channel to solve the IGP problem. 
> There are already existing such channel, why we bother IGP to establish new 
> one?


You’re missing the point completely. I’m proposing an entirely new subsystem. 
That’s architecturally necessary.

One of the things that we’ve learned over the years is that modularity is a 
necessity. If you don’t modularize, then you keep adding things onto the main 
structure and it grows in weird ways and becomes unmaintainable and fragile. A 
fine example from early operating systems was the monolithic monitor.  We would 
not dream of architecting things that way today.

To avoid this, as we add functionality, we need to respect the architectural 
boundaries of the systems that we’ve created and create alternate subsystems 
that interface with other subsystems but are largely independent. In this case, 
liveness information can be extracted from the IGP completely locally and flow 
into the pub-sub system for distribution subsystem.


> And, don’t’ you think you open the gate for DDoS attack of the ABR, or all of 
> the ABRs within the network? You need to consider various methods to mitigate 
> it.


Every L2 point of attachment to the network is a DoS target. We know how to 
deal with those things. Management ports are DoS able. SSH ports are DoSable. 
This is not news. This is just one more in a very long list. We have techniques 
for DoS mitigation. They are well known. This was mentioned in the security 
considerations section.


> And for the massive failures scenario, as that you argued for other proposed 
> solutions, all the registered clients will also receive the massive 
> notification information unless you do some filter action on the ABRs.


Each client will receive exactly the notifications that they registered for 
(modulo the discussion on registering for less specifics). If there is a 
massive outage, it does not affect the IGP.  At all. The LSDB is unchanged. 
Only the notification subsystem will be affected by the scale. And all of the 
notificiations should be delivered, eventually. Under stress, it will 
undoubtedly delay, but not drop information. That’s what we want.


> Then what the advantages that your proposal when compared to the PUA/PULSE 
> solution?


The pub-sub proposal is an architecturally clean way of solving the problem, as 
I understand it. It does not have the step function scale nightmare.

Tony


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Robert Raszuk
Hi Acee,

> I’d consider this out-of-band signaling with respect to the IGP as well.

Are you saying this as co-chair ? If so does this indicate that LSR WG
charter would prohibit this type of work from proceeding in this WG with
the exception of the autodiscovery piece ?

Thx,
R.


On Thu, Jan 20, 2022 at 4:01 PM Acee Lindem (acee)  wrote:

> Hi Robert,
>
>
>
>
>
> *From: *Lsr  on behalf of Robert Raszuk <
> rob...@raszuk.net>
> *Date: *Thursday, January 20, 2022 at 4:59 AM
> *To: *Aijun Wang 
> *Cc: *lsr , Tony Li 
> *Subject: *Re: [Lsr] New Version Notification for
> draft-li-lsr-liveness-00.txt
>
>
>
> [WAJ] The exact description should be “It proposes to use IGP establishing
> the out of band channel to deliver the PUB/SUB information”
>
>
>
> Sorry - no.  You seem to be locking IGPs to flooding based transport only.
> Anything else you call "out of band" - I do not subscribe to this.
>
>
>
> The IGP is used solely to advertise support for the
> registration/notification protocol which operates over TCP or QUIC. I’d
> consider this out-of-band signaling with respect to the IGP as well. I will
> comment on the draft at a later time as I have some more exigent matters to
> attend to.
>
>
>
> Thanks,
>
> Acee
>
>
>
> Many thx,
>
> R.
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Acee Lindem (acee)
Hi Robert,


From: Lsr  on behalf of Robert Raszuk 
Date: Thursday, January 20, 2022 at 4:59 AM
To: Aijun Wang 
Cc: lsr , Tony Li 
Subject: Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

[WAJ] The exact description should be “It proposes to use IGP establishing the 
out of band channel to deliver the PUB/SUB information”

Sorry - no.  You seem to be locking IGPs to flooding based transport only. 
Anything else you call "out of band" - I do not subscribe to this.

The IGP is used solely to advertise support for the registration/notification 
protocol which operates over TCP or QUIC. I’d consider this out-of-band 
signaling with respect to the IGP as well. I will comment on the draft at a 
later time as I have some more exigent matters to attend to.

Thanks,
Acee

Many thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Robert Raszuk
>
> [WAJ] The exact description should be “It proposes to use IGP establishing
> the out of band channel to deliver the PUB/SUB information”
>

Sorry - no.  You seem to be locking IGPs to flooding based transport only.
Anything else you call "out of band" - I do not subscribe to this.

Many thx,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Aijun Wang
Hi, Robert:


Aijun Wang
China Telecom

> On Jan 20, 2022, at 17:20, Robert Raszuk  wrote:
> 
> 
> Aijun, 
> 
> > You are proposing to use the Out of Band channel to solve the IGP problem.
> 
> I am not sure if you noticed but Tony's proposal is an IGP extension not out 
> of band channel. 

[WAJ] The exact description should be “It proposes to use IGP establishing the 
out of band channel to deliver the PUB/SUB information”

> 
> > all the registered clients will also receive the massive notification 
> > information unless you do some filter action on the ABRs.
> 
> Did you have a chance to read the draft yet ? Hint: the registration is all 
> about filtering. 

[WAJ] It depends on the amount of registration prefixes.

> 
> Cheers,
> R.
> 
> 
>> On Thu, Jan 20, 2022 at 4:42 AM Aijun Wang  wrote:
>> HI, Tony:
>> 
>>  
>> 
>> You are proposing to use the Out of Band channel to solve the IGP problem. 
>> There are already existing such channel, why we bother IGP to establish new 
>> one?
>> 
>> And, don’t’ you think you open the gate for DDoS attack of the ABR, or all 
>> of the ABRs within the network? You need to consider various methods to 
>> mitigate it.
>> 
>> And for the massive failures scenario, as that you argued for other proposed 
>> solutions, all the registered clients will also receive the massive 
>> notification information unless you do some filter action on the ABRs.
>> 
>> Then what the advantages that your proposal when compared to the PUA/PULSE 
>> solution?
>> 
>>  
>> 
>>  
>> 
>> Best Regards
>> 
>>  
>> 
>> Aijun Wang
>> 
>> China Telecom
>> 
>>  
>> 
>> From: tony1ath...@gmail.com  On Behalf Of Tony Li
>> Sent: Thursday, January 20, 2022 12:19 AM
>> To: Aijun Wang 
>> Cc: lsr 
>> Subject: Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt
>> 
>>  
>> 
>> Hi Aijun,
>> 
>>  
>> 
>>  
>> 
>> If we use pub/sub mechanism, why don’t we accomplish it via the management 
>> system, or controller? 
>> 
>> No IGP extension needed then.
>> 
>>  
>> 
>>  
>> 
>> As I recall, you are the one who posed the original problem. If a 
>> centralized solution works for you, then that’s certainly fine by me.
>> 
>>  
>> 
>> 
>> 
>> 
>> Also no pressure for the ABR to keep the RDB(Registration Database) and the 
>> TCP/QUIC server connections
>> 
>>  
>> 
>>  
>> 
>> I’m sorry, I don’t undertsand you. Is that a question?
>> 
>>  
>> 
>>  
>> 
>> Tony
>> 
>>  
>> 
>> ___
>> Lsr mailing list
>> Lsr@ietf.org
>> https://www.ietf.org/mailman/listinfo/lsr
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-20 Thread Robert Raszuk
Aijun,

> You are proposing to use the Out of Band channel to solve the IGP problem.

I am not sure if you noticed but Tony's proposal is an IGP extension not
out of band channel.

> all the registered clients will also receive the massive notification
> information unless you do some filter action on the ABRs.

Did you have a chance to read the draft yet ? Hint: the registration is all
about filtering.

Cheers,
R.


On Thu, Jan 20, 2022 at 4:42 AM Aijun Wang 
wrote:

> HI, Tony:
>
>
>
> You are proposing to use the Out of Band channel to solve the IGP problem.
> There are already existing such channel, why we bother IGP to establish new
> one?
>
> And, don’t’ you think you open the gate for DDoS attack of the ABR, or all
> of the ABRs within the network? You need to consider various methods to
> mitigate it.
>
> And for the massive failures scenario, as that you argued for other
> proposed solutions, all the registered clients will also receive the
> massive notification information unless you do some filter action on the
> ABRs.
>
> Then what the advantages that your proposal when compared to the PUA/PULSE
> solution?
>
>
>
>
>
> Best Regards
>
>
>
> Aijun Wang
>
> China Telecom
>
>
>
> *From:* tony1ath...@gmail.com  *On Behalf Of *Tony
> Li
> *Sent:* Thursday, January 20, 2022 12:19 AM
> *To:* Aijun Wang 
> *Cc:* lsr 
> *Subject:* Re: [Lsr] New Version Notification for
> draft-li-lsr-liveness-00.txt
>
>
>
> Hi Aijun,
>
>
>
>
>
> If we use pub/sub mechanism, why don’t we accomplish it via the
> management system, or controller?
>
> No IGP extension needed then.
>
>
>
>
>
> As I recall, you are the one who posed the original problem. If a
> centralized solution works for you, then that’s certainly fine by me.
>
>
>
>
>
> Also no pressure for the ABR to keep the RDB(Registration Database) and
> the TCP/QUIC server connections
>
>
>
>
>
> I’m sorry, I don’t undertsand you. Is that a question?
>
>
>
>
>
> Tony
>
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-19 Thread Robert Raszuk
Hi Tony,

Ok, this was the intent already. The current text says:
>
>If an ABR receives a Registration Message for a prefix that is being
>injected by a non-attached area, then it should determine the set of
>ABRs that are advertising that prefix or less specifics and register
>with those ABRs for that prefix.
>
>
How about s/with those ABRs for that prefix./with only those ABRs for that
prefix./

*B*   The same ABR as in *A* instead of registering 100 /32s could observe
> that all of them are part of /24 subnet X and instead of asking ABRs
> serving egress area to get notification for all those next hops could just
> ask those ABRs to send all events pertaining to entire /24 subnet hosts.
> Then upon receiving such events will filter according to local
> registrations in his own area.
>
>
> Yup, thought of this case and I like it. Is it sufficient if the ABR
> doesn’t wait for the 100 /32s to show up?
>


You mean to show up in registration ? I guess there could be a triggering
threshold with some wisely chosen % of the min. number of host routes in
the summaries to avoid too much noise.

Also what I had in mind was single filtering deployment model where ABRs
> react on all host routes (within set ranges) going up and down and unicast
> it to all other ABRs without any registration. Only the ingress area ABRs
> would do local filtering from local ingress PE registrations.
>
>
> It seems to me that this can already be done with the existing protocol
> and is just undocumented ABR behavior.  If folks feel this is worthwhile,
> we can certainly add that.  I’m guessing it would add more configuration,
> which I was trying to avoid.
>

I think that can be a valid deployment model so spelling it out may not
hurt IMHO. The receiving ABR needs to be prepared that it is receiving
something without registering for it.

Or we could perhaps solve it very neatly and define ability to register for
all prefixes (within given mask range) with some form of a wildcard or
reserved prefix 0.0.0.0/32 or 0.0.0.0/24-32 in IPv4 case ?

Thx,
Robert
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-19 Thread Tony Li

Hi Robert,


> Let me perhaps illustrate in two examples what I had in mind. I actually do 
> see immediate benefit to include it day one both in the spec and in the 
> protocol: 
> 
> *A*   ABR locally (from it's own area) received registration requests for 100 
> /32s next hops. It knows from reachability LSDB that summary for all of them 
> came from a single area. So it seems that sending (or rather relaying) such 
> registration to all ABRs instead of only to those serving said area would be 
> very unnecessary and suboptimal and the only effect would be to to grow their 
> RDB for no reason. 


Ok, this was the intent already. The current text says:

   If an ABR receives a Registration Message for a prefix that is being
   injected by a non-attached area, then it should determine the set of
   ABRs that are advertising that prefix or less specifics and register
   with those ABRs for that prefix.

Apparently that’s not clear.  What can I do to make it more obvious?  I did a 
clarification about not duplicating registrations.


> *B*   The same ABR as in *A* instead of registering 100 /32s could observe 
> that all of them are part of /24 subnet X and instead of asking ABRs serving 
> egress area to get notification for all those next hops could just ask those 
> ABRs to send all events pertaining to entire /24 subnet hosts. Then upon 
> receiving such events will filter according to local registrations in his own 
> area. 


Yup, thought of this case and I like it. Is it sufficient if the ABR doesn’t 
wait for the 100 /32s to show up?  When it receives the first one, it 
determines that the covering prefix is a /24 and just registers for that. The 
downside is that there could be many unnecessary notifications delivered.  

If the network operator does as Chris suggested and creates a single summary 
for PE loopbacks, then this all works very well.  If not, there would be noise.

I think we should go for it.  I’m adding text saying that the ABR should 
register for the most specific covering prefix.


> Obviously there are few more low hanging fruits like those two ... 


Please call them out.


> Also what I had in mind was single filtering deployment model where ABRs 
> react on all host routes (within set ranges) going up and down and unicast it 
> to all other ABRs without any registration. Only the ingress area ABRs would 
> do local filtering from local ingress PE registrations. 


It seems to me that this can already be done with the existing protocol and is 
just undocumented ABR behavior.  If folks feel this is worthwhile, we can 
certainly add that.  I’m guessing it would add more configuration, which I was 
trying to avoid.

Tony


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-19 Thread Robert Raszuk
Hi Tony,

> Do you envision any form of aggregation to happen in the messaging
> between ABRs (for both registrations and notifications) ?
>
> Well, as always, I try to generalize mechanisms and solutions. So while I
> don’t see an immediate need or benefit to it, I did write the protocol so
> that it is possible. I wasn’t comfortable with some of the implications of
> non-host prefix liveness events, so I didn’t include those yet, but it is
> certainl possible.


Let me perhaps illustrate in two examples what I had in mind. I actually do
see immediate benefit to include it day one both in the spec and in the
protocol:

*A*   ABR locally (from it's own area) received registration requests for
100 /32s next hops. It knows from reachability LSDB that summary for all of
them came from a single area. So it seems that sending (or rather relaying)
such registration to all ABRs instead of only to those serving said area
would be very unnecessary and suboptimal and the only effect would be to to
grow their RDB for no reason.

*B*   The same ABR as in *A* instead of registering 100 /32s could observe
that all of them are part of /24 subnet X and instead of asking ABRs
serving egress area to get notification for all those next hops could just
ask those ABRs to send all events pertaining to entire /24 subnet hosts.
Then upon receiving such events will filter according to local
registrations in his own area.

Obviously there are few more low hanging fruits like those two ...

> I think the pub-sub model is really cool, but I am not clear what are the
> advantages to do it in the IGP from ABRs vs BGP from area RRs (note that in
> the latter case no new protocol is required).
>
> It doesn’t add more burden to BGP. It also doesn’t require BGP for those
> that aren’t using it. It might also be a bit faster than BGP as there’s
> less overhead.
>

Well let's not forget that BGP will do it anyway :) Next hop tracking
kicking in on the RRs will immediately trigger withdrawals. So clearly this
is not like we are saving BGP in any way here . As discussed, what is
getting withdrawn is a different discussion.

/* I would be really curious to read some of those cases where BGP free
network (to carry services) needs such speed-up */



> > Also if we do it from local area RRs we do not need registrations -
> local RRs know which next hops are attached to local service routes. And it
> would go only where the service route goes.
>
> I think you’re asking if ABRs can be co-located with a client.  Yes,
> certainly.  An ABR could initiate a registration for its own purposes. I’ll
> add a clarification.
>

Great !

Also what I had in mind was single filtering deployment model where ABRs
react on all host routes (within set ranges) going up and down and unicast
it to all other ABRs without any registration. Only the ingress area ABRs
would do local filtering from local ingress PE registrations.

Could be an optional operational model.

Kind regards,
R.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-19 Thread Tony Li


> On Jan 19, 2022, at 11:38 AM, Tony Przygienda  wrote:
> 
> As only observation, if we do this service as suggested I would like to see 
> not only port & protocol but also the according address of the endpoint 
> (since it's really  that is the SSAP here. 
> Operationally it's very often desirable to know what address stuff shows up 
> to especially if there are filtering/reachability considerations … 

Tony,

The router LSA has numerous IP addresses already for the ABR.  Is this not 
sufficient?

Tony


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-19 Thread Tony Li

Hi Robert,


> Do you envision any form of aggregation to happen in the messaging between 
> ABRs (for both registrations and notifications) ? 


Well, as always, I try to generalize mechanisms and solutions. So while I don’t 
see an immediate need or benefit to it, I did write the protocol so that it is 
possible. I wasn’t comfortable with some of the implications of  non-host 
prefix liveness events, so I didn’t include those yet, but it is certainl 
possible. 


> I think the pub-sub model is really cool, but I am not clear what are the 
> advantages to do it in the IGP from ABRs vs BGP from area RRs (note that in 
> the latter case no new protocol is required). 


It doesn’t add more burden to BGP. It also doesn’t require BGP for those that 
aren’t using it. It might also be a bit faster than BGP as there’s less 
overhead.


> Also if we do it from local area RRs we do not need registrations - local RRs 
> know which next hops are attached to local service routes. And it would go 
> only where the service route goes. 


I think you’re asking if ABRs can be co-located with a client.  Yes, certainly. 
 An ABR could initiate a registration for its own purposes. I’ll add a 
clarification.


> Last I am a bit concerned with the scale here. If we keep registrations and 
> notifications at the atomic level the ABRs may need to keep pretty large RDB 
> and efficiently generate *targetted* notifications upon each local area node 
> transition or reception of notification from other ABR(s). 


Understood. You could also register for summaries, which would reduce the scale 
by an order of magnitude. Even without summaries, the RDB is pretty 
constrained. You have (# of local PEs) * (# of other area ABRs) + (# of remote 
PEs) * (# of local PEs).


> Last what protection would be in place to suppress the network wide meltdown 
> when client would (say by mistake or a bug) inject 1 million of registrations 
> ? Would that not result in a bit of load to all ABRs ? 


What protection exists if a BGP speaker injects a million prefixes? What 
protection exists if an IS-IS speaker injects a million LSPs? We can certainly 
add some protection mechanisms if people feel that it’s warranted. However, 
getting the right functionality first seems like a good starting point. Baby 
steps.

Tony


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] New Version Notification for draft-li-lsr-liveness-00.txt

2022-01-19 Thread Tony Li
Hi Aijun,


> If we use pub/sub mechanism, why don’t we accomplish it via the management 
> system, or controller? 
> No IGP extension needed then.


As I recall, you are the one who posed the original problem. If a centralized 
solution works for you, then that’s certainly fine by me.


> Also no pressure for the ABR to keep the RDB(Registration Database) and the 
> TCP/QUIC server connections


I’m sorry, I don’t undertsand you. Is that a question?


Tony

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr