Tony –

What is important to me here is a common understanding and providing a complete 
solution.

Hopefully, you are at least understanding that the point I am making is valid 
i.e., traffic loss can occur even with better-idbx in place.
I would also argue that you are underestimating the effect of scale.

As to your argument below,  it could also be used to argue against doing 
anything – after all we know that current OSPF does converge in a modest amount 
of time.

Since you have decided to make things better (which I support) I do not see why 
we should not define a complete solution.
If you, as a vendor, choose not to implement SA because you consider the 
cost/benefit ratio unappealing – that is your choice. So long as you and your 
customers are satisfied …

But our mission here is to define a solution – and I am simply arguing for a 
more complete solution.

  Les

From: Tony Przygienda <[email protected]>
Sent: Friday, July 12, 2024 11:23 AM
To: Acee Lindem <[email protected]>
Cc: Les Ginsberg (ginsberg) <[email protected]>; Liyan Gong 
<[email protected]>; Aijun Wang <[email protected]>; Peter 
Psenak (ppsenak) <[email protected]>; Yingzhen Qu <[email protected]>; 
lsr <[email protected]>; lsr-chairs <[email protected]>; shraddha 
<[email protected]>
Subject: Re: [Lsr] About Premature aging of LSA and Purge LSA

Les, whatever you try to suggest here, you slide into direction of trying to 
guarantee common knowledge closure (that's the technical term for what you try) 
and based on distributed systems theory you end up ultimately with virtual 
clock synchronization of the network in some form if you _really_ want to solve 
the problem rather than "hey, my stuff may work 2 hops away rather than 1 hop 
so it's much better and let's not talk about 3 hops" (look up at Lampert's 
clock vectors/matrices for proper theoretical underpinnings of such 
undertakings if you'd like to take this discussion further) and this will slow 
things to a crawl. Worse, you will discover pretty soon that going down this 
path you will have to learn consistent cuts and basically transaction 
scheduling most likley ;-)

IGPs are just IGPs, i.e. they do guarantee "eventual consistency" (in proper 
technical terms epsilon consistency) and that makes them fast and reacting fast 
to failures and that's the base of their success. This also means you have 
transients and this here is just one, relatively simple fix of a local 
transient and that's about the best you can do to preserve the desirable 
properties (i.e. fastest possible eventual consistency with maximum resiliency 
[that's the CAP paradigm part which is another way to see IGPs as _AP type of 
solution]).

Without this kind of underlying understanding/language we are talking about "me 
likes my stuff with me bells and whistles better than ye' thing" and it's going 
in circles AFAIS.

so I'm with acee here in short (and I left the fact out that as I say, flavor 
of this stuff is deployed since long time and works fine at any scale in our 
experience and it's damn' simple to implement comparatively speaking and 
doesn't need any big rollouts on the network] compared to all the signalling 
machinery suggested)

-- tony

On Fri, Jul 12, 2024 at 6:44 PM Acee Lindem 
<[email protected]<mailto:[email protected]>> wrote:
So, I don’t think the case you are suggesting is plausible. Let’s say you have 
a hypothetical router somewhere in the same area that has the restarting 
router’s stale LSAs.

    1. The restarting router’s neighbors will only advertise an adjacency once 
the stale LSAs have been updated or purged from their local databases.
    2. Only then will the adjacency be advertised - so the update or purge 
precedes the adjacency advertisement.
    3. How is the neighbor router’s LSA going to pass the restarting router’s 
LSA update or purge? It will take the same or possibly even better flooding 
path. Will it be flooded at warp speed?
    4. Are you suggesting that the restarting router’s LSAs are dropped but the 
neighbor’s advertisement is not? If so, how would the restarting router know 
this and delay removing the adjacency suppression? Are you relying on the 
inherent inefficiencies and convergence delays with LLS signaling handshake 
between the two routers?

In any case, trying to prevent transient problems due to selective loss of 
updates is an exercise in futility.

Thanks,
Acee




On Jul 12, 2024, at 12:13, Les Ginsberg (ginsberg) 
<[email protected]<mailto:[email protected]>> wrote:

Acee –

When the restarting router goes down, the state of the LSDB in the network 
becomes:

Restart Router LSA: All neighbors advertised
Neighbor Routers: Neighbor to Restarting Router is removed

When the restarting router comes back up, two changes will occur:

1)Restarting Router updates its LSAs
2)Neighbors updates their LSAs to indicate it once again has a neighbor to the 
restarting router

You cannot guarantee the flooding order of network-wide.
Because the stale LSAs from the Restarting Router are present in all nodes, as 
soon as a neighbor readvertises the adjacency to the restarting router, it is 
now possible that on some nodes in the network you will temporarily have an 
LSDB which has:

Stale LSA from restarting router + Updated LSA from neighbor

Whether the restarting router sends an updated LSA with neighbors or without 
neighbors (as you suggest) you cannot prevent the above transient condition 
from occurring because doing so requires guaranteeing that the update to the 
Neighbor LSA and the update to the restarting router LSA are done atomically 
network-wide.
That is why the restarting router cannot do this without help from the 
neighbors.

Hope this is clear.

    Les


From: Acee Lindem <[email protected]<mailto:[email protected]>>
Sent: Friday, July 12, 2024 7:55 AM
To: Les Ginsberg (ginsberg) <[email protected]<mailto:[email protected]>>
Cc: Liyan Gong <[email protected]<mailto:[email protected]>>; 
Aijun Wang <[email protected]<mailto:[email protected]>>; Peter 
Psenak (ppsenak) <[email protected]<mailto:[email protected]>>; Yingzhen Qu 
<[email protected]<mailto:[email protected]>>; lsr 
<[email protected]<mailto:[email protected]>>; lsr-chairs 
<[email protected]<mailto:[email protected]>>; tony Przygienda 
<[email protected]<mailto:[email protected]>>; shraddha 
<[email protected]<mailto:[email protected]>>
Subject: Re: [Lsr] About Premature aging of LSA and Purge LSA



On Jul 12, 2024, at 10:49, Les Ginsberg (ginsberg) 
<[email protected]<mailto:[email protected]>> wrote:

Acee –

The neighbors do not control when the flooding of the purge/update reaches all 
routers in the network.
The neighbors have direct control of the exchange between themselves and their 
immediate neighbors – nothing else.

The restarting router has no better idea. If you’re suggesting suppressing 
advertising adjacencies until all neighbors of the restarting router are 
adjacent (which is a bad idea), the restarting router can do this as well by 
suppressing its link advertisements. There is NOTHING additional that can be 
accomplished by adding LLS signaling.

Acee





   Les

From: Acee Lindem <[email protected]<mailto:[email protected]>>
Sent: Friday, July 12, 2024 7:44 AM
To: Les Ginsberg (ginsberg) <[email protected]<mailto:[email protected]>>
Cc: Liyan Gong <[email protected]<mailto:[email protected]>>; 
Aijun Wang <[email protected]<mailto:[email protected]>>; Peter 
Psenak (ppsenak) <[email protected]<mailto:[email protected]>>; Yingzhen Qu 
<[email protected]<mailto:[email protected]>>; lsr 
<[email protected]<mailto:[email protected]>>; lsr-chairs 
<[email protected]<mailto:[email protected]>>; tony Przygienda 
<[email protected]<mailto:[email protected]>>; shraddha 
<[email protected]<mailto:[email protected]>>
Subject: Re: [Lsr] About Premature aging of LSA and Purge LSA




On Jul 12, 2024, at 10:40, Les Ginsberg (ginsberg) 
<[email protected]<mailto:[email protected]>> wrote:

Acee –

Having the restarting router suppress advertisement of its adjacencies does not 
address the transient state where routers in the network have received the 
updated LSA from the neighbor with the reestablished adjacency to the 
restarting router but still have the stale LSA from the restarting router that 
has the pre-restart adjacency advertisements. (point #1 I made below).

The neighbors of the restarting router will not advertise the adjacency until 
the stale LSAs are purged or updated - this is the whole point of 
https://datatracker.ietf.org/doc/draft-hegde-lsr-ospf-better-idbx/


Thanks,
Acee






So this is not a robust solution.

   Les

From: Acee Lindem <[email protected]<mailto:[email protected]>>
Sent: Friday, July 12, 2024 7:21 AM
To: Les Ginsberg (ginsberg) <[email protected]<mailto:[email protected]>>
Cc: Liyan Gong <[email protected]<mailto:[email protected]>>; 
Aijun Wang <[email protected]<mailto:[email protected]>>; Peter 
Psenak (ppsenak) <[email protected]<mailto:[email protected]>>; Yingzhen Qu 
<[email protected]<mailto:[email protected]>>; lsr 
<[email protected]<mailto:[email protected]>>; lsr-chairs 
<[email protected]<mailto:[email protected]>>; tony Przygienda 
<[email protected]<mailto:[email protected]>>; shraddha 
<[email protected]<mailto:[email protected]>>
Subject: Re: [Lsr] About Premature aging of LSA and Purge LSA

Hi Les,



On Jul 12, 2024, at 02:57, Les Ginsberg (ginsberg) 
<[email protected]<mailto:[email protected]>> wrote:

I am happy that work on this problem has begun.
I believe the most robust way forward is to implement the mechanisms defined in 
BOTH drafts.

I think the mechanism defined in draft-hegde-lsr-ospf-better-idbx is sound and 
not overly complex (sorry Liyan 😊) and should be done.
But it does not solve all aspects of the problem.
It does make LSDB synchronization more robust – which addresses the control 
plane aspects of the problem.
It also has the advantage that it does not require any support on the 
neighboring routers – and so the benefits can be realized simply by upgrading 
one router at a time.

However,  draft-hegde-lsr-ospf-better-idbx does not address forwarding plane 
aspects of the problem – which become more significant at scale.
There are two aspects of this problem:

1)You do not have control over the order in which the updated LSAs are flooded 
to the rest of the network – so it is still possible for transient forwarding 
issues to occur multiple hops away from the restarting router.
2)The restarting router requires additional time – after full LSDB sync – to 
program the forwarding plane. It is well known that update of the forwarding 
plane takes much longer than protocol SPF calculation.
If only a few hundred routes are supported, this may not be of significant 
concern, but if thousands of routes are supported the time it takes to program 
the forwarding plane becomes a significant contributor.

I fail to see how suppressing neighbor adjacency advertisement solves any 
additional problems that are not solved by avoiding usage of the restarting 
router’s stale LSAs.

Note that the OSPF SPF has a check for bi-directional connectivity,  excerpted 
from section 16.1 of RFC2328:



            (b) Otherwise, W is a transit vertex (router or transit

                network).  Look up the vertex W's LSA (router-LSA or

                network-LSA) in Area A's link state database.  If the

                LSA does not exist, or its LS age is equal to MaxAge, or

                it does not have a link back to vertex V, examine the

                next link in V's LSA.[23]



Consequently, the restarting router can simply suppress its own link 
advertisement until such time that is required to solve the above problems. You 
should be familiar with this quote:


              “If you want a thing done well, do it yourself.”
              ― Napoleon Bonaparte


Thanks,
Acee








draft-cheng-lsr-ospf-adjacency-suppress  provides a way to address the above 
two aspects by providing a means for the neighbors of the restarting router to 
delay advertisement of the restored adjacency to the restarting router. (SA 
signaling)

It could be argued that using SA signaling eliminates the need to do anything 
else – but given that this mechanism depends upon support by all the neighbors 
of the restarting router I believe there is still good reason to implement both 
mechanisms.

NOTE: I would prefer that the two drafts be combined into a single draft – but 
that is optional and up to the authors. But from the WG perspective I would 
like to see both solutions progress.

   Les



From: Liyan Gong <[email protected]<mailto:[email protected]>>
Sent: Thursday, July 11, 2024 8:22 PM
To: Acee Lindem <[email protected]<mailto:[email protected]>>; Aijun Wang 
<[email protected]<mailto:[email protected]>>
Cc: Peter Psenak (ppsenak) <[email protected]<mailto:[email protected]>>; 
Yingzhen Qu <[email protected]<mailto:[email protected]>>; lsr 
<[email protected]<mailto:[email protected]>>; lsr-chairs 
<[email protected]<mailto:[email protected]>>; tony Przygienda 
<[email protected]<mailto:[email protected]>>; shraddha 
<[email protected]<mailto:[email protected]>>
Subject: [Lsr] Re: About Premature aging of LSA and Purge LSA

Hi Acee and Aijun,

Thank you very much for your discussion. I would like to share my thoughts on 
the proposed solutions.
In my view, draft-hegde-lsr-ospf-better-idbx  may not be as straight forward as 
it initially appears.
Despite its local applicability, it entails a complex neighbor establishment 
process, which is fundamental to the OSPF protocol and typically not altered 
lightly by those familiar with its workings.
On the other hand, draft-cheng-lsr-ospf-adjacency-suppress  presents a more 
focused approach tailored to address the specific issue without unintended 
consequences.
I still believe the key factor in evaluating any approach is whether it impacts 
the current systems negatively.

Regarding our extensive discussions on these drafts, please refer to our 
previous records for more details.
https://mailarchive.ietf.org/arch/search/?q=%22draft-cheng-lsr-ospf-adjacency-suppress%22

Thank you for your attention to this matter.



Best Regards,

Liyan


----邮件原文----
发件人:Acee Lindem  <[email protected]<mailto:[email protected]>>
收件人:Aijun Wang  <[email protected]<mailto:[email protected]>>
抄 送:Peter Psenak  <[email protected]<mailto:[email protected]>>,Yingzhen Qu  
<[email protected]<mailto:[email protected]>>,lsr  
<[email protected]<mailto:[email protected]>>,lsr-chairs  
<[email protected]<mailto:[email protected]>>,tony Przygienda  
<[email protected]<mailto:[email protected]>>,shraddha  
<[email protected]<mailto:[email protected]>>
发送时间:2024-07-11 23:26:57
主题:[Lsr] Re: About Premature aging of LSA and Purge LSA

As WG member:

On Jul 11, 2024, at 05:29, Aijun Wang 
<[email protected]<mailto:[email protected]>> wrote:

And, there is also another draft aims to solve the similar problem 
https://datatracker.ietf.org/doc/html/draft-cheng-lsr-ospf-adjacency-suppress-02,
 which it declares similar with the solution in IS-IS.   Why not take this 
approach?

Because this one doesn’t require any signaling and can accomplished via local 
behavior without requiring support from any other OSPF router. Additionally, it 
is simpler.. Well, at least for someone who has a deep understanding of the 
protocol.

Thanks,
Acee





Best Regards

Aijun Wang
China Telecom

发件人: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] 代表 Aijun Wang
发送时间: 2024年7月11日 17:20
收件人: 'Acee Lindem' <[email protected]<mailto:[email protected]>>
抄送: 'Peter Psenak' <[email protected]<mailto:[email protected]>>; 'Yingzhen Qu' 
<[email protected]<mailto:[email protected]>>; 'lsr' 
<[email protected]<mailto:[email protected]>>; 'lsr-chairs' 
<[email protected]<mailto:[email protected]>>; 'tony Przygienda' 
<[email protected]<mailto:[email protected]>>; 'shraddha' 
<[email protected]<mailto:[email protected]>>
主题: [Lsr] 答复: Re: About Premature aging of LSA and Purge LSA

For the neighbors of the restarting router, why can’t they delete directly the 
LSAs that originated by the restarting router instead of putting them into one 
“Stale DB Exchange list” when they detect their neighbor is down?

发件人: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] 代表 Acee Lindem
发送时间: 2024年7月10日 22:14
收件人: Aijun Wang <[email protected]<mailto:[email protected]>>
抄送: Peter Psenak <[email protected]<mailto:[email protected]>>; Yingzhen Qu 
<[email protected]<mailto:[email protected]>>; lsr 
<[email protected]<mailto:[email protected]>>; lsr-chairs 
<[email protected]<mailto:[email protected]>>; tony Przygienda 
<[email protected]<mailto:[email protected]>>; shraddha 
<[email protected]<mailto:[email protected]>>
主题: [Lsr] Re: About Premature aging of LSA and Purge LSA

Yes - but the whole discussion of adjacency suppression and database 
synchronization is based on preventing TEMPORARY usage of stale LSAs leading to 
false bidirectional adjacencies during unplanned restart. RFC 2328 OSPF will 
converge without any modifications - there can just be transient traffic drops 
and/or loops.

Thanks,
Acee
On Jul 9, 2024, at 20:42, Aijun Wang 
<[email protected]<mailto:[email protected]>> wrote:

For the unplanned restart, shouldn’t the responsibility of the directed connect 
neighbors to send out such LSAs for the purge of obsolete LSA?

Best Regards

Aijun Wang
China Telecom

发件人: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] 代表 Acee Lindem
发送时间: 2024年7月9日 20:14
收件人: Peter Psenak <[email protected]<mailto:[email protected]>>
抄送: Aijun Wang <[email protected]<mailto:[email protected]>>; 
Yingzhen Qu <[email protected]<mailto:[email protected]>>; lsr 
<[email protected]<mailto:[email protected]>>; lsr-chairs 
<[email protected]<mailto:[email protected]>>; tony Przygienda 
<[email protected]<mailto:[email protected]>>; shraddha 
<[email protected]<mailto:[email protected]>>
主题: [Lsr] Re: About Premature aging of LSA and Purge LSA

Additionally, you certainly don’t need a standards track solution to this 
problem. An implementation could honor MinLSInterval by simply locally keeping 
its own list of self-originated MaxAge LSAs and delaying reorigination.

Thanks,
Acee

On Jul 9, 2024, at 04:13, Peter Psenak 
<[email protected]<mailto:[email protected]>> wrote:

Aijun,

On 09/07/2024 09:46, Aijun Wang wrote:
Hi, Acee:

Can the proposal in 
https://datatracker.ietf.org/doc/html/draft-dong-ospf-purge-lsa-00, together 
with https://datatracker.ietf.org/doc/html/rfc2328#section-14.1(Premature aging 
of LSAs) solve your mentioned problem?
If so, is it simpler than your proposal?
That is, before the router restart, it needs only send out the Purge LSA(when 
LSA sequence number is not to wrap) or premature aging of its LSA.(when 
sequence number is to wrap)

does not work for unplanned restart.

thanks,
Peter

Best Regards

Aijun Wang
China Telecom

发件人: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] 代表 Acee Lindem
发送时间: 2024年7月9日 3:58
收件人: Yingzhen Qu <[email protected]><mailto:[email protected]>
抄送: lsr <[email protected]><mailto:[email protected]>; lsr-chairs 
<[email protected]><mailto:[email protected]>; tony Przygienda 
<[email protected]><mailto:[email protected]>; shraddha 
<[email protected]><mailto:[email protected]>
主题: [Lsr] Re: IETF 120 LSR Slot Requests

Speaking as WG member:

I would like a 10 minute slot to present an update to 
https://datatracker.ietf.org/doc/draft-hegde-lsr-ospf-better-idbx/

Thanks,
Acee




On Jun 25, 2024, at 14:19, Yingzhen Qu 
<[email protected]<mailto:[email protected]>> wrote:

Hi,

The draft agenda for IETF 120 has been posted:
IETF 120 Meeting Agenda<https://datatracker.ietf.org/meeting/120/agenda/>

The LSR session is scheduled on Friday Session I1 9:30 - 11:30, July 26, 2024.

Please send slot requests to [email protected]<mailto:[email protected]> 
before the end of the day Wednesday July 10th.  Please include draft name and 
link, presenter, desired slot length including Q&A.

Please note that having a discussion on the LSR mailing list is a prerequisite 
for a draft presentation in the WG session. If you need any help please reach 
out to the chairs.

Thanks,
Yingzhen

_______________________________________________
Lsr mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to