On 2 Nov 2023, at 05:09, Gyan Mishra <[email protected]> wrote:
Hi Sasha, Bruno & Stewart
Thank you for going over my OPSDIR review in detail.
I am good with the latest updated verbiage that Bruno had given.
Comments in-line
On Mon, Oct 23, 2023 at 8:41 AM Alexander Vainshtein
<[email protected]> wrote:
Bruno,
Lots of thanks for a prompt and very encouraging response!
Your version of the text is definitely better than mine, I am all
for using it.
As for where the clarifying text could be inserted, I see two
options:
* A common “Applicability Statement” section (there is no such
section in the draft)
*
* A dedicated section on relationship between TI-LFA and
micro-loops.
Gyan> I think this option would be best. This would fix the
existing gap on uLoop. I did mention but not sure if possible-
as TI-LFA and uLoop are tightly coupled as a overall post
convergence solution is it possible to combine the drafts and
issue another WGLC. (Question for authors)
In any case, I defer to you and the rest of the authors to decide
what, if anything should be done for clarifying the relationship
between TI-LFA and micro-loops.
Regards,
Sasha
*From:* [email protected] <[email protected]>
*Sent:* Monday, October 23, 2023 3:27 PM
*To:* Alexander Vainshtein <[email protected]>
*Cc:* [email protected]; rtgwg-chairs <[email protected]>;
[email protected]; Stewart Bryant
<[email protected]>
*Subject:* [EXTERNAL] RE: draft-ietf-rtgwg-segment-routing-ti-lfa
: A simple pathological network fragment
Sasha,
Thanks for the summary and the constructive proposal.
Speaking for myself, this makes sense and I agree.
ØTI-LFA is a local operation applied by the PLR when it detects
failure of one of its local links. As such, it does not affect:
oMicro-loops that appear – or do not appear –on the paths to the
destination that do not pass thru TI-LFA paths
As an editorial comment, depending on where such text would be
inserted, I would propose the following change:
OLD: Micro-loops that appear – or do not appear –
NEW: Micro-loops that appear – or do not appear – as part of the
distributed IGP convergence [RFC5715]
Motivation: some reader could wrongly understand that such
micro-loops are caused by TI-LFA
Thanks,
Regards,
--Bruno
Orange Restricted
*From:*Alexander Vainshtein <[email protected]>
*Sent:* Sunday, October 22, 2023 4:21 PM
*To:* DECRAENE Bruno INNOV/NET <[email protected]>;
Stewart Bryant <[email protected]>
*Cc:* [email protected]; rtgwg-chairs <[email protected]>;
[email protected]
*Subject:* RE: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple
pathological network fragment
*Importance:* High
Bruno, Stewart and all,
I think that most of the things about TI-LFA and micro-loops have
been said already (if in a slightly different context) and are
mainly self-evident.
However, I share the feeling that somehow the relationship
between TI-LFA and micro-loop avoidance has become somewhat muddled.
Therefore, I would like to suggest adding some text to the TI-LFA
draft that clarifies this relationship, e.g., along the following
lines:
1.TI-LFA is a local operation applied by the PLR when it detects
failure of one of its local links. As such, it does not affect:
a.Micro-loops that appear – or do not appear –on the paths to the
destination that do not pass thru TI-LFA paths
i.As explained in RFC 5714, such micro-loops may result in the
traffic not reaching the PLR and therefore not following TI-LFA paths
ii.Segment Routing may be used for prevention of such micro-loops
as described in the micro-loop avoidance draft
b.Micro-loops that appear – or do not appear - when the failed
link is repaired (/aside: the need for this line is based on
personal experience//☹/)
2.TI-LFA paths are loop-free. What’s more, they follow the
post-convergence paths, and, therefore, not subject to
micro-loops due to difference in the IGP convergence times of the
nodes thru which they pass
3.TI-LFA paths are applied from the moment the PLR detects
failure of a local link and until IGP convergence at the PLR is
completed. Therefore, early (relative to the other nodes) IGP
convergence at the PLR and the consecutive ”early” release of
TI-LFA paths may cause micro-loops, especially if these paths
have been computed using the methods described in Section 6.2,
6.3 or 6.4 of the draft. One of the possible ways to prevent such
micro-loops is local convergence delay (RFC 8333).
4.TI-LFA procedures are complementary to application of any
micro-loop avoidance procedures in the case of link or node failure:
a.Link or node failure requires some urgent action to restore the
traffic that passed thru the failed resource. TI-LFA paths are
pre-computed and pre-installed and therefore suitable for urgent
recovery
b.The paths used in the micro-loop avoidance procedures typically
cannot be pre-computed.
Hopefully these notes would be useful.
Regards,
Sasha
*From:* rtgwg <[email protected]> *On Behalf Of
*[email protected]
*Sent:* Thursday, October 19, 2023 7:34 PM
*To:* Stewart Bryant <[email protected]>
*Cc:* [email protected]; rtgwg-chairs <[email protected]>;
[email protected]
*Subject:* [EXTERNAL] RE: draft-ietf-rtgwg-segment-routing-ti-lfa
: A simple pathological network fragment
Hi Stewart,
I agree with you on the technical points, so the first part of
your email up to “So I think”.
But I don’t quite follow why you want to mix IGP Convergence
issues with this Fast ReRoute Solution.
To quote RFC 5714 « IP Fast Reroute Framework”
In order to reduce packet disruption times to a duration commensurate
with the failure detection times, two mechanisms may be required:
a. A mechanism for the router(s) adjacent to the failure to
rapidly
invoke a repair path, which is unaffected by any subsequent re-
convergence.
b. In topologies that are susceptible to micro-loops, a
micro-loop
control mechanism may be required [RFC5715
<https://datatracker.ietf.org/doc/html/rfc5715>].
Performing the first task without the second may result in the repair
path being starved of traffic and hence being redundant.
https://datatracker.ietf.org/doc/html/rfc5714#section-4
I would assume that you agree with the above (as you are an
author of this RFC and my guess would be that you wrote that text)
My point is that there are two different mechanisms involved, in
two different time periods:
-Fast ReRoute (“a”): this is the scope of
draft-ietf-rtgwg-segment-routing-ti-lfa
oTiming: from detection time , to start of the IGP convergence
-IGP Micro-loop avoidance (“b”)
oTiming: from start of IGP convergence to end of IGP convergence
The scope of draft-ietf-rtgwg-segment-routing-ti-lfa is FRR /
“a”. IGP micro-loop is out of scope. Other documents are
proposing solutions for this. (and for those Micro-loop
documents, FRR is similarly out of scope).
Personally I agree with you that both mechanisms are needed. But
I think that this is already highlighted in RFC 5714, and that
this is no different than RFC 7490 (RLFA). Therefore, I don’t see
why the outcome/text should be different. Hence my proposition to
reuse the text from RFC 7490 (RLFA). I find it adequate. You
wrote it so probably find it adequate.
On a side note, RFC5715, that you also wrote, seems to already
cover what you are asking for. Quoting the abstract, it
provides a summary of the causes and consequences of
micro-loops and enables the reader to form a judgement on whether
micro-looping is an issue that needs to be addressed in specific
networks.
Note that this RFC5715 is already cited in the proposed text.
PS: If you were ready to wrote a 5715bis, I would support this.
Best regards,
--Bruno
Orange Restricted
*From:* Stewart Bryant <[email protected]>
*Sent:* Tuesday, October 17, 2023 1:48 PM
*To:* DECRAENE Bruno INNOV/NET <[email protected]>
*Cc:* Stewart Bryant <[email protected]>; [email protected];
rtgwg-chairs <[email protected]>;
[email protected]
*Subject:* Re: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple
pathological network fragment
Hi Bruno
I was thinking about this some more. It is something that was
recognised in the early days, but somewhat swept aside.
The case that Gyan bought up was an ECMP case, but I fear that
the case is more common and I think we should characterise it as
part of the text rather that giving the impression it is unusual.
I think the problem occurs whenever there are two or more nodes
between the point of packet entry and the failure.
CE1 - R1 - R2 - R3 - R4 -/- R5 - CE2
| |
R6 - R7 - R8 - R9 — R10
The normal path CE1-CE2 is via R2
When R4-R5 fails it is trivial to see how the repair works with
R7 as the entry into Q space.
However unless R1, R2, R3 converge in that order there will be
microloops for traffic entering via any of those three nodes.
So I think we can say that unless the PLR is only receiving
traffic to be protected directly or from its immediate neighbour
it is not guaranteed that there will not be micro loops that are
not addressable by the propose strategy of aligning the repair
path with the post convergence path.
Now thinking about the text you have below, I think we need to
write in in terms of - Unless the operator is certain that no
micro loops will form over any path the protected traffic will
traverse between entry to the network and arrival at the PLR a
micro loop avoidance method MUST be deployed. Of course I think
that it would be helpful to the operator community for the text
to provide some guidance on how to ascertain whether there is a
danger of the formation of micro loops.
I would note that the long chains of nodes show in the example
above were probably not present in the test topologies which as I
remember were all national scale provider networks, but unless we
provide guidance otherwise Ti-LFA could reasonably be deployed in
edge networks and in the case of cell systems these are often
ring topologies.
So I think we need to agree (as a WG) on the constrains that we
are prepared to specify in the text and the degree of warning we
need to provide to the operator community and then we can polish
the text below.
Best regards
Stewart
On 16 Oct 2023, at 17:25, [email protected] wrote:
Hi Stewart,
Please see inline
Orange Restricted
*From:*Stewart Bryant <[email protected]
<mailto:[email protected]>>
*Sent:*Monday, October 16, 2023 2:08 PM
*To:*[email protected] <mailto:[email protected]>; rtgwg-chairs
<[email protected]
<mailto:[email protected]>>;[email protected]
<mailto:[email protected]>
*Cc:*Stewart Bryant <[email protected]
<mailto:[email protected]>>
*Subject:*draft-ietf-rtgwg-segment-routing-ti-lfa : A simple
pathological network fragment
During the operations directorate early review
of draft-ietf-rtgwg-segment-routing-ti-lfa
Gyan Mishra points to a simple pathological network fragment
that I think deserves wider discussion.
https://datatracker.ietf.org/doc/review-ietf-rtgwg-segment-routing-ti-lfa-11-opsdir-early-mishra-2023-08-25/
<https://datatracker.ietf.org/doc/review-ietf-rtgwg-segment-routing-ti-lfa-11-opsdir-early-mishra-2023-08-25>
I am not aware of any response to the RTGWG by the draft
authors concerning the review comment and I cannot see
obvious new text addressing this concern.
The fragment is as follows
CE1 –R1- R2-/-R3-CE2
| |
R4 – R5 -R6
In the pre converged network R4 is ECMP CE2 via R5 (cost 4)
and via R1 (cost also 4).
We can easily build a TI-LFA repair path from R2 under link
failure to CE2 (so long as we remember that R4 is an ECMP
path to CE2), but the problem occurs during convergence. If
R1 converges before R4, R4 may ECMP packets addressed to CE2
back to R1 in a micro loop. Meanwhile since no packets for R3
are reaching R2 the Ti-LFA repair is not doing anything useful.
The Ti-LFA text leads the reader to conclude that it is a
loop-free solution, but gives no guidance on how to determine
when this assumption breaks down. There is an informational
reference to
draft-bashandy-rtgwg-segment-routing-uloop, but this short
individual draft does little in the way of helping the reader
determine when loop avoidance strategy needs to be deployed
and the loop-free approach it describes does not seem to be
fully developed.
I am worried that proceeding with the Ti-LFA draft without
noting that there is a real risk that simple network
fragments can micoloop, and providing a fully formed
mitigation strategy is a disservice to the operator community
given the industry interest in Ti-LDA and the insidious
nature of unexpected micro loop network transients, I am
wondering what the view of the working group is on how to
proceed.
One approach would be for the Ti-LFA draft to incorporate
detailed guidance on how to determine the risk of a micro
loop in a specific operator network, and to provide specific
mitigation advice. Another approach would be to reference a
developed loop avoidance strategy and recommending its
preemptive deployment. Another approach would be to make
draft-bashandy-rtgwg-segment-routing-uloop a normative
reference and tie the fate of the two drafts. Another
approach would be to elaborate on the risks and their
manifestations but declare it a currently unsolved problem. I
am sure there are other options that the WG may formulate.
What is the opinion of the working group on how we should
proceed with draft-ietf-rtgwg-segment-routing-ti-lfa when
considering the possible formation of micro loops?
FRR takes place between the failure (detection) and the IGP
reconvergence. Those are two consecutive steps that the WG
has so far addressed with different solutions and documents.
That’s not new and that’s not specific to TI-LFA. E.g.,
that’s applicable to RLFA.
Would the below text, taken verbatim from RFC 7490 (RLFA),
work for you? Or would you say that the text is not good enough?
“When the network reconverges, micro-loops [RFC5715
<https://datatracker.ietf.org/doc/html/rfc5715>] can form due to
transient inconsistencies in the forwarding tables of different
routers. If it is determined that micro-loops are a significant
issue in the deployment, then a suitable loop-free convergence
method, such as one of those described in [RFC5715
<https://datatracker.ietf.org/doc/html/rfc5715>], [RFC6976
<https://datatracker.ietf.org/doc/html/rfc6976>], or
[ULOOP-DELAY
<https://datatracker.ietf.org/doc/html/rfc7490#ref-ULOOP-DELAY>],
should be implemented.”
https://datatracker.ietf.org/doc/html/rfc7490#section-10
<https://datatracker.ietf.org/doc/html/rfc7490#section-10>
Of course, we could update the list of informative references.
E.g., by adding another informative reference to
draft-bashandy-rtgwg-segment-routing-uloop and by removing
informative references to [RFC6976] and [ULOOP-DELAY] which
are probably outdated.
--Bruno
- Stewart
____________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des
informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si
vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes.
Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete
altere, deforme ou falsifie. Merci.
This message and its attachments may contain confidential or
privileged information that may be protected by law;
they should not be distributed, used or copied without
authorisation.
If you have received this email in error, please notify the
sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages
that have been modified, changed or falsified.
Thank you.
____________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des
informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous
avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les
messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere,
deforme ou falsifie. Merci.
This message and its attachments may contain confidential or
privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the
sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that
have been modified, changed or falsified.
Thank you.
*Disclaimer*
This e-mail together with any attachments may contain information
of Ribbon Communications Inc. and its Affiliates that is
confidential and/or proprietary for the sole use of the intended
recipient. Any review, disclosure, reliance or distribution by
others or forwarding without express permission is strictly
prohibited. If you are not the intended recipient, please notify
the sender immediately and then delete all copies, including any
attachments.
____________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des
informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous
avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les
messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere,
deforme ou falsifie. Merci.
This message and its attachments may contain confidential or
privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the
sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that
have been modified, changed or falsified.
Thank you.
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg