Re: [bess] Comments on draft-ietf-bess-ir

thomas.morin Tue, 29 Sep 2015 04:18:01 -0700

Hi Eric,


2015-09-28, Eric C Rosen:

 From the draft:

     "This document does not provide any new protocol elements or
procedures"

I think we can agree that it does not specify any new protocol elements.

 > [Thomas] Sections 3, 4.1.1 and 9, at least, introduce what I think
can fairly be considered new procedures.

I don't see anything in section 3 or 4.1.1 that I would call "new
procedures".

However, your point is well-taken about section 9, as RFC6514 does not
really address the use of timers to achieve "make before break"
functionality.  On the other hand, RFC 6513 section 7 does specify the
use of timers when switching a flow from one P-tunnel to another, so the
use of timers is not a new addition.

When we started implementing ingress replication, we found that it
wasn't always very clear how to apply the procedures of RFC6514 when
ingress replication is being used.  The purpose of this draft is to pull
together into one place all the procedures relevant to ingress
replication, and to explain clearly how ingress replication is done
using the procedures of RFC6514.  The focus is on getting it clear
enough to increase the likelihood of multi-vendor interoperability.  We
really tried hard to avoid creating any new IR-specific procedures,
though section 9 may be an exception.

And I fully agree that the specs do fit this intention, but oneexception is enough to make the assertion wrong.

I would suggest to distinguish intent and strict truth, e.g. byreplacing the quoted sentence by "To bring the required clarifications,this document updates the behavior specified by RFC6514, but does sowithout introducing new protocol elements or any fundamentally newprocedures". Or something along these lines.

 From the draft:

     "4.1. Advertised P-tunnels The procedures in this section apply
when the P-tunnel to be joined has been advertised in an S-PMSI A-D
route, an Inter-AS I-PMSI A-D route, or an Intra-AS I-PMSI A-D route."

 > For sake of clarity and avoid any misinterpretation, can you please
add ", and the PMSI Tunnel Attribute is of type Ingress Replication"

Well, section 4 is called "How to Join an IR P-tunnel", and the entire
draft is exclusively about IR P-tunnels.  If you think that is not
clear, perhaps the sentence above should just say "when the IR P-tunnel
to be joined has been ..."


Yes, that would be just fine.

 From the draft:

     "Note that if a set of IR P-tunnels is joined in this manner, the
"discard from the wrong PE" procedures of [RFC6513] section 9.1.1 cannot
be applied to that P-tunnel.  Thus duplicate prevention on such IR
P-tunnels requires the use of either Single Forwarder Selection
([RFC6513] section 9.1.2) or native PIM procedures ([RFC6513] section
9.1.3).

[Thomas] I would suggest rewording with "Note that, in the general case,
..."  and "...unless the tunneling technique relies on an IP transport,
which may allow the identification of the PE sourcing the traffic".

It is certainly true in theory that one could use an IP encapsulation in
this way, but in practice it creates a couple of complications:

- I think it presupposes that the IP source address field of the
tunneled packets contains the same IP address that the ingress PE puts
in the Global Administrator field of the VRF Route Import EC that it
attaches to the unicast routes that it distributes.

(I guess it could use a different one and be made to advertise which oneto expect in a BGP attribute.)

- All the egress PEs need to implement this IP address check in the data
plane forwarding path.


Yes, and this is already true in RFC6513.

While using the IP encapsulation in this way is a possible option, it
has never seemed like a very attractive option, and as far as I know, no
one has implemented it.

To avoid the need for an option like this, I always recommend that if
one wants to use IR by default, one should advertise the IR P-tunnels in
a (C-*,C-*) S-PMSI A-D route rather than in an Intra-AS I-PMSI A-D
route.  One can still use IP tunnels if one wants, but the "discard from
the wrong PE" procedures would be based on the MPLS label that is
carried by the IP payload.


I would tend to agree that the choice made makes sense.

It is however better to not make it look like the only possible designchoice ("'discard from the wrong PE' procedures of [RFC6513] section9.1.1 cannot be applied to that P-tunnel" is ), to avoid misleadingfuture readers.

I think that at least "[procedure xyz] cannot be applied to thatP-tunnel, in the general case," would be better.

Another problem with using the IP header to apply the "discard from the
wrong PE" procedure is that it will not easily generalize to the case of
extranet.  (Still another problem would be that it is just one more
unnecessary option.)

I could add some text explaining this, and explaining why it is not
recommended to use the IP header to apply the "discard from the wrong
PE" procedure.

Yes, this would be useful to document in one or two paragraphs in anAppendix for instance.

Now, regarding the use of timers when switching UMH ...

[Thomas] I understand -- even if that is a bit implicit -- that the NLRI
for the Leaf A-D route to the old UMH is the same as the NLRI for the
Leaf A-D route to the new UMH.

Correct.

See below, there is a lot of implicit in the sentence as currentlywritten. Not enough for me to understand correctly on a first reading.


[Thomas] But I don't in fact understand why this has to be the case...

Leaf A-D routes are originated in response to I/S-PMSI A-D routes, and
the rules for creating the NLRI of a Leaf A-D route, as specified in
RFC6514, are independent of the tunnel type.


I agree with that.

[Thomas] One has to ignore the procedures to build a Leaf A-D route of
RFC6514 since this document specifies new ones for IR in section 4.1.1

I don't understand why you say that.  The 4.1.1 rules for generating the
NLRI of a Leaf A-D route follow the RFC6514 procedures.


(see below)


[Thomas] section 4.1.1 says that the Key field of the Leaf A-D route
contains the "tunnel identifier" defined in section 3

Yes; the tunnel identifier defined in section 3 is the NLRI of the
corresponding I/S-PMSI A-D route, which is exactly that RFC6514
specifies for the route key.


(see below)

[Thomas] section 3 says that (when the "Leaf info required" bit is set,
which is the case for section 4.1.1) the tunnel identifier is
RECOMMENDED to be a routable address of the router that built the PTA

No; section 3 says that the "tunnel identifier" field of the PTA is
recommended to be a routable address of the router that built the PTA.
But section 3 also tries to make it clear that the identifier of the IR
P-tunnel does not appear in the tunnel identifier field of the PTA.

I have re-read section 3 and now got why I had initially misunderstoodsection 4.1.1. Section 3 does in fact say that ''the identifier of an IRP-tunnel is not the "Tunnel Identifier" the PTA'', which is pretty closeto "the tunnel identifier is not the tunnel identifier".

When you read Section 4.1.1, the phrase "MUST contain the tunnelidentifier (as defined in Section 3 above)" might be misunderstood,especially because this time "IR P-tunnel identifier" has become juste"tunnel identifier" (might be read as Tunnel Identifier with the missinguppercase). All this being made even more likely that one may had inmind that "MANDATORY" wording is most often related to new things thatone has to be careful about rather than a mere repeat of an existing spec.


I would suggest the following wording:

Current text:
   Once the UMH is determined, the router joining the IR P-tunnel
   originates a Leaf A-D route.  The NLRI of the Leaf A-D route MUST
   contain the tunnel identifier (as defined in Section 3 above) as its
   "route key".

Proposal:
   Once the UMH is determined, the router joining the IR P-tunnel
   originates a Leaf A-D route following the procedures in RFC65414;
   i.e. the NLRI of the Leaf A-D route MUST is set to the NLRI of
   the route triggering the join (which happens to be the IR P-tunnel
   identifier, as defined in section 3, and distinct from the PTA
   Tunnel Identifier field).

[Thomas] Anyhow, it seems to me that ensuring that the Key changes when
the UMH changes, would simplify the make before break procedure:
everything is at the hand of the downstream PE which can advertise both
routes for as long as it wishes,

That does not seem to me to be a simplification.  The specified
procedure is pretty simple:

- To change parents, only a single control plane operation is needed: a
change in the RT of the Leaf A-D route.

Note that I haven't implied anywhere that re-originating a new routewould be of a problematic complexity.

After a thorough re-reading of section 3, I understand now only why Iinitially totally misunderstood why "only a change in the RT of the LeafA-D route is needed".

Let me suggest a rewording that may avoid other readers to be lost as Iwas...


Current text:
   Suppose a child node has joined a particular IR P-tunnel via a
   particular UMH, and it now determines (for whatever reason) that it
   needs to change its UMH on that P-tunnel.

There is in fact a lot of implicit in this sentence: "joined ... via"and "a particular P-tunnel"/"that P-tunnel" refer to the particulars insections 3 and 4.1.1.


Proposal:
   Suppose a child node has joined a particular IR P-tunnel via a
   particular UMH (following procedures in section 4), and it now
   determines (for whatever reason) that it needs to change its UMH
   on that P-tunnel (same tunnel identifier as defined in Section 3).
   This can for instance arise on a change of UMH for a intermediate
   node in a deployment where segmented trees are used.

- In both the upstream and the downstream node, the to-be-deleted data
plane state is timed out.

- There are no data-driven state changes. (Note that to avoid
data-driven state changes, the downstream node really needs to run a
timer in order to decide when to modify its data plane state.)

- The timers do not need to be very precisely tuned, and certainly do
not need to be tuned on a per-peer basis.

- We retain the RFC6514 principle of keeping the NLRI independent of the
tunnel type.  Thus we minimize the chances of creating unintended
side-effects or new corner cases that need to be thought out.  That is,
we minimize the chances of breaking existing MVPN implementations in
unanticipated ways.

The above is a very precise refutal of issues that I hadn't even raised.If PETA was taking care of strawmen, I would certainly alert them atonce ;)

You have left uncommented the one reason I had given to illustrate thecomplexity of this solution: with the specs as they are, somebody willhave to write code to make these two timers tunable, somebody will haveto test these new settings, somebody will have to map that into a Yangmodel (or similar), and somebody will have to support that in an OSStool and use it to force consistent values on all PEs/A(S)BRs.

After getting a better understanding of the procedures, I agree they areuseful, under the condition that a reasonable default for each of thetwo timers is standardized in the specs (so that they can be implementedviably even before all the actions described above happen).


I would propose:

   An implementation of these specs SHOULD offers means to configure
   the values of timers 1 and 2. An implementation of these specs MUST
   have a default value for timer 1 of at least [T1] seconds and a
   value of timer 2 of at most [T2] seconds.

T1 and T2 are then left to be determined, with [T2] < [T1].

The target is to have T2 large enough to make it likely that the new UMHhas received and processed the route.


I would offer T2=60s and T1=120s.

Of course, setups that want a finer tuning to optimize bandwidth, willtypically to use the tuning knobs to change the timers.


Comments ?

-Thomas


_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.

_______________________________________________
BESS mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/bess

Re: [bess] Comments on draft-ietf-bess-ir

Reply via email to