Re: [trill] Adam Roach's Discuss on draft-ietf-trill-ecn-support-05: (with DISCUSS and COMMENT)

Bob Briscoe Fri, 23 Feb 2018 15:58:38 -0800

@Adam, Thx for your patience. (It turned out I didn't see your DISCUSScos I had created a bug in my own e-mail filters - sry for that).

@Donald (as editor), Adam's right that we risk implementers doing theirown thing if they judge that Table 3 might be wrong, because itsapparent illogicality is not explained without reading the references.So, I've suggested one further change to text, inline...



On 20/02/18 22:22, Adam Roach wrote:

Bob --

Thanks for taking the time to explain the history and rationale ofTable 3 to me. I'm going to clear my DISCUSS based on thisexplanation, since it seems that the technical solution in thedocument is correct, if a bit non-obvious. Some additional commentsinline.

[For those of you who did not see Bob's response: it appears that myoriginal DISCUSS email did not reach him, so he replied to a reducedaudience. I have restored the traditional IESG review distribution tothis email. For this reason, I am eliding less of Bob's response thanI usually would]


On 2/20/18 12:51 PM, Bob Briscoe wrote:


        On Thu, Feb 8, 2018 at 2:15 PM, Adam Roach
        <a...@nostrum.com <mailto:a...@nostrum.com>> wrote:

            On 2/8/18 11:53 AM, Donald Eastlake wrote:

            Hi Adam,

            On Wed, Feb 7, 2018 at 8:12 PM, Adam Roach
            <a...@nostrum.com <mailto:a...@nostrum.com>> wrote:
            > Adam Roach has entered the following ballot
            position for
            > draft-ietf-trill-ecn-support-05: Discuss
            >
            > ...
            >
            >
            
----------------------------------------------------------------------
            > DISCUSS:
            >
            
----------------------------------------------------------------------
            >
            > Thanks to the authors, chairs, shepherd, and
            working group for the effort that
            > has been put into this document.
            >
            > I have concerns about some ambiguity and/or
            self-contradiction in this
            > specification, but I suspect these should be easy
            to fix. In particular, the
            > behavior defined in Table 3 doesn't seem to be
            consistent with the behavior
            > described in the prose.
            >
            > For easy reference, I've copied Table 3 here:
            >
            >>
            +---------+----------------------------------------------+
            >>       | Inner   |  Arriving TRILL 3-bit ECN
            Codepoint Name   |
            >>       | Native
             +---------+------------+------------+----------+
            >>       | Header  | Not-ECT | ECT(0)     | ECT(1) |
                CE   |
            >>
            +---------+---------+------------+------------+----------+
            >>       | Not-ECT | Not-ECT | Not-ECT(*) |
            Not-ECT(*) |  <drop>  |
            >>       |  ECT(0) |  ECT(0) |  ECT(0)    |  ECT(1)  
             | CE   |
            >>       |  ECT(1) |  ECT(1) |  ECT(1)(*) |  ECT(1)  
             | CE   |
            >>       |    CE   |  CE |      CE  |      CE(*) |  
              CE   |
            >>
            +---------+---------+------------+------------+----------+
            >>
            >>  Table 3. Egress ECN Behavior
            >>
            >>  An asterisk in the above table indicates a
            currently unused
            >>  combination that SHOULD be logged. In contrast to
            [RFC6040], in TRILL
            >>  the drop condition is the result of a valid
            combination of events and
            >>  need not be logged.
            >
            > The prose in this document indicates:
            >
            >  1. Ingress gateway either copies the native header
            value to the TRILL ECN
            >     codepoint (resulting in any of the four values
            above) or doesn't insert
            >     any ECN information in the TRILL header.
            >
            >  2. Intermediate gateways can set the CCE flag,
            resulting in "CE" in the
            >     table above.
            >
            > Based on the above, a packet arriving at an egress
            gateway can only be in one of
            > the following states:
            >
            >  A. TRILL header is Not-ECT because no TRILL node
            inserted ECN information.
            >
            >  B. TRILL header value == Native header value
            because the ingress gateway
            >     copied it from native to TRILL.
            >
            >  C. TRILL header is "CE" because an intermediate
            node indicated congestion.

            Sort of... But note that the TRILL header ECN bit s
            can indicate non-ECT while the CCE bit is set making
            the overall TRILL Header "CE".


            Right. That's part of case C. My states above assume
            application of Table 2 has already taken place.


            > If that's correct, I would think that any state
            other than those three needs
            > to be marked with an (*). In particular, these two
            states fall into that
            > classification, and seem to require an asterisk:

[BB] These states are not tagged (*) cos they can arise with variantsof ECN marking behaviour referred to in Section 4 (explained beneatheach bullet below). By design (explained further below), a consistentencap and decap behaviour is intended to support all variants of ECN,because subnet ingress and egress nodes will very probably bedeployed independently of each other and of intermediate nodes and/orL4 endpoints.

Nonetheless, you're right that this is completely non-obvious to areader coming to this new. So let's change the text:

CURRENT:
    An asterisk in the above table indicates a currently unused
    combination that SHOULD be logged. In contrast to [RFC6040 
<https://tools.ietf.org/html/rfc6040>], in TRILL
    the drop condition is the result of a valid combination of events and
    need not be logged.

SUGGESTED (2nd attempt):

   An asterisk in the above table indicates a combination that is currently
   unused in all variants of ECN marking (see Section 4) and therefore
   SHOULD be logged.

   With one exception, the mappings in Table 3 are consistent with those
   for IP-in-IP tunnels [RFC6040], which ensures backward compatibility with
   all current and past variants of ECN marking (see Section 4). It also
   ensures forward compatibility with any future form of ECN marking that
   complies with the guidelines in [ECNencapGuide], including cases where
   ECT(1) represents a second level of marking severity below CE.

   The one exception is that the drop condition in Table 3 need not be
   logged because, with TRILL, it is the result of a valid combination of
   events.

Thanks. That definitely helps; but see my comments below.
            >
            >   - Native==CE && TRILL==ECT(0)
[BB] You have indeed highlighted an awkward wrinkle here - we couldhave tagged this combination with '(*)', cos it is strictly currentlyunused by trill. However, I'm going to argue against a '(*)'...
My explanation starts with some history: Back in 2001 when ECN in IPwas first defined [RFC3168], an IP-in-IP tunnel ingress set the outerto ECT(0) if ECN in the native header was anything but Not-ECT(0b00). This was intended to close off a perceived covert channel inIPsec. However, since then the security area agreed that this addedover-paranoid complexity. So, since [RFC6040] an IP-in-IP tunnelingress now just copies the native ECN to the outer, and TRILL ECNfollows this new simpler model.
Altho this spec does not allow a TRILL ingress RBrIdge to exhibit theold IP-in-IP legacy behaviour, I think it best not to tag thiscombination as 'currently unused'. Because, wherever possible, wewant trill's ECN encap and decap to mimic IP-in-IP (a principleexplained further down this email).
A more specific reason: there is already a congestion monitoringtechnique being proposed that fakes the legacy IP-in-IP behaviour tomeasure congestion across a tunnel(draft-ietf-tsvwg-tunnel-congestion-feedback). Before it becomes anRFC, I wouldn't be surprised if that draft gets extended fromIP-in-IP to IP-in-TRILL and other L2 protocols as well. If thatlikely event happened, we would have to update this trill-ecn spec toremove the '(*)' again. In the mean time, there's little harm in notlogging a combination that is not harmful, even tho it is stricty'currently unused'.
(sry about the triple negative!)
This is a great explanation. I understand the risks of puttingforward-looking speculation in RFCs, but could I convince you to atleast explain in the document that the reason for this state not beingconsidered out of the ordinary is to mimic legacy IP-in-IP ECNbehavior? Any acknowledgement of why this case is being treated theway it is would be sufficient to prevent implementors from decidingthat the table is wrong.
            >
            >   - Native==ECT(0) && TRILL==ECT(1)
[BB] The standards track PCN variant of ECN [RFC6660] uses ECT(1) asa lower severity of congestion notification than CE. Hence theasymmetry where a native ECT(0) header can have an ECT(1) outer, butnot vice versa.
As with the case above, it would be good (inasmuch as it would serveto un-confuse implementors) to include an explanation that this casemakes sense due to RFC 6660's handling of ECT(0) versus ECT(1). Such acomment would also address my concern quoted below.
            I would defer to my co-author Bob Briscoe but it
            looks to me like you have a good point.
            Thanks; I'll wait to hear from Bob.
            > In addition, the behavior this table defines for
            Native==ECT(0) && TRILL==ECT(1)
            > is somewhat perplexing: for this case, the value in
            the TRILL header takes
            > precedence; however, when Native==ECT(1) &&
            TRILL==ECT(0) the Native header
            > takes precedence. Or, put another way, this table
            defines ECT(1) to always
            > override ECT(0). I don't find any prose in here to
            indicate why this needs to be
            > treated differentially, so I'm left to conclude
            that this is a typographical
            > error. If that's not the case, please add
            motivating text to Table 3 explaining
            > why ECT(1) is treated differently than ECT(0) for
            baseline ECN behavior.

            As noted in Section 4.1, there is an ECN variation
            where ECT(0) just indicates ECT while ECT(1)
            indicates congestion of a lesser severity level has
            been encountered than that indicated by CE. I believe
            the dominance of ECT(1) over ECT(0) is designed to
            not interfere with this variation.
            Yes, and section 4 opens with the explanation that
            "Section 3 specifies interworking between TRILL and
            the original standardized form of ECN in IP
            [RFC3168]." Beyond that, I wouldn't expect any of the
            text in non-normative section 4 or its subsections to
            have bearing on the normative table in Section 3.
[BB]
* PCN is standards track, hence PCN-specific combinations are nottagged 'currently unused' in (normative) Section 3.* L4S is experimental track, but it doesn't use any differentcombinations of ECN than standard ECN, so it doesn't alter the CUasterisks in section 3.
Section 4 is informative, only because it doesn't define the protocolspec; it merely explains how the normative spec in Section 3 allowsfor ECN variants (whether stds track or experimental).
Thanks. The answer to this that I find compelling is your explanationbelow about RFC 6040's relative ordering of ECN markings, which Ithink should be at least mentioned in passing here.
            My reading of RFC 8311 is that it contemplates a
            series of experiments beyond those currently under
            development. It may well be that the current
            experiments consider ECT(1) to have a higher severity
            than ECT(0), but that this may not make sense for
            future experiments. Even if it does, I don't see
            guidance in RFC 8311 (or any other update to RFC 3168)
            that suggests such a relationship between ECT(0) and
            ECT(1) exists.
As above, the possibility that ECT(1) is a higher severity thanECT(0) is already on the standards track (PCN [RFC6660]).
As far as possible, we want trill subnet endpoints to follow the sameECN behaviour as IP-in-IP tunnel endpoints [RFC6040]. The explicitlystated philosophy of RFC6040 and [ECNencapGuide] (which is referredto from the Intro to this trill spec), is for respectively all tunnelendpoints and subnet endpoints to have the same mechanical, dumb ECNbehaviour, whatever the variant of ECN in use. Because in theincrementally deployed public Internet, one cannot know what tunnelsand subnets might be encountered across different paths.
            On top of this (as implied by the existence of section
            4), the TRILL handling for ECN will need to vary from
            experiment to experiment. It seems reasonable that
            part of this variability would include different
            mapping of ECN bits by the egress gateway.
Nope. By design, all variants of ECN use the same encap and decapbehaviours, otherwise they would be undeployable to all intents andpurposes. Perhaps we should make that clearer in this sentenceintroducing Section 4:
CURRENT:
The ECN wire protocol for TRILL (Section 2<https://tools.ietf.org/html/draft-ietf-trill-ecn-support-05#section-2>) has been designed to
    support the other known variants of ECN,
SUGGESTED:
The ECN wire protocol for TRILL (Section 2<https://tools.ietf.org/html/draft-ietf-trill-ecn-support-05#section-2>) and the ingress (Section 3.1)
    and egress (Section 3.3) ECN behaviours have been designed to
    support the other known variants of ECN,
This sounds good. It might be worth mentioning that this behavior isintended to also be forward-compatible with future ECN variants, aslong as they conform to RFC 6040's notion of relative codepointpriorities.
            Basically, I see two ways this can be resolved,
            although I'm happy to hear alternatives so long as
            they end up with the ECN and TRILL documents being in
            a consistent and future-proof state:

            Approach 1: Revise Table 3 so that ECT(0) and ECT(1)
            are treated the same (i.e., pick one of "TRILL header
            wins" or "Native header wins" -- I would suggest
            "Native header wins," for maximal compatibility with
            older ECN clients but I'm not dogmatic on this point),
            and then include a normative statement that allows RFC
            8311 experiments to override this mapping as makes
            sense for their design.

            Approach 2: Leave the table as is, but add an
            explanation of why ECT(1) is given preferential
            treatment over ECT(0).
[BB] I hope you're happy with the additional text I've suggested forwhy Table 3 was like it was. In summary, consistency with IP-in-IP isparamount wherever possible.
That's exactly what the additional text should say. :) I don't seethat in your current proposal.
            Add a normative dependency from this document to a new
            document that updates RFC 8311 to add a requirement
            that any experiments that treat ECT(0) differently
            than ECT(1) MUST be designed such ECT(0) always
            indicates a lower severity of congestion than ECT(1).
            I presume that this new document would need to be done
            in coordination with TSVWG.
[BB] That's already catered for in [RFC6040] (stds track) forIP-in-IP, quoting:
    o  In all other cases where the inner supports ECN, the decapsulator
       MUST set the outgoing ECN field to the more severe marking of the
       outer and inner ECN fields, where the ranking of severity from
       highest to lowest is CE, ECT(1), ECT(0), Not-ECT.  This in no way
       precludes cases where ECT(1) and ECT(0) have the same severity;

and in [ECNencapGuide] (intended BCP) for IP-in-any-L2, quoting:
    4.  Congestion indications may be encoded by a severity level.  For
        instance increasing levels of congestion might be encoded by
        numerically increasing indications, e.g. pre-congestion
        notification (PCN) can be encoded in each PDU at three severity
        levels in IP or MPLS [RFC6660 <https://tools.ietf.org/html/rfc6660>].

        If the arriving inner header is an ECN-PDU, where the inner and
        outer headers carry indications of congestion of different
        severity, the more severe indication SHOULD be forwarded in
        preference to the less severe.
            I think Approach 1 is more straightforward, but if
            there's a feeling in the working group that we need
            default egress behavior that is forwards-compatible
            with yet-undesigned experiments, then I think Approach
            2 is what you need.

            /a
The way ECN encap & decap has been designed already follows your'approach 2'; where the same consistent behaviour at all encaps anddecaps can be exploited in both backward and forward compatible ways.
Given your explanation above, I agree that such is the case. I thinkincluding enough of that explanation that implementors don't findthemselves double-guessing the specification in Table 3 wouldultimately aid in interoperability.
Indeed, here's proof that this forward compatibility is not justtheoretical...
L4S came along after all this, and required drop probability to bethe square of the marking probability, but only for one of the ECNcodepoints. We managed to contrive the marking behaviour of anintermediate trill RBridge that supports L4S so that any trill egresswill achieve this precise squaring, even an egress without any ECN orL4S logic, and no squaring logic either (see Appendix A).
Cheers, and thx again for noticing all this.



Bob
_______________________________________________
trill mailing list
trill@ietf.org
https://www.ietf.org/mailman/listinfo/trill


--
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/

_______________________________________________
trill mailing list
trill@ietf.org
https://www.ietf.org/mailman/listinfo/trill

Re: [trill] Adam Roach's Discuss on draft-ietf-trill-ecn-support-05: (with DISCUSS and COMMENT)

Reply via email to