Jonathan,

On 10/07/2019 11:38, Jonathan Morton wrote:
On 8 Jul, 2019, at 3:27 pm, Bob Briscoe <i...@bobbriscoe.net> wrote:

These are quite significant updates to outer fragment processing at the tunnel 
egress. But, given something has to be said, I can't think of a better way (see 
the original quoted email about why the logical OR of the ECN codepoints as 
defined in RFC3168 is no longer sufficient - and it's no simpler anyway).
If I may offer such an alternative approach, which avoids the need to keep 
persistent state at the reassembly point whilst still properly handling 
RFC3168, L4S and SCE expected semantics:

- If all incoming fragments are Not-ECT marked, the outgoing packet must also 
be so marked.

- If any fragment has CE set, the reassembled packet must have CE set.

(This guarantees correct RFC3168 and SCE behaviour for each conventional AQM marking 
action.  Your proposal doesn't, as it will generally result in fewer CE marks downstream 
especially if the smaller fragments end up being marked; subsequent upstream CE marks 
have to relieve a counter deficit before they will be honoured.  The tradeoff is that L4S 
may see some technical "over marking" but this should be tolerable.)
Yes, with byte-preserving, as packets are re-assembled the number of marked packets reduces. Counter-intuitively, that's correct, even for compatibility with TCP's single congestion response per RTT.

I originally suggested the requirement in RFC3168 to preserve the number of marked packets, but it's incorrect. It's not compatible with TCP's single response per RTT (or the response to the proportion of marks of other TCP-Friendly real-time congestion controls).

This is not a matter of compatibility with just one of SCE or L4S. The logical OR approach is wrong for both, and the byte-preserving approach is correct for both - see previous response to Markku.

Reasoning: the paramount requirement when reassembling fragments is to reconstruct the marking probability that would have occurred had the packets not been fragmented when the AQM in the tunnel marked them. The logical OR approach increases the marking probability as if congestion was higher, while byte-preserving keeps it constant.

If it helps, consider the reductio ad absurdam proof that, as fragments get smaller, the logical OR approach would ultimately result in every packet being marked.

Regarding state, the problem isn't amount of state because reassembly requires per-packet state and the the approach I suggested adds only 2 int's per tunnel decap.

The problem seems to be common read-write access to these variables. However, it's not important that they are updated before forwarding continues. So updates can be queued in parallel to forwarding.

Also the state can include occasional errors, so it doesn't have to be strictly persistent - meaning it's unnecessary to preserve during a re-boot or if a tunnel endpoint is moved.

- Notwithstanding the above rules, the ECT(0) vs ECT(1) choice should be made 
according to the majority of fragmented payload bytes so marked, on the 
individual packet being reassembled.  In the case of a tie, break in favour of 
ECT(0).
My original proposed wording deliberately allows such an algorithm. Byte-preserving was stated as a goal. The specific mechanism was only an example.

(I'm not convinced majority voting would be simpler than byte counting, which never leads to an exception case. But that's irrelevant anyway).


(We may expect that L4S packets will be entirely ECT(1) marked except for 
fragments or whole packets carrying CE; this also applies to obsolete Nonce Sum 
semantics.  SCE and RFC-3168 flows will be ECT(0) marked by default, with 
perhaps some ECT(1) marking applied by SCE middleboxes.  SCE is reasonably 
tolerant of disruption to its markings, because the control loop is 
fundamentally stable.)
As above, my proposed wording is nothing to do with support for one semantic or another. It is for all semantics.

- A mixture of Not-ECT and other ECN codepoints is \unexpected and may imply 
upstream shenanigans.
Yup, the wording in 3168, and my wording both agree with you here.



Bob



  - Jonathan Morton


--
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/

_______________________________________________
Int-area mailing list
Int-area@ietf.org
https://www.ietf.org/mailman/listinfo/int-area

Reply via email to