Re: [6lo] Mirja Kühlewind's Discuss on draft-ietf-6lo-fragment-recovery-13: (with DISCUSS and COMMENT)

Pascal Thubert (pthubert) Thu, 19 Mar 2020 07:58:15 -0700

Hello Mirja

> Thanks for you updates. Sorry for my late reply. I unfortunately have some
> more comments. Please see below.


More comments => more thanks, you'll have to live with it 😊

As usual I published for your convenience so you can observe the global effect 
of your review.
Please see 
https://tools.ietf.org/rfcdiff?url2=draft-ietf-6lo-fragment-recovery-18.txt 
(and as usual also I found a typo there that I fixed, this time the duplication 
of "reduce the")

Please see below the discussion:

> >> ---------------------------------------------------------------------
> >> -
> >> DISCUSS:
> >> ---------------------------------------------------------------------


> > 4.3.  Flow Control
> >
> >   The inter-frame gap is the only protection that [FRAG-FWD] imposes by
> >   default.  This document enables to group fragments in windows and
> >   request intermediate acknowledgements so the number of in-flight
> >   fragments can be bounded.  This document also adds an ECN mechanism
> >   that can be used to adapt the size of the window, the size of the
> >   fragments, and/or the inter-frame gap to protect the network.
> >
> >   This specification enables the source endpoint to apply a flow
> >   control mechanism to tune those parameters, but the mechanism itself
> >   is out of scope.  In most cases, the expectation is that most
> >   datagrams will represent only a few fragments, and that only the last
> >   fragment will be acknowledged.  A basic implementation of the source
> >   endpoint is NOT REQUIRED to variate the size of the window, the
> >   duration of the inter-frame gap or the size of a fragment in the
> >   middle of the transmission of a datagram, and it MAY ignore the ECN
> >   signal or simply reset the window to 1 (see Appendix C for more) till
> >   the end of this datagram upon detecting a congestion.
> >
> >   The size of the fragments is typically computed from the Link MTU to
> >   maximize the size of the resulting frames.  The size of the window
> >   and the duration of the inter-frame gap SHOULD be configurable, to
> >   roughly adapt the size of the window to the number of hops in an
> >   average path, and to follow the general recommendations in
> >   [FRAG-FWD], respectively.
> > “
> >
> Thanks for adding this. However, as I said a couple of times in my discuss 
> there
> must be more guidance. This is not only about flow control but also about
> congestion control and it is not okay to declare congestion control out of
> scope. If you only do fragmentation but no retransmission, you don’t need to
> care about congestion control (but only flow control) as you don’t increase 
> the
> actual network load by this. However, if you retransmit you are sending more
> data than the original sender (that hopefully is congestion controlled) and
> therefore you increase the load on the network and you MUST implement your
> own congestion control or some fixed rate limiting for that additional load.
> Saying this is out of scope and you want to do experimentation is not
> acceptable for a Proposed Standard.

I agree. I should be a lot more careful with my wording. 

Understanding that the flow control is pacing from the receiver to protect the 
receiver and congestion control is protecting the network (ECN etc...); stricto 
sensu  we are not doing any flow control since the receiver is expected to have 
a buffer for the whole datagram and he does not need to be protected. All we do 
is congestion control.

=> I should change "flow control" to "congestion control" throughout, and add 
text about the above, is that correct?

Note that all classical wireless interfaces perform ARQ. On one hop, you get 
the same effect of multiplying the traffic in the air vs. what the transport 
see. The factor can be high, e.g., 64. On a mesh, we get the additional effect 
of multiple flows converging on a same node.

> 
> To be clear the request of this discuss is to give a normative recommendation
> for the default value of the window size and/or inter-frame gap.

Yes, and since there is no great expectation that ECN will be implemented, that 
must be reasonable.
Also we want to agree on the proposed mechanism to drop the window to 1 in case 
of congestion notification, or is that behind us already?


> 
> Further note, as you allow to adapt both the window and the inter-frame gap
> dynamically, you actually implement two control mechanisms here. I actually
> recommend to only use the inter-frame gap and don’t have window here. You
> say a couple of times in your reply below, that the window determines the ask-
> rate, however, it is not clear to me because the ack rate should be a 
> parameter
> at the receiver and not at the sender (maybe I don’t remember things correctly
> because it’s a while back since I read the draft and I couple find anything 
> about
> this in the draft quickly). If the window size however does define the ack 
> rate,
> then maybe you should rename that parameter respectively.

The ack is not for flow control (as we do not have it) but in support of ARQ. 
The possibility to use it for congestion control is a desirable side effect. 
The fragmenting endpoint FSM may want to wait and see how things went for the 
fragments that it already sent. E.g., there's the case where the fragmenting 
endpoint would use an ack on the first fragment for  a number of reasons such 
as check that a path is available, that the MTU is OK or assess an initial RTT. 
It may maximize the number of fragments in flight for congestion control. But 
whether to do any of that is left to implementation (so far).


> 
> However, if there is really a need for a window, I still recommend to talk 
> less
> about adapting this value dynamically and make clear that having a fixed value
> is the recommended default. Therefore I recommend to remove the parameter
> MinWindowSize and MaxWindowSize because these should actually not be
> parameters than can be configured but are actual bounds. If someone decides
> to implement dynamic window adaption, they can decide to re-introduce these
> parameter again and make them configurable but it doesn’t need to be part of
> this spec.

I can see that, yes. I still like the idea to drop to 1 in case of ECN. 
Do you suggest to remove that as well?
If so, should we augment the inter-frame gap in case of ECN? 
That may be better though not simpler to specify or implement.



> 
> So it could be something like:
> 
> "Window_Size:  Window_Size MUST be at least 1 and less than 33. If the inter-
> frame gap is selected large enough to not overload the path and the one-way

I see the IFG as more efficient for flow control than for congestion control: 
Increasing IFG slows down the packets but as long as the result is faster than 
the bottleneck, it does not help much does it? So I'm still  unsatisfied on how 
to characterize an IFG that does not overload a path. But I'm not sure we can 
do better. I moved that piece to the IFG definition if that's OK?

> delay is known, the Window_Size SHOULD be set to the one-way trip time
> divided by the inter-frame gap.  Otherwise a small value of X SHOULD be
> configured. Note that the Window_Size determines the ack rate. If the
> window_size is set this to 32, this means only the last Fragment is
> acknowledged in the first round. If it is set to a smaller value, more acks 
> are
> generated but the load on the forward path will be lower. Window_Size MAY
> be adapted dynamically to reduce load on the forward path in case of
> congestion.”

The assertion that the load on the forwarding path will be lower is usually 
incorrect in a typical  LLN, since the radios are half duplex. In the example 
of 6TiSCH, an rfrag_ack consumes the exact same bandwidth as a fragment (one 
time slot). Also the path of the rfrag_ack is the reverse LSP, so it goes 
across the exact same links.

The last sentence is already present above in the text above it, all quoted 
below, so I'm also trying to avoid duplication.

> Still you also need to say more about how to set and dynamically adapt the
> inter-frame gap because that is probably the real limiting fact to avoid 
> network
> overload.
> 

Yes, I see that tuning IFG impacts the rate and can help alleviate the 
congestion once you pass below the rate the bottleneck can give you. I've done 
some adaptive CIR long ago in IBM FR switches and it can be made to work, and 
though it was a lot of fun, it's not any easier than window-based flow control. 
And it really depends on the relay doing the right thing, e.g., reacting 
quickly on growing queue latency and applying fair sharing.

> Also below you remove the recommendation for using the number of hops as
> window size but here you added it again. This is just incorrect. There is no
> dependency between the number of hops and the window size: If there is no
> bottleneck on the path, you can just send with line rate at the sender. 

The rationale was: If there is no bottleneck on the path and the window is less 
than the number of hops then the sender will be blocked and the maximum 
throughput cannot be achieved.. If the rfrag_ack is as slow as the frag, which 
is reasonable in an LLN, we're talking about a window of twice the number of 
hops to keep the fragments going.

I saw the number of hops as a starting point, but I'm (really) happy to stick 
to RTT/IFG which makes more sense considering the focus that you seem to 
recommend placing on IFG (and I agree with that too).
 
>                                                                               
>                       If there
> is a bottleneck on the path and you send at a higher rate than the bottleneck
> than soon or later the buffer at that hop will fill up completely. So the 
> window
> size depends only on the bottleneck rate and end-to-end delay (BDP) (which
> let’s you calculate the number of packet in flight) plus the buffer size at 
> the
> bottleneck. The number of hops is irrelevant.

Yes, I understand that model. It was easier to apply some 25 years ago.
So far the links in a LLN are usually the same and the PHYs are usually the 
same so it's still usable.
But that is bound to change rapidly as even LLN radios are going to be agile 
WRT PHY rate. Meaning that the rate at the bottleneck will become hard to 
fathom and will change (rapidly) over time (same as Wi-Fi).

All in all we'd get:

"
   An implementation must control the rate at which it sends packets
   over the same path to allow the next hop to forward a packet before
   it gets the next.  In a wireless network that uses the same frequency
   along a path, more time must be inserted to avoid hidden terminal
   issues between fragments (more in Section 4.2).  An implementation
   should consider the generic recommendations from the IETF in the
   matter of congestion control and rate management in [RFC5033].  An
   implementation may perform a congestion control by using a dynamic
   value of the window size (Window_Size), adapting the fragment size
   (Fragment_Size), and may reduce the load by inserting an inter-frame
   gap that is longer than necessary.  In a large network where nodes
   contend for the bandwidth, a larger Fragment_Size consumes less
   bandwidth but also reduces fluidity and incurs higher chances of loss
   in transmission.

   This is controlled by the following parameters:


   inter-frame gap:  Indicates the minimum amount of time between
      transmissions.  The inter-frame gap protects the propagation of
      one transmission before the next one is triggered and creates a
      duty cycle that controls the ratio of air time and memory in
      intermediate nodes that a particular datagram will use.  The
      inter-frame gap controls the rate at which fragments are sent and
      SHOULD be selected large enough to protect the network.


<snip>

   Window_Size:  The Window_Size MUST be at least 1 and less than 33.

      *  If the round-trip time is known, the Window_Size SHOULD be set
         to the round-trip time divided by the time per fragment, that
         is the time to transmit a fragment plus the inter-frame gap.

      Otherwise:

      *  Setting the window_size to 32 is to be understood as only the
         last Fragment is acknowledged in each round.  This is the
         RECOMMENDED value in a half-duplex LLN where the fragment
         acknowledgment consumes roughly the same bandwidth on the same
         links as the fragments themselves

      *  If it is set to a smaller value, more acks are generated.  In a
         full-duplex network, the load on the forward path will be
         lower, and a small value of 3 SHOULD be configured.
"


_______________________________________________
6lo mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/6lo

Re: [6lo] Mirja Kühlewind's Discuss on draft-ietf-6lo-fragment-recovery-13: (with DISCUSS and COMMENT)

Reply via email to