Re: [6lo] Mirja Kühlewind's Discuss on draft-ietf-6lo-fragment-recovery-13: (with DISCUSS and COMMENT)

Mirja Kuehlewind Fri, 20 Mar 2020 06:49:26 -0700

Hi Pascal,

Okay! :-)


About the use of ECN, I agree as you say there should only be a few fragments 
and not increasing might be okay. However, you would need to clarify that the 
window is reset for each new datagram, I guess, right? Also I don’t think you 
necessarily need to reduce to 1 on CE marking but maybe halve the window or 
something. Or you leave this open like “If an E flag is received the window 
SHOULD be reduced, at least by 1 and at max to 1. Halving the window for each E 
flag received, could be a good compromise but needs further experimentation.”…

I wonder if it would be good to say a bit more about the recommended values for 
the window size, as I think 32 will usually in most network not limit 
transmission (and the limiting value will be IFG) while with a size of 3, 
that's very conservative to not overload the network (and will be slow than the 
limits induced by IFG). Is my understanding correct?

Mirja



> On 19. Mar 2020, at 20:12, Pascal Thubert (pthubert) <[email protected]> 
> wrote:
> 
> Hello Mirja
> 
> 
> 
>> 
>>> 
>>> 
>>> Please see below the discussion:
>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> -
>>>>>> DISCUSS:
>>>>>> -----------------------------------------
>> 
>>> 
>>> Note that all classical wireless interfaces perform ARQ. On one hop, you 
>>> get the same effect of multiplying the traffic in the air vs. what the 
>>> transport see. The factor can be high, e.g., 64. On a mesh, we get the 
>>> additional effect of multiple flows converging on a same node.
>> 
>> Yes but with only “one hop”/the network you are connected to directly, and 
>> there is usually also some kind of back-off mechanism that reacts to 
>> congestion/collision/contention on that layer.
>> 
>> 
>>> 
>>>> 
>>>> To be clear the request of this discuss is to give a normative 
>>>> recommendation
>>>> for the default value of the window size and/or inter-frame gap.
>>> 
>>> Yes, and since there is no great expectation that ECN will be implemented, 
>>> that must be reasonable.
>>> Also we want to agree on the proposed mechanism to drop the window to 1 in 
>>> case of congestion notification, or is that behind us already?
>> 
>> Dropping to 1 on CE mark is fine. However, when do you increase the window 
>> again? If you want to say something here, you have to specify that as well.
> 
> 
> If we keep things really simple it would not. Note that this applies to a 
> single data gram and that’s usually just a few fragments.
> 
> We could double at each round trip but by the time it takes effect the 
> datagram will be done...
> 
>> 
>>> 
>>> 
>>>> 
>>>> Further note, as you allow to adapt both the window and the inter-frame gap
>>>> dynamically, you actually implement two control mechanisms here. I actually
>>>> recommend to only use the inter-frame gap and don’t have window here. You
>>>> say a couple of times in your reply below, that the window determines the 
>>>> ask-
>>>> rate, however, it is not clear to me because the ack rate should be a 
>>>> parameter
>>>> at the receiver and not at the sender (maybe I don’t remember things 
>>>> correctly
>>>> because it’s a while back since I read the draft and I couple find 
>>>> anything about
>>>> this in the draft quickly). If the window size however does define the ack 
>>>> rate,
>>>> then maybe you should rename that parameter respectively.
>>> 
>>> The ack is not for flow control (as we do not have it) but in support of 
>>> ARQ. The possibility to use it for congestion control is a desirable side 
>>> effect. The fragmenting endpoint FSM may want to wait and see how things 
>>> went for the fragments that it already sent. E.g., there's the case where 
>>> the fragmenting endpoint would use an ack on the first fragment for  a 
>>> number of reasons such as check that a path is available, that the MTU is 
>>> OK or assess an initial RTT. It may maximize the number of fragments in 
>>> flight for congestion control. But whether to do any of that is left to 
>>> implementation (so far).
>>> 
>>> 
>>>> 
>>>> However, if there is really a need for a window, I still recommend to talk 
>>>> less
>>>> about adapting this value dynamically and make clear that having a fixed 
>>>> value
>>>> is the recommended default. Therefore I recommend to remove the parameter
>>>> MinWindowSize and MaxWindowSize because these should actually not be
>>>> parameters than can be configured but are actual bounds. If someone decides
>>>> to implement dynamic window adaption, they can decide to re-introduce these
>>>> parameter again and make them configurable but it doesn’t need to be part 
>>>> of
>>>> this spec.
>>> 
>>> I can see that, yes. I still like the idea to drop to 1 in case of ECN. 
>>> Do you suggest to remove that as well?
>>> If so, should we augment the inter-frame gap in case of ECN? 
>>> That may be better though not simpler to specify or implement.
>> 
>> That’s an option as well. Again when you reduce something you might as well 
>> need to specify when to increase it again and that means you are specifying 
>> basically a congestion control scheme.
> 
> My goal for this doc was to keep it dead simple, build it so we have the 
> necessary basis with ecn and windowing, and then play with it and learn from 
> it. only then we can do a valuable spec.
> 
> can we keep it at what the doc has now?
>> 
>>> 
>>> 
>>> 
>>>> 
>>>> So it could be something like:
>>>> 
>>>> "Window_Size:  Window_Size MUST be at least 1 and less than 33. If the 
>>>> inter-
>>>> frame gap is selected large enough to not overload the path and the one-way
>>> 
>>> I see the IFG as more efficient for flow control than for congestion 
>>> control: Increasing IFG slows down the packets but as long as the result is 
>>> faster than the bottleneck, it does not help much does it? So I'm still  
>>> unsatisfied on how to characterize an IFG that does not overload a path. 
>>> But I'm not sure we can do better. I moved that piece to the IFG definition 
>>> if that's OK?
>> 
>> So how do you currently set the IFG? Both IFG and window_size can be used 
>> for both flow as well as congestion control, it only depends who generated 
>> the signal that is used to adapt the value, either the endpoint/receiver or 
>> the network/nodes on the path.
>> 
> 
> This is specified in minimal fragments. The IFG ensures that the previous 
> fragment is beyond interference range. In a single frequency mesh that is 
> multiple hops away.
> 
>> Using a window would be a window-based congestion control. Using the IFG 
>> would be a rate-based congestion control. But the principle is the same.
>> 
>> 
> I’d love to chat about that at the next IETF to get your view. On paper the 
> rate based does not guarantee the amount of bytes that this node will pack at 
> the bottleneck. What I found implementing it years ago was that it is 
> sensitive to when and how the congested node sets ECN. I ended up adjusting 
> only once per RTT...
> 
>>> 
>>>> delay is known, the Window_Size SHOULD be set to the one-way trip time
>>>> divided by the inter-frame gap.  Otherwise a small value of X SHOULD be
>>>> configured. Note that the Window_Size determines the ack rate. If the
>>>> window_size is set this to 32, this means only the last Fragment is
>>>> acknowledged in the first round. If it is set to a smaller value, more 
>>>> acks are
>>>> generated but the load on the forward path will be lower. Window_Size MAY
>>>> be adapted dynamically to reduce load on the forward path in case of
>>>> congestion.”
>>> 
>>> The assertion that the load on the forwarding path will be lower is usually 
>>> incorrect in a typical  LLN, since the radios are half duplex. In the 
>>> example of 6TiSCH, an rfrag_ack consumes the exact same bandwidth as a 
>>> fragment (one time slot). Also the path of the rfrag_ack is the reverse 
>>> LSP, so it goes across the exact same links.
>> 
>> Okay. Yes makes sense.
>>> 
>>> The last sentence is already present above in the text above it, all quoted 
>>> below, so I'm also trying to avoid duplication.
>>> 
>>>> Still you also need to say more about how to set and dynamically adapt the
>>>> inter-frame gap because that is probably the real limiting fact to avoid 
>>>> network
>>>> overload.
>>>> 
>>> 
>>> Yes, I see that tuning IFG impacts the rate and can help alleviate the 
>>> congestion once you pass below the rate the bottleneck can give you. I've 
>>> done some adaptive CIR long ago in IBM FR switches and it can be made to 
>>> work, and though it was a lot of fun, it's not any easier than window-based 
>>> flow control. And it really depends on the relay doing the right thing, 
>>> e.g., reacting quickly on growing queue latency and applying fair sharing.
>>> 
>>>> Also below you remove the recommendation for using the number of hops as
>>>> window size but here you added it again. This is just incorrect. There is 
>>>> no
>>>> dependency between the number of hops and the window size: If there is no
>>>> bottleneck on the path, you can just send with line rate at the sender. 
>>> 
>>> The rationale was: If there is no bottleneck on the path and the window is 
>>> less than the number of hops then the sender will be blocked and the 
>>> maximum throughput cannot be achieved.. If the rfrag_ack is as slow as the 
>>> frag, which is reasonable in an LLN, we're talking about a window of twice 
>>> the number of hops to keep the fragments going.
>> 
>> I see that the idea was rather to get the frames flowing (that avoiding 
>> overload) under the assumption that there is no bottleneck on the path. 
>> However, in this case you don’t really need the window at all and using the 
>> IFG should be enough.
> 
> 
> Yep it is actually more constrained since IFG usually covers transmission 
> over multiple hops. You found that I removed hop count throughout this time 
> (hopefully) and followed your recommendation.
> 
>> 
>>> 
>>> I saw the number of hops as a starting point, but I'm (really) happy to 
>>> stick to RTT/IFG which makes more sense considering the focus that you seem 
>>> to recommend placing on IFG (and I agree with that too).
>>> 
>>>>                                                                            
>>>>                       If there
>>>> is a bottleneck on the path and you send at a higher rate than the 
>>>> bottleneck
>>>> than soon or later the buffer at that hop will fill up completely. So the 
>>>> window
>>>> size depends only on the bottleneck rate and end-to-end delay (BDP) (which
>>>> let’s you calculate the number of packet in flight) plus the buffer size 
>>>> at the
>>>> bottleneck. The number of hops is irrelevant.
>>> 
>>> Yes, I understand that model. It was easier to apply some 25 years ago.
>>> So far the links in a LLN are usually the same and the PHYs are usually the 
>>> same so it's still usable.
>>> But that is bound to change rapidly as even LLN radios are going to be 
>>> agile WRT PHY rate. Meaning that the rate at the bottleneck will become 
>>> hard to fathom and will change (rapidly) over time (same as Wi-Fi).
>>> 
>>> All in all we'd get:
>>> 
>>> "
>>> An implementation must control the rate at which it sends packets
>>> over the same path to allow the next hop to forward a packet before
>> 
>> What does “same" relate to here?
> 
> Same next hop (in TSCH) and possibly multiple hops but usually it does not 
> know (requires link state which we usually do not have)
> 
> I can change over the same path to via te same next hop to keep it simple?
>> 
>> 
>>> it gets the next.  In a wireless network that uses the same frequency
>>> along a path, more time must be inserted to avoid hidden terminal
>>> issues between fragments (more in Section 4.2).  An implementation
>>> should consider the generic recommendations from the IETF in the
>>> matter of congestion control and rate management in [RFC5033].  An
>> 
>> Maybe RFC8085 is the better reference?
> 
> Happy to change 
> 
>> 
>>> implementation may perform a congestion control by using a dynamic
>>> value of the window size (Window_Size), adapting the fragment size
>>> (Fragment_Size), and may reduce the load by inserting an inter-frame
>>> gap that is longer than necessary.
>> 
>> This is a bit the part that I don’t understand fully. Why do you need three 
>> different ways to enable congestion control instead of just having one.
> 
> To enable experimenting. The text above is true all this is possible. But the 
> only thing that we mandate is resetting the window.
> 
> 
>> 
>> You already have the IFG. What's the benefits of having a window based rate 
>> limit in addition?
>> 
> 
> Tuning IFG is complex. Implementation may prefer window over rate based. Once 
> we have experience we’ll build the necessary specs.
> 
> 
>>> In a large network where nodes
>>> contend for the bandwidth, a larger Fragment_Size consumes less
>>> bandwidth but also reduces fluidity and incurs higher chances of loss
>>> in transmission.
>>> 
>>> This is controlled by the following parameters:
>>> 
>>> 
>>> inter-frame gap:  Indicates the minimum amount of time between
>>>    transmissions.  The inter-frame gap protects the propagation of
>>>    one transmission before the next one is triggered and creates a
>>>    duty cycle that controls the ratio of air time and memory in
>>>    intermediate nodes that a particular datagram will use.  The
>>>    inter-frame gap controls the rate at which fragments are sent and
>>>    SHOULD be selected large enough to protect the network.
>> 
>> I think you need to provide some (normative) recommendation for the default 
>> configuration of the IFG. If that is specified in 
>> draft-ietf-6lo-minimal-fragment a pointer would be good and more explanation.
>> 
> 
> It is and it I will. There’s already such reference but I can probably do 
> better.
> 
> 
> 
>>> 
>>> 
>>> <snip>
>>> 
>>> Window_Size:  The Window_Size MUST be at least 1 and less than 33.
>>> 
>>>    *  If the round-trip time is known, the Window_Size SHOULD be set
>>>       to the round-trip time divided by the time per fragment, that
>>>       is the time to transmit a fragment plus the inter-frame gap.
>>> 
>>>    Otherwise:
>>> 
>>>    *  Setting the window_size to 32 is to be understood as only the
>>>       last Fragment is acknowledged in each round.  This is the
>>>       RECOMMENDED value in a half-duplex LLN where the fragment
>>>       acknowledgment consumes roughly the same bandwidth on the same
>>>       links as the fragments themselves
>>> 
>>>    *  If it is set to a smaller value, more acks are generated.  In a
>>>       full-duplex network, the load on the forward path will be
>>>       lower, and a small value of 3 SHOULD be configured.
>>> “
>>> 
>> 
>> If having the window is still useful, this is fine I think.
>> 
> 
> I don’t know but we’ll learn. Please let me know if we agree on the above and 
> I’ll update tomorrow (it’s family time now in CET).
> 
> Take care,
> 
> Pascal 
> 
>> Mirja
>> 
>> 
>> 
>>> 
>> 

_______________________________________________
6lo mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/6lo

Re: [6lo] Mirja Kühlewind's Discuss on draft-ietf-6lo-fragment-recovery-13: (with DISCUSS and COMMENT)

Reply via email to