[Rpm] draft-ietf-ippm-responsiveness

Sebastian Moeller via Rpm Sun, 03 Dec 2023 10:13:54 -0800

Dear IPPM members,

On re-reading the current responsiveness draft I stumbled over the following 
section:



Parallel vs Sequential Uplink and Downlink

Poor responsiveness can be caused by queues in either (or both) the upstream 
and the downstream direction. Furthermore, both paths may differ significantly 
due to access link conditions (e.g., 5G downstream and LTE upstream) or routing 
changes within the ISPs. To measure responsiveness under working conditions, 
the algorithm must explore both directions.

One approach could be to measure responsiveness in the uplink and downlink in 
parallel. It would allow for a shorter test run-time.

However, a number of caveats come with measuring in parallel:

        • Half-duplex links may not permit simultaneous uplink and downlink 
traffic. This restriction means the test might not reach the path's capacity in 
both directions at once and thus not expose all the potential sources of low 
responsiveness.
        • Debuggability of the results becomes harder: During parallel 
measurement it is impossible to differentiate whether the observed latency 
happens in the uplink or the downlink direction.
Thus, we recommend testing uplink and downlink sequentially. Parallel testing 
is considered a future extension.


I argue, that this is not the correct diagnosis and hence not the correct 
decision.
For half-duplex links the given argument is not incorrect, but incomplete, as 
it is quite likely that when forced to multiplex more bi-directional traffic 
(all TCP testing is bi-directional, so we only argue about the amount of 
reverse traffic, not whether it exist, and even if we would switch to QUIC/UDP 
we would still need a feed-back channel) we will se different "potential 
sources of low responsiveness" so ignoring any of the two seems ill advised.
Debuggability is not "rocket science" either, all one needs is a three value 
timestamp format (similar to what NTP uses) and one can, even without 
synchronized clocks! establish baseline OWDs and then under bi-directional load 
one can see which of these unloaded OWDs actually increases, so I argue that 
"it is impossible to differentiate whether the observed latency happens in the 
uplink or the downlink direction" is simply an incorrect assertion... (and we 
are actually doing this successfully in the existing internet as part of the 
cake-autorate project [h++ps://github.com/lynxthecat/cake-autorate/tree/master] 
already, based on ICMP timestamps). The relevant observation here is that we 
are not necessarily interested in veridical OWDs under idle conditions, but we 
want to see which OWD(s) increase during working-conditions, and that works 
with desynchronized clocks and is also robust against slow clock drift.

Given these observations, I ask that we change this design parameter to default 
requiring both measurement modes and defaulting to parallel testing (or 
randomly select between both modes, but report which it choose).

Best Regards
        Sebastian
_______________________________________________
Rpm mailing list
Rpm@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/rpm

[Rpm] draft-ietf-ippm-responsiveness

Reply via email to