Hello Christoph, thanks for your detailed response!
> On Jul 6, 2021, at 20:54, Christoph Paasch <[email protected]> wrote: > > Hello Sebastian, > > On 06/29/21 - 09:58, Sebastian Moeller wrote: >> Hi Christoph, >> >> one question below: >> >>> On Jun 18, 2021, at 01:43, Christoph Paasch via Bloat >>> <[email protected]> wrote: >>> >>> Hello, >>> >>> On 06/17/21 - 11:16, Matt Mathis via Bloat wrote: >>>> Is there a paper or spec for RPM? >>> >>> we try to publish an IETF-draft on the methodology before the upcoming >>> IETF in July. >>> >>> But, in the mean-time please see inline: >>> >>>> There are at least two different ways to define RPM, both of which >>>> might be relevant. >>>> >>>> At the TCP layer: it can be directly computed from a packet capture. >>>> The trick is to time reverse a trace and compute the critical path >>>> backwards through the trace: what event triggered each segment or ACK, >>>> and count round trips. This would be super robust but does not include >>>> the queueing required in the kernel socket buffers. I need to think >>>> some more about computing TCP RPM from tcp_info or other kernel >>>> instrumentation - it might be possible. >>> >>> We explicitly opted against measuring purely TCP-level round-trip times. >>> Because there are countless transparent TCP-proxies out there that would >>> skew these numbers. Our goal with RPM/Responsiveness is to measure how >>> an end-user would experience the network. Which means, DNS-resolution, >>> TCP handshake-time, TLS-handshake, HTTP/2 Request/response. Because, at >>> the end, that's what actually matters to the users. >>> >>>> A different RPM can be done in the application, above TCP, for example >>>> by ping-ponging messages. This would include the delays traversing the >>>> kernel socket buffers which have to be at least as large as a full >>>> network RTT. >>>> >>>> This is perhaps an important point: due to the retransmit and >>>> reassuebly queues (which are required to implement robust data >>>> delivery) TCP must be able hold at least a full RTT of data in it's own >>>> buffers, which means that under some conditions the RTT as seen by the >>>> application has be be at least twice the network's RTT, including any >>>> bloat in the network. >>> >>> Currently, we measure RPM on separate connections (not the load-bearing >>> ones). We are also measuring on the load-bearing connections themselves >>> through H2 Ping frames. But for the reasons you described we haven't yet >>> factored it into the RPM-number. >>> >>> One way may be to inspect with TCP_INFO whether or not the connections >>> had retransmissions and then throw away the number. On the other hand, >>> if the network becomes extremely lossy under working conditions, it does >>> impact the user-experience and so it could make sense to take this into >>> account. >>> >>> >>> In the end, we realized how hard it is to accurately measure bufferbloat >>> within a reasonable time-frame (our goal is to finish the test within >>> ~15 seconds). >> >> [SM] I understand that 10-15 seconds is the amount of time users >> have been trained to expect an on-line speedtest to take, but >> experiments with flent/RRUL showed that there are latency affection >> processes on slower timescales that are better visible if one can >> also run a test for 60 - 300 seconds (e.g. cyclic WiFi channel >> probing). Does your tool optionally allow to specify a longer >> run-time? > > Currently the tool does not have a "deep-dive"-mode. There are a few things > (besides running longer) that a "deep-dive"-mode could provide. For example, > traceroute-style probes during the test to identify the location of the > bufferbloat. [SM] Oh, shiny ;) To be useful/interpretable such a tracerouter style path traversal should be performed from both sides of a link (I am sure you know, but my go to slide-deck is https://archive.nanog.org/sites/default/files/10_Roisman_Traceroute.pdf). But it would be sweet if there was a reliable way to get bi-directional traceroutes over path one actually uses. > Use H3 for testing and/or run TCP on a different port to > identify traffic-classifiers/transparent TCP-proxies that treat things > differently. Study the impact of TCP bulk transfer on UDP latency. And so > on... > Such a deep-dive mode would be possible in the command-line tool but very > unlikely in the UI-mode. [SM] Fair enough, thanks. > > Our primary goal in this first iteration is to provide a tool that gives a > quick insight into how bad/good the bufferbloat is on the network in such a > way that a non-expert user can run it and understand the result. [SM] Worthy goal. > We also want it to be using standard protocols so that any basic web-server > can > be configured to serve as an endpoint to it and because that's the protocols > that the users are actually using in the end. [SM] +1; Yes, tests with the production protocols, ideally to the "production" servers seems like a great way forward. Regards Sebastian > > > Cheers, > Christoph > > >> Thinking of it, to keep everybody on their toes, how >> about occasionally running a test with longer run-time (maybe after >> asking the users consent) and store the test duration as part of the >> results? >> >> >> Best Regards Sebastian >> >> >>> >>> We hope that with the IETF-draft we can get the right people together to >>> iterate over it and squash out a very accurate measurement that >>> represents what users would experience. >>> >>> >>> Cheers, Christoph >>> >>> >>>> >>>> Thanks, --MM-- The best way to predict the future is to create it. - >>>> Alan Kay >>>> >>>> We must not tolerate intolerance; however our response must be >>>> carefully measured: too strong would be hypocritical and risks >>>> spiraling out of control; too weak risks being mistaken for tacit >>>> approval. >>>> >>>> >>>> On Sat, Jun 12, 2021 at 9:11 AM Rich Brown <[email protected]> >>>> wrote: >>>> >>>>>> On Jun 12, 2021, at 12:00 PM, [email protected] >>>>>> wrote: >>>>>> >>>>>> Some relevant talks / publicity at WWDC -- the first mentioning >>>>>> CoDel, queueing, etc. Featuring Stuart Cheshire. iOS 15 adds a >>>>>> developer test >>>>> for >>>>>> loaded latency, reported in "RPM" or round-trips per minute. >>>>>> >>>>>> I ran it on my machine: nowens@mac1015 ~ % /usr/bin/networkQuality >>>>>> ==== SUMMARY ==== Upload capacity: 90.867 Mbps Download capacity: >>>>>> 93.616 Mbps Upload flows: 16 Download flows: 20 Responsiveness: >>>>>> Medium (840 RPM) >>>>> >>>>> Does anyone know how to get the command-line version for current (not >>>>> upcoming) macOS? Thanks. >>>>> >>>>> Rich _______________________________________________ Bloat mailing >>>>> list [email protected] >>>>> https://lists.bufferbloat.net/listinfo/bloat >>>>> >>> >>>> _______________________________________________ Bloat mailing list >>>> [email protected] >>>> https://lists.bufferbloat.net/listinfo/bloat >>> >>> _______________________________________________ Bloat mailing list >>> [email protected] https://lists.bufferbloat.net/listinfo/bloat _______________________________________________ Bloat mailing list [email protected] https://lists.bufferbloat.net/listinfo/bloat
