Thanks for the explanation. Right now, our criteria for phone/audio is "latency < 300 ms" and "jitter < 40 ms".
It seems like something along the lines of "95th percentile latency < 300 ms" might be advisable in place of the two existing criteria? Sina. On Wed, Feb 24, 2021 at 11:15 PM Simon Barber <[email protected]> wrote: > > > > On February 24, 2021 9:57:13 PM Sina Khanifar <[email protected]> wrote: > >> Thanks for the feedback, Dave! >> >>> 0) "average" jitter is a meaningless number. In the case of a >>> videoconferencing application, what matters most is max jitter, where the >>> app will choose to ride the top edge of that, rather than follow it. I'd >>> prefer using a 98% number, rather than 75% number, to weight where the >>> typical delay in a videoconfernce might end up. >> >> >> Both DSLReports and Ookla's desktop app report jitter as an average >> rather than as a max number, so I'm a little hesitant to go against >> the norm - users might find it a bit surprising to see much larger >> jitter numbers reported. We're also not taking a whole ton of latency >> tests in each phase, so the 98% will often end up being the max >> number. >> >> With regards to the videoconferencing, we actually ran some real-world >> tests of Zoom with various levels of bufferbloat/jitter/latency, and >> calibrated our "real-world results" table on that basis. We used >> average jitter in those tests ... I think if we used 98% or even 95% >> the allowable number would be quite high. > > > Video and audio cannot be played out until the packets have arrived, so late > packets are effectively dropped, or the playback buffer must expand to > accommodate the most late packets. If the playback buffer expands to > accommodate the most late packets then the result is that the whole > conversation is delayed by that amount. More than a fraction of a percent of > dropped packets results in a very poor video or audio experience, this is why > average jitter is irrelevant and peak or maximum latency is the correct > measure to use. > > Yes, humans can tolerate quite a bit of delay. The conversation is > significantly less fluid though. > > Simon > > > > > >> >>> 1) The worst case scenario of bloat affecting a users experience is during >>> a simultaneous up and download, and I'd rather you did that rather than >>> test them separately. Also you get a more realistic figure for the actual >>> achievable bandwidth under contention and can expose problems like strict >>> priority queuing in one direction or another locking out further flows. >> >> >> We did consider this based on another user's feedback, but didn't >> implement it. Perhaps we can do this next time we revisit, though! >> >>> This points to any of number of problems (features!) It's certainly my hope >>> that all the cdn makers at this point have installed bufferbloat >>> mitigations. Testing a cdn's tcp IS a great idea, but as a bufferbloated >>> test, maybe not so much. >> >> >> We chose to use a CDN because it seemed like the only feasible way to >> saturate gigabit links at least somewhat consistently for a meaningful >> part of the globe, without setting up a whole lot of servers at quite >> high cost. >> >> But we weren't aware that bufferbloat could be abated from the CDN's >> end. This is a bit surprising to me, as our test results indicate that >> bufferbloat is regularly an issue even though we're using a CDN for >> the speed and latency tests. For example, these are the results on my >> own connection here (Cox, in Southern California), showing meaningful >> bufferbloat: >> >> https://www.waveform.com/tools/bufferbloat?test-id=ece467bd-e07a-45ea-9db6-e64d8da2c1d2 >> >> I get even larger bufferbloat effects when running the test on a 4G LTE >> network: >> >> https://www.waveform.com/tools/bufferbloat?test-id=e99ae561-88e0-4e1e-bafd-90fe1de298ac >> >> If the CDN was abating bufferbloat, surely I wouldn't see results like these? >> >>> 3) Are you tracking an ecn statistics at this point (ecnseen)? >> >> >> We are not, no. I'd definitely be curious to see if we can add this in >> the future, though! >> >> Best, >> >> On Wed, Feb 24, 2021 at 2:10 PM Dave Taht <[email protected]> wrote: >>> >>> >>> So I've taken a tiny amount of time to run a few tests. For starters, >>> thank you very much >>> for your dedication and time into creating such a usable website, and faq. >>> >>> I have several issues though I really haven't had time to delve deep >>> into the packet captures. (others, please try taking em, and put them >>> somewhere?) >>> >>> 0) "average" jitter is a meaningless number. In the case of a >>> videoconferencing application, >>> what matters most is max jitter, where the app will choose to ride the >>> top edge of that, rather than follow it. I'd prefer using a 98% >>> number, rather than 75% number, to weight where the typical delay in a >>> videoconfernce might end up. >>> >>> 1) The worst case scenario of bloat affecting a users experience is >>> during a simultaneous up and download, and I'd rather you did that >>> rather than test them separately. Also you get >>> a more realistic figure for the actual achievable bandwidth under >>> contention and can expose problems like strict priority queuing in one >>> direction or another locking out further flows. >>> >>> 2) I get absurdly great results from it with or without sqm on on a >>> reasonably modern cablemodem (buffercontrol and pie and a cmts doing >>> the right things) >>> >>> This points to any of number of problems (features!) It's certainly my >>> hope that all the cdn makers at this point have installed bufferbloat >>> mitigations. Testing a cdn's tcp IS a great idea, but as a >>> bufferbloated test, maybe not so much. >>> >>> The packet capture of the tcp flows DOES show about 60ms jitter... but >>> no loss. Your test shows: >>> >>> https://www.waveform.com/tools/bufferbloat?test-id=6fc7dd95-8bfa-4b76-b141-ed423b6580a9 >>> >>> And is very jittery in the beginning of the test on its estimates. I >>> really should be overjoyed at knowing a cdn is doing more of the right >>> things, but in terms of a test... and linux also has got a ton of >>> mitigations on the client side. >>> >>> 3) As a side note, ecn actually is negotiated on the upload, if it's >>> enabled on your system. >>> Are you tracking an ecn statistics at this point (ecnseen)? It is not >>> negotiated on the download (which is fine by me). >>> >>> I regrettable at this precise moment am unable to test a native >>> cablemodem at the same speed as a sqm box, hope to get further on this >>> tomorrow. >>> >>> Again, GREAT work so far, and I do think a test tool for all these >>> cdns - heck, one that tested all of them at the same time, is very, >>> very useful. >>> >>> On Wed, Feb 24, 2021 at 10:22 AM Sina Khanifar <[email protected]> wrote: >>>> >>>> >>>> Hi all, >>>> >>>> A couple of months ago my co-founder Sam posted an early beta of the >>>> Bufferbloat test that we’ve been working on, and Dave also linked to >>>> it a couple of weeks ago. >>>> >>>> Thank you all so much for your feedback - we almost entirely >>>> redesigned the tool and the UI based on the comments we received. >>>> We’re almost ready to launch the tool officially today at this URL, >>>> but wanted to show it to the list in case anyone finds any last bugs >>>> that we might have overlooked: >>>> >>>> https://www.waveform.com/tools/bufferbloat >>>> >>>> If you find a bug, please share the "Share Your Results" link with us >>>> along with what happened. We capture some debugging information on the >>>> backend, and having a share link allows us to diagnose any issues. >>>> >>>> This is really more of a passion project than anything else for us – >>>> we don’t anticipate we’ll try to commercialize it or anything like >>>> that. We're very thankful for all the work the folks on this list have >>>> done to identify and fix bufferbloat, and hope this is a useful >>>> contribution. I’ve personally been very frustrated by bufferbloat on a >>>> range of devices, and decided it might be helpful to build another >>>> bufferbloat test when the DSLReports test was down at some point last >>>> year. >>>> >>>> Our goals with this project were: >>>> * To build a second solid bufferbloat test in case DSLReports goes down >>>> again. >>>> * Build a test where bufferbloat is front and center as the primary >>>> purpose of the test, rather than just a feature. >>>> * Try to explain bufferbloat and its effect on a user's connection >>>> as clearly as possible for a lay audience. >>>> >>>> A few notes: >>>> * On the backend, we’re using Cloudflare’s CDN to perform the actual >>>> download and upload speed test. I know John Graham-Cunning has posted >>>> to this list in the past; if he or anyone from Cloudflare sees this, >>>> we’d love some help. Our Cloudflare Workers are being >>>> bandwidth-throttled due to having a non-enterprise grade account. >>>> We’ve worked around this in a kludgy way, but we’d love to get it >>>> resolved. >>>> * We have lots of ideas for improvements, e.g. simultaneous >>>> upload/downloads, trying different file size chunks, time-series >>>> latency graphs, using WebRTC to test UDP traffic etc, but in the >>>> interest of getting things launched we're sticking with the current >>>> featureset. >>>> * There are a lot of browser-specific workarounds that we had to >>>> implement, and latency itself is measured in different ways on >>>> Safari/Webkit vs Chromium/Firefox due to limitations of the >>>> PerformanceTiming APIs. You may notice that latency is different on >>>> different browsers, however the actual bufferbloat (relative increase >>>> in latency) should be pretty consistent. >>>> >>>> In terms of some of the changes we made based on the feedback we >>>> receive on this list: >>>> >>>> Based on Toke’s feedback: >>>> https://lists.bufferbloat.net/pipermail/bloat/2020-November/015960.html >>>> https://lists.bufferbloat.net/pipermail/bloat/2020-November/015976.html >>>> * We changed the way the speed tests run to show an instantaneous >>>> speed as the test is being run. >>>> * We moved the bufferbloat grade into the main results box. >>>> * We tried really hard to get as close to saturating gigabit >>>> connections as possible. We redesigned completely the way we chunk >>>> files, added a “warming up” period, and spent quite a bit optimizing >>>> our code to minimize CPU usage, as we found that was often the >>>> limiting factor to our speed test results. >>>> * We changed the shield grades altogether and went through a few >>>> different iterations of how to show the effect of bufferbloat on >>>> connectivity, and ended up with a “table view” to try to show the >>>> effect that bufferbloat specifically is having on the connection >>>> (compared to when the connection is unloaded). >>>> * We now link from the results table view to the FAQ where the >>>> conditions for each type of connection are explained. >>>> * We also changed the way we measure latency and now use the faster >>>> of either Google’s CDN or Cloudflare at any given location. We’re also >>>> using the WebTiming APIs to get a more accurate latency number, though >>>> this does not work on some mobile browsers (e.g. iOS Safari) and as a >>>> result we show a higher latency on mobile devices. Since our test is >>>> less a test of absolute latency and more a test of relative latency >>>> with and without load, we felt this was workable. >>>> * Our jitter is now an average (was previously RMS). >>>> * The “before you start” text was rewritten and moved above the start >>>> button. >>>> * We now spell out upload and download instead of having arrows. >>>> * We hugely reduced the number of cross-site scripts. I was a bit >>>> embarrassed by this if I’m honest - I spent a long time building web >>>> tools for the EFF, where we almost never allowed any cross-site >>>> scripts. * Our site is hosted on Shopify, and adding any features via >>>> their app store ends up adding a whole lot of gunk. But we uninstalled >>>> some apps, rewrote our template, and ended up removing a whole lot of >>>> the gunk. There’s still plenty of room for improvement, but it should >>>> be a lot better than before. >>>> >>>> Based on Dave Collier-Brown’s feedback: >>>> https://lists.bufferbloat.net/pipermail/bloat/2020-November/015966.html >>>> * We replaced the “unloaded” and “loaded” language with “unloaded” >>>> and then “download active” and “upload active.” In the grade box we >>>> indicate that, for example, “Your latency increased moderately under >>>> load.” >>>> * We tried to generally make it easier for non-techie folks to >>>> understand by emphasizing the grade and adding the table showing how >>>> bufferbloat affects some commonly-used services. >>>> * We didn’t really change the candle charts too much - they’re >>>> mostly just to give a basic visual - we focused more on the actual >>>> meat of the results above that. >>>> >>>> Based on Sebastian Moeller’s feedback: >>>> https://lists.bufferbloat.net/pipermail/bloat/2020-November/015963.html >>>> * We considered doing a bidirectional saturating load, but decided >>>> to skip on implementing it for now. * It’s definitely something we’d >>>> like to experiment with more in the future. >>>> * We added a “warming up” period as well as a “draining” period to >>>> help fill and empty the buffer. We haven’t added the option for an >>>> extended test, but have this on our list of backlog changes to make in >>>> the future. >>>> >>>> Based on Y’s feedback (link): >>>> https://lists.bufferbloat.net/pipermail/bloat/2020-November/015962.html >>>> * We actually ended up removing the grades, but we explained our >>>> criteria for the new table in the FAQ. >>>> >>>> Based on Greg White's feedback (shared privately): >>>> * We added an FAQ answer explaining jitter and how we measure it. >>>> >>>> We’d love for you all to play with the new version of the tool and >>>> send over any feedback you might have. We’re going to be in a feature >>>> freeze before launch but we'd love to get any bugs sorted out. We'll >>>> likely put this project aside after we iron out a last round of bugs >>>> and launch, and turn back to working on projects that help us pay the >>>> bills, but we definitely hope to revisit and improve the tool over >>>> time. >>>> >>>> Best, >>>> >>>> Sina, Arshan, and Sam. >>>> _______________________________________________ >>>> Bloat mailing list >>>> [email protected] >>>> https://lists.bufferbloat.net/listinfo/bloat >>> >>> >>> >>> >>> -- >>> "For a successful technology, reality must take precedence over public >>> relations, for Mother Nature cannot be fooled" - Richard Feynman >>> >>> [email protected] <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729 >> >> _______________________________________________ >> Bloat mailing list >> [email protected] >> https://lists.bufferbloat.net/listinfo/bloat > > _______________________________________________ Bloat mailing list [email protected] https://lists.bufferbloat.net/listinfo/bloat
