Re: [Bloat] summarizing the bitag latency report?

Jonathan Morton via Bloat Mon, 14 Nov 2022 09:05:53 -0800

> On 12 Nov, 2022, at 1:16 am, Dave Taht via Bloat 
> <[email protected]> wrote:
> 
> If you were to try to summarize this *in a paragraph*, what would you say?
> 
> https://www.bitag.org/documents/BITAG_latency_explained.pdf


I can get it down to *three* paragraphs while conveying the essentials:

The quality of an Internet path is measured by three factors:  throughput, 
latency, and packet loss.  Of these three measures, throughput is typically the 
least important for application performance, so long as a modest threshold is 
met - for example the US "broadband" definition of 25Mbps.  Packet loss is 
interpreted by computers as indicating congestion, which causes them to slow 
down network transfers unnecessarily; it also causes objectionable glitches in 
video and audio streams, and should thus be minimised.  Latency is the primary 
driver of perceived Internet quality for most applications in most 
circumstances.

Latency can be divided into "inherent" and "induced" components.  Inherent 
latency is simply the time it takes for a packet to traverse all the links in 
the path, outward and return.  Induced latency is the additional time spent 
deciding which of several links to direct the packet to, waiting for a shared 
medium, and/or stuck in a queue full of other packets going the same way.  Most 
applications are able to adapt to reasonable levels of inherent latency, but 
induced latency is much more difficult to manage due to its variability.  There 
are several ways to reduce induced latency without impairing throughput or 
packet loss, chiefly AQM and Fair Queuing, which can fruitfully be combined as 
in SQM.  SQM is widely, but not yet universally, deployed on the Internet, and 
works very well.

AQM is the practice of observing how big queues get, and signalling congestion 
in a deliberate way based on those observations.  ECN can be used to perform 
that signalling without any packet loss.  On traffic that doesn't support ECN, 
deliberately dropping packets in a controlled way is necessary.  These 
congestion signals cause applications to reduce their load on the network to 
match available capacity, and thereby reduce queuing.  Fair Queuing works 
orthogonally to this by treating each flow of traffic individually, so that one 
flow inducing heavy delays in its queue doesn't affect another flow which is 
lighter.  This makes it easy for very different applications to coexist on the 
same path, which often happens when there are several users in the same 
household or office.  SQM uses Fair Queuing, and also applies a separate AQM to 
each flow, so that congestion signals are directed solely to heavy flows.

If you really need it to be only *one* paragraph, the middle one might be the 
most essential.

> Also QoS, vs QoE. Try to imagine explaining the need to a CFO, or
> congresscritter. Feel free to take more than a paragraph.

QoS is Quality of Service.  QoE is Quality of Experience.  The two are very 
different concepts.

To illustrate this, consider a railway manager tasked with modernising his line 
by replacing steam trains with diesel ones.  He's a modern businessman keen to 
apply modern thinking to this task, so he delegates some underlings to gather 
data about the expected traffic flows on the line, as well as the types of 
train that are available for hire.

In the answers that come back, he focuses on two key figures:  the line carries 
1000 passengers per day, and each carriage can seat 50 passengers.  Simple 
arithmetic shows that this demand can be met by running 20 carriages per day, 
but the manager rounds this up to 24 carriages to allow some margin for error.  
After all, with the tremendous efficiency of diesel traction (compared to steam 
traction), he can afford to be a little generous.

One of the trains on offer is a 2000hp locomotive hauling 12 carriages - a very 
impressive sight, to be sure.  "Splendid," he thinks, "we can run that one 
twice a day, and that will meet demand with some margin to spare."  So that's 
what he does; once in the morning, and once in the evening.  The timetables are 
very easy to publish, too.

In the first month of operation, all of these trains turn up on time and with 
the correct number of carriages, and there are no breakdowns or accidents.  The 
specified capacity is therefore supplied.  This is an excellent "Quality of 
Service".

Yet the complaints start rolling in almost immediately.  Passengers who turn up 
wanting to travel at any other time than the two trains serve find themselves 
with an exceptionally long wait ahead of them.  Local police even report an 
increase in vagrancy complaints, due to passengers missing the evening train 
and having to sleep in the waiting rooms overnight.  This represents a very 
poor "Quality of Experience".

Learning from this misadventure, the manager goes back to his data and notes 
that one-carriage "railcars" are also available for hire.  For the next month's 
timetable, instead of the two 12-carriage trains each day, he will run one of 
these railcars every hour.  These will provide exactly the same seating 
capacity over the course of the day, but the waiting time will now be limited 
to a much more palatable duration.  (In Internet terms, he's optimised squarely 
for latency.)

Still the complaints come in - but now from different sources.  No longer are 
passengers waiting for hours and sleeping overnight in stations.  Instead, 
rush-hour commuters who had previously found the 12-carriage trains convenient 
are finding the railcars too crowded.  Even with over a hundred passengers 
crammed in like sardines, many more are left on the platforms and arrive at 
work late - or worse, come home to a cold dinner and an annoyed wife.  Simply 
put, demand is not evenly distributed through the day, but concentrated on 
particular times; at other times, the railcars are sufficient for the 
relatively small number of passengers, or even run almost empty.

So again, even though the "Quality of Service" is provided just as specified, 
the "Quality of Experience" for the passengers is very poor.  Indeed the 
overcrowding leads to some railcars being delayed, due to the difficulty of 
getting everyone in and out of the doors, and the conductors have great 
difficulty in checking tickets, hence a noticeable reduction in fare revenue.

Things improve markedly when the manager brings in 6-carriage express trains 
for the morning, lunchtime, and evening commuters, and continues to run the 
railcars at hourly intervals in between them, except for the small hours when 
some trains are removed due to minimal demand.  Now there are enough carriages 
in the rush-hour trains to satisfy commuters, and there are still trains 
running at other times so that nobody needs to wait particularly long for one.

In fact, demand increases substantially due to the good "Quality of Experience" 
that this new timetable provides, such that by the end of the first year, many 
of the railcars are upgraded to 3-carriage trains, and the commuter expresses 
are lengthened to 8 carriages.  Fare revenue is more than doubled.  The 
modernisation effort is a success.

The lesson here is that QoS is merely the means by which you may attempt to 
achieve high QoE.  Meeting QoS does not guarantee QoE.  Only if the QoS is 
designed around the factors that genuinely influence QoE will you succeed.  
Unfortunately, many QoS schemes are inadequate for the needs of actual Internet 
users; this is because their designers have not kept up with the appropriate 
QoE factors.

 - Jonathan Morton

_______________________________________________
Bloat mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/bloat

Re: [Bloat] summarizing the bitag latency report?

Reply via email to