1 Queues with Time and Space Priorities

David P. Reed via Starlink Fri, 29 Jul 2022 12:38:12 -0700

From: "Bless, Roland (TM)" <[email protected]> 
models fromqueueing theory is that they only work for load < 1, whereaswe are 
using the network with load values ~1 (i.e., around one) due tocongestion 
control feedback loops that drive the bottleneck linkto saturation (unless you 
consider application limited traffic sources).
 
Let me remind people here that there is some kind of really weird thinking 
going on here about what should be typical behavior in the Intenet when it is 
working well.
 
No, the goal of the Internet is not to saturate all bottlenecks at maximum 
capacity. That is the opposite of the goal, and it is the opposite of a sane 
operating point.
 
Every user seeks low response time, typically a response time on the order of 
the unloaded delay in the network, for ALL traffic. (whether it's the response 
to a file transfer or a voice frame or a WWW request). * 
 
Queueing is always suboptimal, if you can achieve goodput without introducing 
any queueing delay. Because a queue built up at any link delays *all* traffic 
sharing that link, so the overall cost to all users goes up radically when 
multiple streams share a link, because the queueing *delay* gets multiplied by 
the number of flows affected!
 
So the most desirable operating point (which Kleinrock and his students 
recently demonstrated with his "power metric") is to have each queue in every 
link average < 1 packet in length. (big or small packets, doesn't matter, 
actually).
 
Now the bigger issue is that this is unachievable when the flows in the network 
are bursty. Poisson being the least bursty, and easiest to analyze of the 
random processes generating flows. Typical Internet usage is incredibly bursty 
at all time scales, though - the burstiness is fractal when observed for real 
(at least if you look at time scales from 1 ms. to 1 day as your unit of 
analysis).  Fractal random processes of this sort are not Poisson at all.
 
So what is the best one ought to try to do?
 
Well, "keeping utilization at 100%" is never what real network operators seek. 
Never, ever. Instead, congestion control is focused on latency control, not 
optimizing utilization.
 
The only folks who seem to focus on utilization is the bean counting 
fraternity, because they seem to think the only cost is the wires, so you want 
the wires to be full. That, in my opinion, and even in most accounting systems 
that consider the whole enterprise rather than the wires/fibers/airtime alone, 
is IGNORANT and STUPID.
 
However, academics and vendors of switches care nothing about latency at 
network scale. They focus on wirespeed as the only metric.
 
Well, in the old Bell Telephone days, the metric of the Bell System that really 
mattered was not utilization on every day. Instead it was avoiding outages due 
to peak load. That often was "Mother's Day" - a few hours out of one day once a 
year. Because an outage on Mother's day (busy signals) meant major frustration!
 
Why am I talking about this?
 
Because I have been trying for decades (and I am not alone) to apply a 
"Clue-by-Four" to the thick skulls of folks who don't think about the Internet 
at scale, or even won't think about an Enterprise Internet at scale (or 
Starlink at scale). And it doesn't sink in.
 
Andrew Odlyzko, a brilliant mathematician at Bell Labs for most of his career 
also tried to point out that the utilization of the "bottleneck links" in any 
enterprise, up to the size of ATT in the old days, was typically tuned to < 10% 
of saturation at almost any time. Why? Because the CEO freaked out at the 
quality of service of this critical infrastructure (which means completing 
tasks quickly, when load is unusual) and fired people.
 
And in fact, the wires are the cheapest resource - the computers and people 
connected by those resources that can't do work while waiting for queueing 
delay are vastly more expensive to leave idle. Networks don't do "work" that 
matters. Queueing isn't "efficient". It's evil.
 
Which is why dropping packets rather then queueing them is *good*, if the 
sender will slow down and can resend them. Intentially dropped packets should 
be nonzero under load, if an outsider is observing for measruing quality.
 
I call this brain-miswiring about optimizing throughput to fill a bottleneck 
link the Hotrodder Fallacy. That's the idea that one should optimize like a 
drag racer optimizes his car - to burn up the tires and the engine to meet an 
irrelevant metric for automobiles. A nice hobby that has never improved any 
actual vehicle. (Even F1 racing is far more realistic, given you want your cars 
to last for the lifetime of the race).
 
A problem with much of the "network research" community is that it never has 
actually looked at what networks are used for and tried to solve those 
problems. Instead, they define irrelevant problems and encourage all students 
and professors to pursue irrelevancy.
 
Now let's look at RRUL. While it nicely looks at latency for small packets 
under load, it actually disregards the performance of the load streams, which 
are only used to "fill the pipe". Fortunately, they are TCP, so they rate limit 
themselves by window adjustment. But they are speed unlimited TCP streams that 
are meaningless.
 
Actual situations (like what happens when someone starts using BitTorrent while 
another in the same household is playing a twitch Multi-user FPS) don't 
actually look like RRUL. Because in fact the big load is ALSO fractal. 
Bittorrent demand isn't constant over time - far from it. It's bursty.
 
Everything is bursty at different timescales in the Internet. There are no CBR 
flows.
 
So if we want to address the real congestion problems, we need realistic 
thinking about what the real problem is.
 
Unfortunately this is not achieved by the kind of thinking that created 
diffserv, sadly. Because everything is bursty, just with different timescales 
in some cases. Even "flash override" priority traffic is incredibly bursty.
 
Coming back to Starlink - Starlink apparently is being designed by folks who 
really do not understand these fundamental ideas. Instead, they probably all 
worked in researchy environments where the practical realities of being part of 
a worldwide public Internet were ignored.
 (The FCC folks are almost as bad. I have found no-one at FCC engineering who 
understands fractal burstiness - even w.'t. the old Bell System).

_______________________________________________
Starlink mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/starlink

Re: [Starlink] Finite-Buffer M/G/1 Queues with Time and Space Priorities

Reply via email to