Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

dpreed Thu, 28 Feb 2013 12:58:27 -0800

At least someone actually saw what I've been seeing for years now in Metro area 
HSPA and LTE deployments.
 
As you know, when I first reported this on the e2e list I was told it could not 
possibly be happening and that I didn't know what I was talking about.  No one 
in the phone companies was even interested in replicating my experiments, just 
dismissing them.  It was sad.
 
However, I had the same experience on the original Honeywell 6180 dual CPU 
Multics deployment in about 1973.  One day all my benchmarks were running about 
5 times slower every other time I ran the code.  I suggested that one of the 
CPUs was running 5x slower, and it was probably due to the CPU cache being 
turned off.   The hardware engineer on site said that that was *impossible*.  
After 4 more hours of testing, I was sure I was right.  That evening, I got him 
to take the system down, and we hauled out an oscilloscope.  Sure enough, the 
gate that received the "cache hit" signal had died in one of the processors.   
The machine continued to run, since all that caused was for memory to be 
fetched every time, rather than using the cache.
 
Besides the value of finding the "root cause" of anomalies, the story points 
out that you really need to understand software and hardware sometimes.  The 
hardware engineer didn't understand the role of a cache, even though he fully 
understood timing margins, TTL logic, core memory (yes, this machine used core 
memory), etc.
 
We both understood oscilloscopes, fortunately.
 
In some ways this is like the LTE designers understanding TCP.   They don't.  
But sometimes you need to know about both in some depth.
 
Congratulations, Jim.  More Internet Plumbing Merit Badges for you.
 
-----Original Message-----
From: "Jim Gettys" <[email protected]>
Sent: Thursday, February 28, 2013 3:03pm
To: "Dave Taht" <[email protected]>
Cc: "David P Reed" <[email protected]>, "[email protected]" 
<[email protected]>
Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel 
for Android




I've got a bit more insight into LTE than I did in the past, courtesy of the 
last couple days.
To begin with, LTE runs with several classes of service (the call them 
bearers).  Your VOIP traffic goes into one of them.
And I think there is another as well that is for guaranteed bit rate traffic.  
One transmit opportunity may have a bunch of chunks of data, and that data may 
be destined for more than one device (IIRC).  It's substantially different than 
WiFi.
But most of what we think of as Internet stuff (web surfing, dns, etc) all gets 
dumped into a single best effort ("BE"), class.
The BE class is definitely badly bloated; I can't say how much because I don't 
really know yet; the test my colleague ran wasn't run long enough to be 
confident it filled the buffers).  But I will say worse than most cable modems 
I've seen.  I expect this will be true to different degrees on different 
hardware.  The other traffic classes haven't been tested yet for bufferbloat, 
though I suspect they will have it too.  I was told that those classes have 
much shorter queues, and when the grow, they dump the whole queues (because 
delivering late real time traffic is useless).  But trust *and* verify....  
Verification hasn't been done for anything but BE traffic, and that hasn't been 
quantified.
But each device gets a "fair" shot at bandwidth in the cell (or sector of a 
cell; they run 3 radios in each cell), where fair is basically time based; if 
you are at the edge of a cell, you'll get a lot less bandwidth than someone 
near a tower; and this fairness is guaranteed by a scheduler than runs in the 
base station (called a b-nodeb, IIIRC).  So the base station guarantees some 
sort of "fairness" between devices (a place where Linux's wifi stack today 
fails utterly, since there is a single queue per device, rather than one per 
station).
Whether there are bloat problems at the link level in LTE due to error 
correction I don't know yet; but it wouldn't surprise me; I know there was in 
3g.  The people I talked to this morning aren't familiar with the HARQ layer in 
the system.
The base stations are complicated beasts; they have both a linux system in them 
as well as a real time operating system based device inside  We don't know 
where the bottle neck(s) are yet.  I spent lunch upping their paranoia and 
getting them through some conceptual hurdles (e.g. multiple bottlenecks that 
may move, and the like).  They will try to get me some of the data so I can 
help them figure it out.  I don't know if the data flow goes through the linux 
system in the bnodeb or not, for example.
Most carriers are now trying to ensure that their backhauls from the base 
station are never congested, though that is another known source of problems.  
And then there is the lack of AQM at peering point routers....  You'd think 
they might run WRED there, but many/most do not.
- Jim



On Thu, Feb 28, 2013 at 2:08 PM, Dave Taht <[mailto:[email protected]] 
[email protected]> wrote:




On Thu, Feb 28, 2013 at 1:57 PM,  <[mailto:[email protected]] [email protected]> 
wrote:

Doesn't fq_codel need an estimate of link capacity?
No, it just measures delay. Since so far as I know the outgoing portion of LTE 
is not soft-rate limited, but sensitive to the actual available link bandwidth, 
fq_codel should work pretty good (if the underlying interfaces weren't horribly 
overbuffired) in that direction.
I'm looking forward to some measurements of actual buffering at the device 
driver/device levels.
I don't know how inbound to the handset is managed via LTE.

Still quite a few assumptions left to smash in the above.
...
in the home router case....
...
When there are artificial rate limits in play (in, for example, a cable 
modem/CMTS, hooked up via gigE yet rate limiting to 24up/4mbit down), then a 
rate limiter (tbf,htb,hfsc) needs to be applied locally to move that rate 
limiter/queue management into the local device, se we can manage it better.
I'd like to be rid of the need to use htb and come up with a rate limiter that 
could be adjusted dynamically from a daemon in userspace, probing for short all 
bandwidth fluctuations while monitoring the load. It needent send that much 
data very often, to come up with a stable result....
You've described one soft-rate sensing scheme (piggybacking on TCP), and I've 
thought up a few others, that could feed back from a daemon some samples into a 
a soft(er) rate limiter that would keep control of the queues in the home 
router. I am thinking it's going to take way too long to fix the CPE and far 
easier to fix the home router via this method, and certainly it's too painful 
and inaccurate to merely measure the bandwidth once, then set a hard rate, when
So far as I know the gargoyle project was experimenting with this approach.

A problem is in places that connect more than one device to the cable modem... 
then you end up with those needing to communicate their perception of the 
actual bandwidth beyond the link.


Where will it get that from the 4G or 3G uplink?


 
-----Original Message-----
From: "Maciej Soltysiak" <[mailto:[email protected]] [email protected]>
Sent: Thursday, February 28, 2013 1:03pm
 To: [mailto:[email protected]] 
[email protected]
Subject: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for 
Android



Hiya,
Looks like Google's experimenting with 3.8 for Android: 
[https://android.googlesource.com/kernel/common/+/experimental/android-3.8] 
https://android.googlesource.com/kernel/common/+/experimental/android-3.8
Sounds great if this means they will utilize fq_codel, TFO, BQL, etc.
Anyway my nexus 7 says it has 3.1.10 and this 3.8 will probably go to Android 
5.0 so I hope Nexus 7 will get it too some day or at least 3.3+
Phoronix coverage: [http://www.phoronix.com/scan.php?page=news_item&px=MTMxMzc] 
http://www.phoronix.com/scan.php?page=news_item&px=MTMxMzc
Their 3.8 changelog: 
[https://android.googlesource.com/kernel/common/+log/experimental/android-3.8] 
https://android.googlesource.com/kernel/common/+log/experimental/android-3.8
Regards,
Maciej_______________________________________________
 Cerowrt-devel mailing list
[mailto:[email protected]] [email protected]
[https://lists.bufferbloat.net/listinfo/cerowrt-devel] 
https://lists.bufferbloat.net/listinfo/cerowrt-devel



-- 
Dave Täht

Fixing bufferbloat with cerowrt: 
[http://www.teklibre.com/cerowrt/subscribe.html] 
http://www.teklibre.com/cerowrt/subscribe.html 
_______________________________________________
 Cerowrt-devel mailing list
[mailto:[email protected]] [email protected]
[https://lists.bufferbloat.net/listinfo/cerowrt-devel] 
https://lists.bufferbloat.net/listinfo/cerowrt-devel

_______________________________________________
Cerowrt-devel mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/cerowrt-devel

Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

Reply via email to