[Bloat] [cr...@tereschau.net: [ih] Sally Floyd has died]

2019-08-27 Thread Dave Täht
I'd always wanted to meet her. RIP.

- Forwarded message from Craig Partridge  -

Date: Mon, 26 Aug 2019 18:47:29 -0600
From: Craig Partridge 
To: internet history 
Subject: [ih] Sally Floyd has died

Just saw a post from Eddie Kohler that Sally Floyd has died.

She packed a huge number of achievements into a relatively short career in
networking (less than 20 years from PhD to retirement).  She developed RED
for congestion control with Van Jacobson, identified the issues in Poisson
modeling with Vern Paxson, developed TCP SACK, to name a few of her
important contributions.  At one point (and perhaps still) she was one of
the top 10 cited women in computing.  I think she won almost every award
the communications field could give.

She was also a great colleague.  I got to work somewhat closely with her on
the End-to-End Task Force for many years.  She had a knack for presenting
stunning insights quietly and articulately, which made the insights all the
more stunning.

Our world is a bit smaller.

Craig



-- 
*
Craig Partridge's email account for professional society activities and
mailing lists.

___
internet-history mailing list
internet-hist...@postel.org
http://mailman.postel.org/mailman/listinfo/internet-history
Contact list-ow...@postel.org for assistance.


- End forwarded message -

-- 
My email server only sends and accepts starttls encrypted mail in transit.
One benefit - it stops all spams thus far, cold. If you are not encrypting
 by default you are not going to get my mail or I, yours.
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] I have a new record for bloat

2017-03-31 Thread Dave Täht


On 3/31/17 3:07 PM, Alex Burr wrote:
> 
> 
> 
> 
> On Saturday, February 11, 2017 9:05 AM, Neil Davies  
> wrote:
> 
> 
>> The same mindset that created these sort of performance artefacts is also 
>> driving changes in DSL 
>> standards, so watch out for these sort of artefacts (again induced by things 
>> that you have no direct
>> control over) starting to occur in “high speed” DSL systems (such as VDSL) 
>> when conditions are less
> 
>> than perfect.
> 
> Are you referring to G.998.4? That's supposed to introduce a max of 63ms 
> delay. But it's been a while since I was involved in DSL, so maybe it's 
> something new...


It's not the encoding, it's the queuing. Unless they also adopt
something like "bql" in their ring buffers in their firmware, and
something like fq_codel slightly above that, under distant conditions or
when rate limited by the provider, they new stuff will bloat up.


> 
> Alex
> ___
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
> 
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Recommendations for fq_codel and tso/gso in 2017

2017-01-26 Thread Dave Täht


On 1/26/17 11:21 PM, Hans-Kristian Bakke wrote:
> Hi
> 
> After having had some issues with inconcistent tso/gso configuration
> causing performance issues for sch_fq with pacing in one of my systems,
> I wonder if is it still recommended to disable gso/tso for interfaces
> used with fq_codel qdiscs and shaping using HTB etc. 

At lower bandwidths gro can do terrible things. Say you have a 1Mbit
uplink, and IW10. (At least one device (mvneta) will synthesise 64k of
gro packets)

a single IW10 burst from one flow injects 130ms of latency.

> 
> If there is a trade off, at which bandwith does it generally make more
> sense to enable tso/gso than to have it disabled when doing HTB shaped
> fq_codel qdiscs?

I stopped caring about tuning params at > 40Mbit. < 10 gbit, or rather,
trying get below 200usec of jitter|latency. (Others care)

And: My expectation was generally that people would ignore our
recommendations on disabling offloads!

Yes, we should revise the sample sqm code and recommendations for a post
gigabit era to not bother with changing network offloads. Were you
modifying the old debloat script?

TBF & sch_Cake do peeling of gro/tso/gso back into packets, and then
interleave their scheduling, so GRO is both helpful (transiting the
stack faster) and harmless, at all bandwidths.

HTB doesn't peel. We just ripped out hsfc for sqm-scripts (too buggy),
alsp. Leaving: tbf + fq_codel, htb+fq_codel, and cake models there.

...

Cake is coming along nicely. I'd love a test in your 2Gbit bonding
scenario, particularly in a per host fairness test, at line or shaped
rates. We recently got cake working well with nat.

http://blog.cerowrt.org/flent/steam/down_working.svg (ignore the latency
figure, the 6 flows were to spots all over the world)

> Regards,
> Hans-Kristian
> 
> 
> ___
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
> 
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] TCP BBR paper is now generally available

2016-12-08 Thread Dave Täht
drop tail works better than any single queue aqm in this scenario.


On 12/8/16 12:24 AM, Mikael Abrahamsson wrote:
> On Fri, 2 Dec 2016, Dave Taht wrote:
> 
>> http://queue.acm.org/detail.cfm?id=3022184
> 
> "BBR converges toward a fair share of the bottleneck bandwidth whether
> competing with other BBR flows or with loss-based congestion control."
> 
> That's not what I took away from your tests of having BBR and Cubic
> flows together, where BBR just killed Cubic dead.
> 
> What has changed since? Have you re-done your tests with whatever has
> changed, I must have missed that? Or did I misunderstand?
> 
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [bbr-dev] Re: "BBR" TCP patches submitted to linux kernel

2016-11-02 Thread Dave Täht


On 11/2/16 11:21 AM, Klatsky, Carl wrote:
>> On Tue, 1 Nov 2016, Yuchung Cheng wrote:
>>
>>> We are curious why you choose the single-queued AQM. Is it just for
>>> the sake of testing?
>>
>> Non-flow aware AQM is the most commonly deployed "queue
>> management" on the Internet today. Most of them are just stupid FIFOs
>> with taildrop, and the buffer size can be anywhere from super small to huge
>> depending on equipment used and how it's configured.
>>
>> Any proposed TCP congestion avoidance algorithm to be deployed on the
>> wider Internet has to some degree be able to handle this deployment
>> scenario without killing everything else it's sharing capacity with.
>>
>> Dave Tähts testing case where BBR just kills Cubic makes me very concerned.
> 
> If I am understanding BBR correctly, that is working in the sender to 
> receiver direction.  In Dave's test running TCP BBR & TCP CUBIC with a single 
> queue AQM, where CUBIC gets crushed.

The scenario as I constructed it was emulating a sender on "home" side
of the link, using BBR and cubic through an emulated cablemodem running pie.


  Silly question, but the single queue AQM was also operating in the in
sender to receiver direction for this test, yes?
> ___
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
> 
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] "BBR" TCP patches submitted to linux kernel

2016-09-29 Thread Dave Täht


On 9/29/16 6:54 PM, Mario Ferreira wrote:
> On Thu, Sep 29, 2016, 16:43 Dave Täht <d...@taht.net
> <mailto:d...@taht.net>> wrote:
> 
> 
> 
> On 9/29/16 4:24 AM, Mário Sérgio Fujikawa Ferreira wrote:
> > Is there a mailing list I can lurk in to follow on the development?
> >
> > I'm most interested on a delta to apply to Android 6.x Franco Kernel
> > (https://github.com/franciscofranco/angler)
> 
> Android is still shipping linux 3.10? How... quaint.
> 
> 
> That's for Huawei Nexus 6P. Google's flagship till Pixel XL arrives.
> I'll try getting BBR merged to Franco's kernel (officially or maintained
> as a fork)  and add tc-adv with fq_codel/cake.
> 
> Since this is mobile, I'm pretty sure it will present a whole new host
> of "data points". 

yes! (there have been a few studies of this, btw)

The part that we don't really know on the android handsets is how
much buffering there really is between qdisc, driver, and firmware,
which no doubt varies between models - and within a model on the
different wifi and 4G stacks. Odds are - just as we just ripped out on
the ath9k/ath10k - so much intermediate buffering exists as to make
applying the latency managing qdisc on top of marginal effectiveness.

In the case of BBR, well, there is some hope that would regulate
TCP on the uplink, but it will have no effect on the down (neither will
the qdiscs) - and it requires sch_fq to work properly - which
means that you'd have a choice between bbr + sch_fq, or
sch_cake/sch_fq_codel.

In all cases, acquiring more data and backports would be nice.
> 
> I'm swamped right now but I'll have time by november for both FreeBSD
> and Android 6 (7 doesn't have source code released just yet) . I'm no
> expert but I'll help as I can.
> 
> >  2. ipfw/dummynet AQM with FQ-Codel
> >   
>  
> (https://lists.freebsd.org/pipermail/freebsd-ipfw/2016-February/006026.html)
> 
> I made some comments on this that I'd like people to look at what is
> really going on in bsd + fq_codel at > 100mbit.
> 
> https://github.com/opnsense/core/issues/505
> 
> 
> I haven't been involved with the FreeBSD project on recent years but
> I'll try to advocate internally. 
> -- 
> 
> Mario S F Ferreira - Brazil - "I guess this is a signature."
> feature, n: a documented bug | bug, n: an undocumented feature
> 
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] iperf3 and packet bursts

2016-09-20 Thread Dave Täht
Groovy. I note that I am really fond of the linux "fdtimer" notion for
tickers, we use that throughout the high speed stats gathering code in
flent.

I'd really like a voip or ping tool that used those, and I've always
worried about iperf's internal notion of a sampling interval.

On 9/20/16 3:00 PM, Aaron Wood wrote:
> We were using iperf3 at work to test a network path (we wanted the
> fixed-rate throttling that it can do).  And while TCP would fill the
> 50Mbps link from the ISP, UDP couldn't.  UDP couldn't get over 8Mbps of
> goodput, no matter what rate we specified on the command line.
> 
> We found a 100ms timer that's used to PWM the packet transmission to
> perform the throttling.  Fine for TCP, fine where the end-to-end
> physical links are the same rate.  But throw a 10:1 rate change through
> a switch into that, and suddenly you find out that the switch isn't that
> bloated.
> 
> I modified iperf3 to use a 1ms timer, and was able to get things much
> smoother.  I doubt it's as smooth as iperf3 gets on Linux when fq pacing
> is used, but it's a big improvement vs. the nice small buffers in switches.
> 
> I put together a writeup with graphs on my blog:
> http://burntchrome.blogspot.com/2016/09/iperf3-and-microbursts.html
> 
> I have a forked version of iperf3 on github:
> https://github.com/woody77/iperf
> 
> This uses the 1ms timer, and has a few other fixes, such as it resets
> the throttling calculations at the start of each stats interval.  That
> change stops iperf3 from transmitting at maximum rate after congestion
> has stopped it from achieving the target rate.  There will be another
> writeup on that, but I need to get some good sample data together for
> graphing.
> 
> -Aaron Wood
> 
> 
> ___
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
> 
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] cross checked spam filters, no dice

2016-08-10 Thread Dave Täht
well, lists.bufferbloat.net is not in any dns rbls, my spf header is
fine, lists.bufferbloat.net HAD an expired cert, maybe that was the
source of the problem?
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] coping with overagressive spam filters

2016-08-10 Thread Dave Täht
I do not know why but emails from both of my email accounts - both on
this server and from my gmail have been landing in multiple people's
spam buckets. Please check your spam mailboxes and/or try sending a new
mail to these lists to see if it's related to the list or me

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Intel recommends against LRO in routing and bridging.

2016-08-10 Thread Dave Täht


On 8/9/16 12:13 AM, Aaron Wood wrote:
> Just came across this at the top of the README for the ixgbe driver:
> (http://downloadmirror.intel.com/22919/eng/README.txt)
> 
> WARNING:  The ixgbe driver compiles by default with the LRO (Large Receive
> Offload) feature enabled.  This option offers the lowest CPU utilization for
> receives, but is completely incompatible with *routing/ip forwarding* and
> *bridging*.  If enabling ip forwarding or bridging is a requirement, it is
> necessary to disable LRO using compile time options as noted in the LRO
> section later in this document.  The result of not disabling LRO when combined
> with ip forwarding or bridging can be low throughput or even a kernel panic.

I am not sure how true this remains. (?) We certainly have gone through
hell with TSO and GRO, but it was my hope that most of those issues were
fixed over the past 2 years, in the drivers, and htb/cake/hfsc, etc.

I don't think that the existing pie in linux head is handling 64k
superpackets[1] well. tbf is doing more of the right things here now,,
but wasn't till about a year back. Haven't tracked htb, and certainly
hfsc has shown some problems that I don't know are fixed or not (there
was a kernel bug filed on it some time back with stephen taking the lead
on it)

Cake's "peeling" mechanism for superpackets is also overly aggressive,
but we have not put much effort into tuning it yet. We have a variety of
potential solutions for that...

BQL tends to get a very big number for its estimator when superpackets
are in use, usually 4x what happens without superpackets turned on via
ethtool - but NOT having superpackets turned on can really hurt throughput.

The code we tested a while back on the bsd implementations (which
generally lack superpacket support) had trouble hitting fifo speeds at
high rates on the hardware under test (>400mbits).

[1] For want of a word, I'll use superpackets to refer to
TSO/GSO/GRO/LSO universally.

> 
> -Aaron
> 
> 
> ___
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
> 
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] pie, codel, fq_pie, fq_codel tech report

2016-08-03 Thread Dave Täht
I am especially grateful for the full documentation of how to configure
the bsd versions of this stuff, but the rest of the report was pretty
good too.

http://caia.swin.edu.au/reports/160708A/CAIA-TR-160708A.pdf
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] What is the best firewall software/distribution for Cake/fq_codel?

2016-01-21 Thread Dave Täht


On 1/20/16 8:31 AM, John Klimek wrote:
> I'm currently using pfSense on an x86 system and it's working great, but I'd 
> like to use fq_codel and the upcoming Cake algorithm/system.

There is work going on to finally port this stuff to BSD.

> What is considered to be the best firewall software/distribution that will 
> also be one of the first to support Cake?

At the moment cake remains under heavy development. (see the cake
mailing list) - but it does work on nearly every form of linux with a
little work. More benchmarking/testing is needed before the code can be
finalized.

it was my hope that "some vendor" - like ubnt, who were the first out
the gate with good fq_codel support in their edgerouter series - would
be tagging along this time - on cake, and actively tracking and more
importantly *financially supporting* the work. No such (visible) luck,
except from nlnet. Sigh.

Hardwarewise it looks like armada 385 based products are the most open
and hackable, so things like the linksys 1200ac and turris omnia will be
good choices * in the future *.

I still consider the state of the firmware on that chipset to be very
immature, but there is a lot of activity on it and it's getting better
rapidly.

There is a mt76 based board that is looking decent also.

> 
> It sounds like I can use OpenWrt and/or CeroWrt, but I'm unsure about the 
> quality of the x86 version.  

CeroWrt, as we knew it, is "dead". Everything in it that was important
is in mainline linux now (and openwrt and derivatives). CeroWrt might
come to a resurrection of sorts as part of "make-wifi-fast".

x86 is not really a primary target for openwrt. Far too many variants -
I'd like to know of a good x86 board to use for openwrt, though - the
"quality" is there for sure - just have to find a decent embedded x86 board.

>Another suggestion I've seen was to use ipFire which supports fq_codel.

ipfire is pretty darn good for the x86 market.

> 
> Any suggestions?  Is there a "primary" firewall/distribution that Buffer 
> Bloat recommends?

Not in the business of choosing sides. Goal in life is to make this
stuff and available default in everything so nobody ever has to think
about it anymore.


___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Hardware upticks

2016-01-05 Thread Dave Täht


On 1/5/16 11:29 AM, Steinar H. Gunderson wrote:
> On Tue, Jan 05, 2016 at 10:57:13AM -0800, Dave Täht wrote:
>> Context switch time is probably one of the biggest hidden nightmares in
>> modern OOO cpu architectures - they only go fast in a straight line. I'd
>> love to see a 1ghz processor that could context switch in 5 cycles.
> 
> It's called hyperthreading? ;-)
> 
> Anyway, the biggest cost of a context switch isn't necessarily the time used
> to set up registers and such. It's increased L1 pressure; your CPU is now
> running different code and looking at (largely) different data.

+10.

A L1/L2 Icache dedicated to interrupt processing code could make a great
deal of difference, if only cpu makers and benchmarkers would make
CS time something we valued.

Dcache, not so much, except for the intel architectures which are now
doing DMA direct to cache. (any arms doing that?)

> /* Steinar */
> 
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] bufferbloat email list server upgrade going slow and badly

2016-01-04 Thread Dave Täht
all the bufferbloat.net servers are in the process of migrating to a
new co-location facility. the lists - if not the archives - should be
alive again, at least.

There is a stupid bug somewhere stopping the archived pages from
making the web.

I am concerned if anyone's bloat email is now auto-ending up in a spam
folder? (if so, please send response privately) (this email is also a
test)

The redmine webserver move is troublesome, also.

Further more, the mailman web service is under persistent attack by
a set of 286 (so far) ips flooding it with subscription requests, and
linode itself has been under enormous attack over the holidays...

there will be ssl updates and the like once all this is sorted but it
might take a week or two more. I might give up and try another co-lo also.

In the meantime, why not top it all off with breakfast at milliways?

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] bufferbloat email list server upgrade going slow and badly

2016-01-04 Thread Dave Täht


On 1/4/16 4:10 PM, Stephen Hemminger wrote:

> Talk to davem, maybe kernel.org would be safer/better more robust?

Damned if I know - vger is one of my problems that I'd wanted to solve
with this move 1) my old anti-spam setup made him crazy - now fixed -
and 2) vger doesn't use starttls. I'd so hoped that after 10+ years of
availability it was basically on universally, and in the post CISA world
we could put at least this portion of the middle finger up.

for now, for accepting email, (and while I sort out
other stuff) I have postfix being strict about what it accepts, and
liberal about what it sends.

smtp_tls_security_level=may
smtpd_tls_security_level=encrypt

Only 38 out of 532 email addresses on the bloat list are refusing
starttls. The instant anti-spam improvement of making tls mandatory for
email was pretty amazing...

The ongoing mailman subscribe attack looks to have been going on for
months and must be targetted at a metric ton of mailman servers.

It's only hitting three users at google, but

whoever+somerandomnumber is something I need to teach mailman to sort
out. These are the users getting the subscribe spam.

kemo.mart+67292...@gmail.com
kezukaya+93690...@gmail.com
touma3108+42493...@gmail.com

On my more paranoid days I'd think this was an attempt at a known
plaintext attack...

and, alas, poor linode: http://status.linode.com

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] The complete results from the latencyload test.

2011-03-20 Thread Dave Täht

Instead of dribs and drabs. I don't know when the test started, but this
took a very long time.

Server responding, beginning test...
MinRTT: 1.1ms
Scenario 1: 0 uploads, 1 downloads... 8024 KiB/s down, 12.71 Hz smoothness
Scenario 2: 1 uploads, 0 downloads... 6526 KiB/s up, 5.52 Hz smoothness
Scenario 3: 0 uploads, 2 downloads... 6703 KiB/s down, 19.07 Hz smoothness
Scenario 4: 1 uploads, 1 downloads... 3794 KiB/s up, 4084 KiB/s down, 1.82 Hz 
smoothness
Scenario 5: 2 uploads, 0 downloads... 7503 KiB/s up, 4.11 Hz smoothness
Scenario 6: 0 uploads, 3 downloads... 7298 KiB/s down, 2.08 Hz smoothness
Scenario 7: 1 uploads, 2 downloads... 4323 KiB/s up, 4690 KiB/s down, 1.38 Hz 
smoothness
Scenario 8: 2 uploads, 1 downloads... 4743 KiB/s up, 3422 KiB/s down, 0.02 Hz 
smoothness
Scenario 9: 3 uploads, 0 downloads... 7558 KiB/s up, 1.39 Hz smoothness
Scenario 10: 0 uploads, 4 downloads... 6719 KiB/s down, 2.08 Hz smoothness
Scenario 11: 1 uploads, 3 downloads... 4372 KiB/s up, 4067 KiB/s down, 1.30 Hz 
smoothness
Scenario 12: 2 uploads, 2 downloads... 4725 KiB/s up, 3645 KiB/s down, 0.81 Hz 
smoothness
Scenario 13: 3 uploads, 1 downloads... 4917 KiB/s up, 3175 KiB/s down, 0.59 Hz 
smoothness
Scenario 14: 4 uploads, 0 downloads... 7706 KiB/s up, 1.43 Hz smoothness
Scenario 15: 0 uploads, 32 downloads... 6648 KiB/s down, 1.36 Hz smoothness
Scenario 16: 1 uploads, 31 downloads... 4401 KiB/s up, 4111 KiB/s down, 0.59 Hz 
smoothness
Scenario 17: 16 uploads, 16 downloads... 6108 KiB/s up, 2103 KiB/s down, 0.00 
Hz smoothness
Scenario 18: 31 uploads, 1 downloads... 8003 KiB/s up, 520 KiB/s down, 0.03 Hz 
smoothness
Scenario 19: 32 uploads, 0 downloads... 7236 KiB/s up, 0.00 Hz smoothness

OVERALL:
Upload Capacity: 5486 KiB/s
  Download Capacity: 2836 KiB/s
Link Responsiveness: 0 Hz
Flow Smoothness: 0 Hz


-- 
Dave Taht
http://nex-6.taht.net
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] Better understanding decision-making across all layers of the stack

2011-03-17 Thread Dave Täht

Michael J. Schultz just put up a nice blog entry on how the receive side
of the current Linux network stack works. 

http://blog.beyond-syntax.com/2011/03/diving-into-linux-networking-i/

There are people on the bloat lists that understand wireless RF, people
that understand a specific driver, people that grok the mac layer,
there's a whole bunch of TCP and AQM folk, we have supercomputer guys
and embedded guys, cats and dogs, all talking in one space and yet...

Yet in my discussions[0] with specialists working at these various
layers of the kernel I've often spotted holes in knowledge on how
the layers actually work together to produce a result.

I'm no exception[1] - in the last few weeks of fiddling with the
debloat-testing tree I've learned that everything I knew about the Linux
networking stack is basically obsolete. With the advent of tickless
operations, GSO offload, threaded interrupts, soft-irqs, and other new
scheduling mechanisms, most of the rationale I'd had for running at a
1000HZ tick rate has vanished.

That said, I cannot honestly believe Linux is soft-clocked enough right
now to ensure low latency decision making across all the layers of the
networking stack, as my struggles with the new eBDP algorithm and the
iwl driver seem to be showing. 

Certainly low hanging fruit remains. 

For example, Dan Siemon just found (and, with Eric Dumazet fixed) a
long-standing bug in Linux's default pfifo_fast qdisc that has been
messing up ECN for a decade. [2]. That fix went *straight* into linus's
git head and net-stable.

It would be nice to have a clear up-to-date picture - a flowchart - a
set of diagrams - about how, when, and where how all the different
network servo mechanisms in the kernel interact, for several protocols,
from layer 0 to layer 7 and back again.

Call it, a day in the life of a set of network streams. [3]

Michael's piece above is a start, but only handles the receive side at a
very low level. When does a TCP packet get put on the txqueue? When does
a qdisc get invoked and a packet make it onto the device ring? How does
stuff get pulled from the receive buffer and fed back into the TCP
server loop? When and at what points do we decide to drop a packet? How
is ND handled differently from ARP or other low level packets? When does
napi kick in? What's the interaction between wireless retries and packet 
aggregation?

Pointers to more existing current and accurate documentation would be
nice too.

I think that a lot of debloating could be done in-between the layers of
the stack on both low and high end devices. Judging from this recent
thread [4] here, on the high end, there are disputes between adequate
amounts of driver buffering on 10GE[5] and queue management[6], and
abstractions such as RED have actually been pushed into silicon[7]. How
do we best take advantage of those features going forward? [8]

In order to improve responsiveness, reduce delay and excessive buffering
up and down the stack we could really use more cross-disciplinary
knowledge, and a more common understanding about how all this stuff fits
together, but writing such a document would require multiple people get
their heads together to get something coherent. [9] Volunteers?

-- 
Dave Taht
http://nex-6.taht.net

[0] Dave Täht  Felix Fietkau (of openwrt)

http://mirrors.bufferbloat.net/podcasts/BPR-The_Wireless_Stack.mp3
  
I had intended to turn this discussion into a more formal podcast
format. I simply haven't had time. It's listenable as is, however. If
you want to learn more about how 802.11 wireless works, in particular,
how 802.11n packet aggregratation works, toss that recording onto your
mp3 player and call up etags

[1] I also had to listen to this recording about 6 times to understand where
Felix and I had miscommunicated. It was a very educational conversation
for me, at least. (And convinced Felix to spend time on bufferbloat, too)

I also note that recording as much as possible of everything is the only
trait I share with Richard Nixon.

[2] ECN + pfifo problem clearly explained:
http://www.coverfire.com/archives/2011/03/13/pfifo_fast-and-ecn/
WAY TO GO DAN!
[3] I imagine the work would make for a good (series of) article(s) on
LWN, or perhaps the new Byte magazine.

[4] https://lists.bufferbloat.net/pipermail/bloat/2011-March/000240.html
[5] https://lists.bufferbloat.net/pipermail/bloat/2011-March/000260.html
[6] https://lists.bufferbloat.net/pipermail/bloat/2011-March/000265.html
[7] https://lists.bufferbloat.net/pipermail/bloat/2011-March/000281.html
[8] There have been interesting attempts at simplifying the Linux networking
stack, notably VJ's netchannels, which was sidetracked by the problems
of interacting with netfilter ( http://lwn.net/Articles/192767/ )

Openflow is also interesting as an example of what can be moved into hardware.

[9] I don't want all these footnotes and theoretical stuff to get in the
way of actually gaining a good set of pictures and understanding

[Bloat] Bufferbloat and Internet gaming

2011-03-12 Thread Dave Täht

John Carmack (of armadillo aerospace[2] and idsoftware fame) gave me
permission to repost from this private email...

John Carmack jo...@idsoftware.com writes:

 I would really like to help, but I don't know of a good resource for
 you [for 3D visualizations of bufferbloat].  If I could clone myself a
 couple times, I would put one of them on the task...

 Buffer bloat was a huge issue for us back in the modem days, when we
 had to contend with multiple internal buffers in modems for data
 compression and error correction, on top of the serial buffers in the
 host.  Old school modem players generally have memories of surge
 lag, where things were going smoothly until the action spiked and the
 play could then get multiple seconds of latency backed up into
 buffers.

 I had always advocated for query routines on the local host, because I
 would choose to not send a packet if I knew it was going to get stuck
 in a bloating buffer, but this has been futile.  There is a strong
 aversion to wanting to add anything at the device driver level, and
 visibility at the OS level has limited benefits.  Just the last year I
 identified some bufferbloat related issues on the iPhone WiFi stack,
 and there was a distinct lack of enthusiasm on Apple's part to wanting
 to touch anything around that.

 A point I would like to add to the discussion:

 For continuously updating data streams like games, you ideally want to
 never have more than one packet buffered at any given node, and if
 another packet arrives it should replace the currently buffered packet
 with the same source/dest address/port while maintaining the original
 position in the transmit queue.  This is orthogonal to the question of
 fairness scheduling among multiple streams.  Unfortunately, I suspect
 that there are enough applications that mix latency sensitive and bulk
 data transfer on the same port that it would be problematic to
 heuristically implement this behavior.  Having an IP header bit asking
 for it would be ideal, but that seems unlikely to fly any time soon...

 I am a little doubtful of the real world benefits that can be gained
 by updating routers at end user sites, because the problem is usually
 in the download direction.  A poster child case could be created with
 playing a game while doing a backup to a remote system, but that case
 should be much less common than multiple people in your house
 streaming video while you are playing a game.  If you do see some good
 results, I would be interested in hearing about them and possibly
 helping to promote the results.

 John Carmack


 -Original Message-
 From: Dave Täht [mailto:d...@taht.net] 
 Sent: Saturday, March 05, 2011 10:57 AM
 To: John Carmack
 Subject: Bufferbloat and 3D visualizations


 Dear John:

 Thx for the retweet the other day. I know you're a really busy guy, but...

 Really high on our list (as network guys, not graphics guys) is
 finding someone(s) that can turn the massive amounts of data we're
 accumulating about bufferbloat into a visualization or three that can
 show the dramatic impact bloat can have for latency under load for
 voip and gaming.

 If you know anyone, please forward this mail.

 When I see bufferbloat, I see something like gource's visualization
 of commit history:

 http://www.thealphablenders.com/2010/10/new-zealand-open-source-awards/

 - the bloated devices in the path exploding in size as the bottleneck
 moves, and exploding as the buffers are overrun.

 I also see (dynamically) stuff like this, with the vertical bars
 growing past the moon on a regular basis:

 http://personalpages.manchester.ac.uk/staff/m.dodge/cybergeography//atlas/geographic.html

 We released a debloat-testing[1] tree the other day, which uses the
 eBDP algorithm for rate control on wireless and incorporates the CHOKe
 and SFB AQMs. It's far from fully baked as yet, but is quite
 promising.

 In addition to jg's now well known TCP traces, I'm in the process this
 week of collecting RTP traces (VOIP) which is a lot easier to look at
 than TCP's methods of ramping up and down. There are those in the
 network business that think 40ms latency and jitter are ok... :(

 I'm also pretty interested in the structure of how (for example) quake
 packets do timestamps and how quake compensates for latency and packet
 loss.

(Minor update from a followup mail - quake used to mix and match but
John now recommends bulk transfers be done via TCP. I am thinking ccnx or
ledbat are possible better alternatives at the moment)


 But I'm totally graphically untalented.

 --
 Dave Taht
 http://nex-6.taht.net

 [1] 
 https://lists.bufferbloat.net/pipermail/bloat-devel/2011-February/61.html

 Ad astra per aspera!

[2] http://www.armadilloaerospace.com/n.x/Armadillo/Home/Gallery/Videos

-- 
Dave Taht
http://nex-6.taht.net
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat