[Bloat] Bufferbloat and Internet gaming

2011-03-12 Thread Dave Täht

John Carmack (of armadillo aerospace[2] and idsoftware fame) gave me
permission to repost from this private email...

John Carmack jo...@idsoftware.com writes:

 I would really like to help, but I don't know of a good resource for
 you [for 3D visualizations of bufferbloat].  If I could clone myself a
 couple times, I would put one of them on the task...

 Buffer bloat was a huge issue for us back in the modem days, when we
 had to contend with multiple internal buffers in modems for data
 compression and error correction, on top of the serial buffers in the
 host.  Old school modem players generally have memories of surge
 lag, where things were going smoothly until the action spiked and the
 play could then get multiple seconds of latency backed up into
 buffers.

 I had always advocated for query routines on the local host, because I
 would choose to not send a packet if I knew it was going to get stuck
 in a bloating buffer, but this has been futile.  There is a strong
 aversion to wanting to add anything at the device driver level, and
 visibility at the OS level has limited benefits.  Just the last year I
 identified some bufferbloat related issues on the iPhone WiFi stack,
 and there was a distinct lack of enthusiasm on Apple's part to wanting
 to touch anything around that.

 A point I would like to add to the discussion:

 For continuously updating data streams like games, you ideally want to
 never have more than one packet buffered at any given node, and if
 another packet arrives it should replace the currently buffered packet
 with the same source/dest address/port while maintaining the original
 position in the transmit queue.  This is orthogonal to the question of
 fairness scheduling among multiple streams.  Unfortunately, I suspect
 that there are enough applications that mix latency sensitive and bulk
 data transfer on the same port that it would be problematic to
 heuristically implement this behavior.  Having an IP header bit asking
 for it would be ideal, but that seems unlikely to fly any time soon...

 I am a little doubtful of the real world benefits that can be gained
 by updating routers at end user sites, because the problem is usually
 in the download direction.  A poster child case could be created with
 playing a game while doing a backup to a remote system, but that case
 should be much less common than multiple people in your house
 streaming video while you are playing a game.  If you do see some good
 results, I would be interested in hearing about them and possibly
 helping to promote the results.

 John Carmack


 -Original Message-
 From: Dave Täht [mailto:d...@taht.net] 
 Sent: Saturday, March 05, 2011 10:57 AM
 To: John Carmack
 Subject: Bufferbloat and 3D visualizations


 Dear John:

 Thx for the retweet the other day. I know you're a really busy guy, but...

 Really high on our list (as network guys, not graphics guys) is
 finding someone(s) that can turn the massive amounts of data we're
 accumulating about bufferbloat into a visualization or three that can
 show the dramatic impact bloat can have for latency under load for
 voip and gaming.

 If you know anyone, please forward this mail.

 When I see bufferbloat, I see something like gource's visualization
 of commit history:

 http://www.thealphablenders.com/2010/10/new-zealand-open-source-awards/

 - the bloated devices in the path exploding in size as the bottleneck
 moves, and exploding as the buffers are overrun.

 I also see (dynamically) stuff like this, with the vertical bars
 growing past the moon on a regular basis:

 http://personalpages.manchester.ac.uk/staff/m.dodge/cybergeography//atlas/geographic.html

 We released a debloat-testing[1] tree the other day, which uses the
 eBDP algorithm for rate control on wireless and incorporates the CHOKe
 and SFB AQMs. It's far from fully baked as yet, but is quite
 promising.

 In addition to jg's now well known TCP traces, I'm in the process this
 week of collecting RTP traces (VOIP) which is a lot easier to look at
 than TCP's methods of ramping up and down. There are those in the
 network business that think 40ms latency and jitter are ok... :(

 I'm also pretty interested in the structure of how (for example) quake
 packets do timestamps and how quake compensates for latency and packet
 loss.

(Minor update from a followup mail - quake used to mix and match but
John now recommends bulk transfers be done via TCP. I am thinking ccnx or
ledbat are possible better alternatives at the moment)


 But I'm totally graphically untalented.

 --
 Dave Taht
 http://nex-6.taht.net

 [1] 
 https://lists.bufferbloat.net/pipermail/bloat-devel/2011-February/61.html

 Ad astra per aspera!

[2] http://www.armadilloaerospace.com/n.x/Armadillo/Home/Gallery/Videos

-- 
Dave Taht
http://nex-6.taht.net
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Measuring latency-under-load consistently

2011-03-12 Thread Fred Baker
At the risk of sounding like someone mentioning a product, let me mention a 
product. This assumes, of course, that you're using Cisco equipment. But it 
allows you to measure delay (how long does it take to get from here to there), 
jitter (first derivative of delay/dt), and packet loss.

http://www.cisco.com/en/US/tech/tk869/tk769/technologies_white_paper09186a00801b1a1e.shtml#definedelayjitterpacketloss
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Measuring latency-under-load consistently

2011-03-12 Thread Jonathan Morton

On 13 Mar, 2011, at 12:21 am, Fred Baker wrote:

 At the risk of sounding like someone mentioning a product, let me mention a 
 product. This assumes, of course, that you're using Cisco equipment. But it 
 allows you to measure delay (how long does it take to get from here to 
 there), jitter (first derivative of delay/dt), and packet loss.

Ping does most of this, and is available on your actual computer.  A little 
post-processing of the output gives you jitter, if it doesn't supply that 
natively.

The point is, the existing tools don't typically measure latency *under load*.

 - Jonathan

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Measuring latency-under-load consistently

2011-03-12 Thread Jonathan Morton

On 13 Mar, 2011, at 12:23 am, richard wrote:

 However, large TCP windows do consume RAM in both server and client,
 and with a sufficient number of simultaneous clients, that could
 theoretically cause trouble.  Constraining TCP windows to near the
 actual BDP is more efficient all around.
 
 Yes - have had RAM exhaustion problems on busy servers with large video
 files - major headache.

Sounds like a good reason to switch to Vegas or at least Illinois on those 
servers.  Those are much less aggressive at growing the congestion window than 
the default CUBIC.

 - Jonathan

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] Thoughts on Stochastic Fair Blue

2011-03-12 Thread Jonathan Morton
Having read some more documents and code, I have some extra insight into SFB 
that might prove helpful.  Note that I haven't actually tried it yet, but it 
looks good anyway.  In control-systems parlance, this is effectively a 
multichannel I-type controller, where RED is a single-channel P-type controller.

My first thought after reading just the paper was that unconditionally dropping 
the packets which increase the marking probability was suspect.  It should be 
quite possible to manage a queue using just ECN, without any packet loss, in 
simple cases such as a single bulk TCP flow.  Thus I am pleased to see that the 
SFB code in the debloat tree has separate thresholds for increasing the marking 
rate and tail-dropping.  They are fairly close together, but they are at least 
distinct.

My second observation is that marking and dropping both happen at the tail of 
the queue, not the head.  This delays the congestion information reaching the 
receiver, and from there to the sender, by the length of the queue - which does 
not appear to be self-tuned by the flow rate.  However, the default values 
appear to be sensible.

Another observation is that packets are not re-ordered by SFB, which (given the 
Bloom filter) is potentially a missed opportunity.  However, they can be 
re-ordered in the current implementation by using child qdiscs such as SFQ, 
with or without HTB in tandem to capture the queue from a downstream dumb 
device.  The major concern with this arrangement is the incompatibility with 
qdiscs that can drop packets internally, since this is not necessarily obvious 
to end-user admins.

I also thought of a different way to implement the hash rotation.  Instead of 
shadowing the entire set of buckets, simply replace the hash on one row at a 
time.  This requires that the next-to-minimum values for q_len and p_mark are 
used, rather than the strict minima.  It is still necessary to calculate two 
hash values for each packet, but the memory requirements are reduced at the 
expense of effectively removing one row from the Bloom filter.

 - Jonathan

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat