Re: [p2p-hackers] Structured Stream Transport

Bryan Ford Mon, 16 Jul 2007 01:43:23 -0700

On Jul 16, 2007, at 12:39 AM, David Barrett wrote:

This looks really impressive!  However, the subversion browser appears
busted (is throwing a Python exception).

Oops, sure enough - some change must have broken it recently. Thanksfor mentioning it; I'll check it out.

I've checked out the code and it
looks really clean. A few questions about the congestion-controlcode:
1) I see you have a few different modules: TCP, VEGAS, DELAY, and
AGGRESSIVE.  What's DELAY and AGGRESSIVE?

DELAY is an experimental TCP-friendly delay-based scheme; AGGRESSIVEwas sort of a "straw man" scheme that doesn't try to be TCP-friendlybut instead tries to explore somewhat aggressive behaviors that couldstill be reasonable in contexts where TCP-friendliness isn'trequired. Neither of these are particularly well-founded in theoryor thoroughly tested, and I definitely wouldn't recommend usingeither of them as a default scheme without further refinement. (Asyou can see from the code, plain old TCP congestion control iscurrently the default; the others should be considered toys forexperimentation.)

2) In TCP/VEGAS/DELAY mode, on a loss it seems to fall back asfollows:
        ssthresh = max( cwnd/2, CWND_MIN )

But spec (RFC2581, eq.3) recommends:

        ssthresh = max( FlightSize/2, 2*SMSS )
I'm curious if it's intentional to fall back to half of cwnd versushalf ofthe in-flight count, or if this is a bug? If it's intentional,what is the
resulting effect?

It's intentional, but it only makes sense when you also take intoaccount the 'cwndlim' variable and how it's used - grep for 'if(cwndlim) { ...'.

As far as I know, the primary reason for RFC2581's recommendation ofusing flightsize/2 instead of cwnd/2 on a loss event is because thestandard TCP congestion control algorithm allows the sender toincrement the congestion window every round-trip time *even if therewasn't enough data to send during that round trip to fill theexisting window*. Thus, over a long semi-idle period where data is"trickling" through the connection at a rate large enough to avoidtriggering a new slow-start from scratch, but slowly enough not tocause any packet losses due to congestion, the cwnd can slowly getincremented _way_ above any reasonable value. So when the send rateincreases and you do eventually experience a loss event, it'simportant to use flightsize/2 instead of cwnd/2 because cwnd can bepretty much arbitrarily large and might no longer have anyrelationship whatsoever to the actual available bandwidth.

But to me that just seems like a bug in the way the traditionalscheme manages cwnd in the first place. It may be a "relatively"harmless bug as long as you use flightsize/2 on a loss event, buteven then it's not completely harmless, because after a long semi-idle period it can still cause the sender to "stuff" the connectionbriefly with a truly unreasonable number of packets, most of whichwill definitely get lost, perhaps in the process causing a seriousburst of congestion throughout the network that tramples on manyother flows as well. A better solution seemed to me just to ensurethat cwnd never gets incremented to an unreasonably high value in thefirst place. That's easy to do during congestion avoidance, forexample, by incrementing cwnd once per round-trip *only* if datatransmission was actually congestion-limited at some point duringthat round-trip. That's the purpose of the cwndlim flag. (A similartest is made during slow-start, as you can see from the code.)During a long semi-idle or underutilized period, cwnd will getincremented to represent the maximum bandwidth that actually getsused, and stay there; if the sender then subsequently starts sendingdata as quickly as possible, cwnd starts increasing again from thatpoint.

Since cwnd never gets to be much greater than flightsize with thismodification, it seemed safe and reasonable for loss events just tocut cwnd to cwnd/2 directly instead of using flightsize/2, since it'scwnd that we're really trying to cut in half anyway according to thecongestion control theory that the algorithm is based on.

I'm not a congestion control expert, though, and although this slightmodification to the classic algorithm seems safe to me and has provedto behave very well in all of my testing so far, it obviously shouldbe looked at by a real congestion control authority, and of coursetested more thoroughly, before the code goes into production use.

3) I note you accept any subsequent sequence number for the RTOcalculation,rather than requiring the exact packet. Very clever; is this morestable in
a high-loss scenario?

Yes, that was the idea, although I haven't done any specific testingto verify that hunch.

4) What's all that power-computation business in the DELAY
congestion-control mode? Is this basically using changes inthroughput to
trigger backoff rather than packet loss?  Is this Fast TCP?

The DELAY scheme in the code is inspired by several existing delay-sensitive congestion control schemes, but doesn't correspond exactlyto any of them. It seemed to work reasonably well in the limitedtest scenarios under which I've tested it, but it's definitely notready for serious use. In particular, I haven't done any realtesting for stability under more adverse conditions such as manyconcurrent flows or paths with large bandwidth-delay product. TheDELAY scheme is currently just a toy I was using to play with delay-based congestion control, and nothing more.

Although the VEGAS scheme "more or less" implements the standard TCPVegas scheme, which at least has some real research behind it, Iwouldn't really recommend using that either for serious use. Forexample, I'm pretty sure it'll have problems if the baselineavailable bandwidth over a connection changes dynamically due to arouting change or something like that - and as far as I can tellthat's a fundamental problem with the actual TCP Vegas scheme, notjust with my implementation of it. (And I think other researchershave discovered other, perhaps more serious, flaws in classic TCPVegas too, since it was first published.)

In your experience, which of the congestion modes gives the bestresults,
and under what scenarios?  Have you done much testing on this?

The only CC scheme currently in SST that's quite well-tested and forwhich I have high confidence is the (more-or-less) classic loss-basedTCP scheme. I would like to get a solid delay-based scheme into SST,because delay-sensitive schemes tend to keep the connection's round-trip latency under load much closer to the path's minimum latencyrather than oscillating between that and that plus the maximumbuffering at the bottleneck link. For things like VoIP we obviouslywould like consistently low latency if possible. Both of the toydelay-based schemes in SST do indeed achieve this nice "minimumlatency preserving" behavior under the _very_ limited scenarios overwhich I've tested them, but I don't expect either of them to hold upas-implemented over more diverse situations. And I haven't workedhard to get a "proper" delay-based scheme into SST yet because it'sstill not clear to me what the "best" delay-based scheme is, anyway -the field of delay-based CC still seems to be very much a topic ofresearch and regular upheaval as far as I can tell, and I haven'tbeen able to follow it consistently. TFRC (RFC 3448, RFC 4828) atleast seems to be getting some traction in the standards bodies, buteven that scheme still seems very ad hoc to me. So I really can'toffer any great wisdom regarding congestion control at the moment -for now I'm just sticking to classic TCP congestion control forserious use, and hoping some delay-based scheme will eventuallyappear that's obviously "the right one", at which point I'llimplement that in SST. If that doesn't happen, well, who knows...


Cheers,
Bryan


-david

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:p2p-hackers-
[EMAIL PROTECTED] On Behalf Of Bryan Ford
Sent: Thursday, July 12, 2007 5:20 AM
To: theory and practice of decentralized computer networks
Subject: [p2p-hackers] Structured Stream Transport

[Partly in response to David's note on UDT...]

For anyone needing an application-linkable transport protocol library
that implements congestion control, reliability, security, NAT
traversal, and other cool features - and who isn't afraid of unstable
bleeding-edge code - there's an early user-space prototype

implementation of my "Structured Stream Transport" (SST) availablehere:


        http://pdos.csail.mit.edu/uia/sst/

The SST protocol itself provides an enhanced transport abstraction
that subsumes traditional streams and datagrams, and is described in
more detail in a paper to appear in this year's SIGCOMM:

        Structured Streams: a New Transport Abstraction
        http://www.brynosaurus.com/pub/net/sst-abs.html

I haven't widely announced the protocol or library "in the wild" yet
because the code isn't quite ready for production use and the
protocol spec is not yet complete or fully consistent with either the
code or the SIGCOMM paper.  So consider yourself warned. :)  I'm
hoping to fix most of the remaining barriers to real use within the
next couple months.  One immediate challenge for potential users of
the code is that it relies on the Qt 4 toolkit - only the QtCore and
QtNet libraries; note that it _doesn't_ need the monstrous ~7MB QtGui
library - but any dependency on Qt means integrating the Qt event
loop with whatever event loop your application uses (at least under
Unix or Mac).  This is obviously trivial if your application happens
to be Qt-based but takes a bit of work (probably not too much)
otherwise.  I'm willing to help anyone seriously interested in using
SST in a real application to smooth over these technical issues.

Cheers,
Bryan


_______________________________________________
p2p-hackers mailing list
[email protected]
http://lists.zooko.com/mailman/listinfo/p2p-hackers


_______________________________________________
p2p-hackers mailing list
[email protected]
http://lists.zooko.com/mailman/listinfo/p2p-hackers


_______________________________________________
p2p-hackers mailing list
[email protected]
http://lists.zooko.com/mailman/listinfo/p2p-hackers

Re: [p2p-hackers] Structured Stream Transport

Reply via email to