Peer-to-peer overlay networks are inefficient on ADSL networks. ADSL networks are almost twice as efficient as SDSL networks. Better alternatives require redesigning the physical layer.
Peer-to-peer overlay networks are inefficient on ADSL networks. --------------------------------------------------------------- I was reading an article by Steven Levy about Google not buying [Skype][], and I was appalled by what Wesley Chan was quoted as saying: > He concluded that one of Skype’s key assets – its peer-to-peer > technology — was a mismatch for Google, which worked on the newer > paradigm of cloud computing. “The worst thing about peer-to-peer is > that it doesn’t work well with Google,” [Wesley] Chan told me > [Steven Levy] during an amazing interview for IN THE PLEX in > February 2010. “Peer-to-peer just eats up your bandwidth, right, > it’s like the old technology.” I was so disgusted that I made a note: “Google may not ‘be evil’, but Wesley Chan surely is!” As long as we’re stuck in the paradigm of cloud computing, where we depend on software running within the data centers of Google and companies like that instead of on our own machines, we’re very vulnerable to abuses. But upon some thought, I concluded that I was putting the blame in the wrong place. It’s ADSL’s fault, not Wesley Chan’s. He’s just observing the incentive structure created by ADSL. **ADSL**: an ADSL line is a connection to the rest of the network that is statically partitioned between a high-bandwidth part outwards for data being sent *to* you, and a low-bandwidth part inwards for data being sent *from* you, typically about an order of magnitude smaller. In this environment, peer-to-peer programs really are inherently inefficient: on average, they use just as much of your inwards bandwidth as your outwards bandwidth, but your inwards bandwidth costs you ten times as much. So, **on an ADSL network, peer-to-peer networking is an order of magnitude less efficient** than data-center-based applications. But it gets worse. [Skype]: http://www.stevenlevy.com/index.php/05/10/why-google-does-not-own-skype (Why Google does not own Skype, 2011-05-10, by Steven Levy) ADSL networks are almost twice as efficient as SDSL networks. ------------------------------------------------------------- When ADSL started to roll out in the late 1990s, I was horrified and opposed. It seemed like an unthinkable violation of the egalitarian ethics of the internet, designed for brainless consumers of “content” rather than full participants. In July 2011, I changed my mind. Here’s why. ### Content-centric networking models actual internet use better than TCP. ### Van Jacobson’s “Content-Centric Networking” work is based on the premise that almost all of our internet usage today consists not of people connecting to remote computers that provide them some service (the designed purpose of TELNET) or sending a message to a single other person (like email or Skype or other VOIP) but rather retrieving named pieces of data from some cloud storage space, or adding them to it. That is, it’s much more publish-and-retrieve (and possibly subscribe) than request-response or send-and-receive; it’s one-to-many communication spread over time, rather than synchronous one-to-one communication. But it’s built on top of the distributed synchronous one-to-one communications provided by TCP and UDP, plus a lot of ad-hoc barely-working multi-centralized server software, so it doesn’t work as well as it could. VJ’s plan is to put the publish-and-retrieve into the network as much as possible instead of endpoints. I believe he is correct. ### SDSL is almost twice as costly as ADSL for content-centric use. ### Let’s look at a simplified egalitarian internet. *All* the communication is ultimately between ordinary people in their houses, looking at each other’s cat photos and home videos; none of it is to Hulu. They are connected to interconnected telephone central offices over long and expensive limited-bandwidth “last mile” links; the central offices themselves are interconnected over much-higher-bandwidth links. How can we design our internet to make efficient use of scarce resources? One scarce resource in this scenario is last-mile bandwidth. Assume that the bandwidth of the last mile must be partitioned statically between inwards (towards the CO) and outwards (towards the house) directions, rather than negotiated dynamically. The SDSL home-server story is that the bandwidth should be symmetric because every time I download a cat photo on the outward half of my connection, someone else has to upload it on the inward half of theirs, so the average number of cat photos per second is the same on inward and outward links. But wait! Consider all the cat photos that at least one person has looked at over the internet. Most of them have been looked at by only one person over the internet. But many of them have been looked at by more than one person over the internet. (None of them, by definition, have been looked at by less than one person over the internet.) That means that the **average number of views-over-the-internet per photo** is greater than 1. In fact, it’s probably substantially greater than 1. Say, 5 or 10. In the SDSL home-server scenario, when people look at a particular cat photo 5 times, the home server sends the cat photo inwards to the central office 5 times, which then sends it outwards to the link-clicker. But that’s silly. It would be more efficient to cache the cat photo in the central office the first time it gets sent out from the home server, then serve it from cache. You’d get better latency and, at least in theory, better reliability. Right now we do this by storing the cat photo in a data center on the disk of some broken-ass web app that’s probably built on top of MySQL (Facebook, say), but you could do it with Van Jacobson’s content-centric networking protocols, too, or by putting a Varnish instance in front of the inward half of your connection. But, once you do this caching, however you do it, you have several times as much bandwidth being used outward as being used inward. Every cat photo only goes over an inward link a single time, and on average goes outward several times, like 5 or 10. Most of your inward bandwidth necessarily goes idle. This isn’t limited to asynchronous communication like posting a cat photo on your page and hoping people will look at it later. The same thing holds for things like chat rooms: it uses less last-mile bandwidth to have a server in a data center receive a single copy of your line of chat, then send copies of it to everyone else in the chat room, rather than forcing your client on your machine to send a copy directly to each of the people in the room over your DSL connection. (Multi-person videoconferencing is probably a more compelling example.) So, as long as you have to allocate the last-mile bandwidth statically, you might as well allocate most of it to outward bandwidth, rather than inward bandwidth. The horrifying existence of abominations like Hulu and the iTunes Music Store, then, is not the root of ADSL. ADSL is just a more efficient way of allocating limited last-mile bandwidth, but it requires that the bulk of communications between people be mediated through some kind of co-located “cloud” that avoids the need to upload more than one copy of each file over your limited last-mile connection. The current legal and social structure of the “cloud” is far more horrific than Hulu, though. Instead of having a content-neutral distributed publish-and-retrieve facility, we have Facebook arbitrarily deleting photos of women breastfeeding and discussion groups where Saudi women advocate for public transit in Riyadh, YouTube selling your eyeballs to the highest bidder, and MySpace forcing “terms of service” on you that you can’t possibly have time to read, but which Lori Drew was nevertheless criminally prosecuted for violating. Better alternatives require redesigning the physical layer. ----------------------------------------------------------- Both ADSL and SDSL are inefficient compared to the way Wi-Fi works, which is typical of radio networking. In Wi-Fi, data is only traveling in one direction over the connection at any time: either inwards or outwards. That means that you don’t have to settle for uploading your cat photos at 10% or 50% of the link’s bandwidth; you can use 100%. (In theory, anyway. Wi-Fi itself has a lot of protocol overhead.) The static FDM bandwidth allocation used in SDSL and ADSL, in which some frequency channels are reserved for each direction of the communication, is primitive, obsolete 20th-century technology. New equipment that used adaptive CDMA or dynamic TDMA could provide marginally better downstream bandwidth when upstream is little-used, and dramatically better upstream bandwidth when needed. I don’t know of any such equipment in the market. Another alternative, better adapted to the realities of content-centric networking, is to adopt a more physically-based topology. As an example, it’s absurd that the block I live on has hundreds of separate 3MHz copper pairs to it, perhaps totaling a gigabit, mostly carrying duplicate traffic: many of the same cat photos, news stories, and Wikipedia articles everyone else is reading. Properly-thought-out content-centric networking --- still a pipe-dream --- would enable us to cache those items locally and securely, communicate with each other when necessary without routing our packets through a phone-company central office, and use the entire bandwidth of that gigabit when it’s left idle. We ought to be able to use multi-gigabit LAN connections to back up encrypted copies of our important files to each other’s computers so that we don’t lose them. -- To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-tol

