Peer-to-peer overlay networks are a bad idea on a DSL-based internet.

Kragen Javier Sitaker Wed, 17 Aug 2011 21:04:47 -0700

Peer-to-peer overlay networks are inefficient on ADSL networks.  ADSL networks
are almost twice as efficient as SDSL networks.  Better alternatives require
redesigning the physical layer.


Peer-to-peer overlay networks are inefficient on ADSL networks.
---------------------------------------------------------------

I was reading an article by Steven Levy about Google not buying
[Skype][], and I was appalled by what Wesley Chan was quoted as
saying:

> He concluded that one of Skype’s key assets – its peer-to-peer
> technology — was a mismatch for Google, which worked on the newer
> paradigm of cloud computing.  “The worst thing about peer-to-peer is
> that it doesn’t work well with Google,” [Wesley] Chan told me
> [Steven Levy] during an amazing interview for IN THE PLEX in
> February 2010. “Peer-to-peer just eats up your bandwidth, right,
> it’s like the old technology.”

I was so disgusted that I made a note: “Google may not ‘be evil’, but
Wesley Chan surely is!”  As long as we’re stuck in the paradigm of
cloud computing, where we depend on software running within the data
centers of Google and companies like that instead of on our own
machines, we’re very vulnerable to abuses.

But upon some thought, I concluded that I was putting the blame in the
wrong place.  It’s ADSL’s fault, not Wesley Chan’s.  He’s just
observing the incentive structure created by ADSL.

**ADSL**: an ADSL line is a connection to the rest of the network that
is statically partitioned between a high-bandwidth part outwards for
data being sent *to* you, and a low-bandwidth part inwards for data
being sent *from* you, typically about an order of magnitude smaller.
In this environment, peer-to-peer programs really are inherently
inefficient: on average, they use just as much of your inwards
bandwidth as your outwards bandwidth, but your inwards bandwidth costs
you ten times as much.

So, **on an ADSL network, peer-to-peer networking is an order of
magnitude less efficient** than data-center-based applications.

But it gets worse.

[Skype]: http://www.stevenlevy.com/index.php/05/10/why-google-does-not-own-skype
  (Why Google does not own Skype, 2011-05-10, by Steven Levy)

ADSL networks are almost twice as efficient as SDSL networks.
-------------------------------------------------------------

When ADSL started to roll out in the late 1990s, I was horrified and
opposed.  It seemed like an unthinkable violation of the egalitarian
ethics of the internet, designed for brainless consumers of “content”
rather than full participants.

In July 2011, I changed my mind.  Here’s why.

### Content-centric networking models actual internet use better than TCP. ###

Van Jacobson’s “Content-Centric Networking” work is based on the
premise that almost all of our internet usage today consists not of
people connecting to remote computers that provide them some service
(the designed purpose of TELNET) or sending a message to a single
other person (like email or Skype or other VOIP) but rather retrieving
named pieces of data from some cloud storage space, or adding them to
it.

That is, it’s much more publish-and-retrieve (and possibly subscribe)
than request-response or send-and-receive; it’s one-to-many
communication spread over time, rather than synchronous one-to-one
communication.  But it’s built on top of the distributed synchronous
one-to-one communications provided by TCP and UDP, plus a lot of
ad-hoc barely-working multi-centralized server software, so it doesn’t
work as well as it could.  VJ’s plan is to put the
publish-and-retrieve into the network as much as possible instead of
endpoints.

I believe he is correct.

### SDSL is almost twice as costly as ADSL for content-centric use. ###

Let’s look at a simplified egalitarian internet. *All* the
communication is ultimately between ordinary people in their houses,
looking at each other’s cat photos and home videos; none of it is to
Hulu.  They are connected to interconnected telephone central offices
over long and expensive limited-bandwidth “last mile” links; the
central offices themselves are interconnected over
much-higher-bandwidth links.

How can we design our internet to make efficient use of scarce
resources?

One scarce resource in this scenario is last-mile bandwidth.  Assume
that the bandwidth of the last mile must be partitioned statically
between inwards (towards the CO) and outwards (towards the house)
directions, rather than negotiated dynamically.

The SDSL home-server story is that the bandwidth should be symmetric
because every time I download a cat photo on the outward half of my
connection, someone else has to upload it on the inward half of
theirs, so the average number of cat photos per second is the same on
inward and outward links.

But wait!  Consider all the cat photos that at least one person has
looked at over the internet.  Most of them have been looked at by only
one person over the internet.  But many of them have been looked at by
more than one person over the internet.  (None of them, by definition,
have been looked at by less than one person over the internet.)

That means that the **average number of views-over-the-internet per
photo** is greater than 1.  In fact, it’s probably substantially
greater than 1.  Say, 5 or 10.

In the SDSL home-server scenario, when people look at a particular cat
photo 5 times, the home server sends the cat photo inwards to the
central office 5 times, which then sends it outwards to the
link-clicker.

But that’s silly.  It would be more efficient to cache the cat photo
in the central office the first time it gets sent out from the home
server, then serve it from cache.  You’d get better latency and, at
least in theory, better reliability.  Right now we do this by storing
the cat photo in a data center on the disk of some broken-ass web app
that’s probably built on top of MySQL (Facebook, say), but you could
do it with Van Jacobson’s content-centric networking protocols, too,
or by putting a Varnish instance in front of the inward half of your
connection.

But, once you do this caching, however you do it, you have several
times as much bandwidth being used outward as being used inward.
Every cat photo only goes over an inward link a single time, and on
average goes outward several times, like 5 or 10.  Most of your inward
bandwidth necessarily goes idle.

This isn’t limited to asynchronous communication like posting a cat
photo on your page and hoping people will look at it later.  The same
thing holds for things like chat rooms: it uses less last-mile
bandwidth to have a server in a data center receive a single copy of
your line of chat, then send copies of it to everyone else in the chat
room, rather than forcing your client on your machine to send a copy
directly to each of the people in the room over your DSL connection.
(Multi-person videoconferencing is probably a more compelling
example.)

So, as long as you have to allocate the last-mile bandwidth
statically, you might as well allocate most of it to outward
bandwidth, rather than inward bandwidth.

The horrifying existence of abominations like Hulu and the iTunes
Music Store, then, is not the root of ADSL.  ADSL is just a more
efficient way of allocating limited last-mile bandwidth, but it
requires that the bulk of communications between people be mediated
through some kind of co-located “cloud” that avoids the need to upload
more than one copy of each file over your limited last-mile
connection.

The current legal and social structure of the “cloud” is far more
horrific than Hulu, though.  Instead of having a content-neutral
distributed publish-and-retrieve facility, we have Facebook
arbitrarily deleting photos of women breastfeeding and discussion
groups where Saudi women advocate for public transit in Riyadh,
YouTube selling your eyeballs to the highest bidder, and MySpace
forcing “terms of service” on you that you can’t possibly have time to
read, but which Lori Drew was nevertheless criminally prosecuted for
violating.

Better alternatives require redesigning the physical layer.
-----------------------------------------------------------

Both ADSL and SDSL are inefficient compared to the way Wi-Fi works,
which is typical of radio networking.  In Wi-Fi, data is only
traveling in one direction over the connection at any time: either
inwards or outwards.  That means that you don’t have to settle for
uploading your cat photos at 10% or 50% of the link’s bandwidth; you
can use 100%.  (In theory, anyway.  Wi-Fi itself has a lot of protocol
overhead.)

The static FDM bandwidth allocation used in SDSL and ADSL, in which
some frequency channels are reserved for each direction of the
communication, is primitive, obsolete 20th-century technology.  New
equipment that used adaptive CDMA or dynamic TDMA could provide
marginally better downstream bandwidth when upstream is little-used,
and dramatically better upstream bandwidth when needed.  I don’t know
of any such equipment in the market.

Another alternative, better adapted to the realities of
content-centric networking, is to adopt a more physically-based
topology.  As an example, it’s absurd that the block I live on has
hundreds of separate 3MHz copper pairs to it, perhaps totaling a
gigabit, mostly carrying duplicate traffic: many of the same cat
photos, news stories, and Wikipedia articles everyone else is reading.
Properly-thought-out content-centric networking --- still a pipe-dream
--- would enable us to cache those items locally and securely,
communicate with each other when necessary without routing our packets
through a phone-company central office, and use the entire bandwidth
of that gigabit when it’s left idle.  We ought to be able to use
multi-gigabit LAN connections to back up encrypted copies of our
important files to each other’s computers so that we don’t lose them.

-- 
To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-tol

Peer-to-peer overlay networks are a bad idea on a DSL-based internet.

Reply via email to