Observations of BitTorrent behavior, with an oversimplified model
-----------------------------------------------------------------

(This varies a lot from torrent to torrent.)

On average, the number of seeders on a BitTorrent torrent is around
10% of the number of leechers, a number that gradually decreases until
the torrent dies.  This is not true for all torrents, but it's true of
a substantial number of them.

If you're using such a torrent to publish a file (such as a home video
or collection of multi-megapixel photographs) that you wish to remain
continuously available, you must operate your own permanent seed.
However, you'd like most of the file data in your torrent to travel
from one peer to another, rather than from your own seed to the peers.

To simplify, I assume that there are no incomplete downloads, and each
leecher becomes a seeder for some period of time after they finish ---
about 10% of the time they had been leeching.

As time goes on, the number of leechers in the torrent approaches the
time to download a complete copy of the file, multiplied by the
frequency of new requests; P, the number of total peers aside from the
permanent seed, is the number of leechers multiplied by a constant
around 1.1.

Whenever P is 1 or 0, all of the data transmitted in the system must
comes from your permanent seed, and BitTorrent degrades to a download
protocol similar to FTP or HTTP, but with better protection against
corruption of data in transit.

Whenever P > 1, non-permanent peers can exchange data with one another
(since they're online at the same time), and the load on the permanent
seed is lessened.  Indeed, if the number of other peers remains above
1 permanently, the permanent seed need never send out data blocks
(under the simplifying assumptions above.)

Where BitTorrent works well
---------------------------

Mostly BitTorrent has been used successfully to date with P in the
hundreds, sometimes with a permanent seed and sometimes without.  But
since P is proportional to the frequency of new requests and the file
size, but inversely proportional to the bandwidth to these peers,
BitTorrent has not been very successful at distributing small files or
those with only a few requestors at any given time.

Making BitTorrent work better through aggregation
-------------------------------------------------

One answer is multi-file torrents, or ISO torrents, in which a torrent
contains hundreds of megabytes of data belonging to many different
files.  This attacks the problem on two fronts: first, it increases
the number of requests for a particular torrent by aggregating demand
for many individual files into a single torrent, and second, it
increases the amount of time required to complete the download.

This may have problems when applied to very large files, however.

BitTorrent works much better than previous peer-to-peer file-sharing
systems for several reasons, of which the best known is that the
software embodies a social norm of reciprocation.  The most effective
way to get good service from a torrent is to send data to other
participants so that they will want to send data to you.  Therefore,
modifying your own copy of the software to improve your own service at
the cost of others is quite difficult.

(Most previous systems also suffered from not having "swarming
downloads" and from attempting to provide search and confidentiality
or deniability facilities.)

Suppose, though, you're downloading a 300MB OurMedia video, and you
find that it's packaged into a 10GB UDF filesystem, which will take up
US$10 of disk space when it finishes downloading, cost you about US$3
of bandwidth at US DSL rates to download, and a similar amount of
bandwidth to upload to others, and worst of all, it will take about a
day; while downloading the video alone would cost you only about
US$0.15 upload and US$0.15 to upload and take less than an hour.  You
might modify your BitTorrent client's piece-selection algorithm to
preferentially request the parts of the UDF filesystem that interests
you, and preferentially talk to peers who have it.

If everyone does this, the result would be rather as if they're
divided among 33 different torrents; each peer stays online for less
time, and other peers it will want to talk to come online less often.

Making aggregation more effective
---------------------------------

Suppose that instead of aggregating 33 OurMedia videos into a single
UDF filesystem, the parts of which can be sensibly requested
separately, we use an M-of-N secret-sharing scheme to encode the 33
videos into a file such that you must download the whole file, or
nearly the whole file, to recover any of the original videos from it.

Now we have aligned everyone's incentives properly, without making any
changes to the BitTorrent protocol or software --- simply by adding an
archiving/unarchiving step outside the purview of BitTorrent itself.
Nobody can gain any use from this torrent without downloading the
whole thing, or nearly so.

The drawbacks of aggregation
----------------------------

The aggregation approach is costly.  Even if everyone participating in
the earlier-mentioned 10GB torrent is interested in only one video,
they have to pay an aggregation tax of 9.7GB --- uploaded, downloaded,
and stored --- merely to provide incentives for others to cooperate
with them.  Economically, this is nearly a deadweight loss.  It can be
diminished slightly by aggregating related files instead of randomly
selected ones.

As a consequence, ISPs will wish their customers would download
smaller torrents, and pay-for-download services (PayPal me $2 and I'll
give you this 300MB file) will have a total cost advantage over
peer-to-peer downloading.

Other approaches to supporting small and unpopular files with BitTorrent
------------------------------------------------------------------------

Trickling
---------

When P = 1, the permanent seed controls the amount of time the client
is online by how fast it sends the requested file.  Consequently it
can lengthen the time to download the file to increase the likelihood
that P will increase as another peer comes online.  However, if there
are only a few ephemeral peers and they are talking to one another
quickly, each time the permanent seed sends the last block of the
file, the other peers will quickly communicate it among themselves and
soon go offline.  Still, this is a way to triple or quadruple your
effective bandwidth, which is worth something.

CacheLogic
----------

CacheLogic is a company that sells peer-to-peer caching devices to
ISPs.  The idea is that the ISP saves a copy of whatever BitTorrent
data passes through it, and serves it to you from their local copy
whenever you request it.  (I'm not clear on whether it merely
participates as a peer in the torrent or actually hijacks your
attempted connections to other peers.)

Payment
-------

MojoNation, and later MNet, had the idea that you would exchange some
sort of currency for services such as file downloads.  This allows the
reciprocity enforcement mechanism to extend beyond pairwise
relationships into group relationships, and to provide a measure of
otherwise incommensurable goods.  This would provide an incentive for
people to download files they didn't themselves want, so that they
could earn credit from other people who did want them, and then use
that credit to download things they did want.

I have written about this a bit on the CommerceNet zLab Wiki.

Reply via email to