Observations of BitTorrent behavior, with an oversimplified model -----------------------------------------------------------------
(This varies a lot from torrent to torrent.) On average, the number of seeders on a BitTorrent torrent is around 10% of the number of leechers, a number that gradually decreases until the torrent dies. This is not true for all torrents, but it's true of a substantial number of them. If you're using such a torrent to publish a file (such as a home video or collection of multi-megapixel photographs) that you wish to remain continuously available, you must operate your own permanent seed. However, you'd like most of the file data in your torrent to travel from one peer to another, rather than from your own seed to the peers. To simplify, I assume that there are no incomplete downloads, and each leecher becomes a seeder for some period of time after they finish --- about 10% of the time they had been leeching. As time goes on, the number of leechers in the torrent approaches the time to download a complete copy of the file, multiplied by the frequency of new requests; P, the number of total peers aside from the permanent seed, is the number of leechers multiplied by a constant around 1.1. Whenever P is 1 or 0, all of the data transmitted in the system must comes from your permanent seed, and BitTorrent degrades to a download protocol similar to FTP or HTTP, but with better protection against corruption of data in transit. Whenever P > 1, non-permanent peers can exchange data with one another (since they're online at the same time), and the load on the permanent seed is lessened. Indeed, if the number of other peers remains above 1 permanently, the permanent seed need never send out data blocks (under the simplifying assumptions above.) Where BitTorrent works well --------------------------- Mostly BitTorrent has been used successfully to date with P in the hundreds, sometimes with a permanent seed and sometimes without. But since P is proportional to the frequency of new requests and the file size, but inversely proportional to the bandwidth to these peers, BitTorrent has not been very successful at distributing small files or those with only a few requestors at any given time. Making BitTorrent work better through aggregation ------------------------------------------------- One answer is multi-file torrents, or ISO torrents, in which a torrent contains hundreds of megabytes of data belonging to many different files. This attacks the problem on two fronts: first, it increases the number of requests for a particular torrent by aggregating demand for many individual files into a single torrent, and second, it increases the amount of time required to complete the download. This may have problems when applied to very large files, however. BitTorrent works much better than previous peer-to-peer file-sharing systems for several reasons, of which the best known is that the software embodies a social norm of reciprocation. The most effective way to get good service from a torrent is to send data to other participants so that they will want to send data to you. Therefore, modifying your own copy of the software to improve your own service at the cost of others is quite difficult. (Most previous systems also suffered from not having "swarming downloads" and from attempting to provide search and confidentiality or deniability facilities.) Suppose, though, you're downloading a 300MB OurMedia video, and you find that it's packaged into a 10GB UDF filesystem, which will take up US$10 of disk space when it finishes downloading, cost you about US$3 of bandwidth at US DSL rates to download, and a similar amount of bandwidth to upload to others, and worst of all, it will take about a day; while downloading the video alone would cost you only about US$0.15 upload and US$0.15 to upload and take less than an hour. You might modify your BitTorrent client's piece-selection algorithm to preferentially request the parts of the UDF filesystem that interests you, and preferentially talk to peers who have it. If everyone does this, the result would be rather as if they're divided among 33 different torrents; each peer stays online for less time, and other peers it will want to talk to come online less often. Making aggregation more effective --------------------------------- Suppose that instead of aggregating 33 OurMedia videos into a single UDF filesystem, the parts of which can be sensibly requested separately, we use an M-of-N secret-sharing scheme to encode the 33 videos into a file such that you must download the whole file, or nearly the whole file, to recover any of the original videos from it. Now we have aligned everyone's incentives properly, without making any changes to the BitTorrent protocol or software --- simply by adding an archiving/unarchiving step outside the purview of BitTorrent itself. Nobody can gain any use from this torrent without downloading the whole thing, or nearly so. The drawbacks of aggregation ---------------------------- The aggregation approach is costly. Even if everyone participating in the earlier-mentioned 10GB torrent is interested in only one video, they have to pay an aggregation tax of 9.7GB --- uploaded, downloaded, and stored --- merely to provide incentives for others to cooperate with them. Economically, this is nearly a deadweight loss. It can be diminished slightly by aggregating related files instead of randomly selected ones. As a consequence, ISPs will wish their customers would download smaller torrents, and pay-for-download services (PayPal me $2 and I'll give you this 300MB file) will have a total cost advantage over peer-to-peer downloading. Other approaches to supporting small and unpopular files with BitTorrent ------------------------------------------------------------------------ Trickling --------- When P = 1, the permanent seed controls the amount of time the client is online by how fast it sends the requested file. Consequently it can lengthen the time to download the file to increase the likelihood that P will increase as another peer comes online. However, if there are only a few ephemeral peers and they are talking to one another quickly, each time the permanent seed sends the last block of the file, the other peers will quickly communicate it among themselves and soon go offline. Still, this is a way to triple or quadruple your effective bandwidth, which is worth something. CacheLogic ---------- CacheLogic is a company that sells peer-to-peer caching devices to ISPs. The idea is that the ISP saves a copy of whatever BitTorrent data passes through it, and serves it to you from their local copy whenever you request it. (I'm not clear on whether it merely participates as a peer in the torrent or actually hijacks your attempted connections to other peers.) Payment ------- MojoNation, and later MNet, had the idea that you would exchange some sort of currency for services such as file downloads. This allows the reciprocity enforcement mechanism to extend beyond pairwise relationships into group relationships, and to provide a measure of otherwise incommensurable goods. This would provide an incentive for people to download files they didn't themselves want, so that they could earn credit from other people who did want them, and then use that credit to download things they did want. I have written about this a bit on the CommerceNet zLab Wiki.

