>ANY localization discovery protocol must also include cache discovery, treating caches as special nodes which will only serve a set of subranges.
I agree that the overall localization solution contemplated by the IETF probably should include cache discovery and usage. (In particular, the overall solution must not preclude the inclusion of caching, although caching may or may not be part of any particular deployment.) I'm not yet 100% convinced that the localization discovery protocol and the cache discovery protocol need to be the *same* protocol. But to be fair, I changed my mind on this subject just IETF Dublin... -- Rich -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nicholas Weaver Sent: Wednesday, December 03, 2008 2:27 PM To: Y. R. Yang Cc: Le Blond, Stevens ; Nicholas Weaver; Arnaud Legout; [email protected] Subject: Re: [alto] Paper on "Pushing BitTorrent Locality to the Limit" On Dec 3, 2008, at 9:18 AM, Y. R. Yang wrote: > > I found this discussion on simulation, controlled experiments, and > real > trials quite interesting. I found the paper Pushing BitTorrent > Locality to > the Limit quite interesting. Here are a few observations from our > experiences: The other factor is that you can also push the mental dial up to 100% localization: With HTTP/server without caches: N copies inbound on the transit link 0 copies outbound on the transit link N copies through the ISP's internal cloud N copies inbound on the last hop 0 copies outbound on the last hop Performance is limited by the web server's upload bandwidth, the transit link, and the end user's download bandwidth ISP cost is the transit link and the last hop Content provider's cost is their transit cost (often much less than ISP transit cost) With HTTP/server with distributed caches (akamai, CoralCache, inline HTTP cache) with perfection 1 copies inbound on the transit link 0 copies outbound on the transit link N+1 copies through the ISP's internal cloud N copies inbound on the last hop 0 copies outbound on the last hop Performance is limited by the cache's bandwidth and the end user's download bandwidth ISP's cost is just the last hop Content provider's cost is the distributed cache's cost (which can be low or can be VERY expensive, I'm looking at you Akamai!) With P2P, some localization, 1 < X < N ~X copies inbound on the transit link ~X copies outbound on the transit link N+X copies through the ISP's internal cloud N copies inbound on the last hop N copies outbound on the last hop Performance is limited by the end user's download bandwidth, the transit link, and the end user's upload bandwidth + ability to freeride ISP's cost is both transit in and out, and the last hop in AND out. Content provider's cost is very low (can go to near 0 if disallows most freeriding once the swarm is seeded) With 100% localization and without caches: 1 copies inbound on the transit link 1 copies outbound on the transit link N+1 copies through the ISP's internal cloud N copies inbound on the last hop N copies outbound on the last hop Performance is limited by the end user's download bandwidth, the transit link, and the end user's upload bandwidth + ability to freeride ISP's cost is the last hop in AND out Content provider's cost is very low. The interesting thing about localization is, even in the limit, it does not eliminate the transit on the last hop uplinks, and it does not eliminate the performance bottleneck of end user's upload bandwidth, or the dependence on freeriding if you want to download faster than the upload allows. This is actually a big problem, as for many technologies (DSL, cable), the last mile is highly asymmetric and, for some (eg, Cable), the last hop uplink can often be a serious point of shared congestion and cost (adding bandwidth means killing TV channels!). Thus for some significant networks, P2P is a huge magnification in aggregate cost even with perfect localization. OTOH, if you add caches, then you get really GOOD behavior because, unlike HTTP caches, a failure in the cache doesn't cause a failure of the system (the cache is out of path), adding in such caches can be done without client changes (a cache is just another client node which allows freeriding ONLY to local nodes), while removing the uplink as a performance bottleneck (the cache can be placed in the ISP's internal cloud). And for legitimate content, you don't have all the legal headaches you do on BitTorrent. Thus: With 100% localization and with caches: 1 copies inbound on the transit link 1 copies outbound on the transit link N+1 copies through the ISP's internal clound N copies inbound on the last hop X < N copies outbound on the last hop This suggests the following: ANY localization discovery protocol must also include cache discovery, treating caches as special nodes which will only serve a set of subranges. Any commercial P2P content distribution scheme should include caches. Such caches can be cheap & cheerful (a low power 1u server with 2x1TB disks and a GigE and a boot-from-CD to boot from net bootstrap startup, if you can't get this down to $1.5K, you're doing something wrong) but should be accounted for right from the start. Simply because without caches, P2P costs ISPs a lot of money to save the content provider money, and you get everyone working at cross purposes. And the customers suffer too, if you overload their outbound link. But with caches, the costs radically shift the other way, so that P2P can save both the ISP AND content providers a lot of money. At $.10/ GB, you only have to shift 15 TB out of the cache to pay for the cost of the cache. If your cache is serving 100 Mbps, thats only 13 days of operation. Even at a paltry 10 Mbps duty (so ~10 1Mbps customers) its still 130 days payback... _______________________________________________ alto mailing list [email protected] https://www.ietf.org/mailman/listinfo/alto _______________________________________________ alto mailing list [email protected] https://www.ietf.org/mailman/listinfo/alto
