I found this discussion on simulation, controlled experiments, and real trials quite interesting. I found the paper Pushing BitTorrent Locality to the Limit quite interesting. Here are a few observations from our experiences:
- Current simulation techniques are not scalable to evaluate large-scale P2P applications. We started with simulations and used quite powerful clusters running at tens of thousands of CPU hours. Unfortunately, we could simulate only to a few hundred clients. Even for a few hundred clients, we could not afford packet level simulations. Thus a key issue we encountered in simulation was how to determine the TCP throughput. We used TCP throughput formula, but this is highly dependent on network traffic dynamics, and thus so far have to be artificial. - Controlled experiments are great. One challenge of controlled experiment is scalability: it is hard to get access to enough number of clients to do repeated, controlled experiments (planetlab would give at most a couple hundred usable clients). The authors of the paper did quite well here. In addition to difficulties in clients, controlling network is challenging: as observed by the paper, network background traffic can have significant effects on the performance gain. In particular, one possible explanation that ISP efficiency (lower delay, or locality) and application performance (i.e., file sharing download speed) may align well is that TCP gives lower throughput to flows using more network resources (i.e. delay, which measures length of link traversed). Thus optimizing network efficiency can at the same time improve application performance (I am talking about the general cases; there are also exceptional cases; there also can be other explanations such as faster ramp-up time due to shorter TCP connection). But the aforementioned explanation assumes that there is contention in the network (i.e., client end-host speed is not always the bottleneck). We sure can use controlled experiments to create contentions in the network and observe the effects. If the results come back negative, we know that our explanation is likely to be false. So in this sense, controlled experiments can be of great value. - Ultimately, it is trials with real users in the wild that answer the basic question: can network information help network efficiency and application performance? There are so many factors that can may contribute to the results, and we may not be even aware of these factors (e.g., implicit ISP traffic engineering factors). Negative results or extreme cases (like the paper did) from controlled experiments can be particularly enlightening in helping us understand our problem better. Just a few observations. Richard On Wed, 3 Dec 2008, 4:45pm +0100, Arnaud Legout wrote: > Hi, > > Maciej Wojciechowski wrote: > > The only thing that is different in this approach from the standard > > simulation is that a regular bittorrent client is used. The ISPs topology, > > seeders to leechers ratio, upload speeds and so on are purely artificial. > > The main problem with bittorrent simulations is not the inaccuracy of the > > simulated software but wrong assumptions about how the network really looks > > like. With respect to that, the abovementioned experiment is not much > > different from "simulating bittorrent with parameters that have unknown > > relation to real-world values". Since much of the bittorrent behavior > > characteristics remain unknown (although many great measurement papers have > > been published) it is very hard to do credible simulations of protocol > > performance in changed conditions. > > You are right that you cannot obtain the best of each world. > > However, it is plain wrong to claim that our results are equivalent to > what would have been obtained with simulations, or that our results do > not bring any new significant insight compared to previous works. I hope > that a detailed reading of the paper will convince you. If this is not > the case, I would be pleased to discuss specific concerns. > > As we explain in section 3.2, the results we obtained would have been > hard, if not impossible, to obtain with simulations. We show that the > dynamics of the packets and of BitTorrent algorithms have a major impact > on the inter-ISP traffic savings. In particular, we found that an > initial seed insufficiently provisioned may increase the inter-ISP > traffic in case of locality. > > Also, arguing that it is equivalent to run simulations than controlled > experiments makes me feel going back ten years ago. And yes, running a > real BitTorrent client is one of the major difference compared to a > simulation, but I don't believe this difference can be discarded as a > minor one. > > If you have specific concerns on the methodology, I would be pleased to > discuss them. > > Running in the wild experiments is important, but it just gives one part > of the picture. The other part can only be obtained running controlled > experiments, and varying well chosen parameters. For instance, even if > both Ono and P4P papers significantly improved the comprehension of P2P > locality, they do not answer all questions (and never claim to do so). > However, there are still some fundamental problems to explore, as > explained in the introduction of our paper. > > > Regards, > Arnaud. > > -- > Arnaud Legout, Ph.D. > > INRIA Sophia Antipolis - Planète Phone : 00.33.4.92.38.78.15 > 2004 route des lucioles - BP 93 Fax : 00.33.4.92.38.79.78 > 06902 Sophia Antipolis CEDEX E-mail: [EMAIL PROTECTED] > FRANCE Web : > http://www-sop.inria.fr/planete/Arnaud.Legout/index.html > > _______________________________________________ > alto mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/alto >
_______________________________________________ alto mailing list [email protected] https://www.ietf.org/mailman/listinfo/alto
