I found this discussion on simulation, controlled experiments, and real 
trials quite interesting. I found the paper Pushing BitTorrent Locality to 
the Limit quite interesting. Here are a few observations from our 
experiences:

- Current simulation techniques are not scalable to evaluate large-scale 
  P2P applications. We started with simulations and used quite powerful 
  clusters running at tens of thousands of CPU hours. Unfortunately, we 
  could simulate only to a few hundred clients. Even for a few hundred 
  clients, we could not afford packet level simulations. Thus a key issue 
  we encountered in simulation was how to determine the TCP throughput. We 
  used TCP throughput formula, but this is highly dependent on network 
  traffic dynamics, and thus so far have to be artificial.

- Controlled experiments are great.  One challenge of controlled 
  experiment is scalability: it is hard to get access to enough number of 
  clients to do repeated, controlled experiments (planetlab would give at 
  most a couple hundred usable clients). The authors of the paper did 
  quite well here.

  In addition to difficulties in clients, controlling network is 
  challenging: as observed by the paper, network background traffic can 
  have significant effects on the performance gain.  In particular, one 
  possible explanation that ISP efficiency (lower delay, or locality) and 
  application performance (i.e., file sharing download speed) may align 
  well is that TCP gives lower throughput to flows using more network 
  resources (i.e. delay, which measures length of link traversed).  Thus 
  optimizing network efficiency can at the same time improve application 
  performance (I am talking about the general cases; there are also 
  exceptional cases; there also can be other explanations such as faster 
  ramp-up time due to shorter TCP connection).  But the aforementioned 
  explanation assumes that there is contention in the network (i.e., 
  client end-host speed is not always the bottleneck).  We sure can use 
  controlled experiments to create contentions in the network and observe 
  the effects. If the results come back negative, we know that our 
  explanation is likely to be false. So in this sense, controlled 
  experiments can be of great value.

- Ultimately, it is trials with real users in the wild that answer the 
  basic question: can network information help network efficiency and 
  application performance? There are so many factors that can may 
  contribute to the results, and we may not be even aware of these factors 
  (e.g., implicit ISP traffic engineering factors).  Negative results or 
  extreme cases (like the paper did) from controlled experiments can be 
  particularly enlightening in helping us understand our problem better.
 
Just a few observations.

Richard

On Wed, 3 Dec 2008, 4:45pm +0100, Arnaud Legout wrote:

> Hi,
> 
> Maciej Wojciechowski wrote:
> > The only thing that is different in this approach from the standard
> > simulation is that a regular bittorrent client is used. The ISPs topology,
> > seeders to leechers ratio, upload speeds and so on are purely artificial.
> > The main problem with bittorrent simulations is not the inaccuracy of the
> > simulated software but wrong assumptions about how the network really looks
> > like. With respect to that, the abovementioned experiment is not much
> > different from "simulating bittorrent with parameters that have unknown
> > relation to real-world values".  Since much of the bittorrent behavior
> > characteristics remain unknown (although many great measurement papers have
> > been published) it is very hard to do credible simulations of protocol
> > performance in changed conditions.
> 
> You are right that you cannot obtain the best of each world.
> 
> However, it is plain wrong to claim that our results are equivalent to 
> what would have been obtained with simulations, or that our results do 
> not bring any new significant insight compared to previous works. I hope 
> that a detailed reading of the paper will convince you. If this is not 
> the case, I would be pleased to discuss specific concerns.
> 
> As we explain in section 3.2, the results we obtained would have been 
> hard, if not impossible, to obtain with simulations. We show that the 
> dynamics of the packets and of BitTorrent algorithms have a major impact 
> on the inter-ISP traffic savings. In particular, we found that an 
> initial seed insufficiently provisioned may increase the inter-ISP 
> traffic in case of locality.
> 
> Also, arguing that it is equivalent to run simulations than controlled 
> experiments makes me feel going back ten years ago. And yes, running a 
> real BitTorrent client is one of the major difference compared to a 
> simulation, but I don't believe this difference can be discarded as a 
> minor one.
>
> If you have specific concerns on the methodology, I would be pleased to 
> discuss them.
> 
> Running in the wild experiments is important, but it just gives one part 
> of the picture. The other part can only be obtained running controlled 
> experiments, and varying well chosen parameters. For instance, even if 
> both Ono and P4P papers significantly improved the comprehension of P2P 
> locality, they do not answer all questions (and never claim to do so). 
> However, there are still some fundamental problems to explore, as 
> explained in the introduction of our paper.
> 
> 
> Regards,
> Arnaud.
> 
> -- 
> Arnaud Legout, Ph.D.
> 
> INRIA Sophia Antipolis - Planète  Phone : 00.33.4.92.38.78.15
> 2004 route des lucioles - BP 93   Fax   : 00.33.4.92.38.79.78
> 06902 Sophia Antipolis CEDEX      E-mail: [EMAIL PROTECTED]
> FRANCE                            Web   :
> http://www-sop.inria.fr/planete/Arnaud.Legout/index.html
> 
> _______________________________________________
> alto mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/alto
> 
_______________________________________________
alto mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/alto

Reply via email to