Hi Bruce, Thanks for your answer. Below you say that in your experience, reactive recovery is needed in smaller-scale overlays. Do you have any results to share regarding the maximum level of churn and overlay size that reactive recovery is able to handle? I have been experimenting in PlanetLab with a 500-node Chord-based P2PSIP overlay, which uses only periodic recovery. The churn rate has been such that there is one node leaving and joining every five seconds on average. At this churn rate and network size, periodic recovery seems to perform rather well.
Jouni Bruce Lowekamp wrote: > Jouni, > > Thanks for the careful read. I think there are a couple issues that > your email raises. > > The first is a general one of determining the requirements for the > mandatory DHT algorithm for p2psip (the wg has voted to make at least > one mandatory, I guess more is an option too). Rhea's research shows > that for large overlays with high churn rates, reactive recovery imposes > too high a load and recommends periodic recovery. I agree with that > conclusion. However, many of the scenarios proposed for p2psip have > smaller overlays and lower churn rates. In those cases, reactive > recovery would be preferable. Even in large-scale overlays, one could > argue a p2psip overlay would expect session times on the longer end of > the scale if people are leaving it up for incoming calls. Regardless, > if we have just one DHT specified, I don't think we're going to be able > to reach an ideal solution for all use cases. > > Looking at the algorithm in RELOAD, I believe it specifies reactive > recovery for the neighborhood set (leaf set in Rhea's terminology) and > periodic recovery for the finger table (routing table). Actually, I > don't think it specifies what exactly to do when a finger table entry > fails, but it only specifies periodic checks for finger table entries > and emphasizes that since they're just an optimization, it's OK to leave > the finger table entries suboptimal. So it's something of a compromise > between the two options, using reactive recovery where it matters more > and periodic (or choice left to the implementor) recovery where it > doesn't matter. > > I personally think that the balance in the current draft is good for > what I expect to be the typical range of p2psip deployments. I > completely agree that it's not appropriate for a large-scale high-churn > overlay. However, my experience with smaller-scale overlays (office pbx > scale) suggests that we do need reactive recovery in those environments. > If the WG decides that the mandatory DHT should specify tunable > behavior or that we should have multiple mandatory DHTs to support a > range of environments, I'm OK with that. > > Parenthetically, I'm a bit embarassed that I've never rewritten the > section on periodically searching for new finger table entries. I > intended to rewrite that to specify that the existing peer in a finger > table entry (that is in that range) should not be replaced with a > "better match" because we should prefer stable peers over better > matches. I think that change would be more in line with current best > practices for dht routing. > > Bruce > > > Jouni Mäenpää wrote: >> Hi, >> >> The RELOAD draft specifies in chapter 12 a modified version of the Chord >> algorithm. It seems to me that this modified Chord algorithm uses >> reactive recovery from node failures instead of periodic recovery, which >> is used by the original Chord algorithm as specified in [1]. In reactive >> recovery, a node reacts to the loss of a node in its neighborhood set by >> sending a copy of it neighbor table to all nodes in the neighborhood >> set. Periodic recovery, in contrast, takes place independently of >> changes in the neighborhood set. In periodic recovery, a node >> periodically shares its neighborhood set with each of the members of >> that set. >> >> It has been shown in [2] that reactive recovery is only efficient at low >> churn rates. With moderate and high churn rates, periodic recovery uses >> less bandwidth and the resulting lower contention of the network leads >> to lower latencies. My concern is that because it uses reactive >> recovery, the performance of RELOAD's Chord algorithm may degrade under >> moderate and high churn rates. >> >> As an example, the first version of the Bamboo DHT algorithm used >> reactive recovery. As a consequence, Bamboo suffered a performance >> degradation under churn. To solve this issue, Bamboo was modified to use >> periodic recovery from node failures. >> >> Any comments? >> >> [1] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. >> Balakrishnan. Chord: A scalable peer-to-peer lookup service for Internet >> applications. Technical Report TR-819, MIT, March 2001 >> [2] S. Rhea, D. Geels, T. Roscoe, and J. Kubiatowicz. Handling churn in >> a DHT. In Proc. of USENIX Technical Conference, June 2004 >> >> Regards, >> Jouni >> _______________________________________________ >> P2PSIP mailing list >> [email protected] >> https://www.ietf.org/mailman/listinfo/p2psip >> > > _______________________________________________ P2PSIP mailing list [email protected] https://www.ietf.org/mailman/listinfo/p2psip
