Vidya,

Thanks for the comments.  Responses inline.

On Mon, May 11, 2009 at 3:47 PM, Narayanan, Vidya <[email protected]> wrote:
> I've described a number of comments on section 9 below.  But, if the 
> responses can be separated out under each topic, it will help the discussion 
> quite a bit, given the length of this email.
>
> Routing (Section 9.3):
> ----------------------
> The routing rule as specified does not work in all cases.  If a node with ID 
> N and immediate successor S receives a message for resource id k such that  N 
> < k < S, since k is neither a node directly attached to N and nor is there a 
> node whose ID is between N and k, the message cannot be routed.  The routing 
> rule should instead read as follows:
>
> "If a peer is not responsible for a Resource-ID k, but is directly connected 
> to a node with Node-ID k, then it routes the message to that node.  
> Otherwise, it routes the request to the peer in the routing table that has 
> the largest Node-ID that is in the interval between the peer and k.  **If no 
> such node is found, it finds the smallest node id that is greater than k and 
> routes the message to that node.**  The routing table is the union of the 
> neighbor table and the finger table."
>
> The new text is as enclosed in "**  **" above.
>

You're right, that text misses the last step of chord routing.   Will update.


> Redundancy (Section 9.4):
> -------------------------
> On the open issue, there should be no need to wait for successful redundant 
> copy storage to return a STORE response.  It doesn't make sense to send a 
> failure if one of the redundant storage attempts fails anyway - so, the wait 
> becomes a pointless exercise.
>
>   "Note that a malicious node can return a success response but not
>   store the data locally or in the replica set.  Requesting peers that
>   wish to ensure that the replication actually occurred SHOULD [[Open
>   Issue:  SHOULD or MAY?]] contact each peer listed in the replicas
>   field of the Store response and retrieve a copy of the data."
>
> The above text is not just relevant for redundant copies of data stored - it 
> is true for even the actual resource-id owner.  Also, in light of not waiting 
> for the replica responses, the STORE response will not contain a list of 
> replicas - so, this needs to be corrected.
>

Yes, that last text doesn't apply just to replicas.  Will reword.

The responsible peer can, however, include the peerids of the peers on
which it will place replicas, even if it doesn't wait to confirm their
responses.

> Joining (Section 9.5):
> ----------------------
> It is not clear why the join procedure requires all neighbor and finger 
> connections to be established prior to sending the JOIN message itself.  To 
> be functional in the overlay, only the immediate successor relationships must 
> be correctly in place.  The other successors, predecessors and fingers can be 
> established subsequently.  This also doesn't cause nodes to establish 
> unnecessary connections unless the node has successfully joined.  There is 
> also the question of whether a JP is supposed to establish a TLS connection 
> to the BP to send messages through it, which is presently not addressed in 
> the draft.  It seems to me that a more streamlined join process can be 
> defined as follows:
>

You're right that it's just an optimization to have finger table
entries, but it's not completely clear to me that adding a new peer
without finger table entries provides a benefit to the overlay.  I can
see that being a SHOULD, though.

Yes, messages sent to the bootstrap peer are sent with (D)TLS.



> 1.  JP connects to its chosen bootstrap node. JP MUST form a TLS session with 
> the bootstrap node, using the certificate or PSK required for overlay 
> membership.
> 2.  JP sends a Join request message to its own Node-ID n, through the BP.  
> This is routed to the admitting peer (AP).  The AP sends a response to the 
> Join.
> 3.  JP also sends an Attach request message to n through the BP.  This may be 
> sent in parallel with the Join message.  The AP sends an attach response, 
> resulting in connection establishment between the JP and AP.
> 4.  AP does a series of Store requests to JP to store the data that JP will 
> be responsible for.
> 5.  AP sends JP an Update explicitly labeling JP as its predecessor.  AP also 
> sends an Update to all its other neighbors with the new value of its neighbor 
> set (including JP).  At this point, JP is part of the ring and responsible 
> for a section of the overlay.  AP SHOULD now store any data that JP is 
> responsible for as part of replicas it will store (since AP is still JP's 
> successor, it will need to store the data as well anyway).
> 6.  JP sends Attach requests to the neighbors and establishes connections 
> with the required successors and predecessors.
> 7.  JP establishes finger connections by sending Attach requests to peers 
> (n+2^i). (Comments on finger selection below).
>
> Updates (Section 9.7):
> ----------------------
> Section 9.7.1. states that every time a connection is lost, the peer should 
> remove it from its neighbor table and find a different peer.  Section 9.7.3. 
> talks about attempting to re-establish connections when a neighbor is 
> detected to be invalid.  These are at odds.  It seems to make sense that the 
> peer would always try to re-establish connections and only remove that 
> neighbor if the connection cannot be re-established.
>

Working on some new text along these lines for the next revision.

> Further, having the enrollment server involved in specifying the frequency of 
> updates and probes seems strange.  How is the enrollment server supposed to 
> know the magic values here? It seems like it should be enough to specify a 
> recommended value and leave it at that.  Since the finger stabilization is 
> for optimization and the neighbor stabilization is reactive anyway in the 
> present draft, having these parameters in the config document seem to be of 
> little use.
>

Are you revisiting the issue of wanting to split the authority for
enrollment and config files, or are you questioning whether it should
be possible to specify update frequency in a config file at all?


> Finger entries (Section 9.7.3 and elsewhere):
> ---------------------------------------------
> The draft requires 16 finger table entries - at first, I took those to mean 
> 16 different resource ids that will be probed per formula.  After reading 
> 9.7.3, it appears that it is talking about 16 different fingers that a node 
> should maintain.  In a small overlay, that's ridiculously large.  The text is 
> also unclear on what happens when the overlay shrinks.  Further, the text 
> states that a peer SHOULD consider the finger table entry valid if it is in 
> the range [n + 2^(numBitsInNodeId-i), n + 1.5 x 2^(numBitsInNodeId-i)]. For 
> small overlays, there may be no entries in this range for every i in [0, 127].
>

The text was left the way it is in an effort to make it simpler, but I
think there may be an issue in there where too many MUSTs are
prohibiting common optimizations.  Definitely need to take a look at
that.  I think the right answer here is to check the existing text for
being too pedantic and compare its length & complexity with the
optimization you describe (which I agree is how this should really be
implemented).


> A node should only need log N fingers to route in OlogN hops.  What we need 
> to specify is an algorithm that uses the same pattern in resource ids to 
> probe - the fingers corresponding to multiple of those will collapse to the 
> same node in a small overlay.  In our implementation, the algorithm works as 
> follows:
>
> A node looks at the list of resource-ids corresponding to f=(n+2^i), for all 
> values of i=[0,127].  Let's say that it first probes n+2^0 and obtains a 
> finger of f0.  It checks f0 against the next set of resource-ids to probe and 
> determines the next larger resource-id to probe from the list.  It populates 
> the finger table entries for all other fingers f for which n < f < f0, with 
> f0 as the corresponding finger.  Depending on the number of nodes in the 
> overlay, this should result in approximately logN fingers in the finger table 
> at any time.  Periodic stabilization should adjust the number of fingers 
> automatically.  In fact, the number of fingers a node has should provide a 
> rough estimate (with accuracy in the order of log N) of the size of the 
> overlay.  The current text suggests that the number of fingers should be 
> decided based on the overlay size, which does not seem intuitive or necessary.
>
> Leaving (Section 9.9):
> ----------------------
> In the event of a graceful leave, a peer should be able to send updates to 
> all members of the neighbor set.  It seems redundant to have the peer send a 
> Leave message, followed by every peer receiving the Leave to send updates.  
> Is there a reason for this?
>

Yeah, it's pretty excessive.  I haven't personally looked at it too
hard because I tend to think about ungraceful leaves more, and
periodic updates remove the issue entirely.

Bruce



> Thanks,
> Vidya
> _______________________________________________
> P2PSIP mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/p2psip
>
_______________________________________________
P2PSIP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/p2psip

Reply via email to