[P2PSIP] Section 9 of RELOAD - Comments

Narayanan, Vidya Mon, 11 May 2009 12:50:09 -0700

I've described a number of comments on section 9 below.  But, if the responses 
can be separated out under each topic, it will help the discussion quite a bit, 
given the length of this email.


Routing (Section 9.3): 
----------------------
The routing rule as specified does not work in all cases.  If a node with ID N 
and immediate successor S receives a message for resource id k such that  N < k 
< S, since k is neither a node directly attached to N and nor is there a node 
whose ID is between N and k, the message cannot be routed.  The routing rule 
should instead read as follows: 

"If a peer is not responsible for a Resource-ID k, but is directly connected to 
a node with Node-ID k, then it routes the message to that node.  Otherwise, it 
routes the request to the peer in the routing table that has the largest 
Node-ID that is in the interval between the peer and k.  **If no such node is 
found, it finds the smallest node id that is greater than k and routes the 
message to that node.**  The routing table is the union of the neighbor table 
and the finger table."

The new text is as enclosed in "**  **" above.  

Redundancy (Section 9.4): 
-------------------------
On the open issue, there should be no need to wait for successful redundant 
copy storage to return a STORE response.  It doesn't make sense to send a 
failure if one of the redundant storage attempts fails anyway - so, the wait 
becomes a pointless exercise.  

   "Note that a malicious node can return a success response but not
   store the data locally or in the replica set.  Requesting peers that
   wish to ensure that the replication actually occurred SHOULD [[Open
   Issue:  SHOULD or MAY?]] contact each peer listed in the replicas
   field of the Store response and retrieve a copy of the data."

The above text is not just relevant for redundant copies of data stored - it is 
true for even the actual resource-id owner.  Also, in light of not waiting for 
the replica responses, the STORE response will not contain a list of replicas - 
so, this needs to be corrected.  

Joining (Section 9.5): 
----------------------
It is not clear why the join procedure requires all neighbor and finger 
connections to be established prior to sending the JOIN message itself.  To be 
functional in the overlay, only the immediate successor relationships must be 
correctly in place.  The other successors, predecessors and fingers can be 
established subsequently.  This also doesn't cause nodes to establish 
unnecessary connections unless the node has successfully joined.  There is also 
the question of whether a JP is supposed to establish a TLS connection to the 
BP to send messages through it, which is presently not addressed in the draft.  
It seems to me that a more streamlined join process can be defined as follows: 

1.  JP connects to its chosen bootstrap node. JP MUST form a TLS session with 
the bootstrap node, using the certificate or PSK required for overlay 
membership.  
2.  JP sends a Join request message to its own Node-ID n, through the BP.  This 
is routed to the admitting peer (AP).  The AP sends a response to the Join.  
3.  JP also sends an Attach request message to n through the BP.  This may be 
sent in parallel with the Join message.  The AP sends an attach response, 
resulting in connection establishment between the JP and AP.  
4.  AP does a series of Store requests to JP to store the data that JP will be 
responsible for.  
5.  AP sends JP an Update explicitly labeling JP as its predecessor.  AP also 
sends an Update to all its other neighbors with the new value of its neighbor 
set (including JP).  At this point, JP is part of the ring and responsible for 
a section of the overlay.  AP SHOULD now store any data that JP is responsible 
for as part of replicas it will store (since AP is still JP's successor, it 
will need to store the data as well anyway).  
6.  JP sends Attach requests to the neighbors and establishes connections with 
the required successors and predecessors. 
7.  JP establishes finger connections by sending Attach requests to peers 
(n+2^i). (Comments on finger selection below).  

Updates (Section 9.7): 
----------------------
Section 9.7.1. states that every time a connection is lost, the peer should 
remove it from its neighbor table and find a different peer.  Section 9.7.3. 
talks about attempting to re-establish connections when a neighbor is detected 
to be invalid.  These are at odds.  It seems to make sense that the peer would 
always try to re-establish connections and only remove that neighbor if the 
connection cannot be re-established.  

Further, having the enrollment server involved in specifying the frequency of 
updates and probes seems strange.  How is the enrollment server supposed to 
know the magic values here? It seems like it should be enough to specify a 
recommended value and leave it at that.  Since the finger stabilization is for 
optimization and the neighbor stabilization is reactive anyway in the present 
draft, having these parameters in the config document seem to be of little use. 
  

Finger entries (Section 9.7.3 and elsewhere): 
---------------------------------------------
The draft requires 16 finger table entries - at first, I took those to mean 16 
different resource ids that will be probed per formula.  After reading 9.7.3, 
it appears that it is talking about 16 different fingers that a node should 
maintain.  In a small overlay, that's ridiculously large.  The text is also 
unclear on what happens when the overlay shrinks.  Further, the text states 
that a peer SHOULD consider the finger table entry valid if it is in the range 
[n + 2^(numBitsInNodeId-i), n + 1.5 x 2^(numBitsInNodeId-i)]. For small 
overlays, there may be no entries in this range for every i in [0, 127].  

A node should only need log N fingers to route in OlogN hops.  What we need to 
specify is an algorithm that uses the same pattern in resource ids to probe - 
the fingers corresponding to multiple of those will collapse to the same node 
in a small overlay.  In our implementation, the algorithm works as follows: 

A node looks at the list of resource-ids corresponding to f=(n+2^i), for all 
values of i=[0,127].  Let's say that it first probes n+2^0 and obtains a finger 
of f0.  It checks f0 against the next set of resource-ids to probe and 
determines the next larger resource-id to probe from the list.  It populates 
the finger table entries for all other fingers f for which n < f < f0, with f0 
as the corresponding finger.  Depending on the number of nodes in the overlay, 
this should result in approximately logN fingers in the finger table at any 
time.  Periodic stabilization should adjust the number of fingers 
automatically.  In fact, the number of fingers a node has should provide a 
rough estimate (with accuracy in the order of log N) of the size of the 
overlay.  The current text suggests that the number of fingers should be 
decided based on the overlay size, which does not seem intuitive or necessary.  

Leaving (Section 9.9): 
----------------------
In the event of a graceful leave, a peer should be able to send updates to all 
members of the neighbor set.  It seems redundant to have the peer send a Leave 
message, followed by every peer receiving the Leave to send updates.  Is there 
a reason for this? 

Thanks,
Vidya
_______________________________________________
P2PSIP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/p2psip

[P2PSIP] Section 9 of RELOAD - Comments

Reply via email to