Eric,

Using AttachReqAns.send_update is a fantastic idea. Brilliant!

Having AP send an UPDATE before JP send JOIN provides JP a precise set of
Node-IDs (S1,S2,S3,P1,P2,P3) to ATTACH.  The cost is an extra UPDATE
message, but it preempted the race condition (AP's predecessors included),
and JP can also do simultaneous attach to S1,S2,S3,P1,P2,P3. It's so worth it!

The AttachReqAns.send_update flag was added in base-10 back in August
of this year. Our code is quite old.  Thanks for the pointer.

I am also glad that we are in agreement on defining a tie-breaker heuristic
based on Node-ID comparison to address the race condition.  We should
move ahead to add it into the next draft.

--Michael


On 11/5/2010 7:03 AM, Eric Rescorla wrote:
Hi Michael,

Section 11 is an example, not normative. The normative description is in S 9.4.

In 9.4 (as revised by Bruce) you send ATTACHes to the expected routing table
entries. Eventually, we want JP to end up with connections to nodes around the
overlay at predictable locations (the finger table) and to adjacent nodes (the
neighbor table).

- The finger table can be formed directly by sending attaches.
- We discover the neighbor table by iteration.

This is done in step 3 (which will become step 2).

Step 7 and 8 result in JP attempting to connect to any neighbors he hasn't connected to in step 3 and similarly any such neighbors trying to form those connections. With chord, this should mainly just be the predecessor set of JP, which doesn't seem so bad. I would generally expect implementations to pace out their response to updates a bit (though I admit this isn't in the draft) so I'm not sure how many simultaneous connections you will get. I agree, however, that it makes sense to have a tiebreaker
and I think smallest node-id is the logical one. [Note that if JP uses the new
"send_update" flag, he can elicit an UPDATE from AP immediately and start 
forming
the connections as soon as he ATTACHes].

WRT to the suggestion that you defer ATTACHes to the successor set prior to
receiving the UPDATE from AP, I agree that this makes the code somewhat simpler
(though not that much), but at the cost of having JP have a quite incomplete connection set at the time it is allegedly part of the overlay, so I think that's a bad tradeoff, so
I would prefer to leave that part of the algorithm as-is.

-Ekr


On Wed, Nov 3, 2010 at 10:08 PM, Michael Chen <[email protected] <mailto:[email protected]>> wrote:

    Bruce,

    We are neglecting predecessors of AP. Let me re-frame this problem a bit as
    follows:

    JP is joining an overlay that looks like this: ...,P3,P2,P1,AP,S1,S2,S3,...
    where AP will be the admitting peer of JP, P1 is the immediate predecessor
    of AP, and S1 is the immediate successor of AP, etc.  Assume that JP has
    ATTACH to AP via its bootstrap node (BP).

    Join Method 1
    =============
    Using the technique described in Section 11 of base-11 draft, JP can ATTACH
    to S1 using Resource-ID AP+1, to S2 using Resource-ID S1+1, and to S3 using
    Resource-ID S2+1. JP has no means to find P1,P2,P3 before JOIN.

    Then JP sends JOIN to AP and later receives UPDATE from AP. This UPDATE will
    have the neighbor table: P2,P1,JP,AP,S1,S2,S3 (may be P3 as well). The same
    UPDATE will also be sent by AP to P1, P2 and may be P3.  The JP entry in
    this UPDATE is new to P1, P2 and P3, which will lead to these 3 peers to
    ATTACH to JP. The same is true for JP.  The result is the race condition
    we are talking about in this thread.

    Join Method 2
    =============
    I am proposing this alternative in which JP does not attempt to ATTACH to
    any of the successors of AP and send JOIN immediately after ATTACH to AP.
    As usual, AP send STORE and followed by an UPDATE with the same neighbor
    table: (P3),P2,P1,JP,AP,S1,S2,S3. Since all peers in this table except
    AP are new to JP, JP starts to ATTACH to each of them one by one or
    concurrently (advantage over method 1).

    The same race condition still exist when S1,S2,S3,P1,P2,P3 receive UPDATE
    from AP with the same neighbor table noting the new comer JP, which has
    not ATTACH to any one of them.

    I am in favor of method 2, because it is much easier to implement. JP
    knows the exact Node-ID it needs to ATTACH.  No calculation is needed.
    JP can ATTACH to all 6 peers simultaneously (ATTACH to S2 does not have
    to wait for the completion of ATTACH to S1).

    However, for either methods, the draft should address the race condition.
    I believe a tie-breaker heuristic should be adopted by this draft. Here
    are some possibilities:

    1) Compare the transaction ID of two concurrent ATTACH as unsigned integer
    and drop the lesser one and its resulting ICE check.

    2) Compare the Node-ID of the two peers involved as unsigned integers and
    drop the ATTACH initiated by the "lesser" peer.

    3) Peers already in the overlay (S1,S2,S3,P1,P2,P3) never ATTACH to the
    new immediate predecessor (JP) of the UPDATE sender (AP).

    Thanks

    --Michael


    On 11/3/2010 6:27 AM, Bruce Lowekamp wrote:

        inline

        On Tue, Nov 2, 2010 at 8:59 PM, Michael Chen<[email protected]
        <mailto:[email protected]>>  wrote:

            Bruce,

            As I said in the original post, the premise is JP uses the
            technique stated
            in
            Section 11 of the draft (and discussed in the thread titled
            "Clarification
            of
            initial attach procedure"), where JP did not PING and ATTACH to AP
            or NP
            before
            sending JOIN to AP.

        I believe in the other thread, we concluded that Section 9.4 needs to
        be revised to use Attach requests rather than Pings as currently
        written.  However, the example in Section 11 currently has JP sending
        Attach requests to both AP and NP as well as its entire finger table
        before sending the Join.  Section 9.4, step 3 (of the current draft)
        likewise has Attach requests being sent to the entire routing table
        before the Join is sent.  I don't believe there's a problem here.

        Bruce

    _______________________________________________
    P2PSIP mailing list
    [email protected] <mailto:[email protected]>
    https://www.ietf.org/mailman/listinfo/p2psip


_______________________________________________
P2PSIP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/p2psip

Reply via email to