There are two proposals, one in the draft and one in your email. The general assumption behind these proposals is that a node participates in the overlay as a full peer and uses these solutions for achieving hop-by-hop reliability.

I do not agree with the proposals in the draft or in your email. For simplicity, lets refer to these solutions as Lesser-reliability-over-UDP (LROU). The reasons, some of which I have stated in the earlier posts, are:

1) LSROU or even TCP-over-UDP is not a universal solution. It is well-known that TURN server use is necessary for UDP, especially when using cascaded NATs, or NATs with end-point dependent filtering.


2) Base draft has tried to incorporate solutions that work in all scenarios. This is why we have recursive routing. LSROU does not work in all scenarios. A combination of LSROU and TCP also does not work in all scenarios. Bottom line: relaying is unavoidable. Clearly, lesser the relaying the better, and LROU is considered to be one way. However, see (3) and (5).


3) TCP inbound connections through NAT are considered more problematic than UDP. The available data I am aware of (Characterizing paper, IMC'05) suggests that for 100% of common type of deployed NATs, it is possible to establish direct TCP connections. You noted that this is not as universal as discussed in the paper, but how much less? Can we put a number? Do we have data?


4) We cannot make any assumptions on the size of the data sent over LSROU.


5) LSROU only relies on *timeouts* to recover each loss. TCP recovers loss using *TDACK* and *timeout*. A node participating in the overlay as a full peer that uses LSROU to recover losses using *timeouts* is the weakest link in the routing chain.


5) Lets suppose that TCP inbound connections were much more problematic than UDP and so a lot of peers will run LSROU. Imagine that the only way these peers running LSROU recover losses is using timeouts. Can we imagine the poor routing performance of this system? Shall we standardize it?


6) The present text in the draft uses TFRC-SP. TFRC-SP mandates a gap of 10ms between each transmission. Imagine 4 packets traversing 5 hops, and each transmission delayed by 10ms. Even if there are no losses, the last packet will leave the fourth hop after 120ms.


My proposal (a rough text) is as follows.

"RELOAD uses TCP for achieving hop-by-hop reliability and relies on existing techniques to solve inbound TCP connection problem. When direct connection fails, the node (a) only participates as a client or (b) particpates as a peer and uses a TCP TURN server to achieve a 1-hop connection with its connection table entries.

Alternatively, a peer can use a TCP-over-UDP protocol to establish direct connections and to achieve reliability. However, we do not specify such a protocol."

-s


On Wed, 25 Mar 2009, Bruce Lowekamp wrote:

Salman,

Based on your list, I'm going to assume you agree with the current
proposal.  RELOAD supports (and prefers) a TCP overlay link protocol,
and it offers a UDP-based protocol when that doesn't work.  I'd fully
support a simple use RFCXXXX for a TCP-over-UDP protocol, but since
there isn't one, we have a goal of something that should work, even if
it doesn't reach "ideal" (TCP-like) performance.  If you believe more
needs to be done to specify a real TCP over UDP, I fully support you
advancing that in TSV.

Bruce


On Thu, Mar 19, 2009 at 12:13 AM, Salman  Abdul Baset
<[email protected]> wrote:
On Wed, 18 Mar 2009, Bruce Lowekamp wrote:

That paper in particular, and the reasons UDP connections are more
reliably formed than TCP, have been discussed numerous times in
MMUSIC, and I really don't think we should be repeating the whole
conversation here.  But the summary is that it's not nearly as
universal a solution as indicated in that paper.

Bruce

Sure. I have already mentioned that this is a IMC'05 paper and more recent
data, if available, is helpful and needed.

There are at least four solutions to the hop-by-hop reliability problem:

(1) Clients
Nodes behind TCP *un*friendly NATs can always act as clients and establish a
TCP connection(s) with reachable node(s). The reachable nodes can be behind
friendly NATs or they can have a public IP address.

(2) Full peer but use relay peer(s)
A node participates as a peer. It establishes TCP connection with reachable
peers, which inturn establish a TCP connection with the nodes' connection
table entries.

(3) Full peer with techniques for direct TCP connection establishment
A node participates as a peer and uses TCP traversal techniques for
establishing direct connection (including Dean's upcoming ones:)

(4) Full peer with TCP-over-UDP
Since TCP traversal may fail, design/reuse a reliable congestion control
protocol over UDP.


Note that:
(1) and (2) always work.

(3) and (4) do not work well behind cascaded NATs. (4) fails behind
UDP-blocking firewalls.

(4) is feasible over (3) since UDP has a better chance of connection
establishment when NATs are not cascaded. However, UDP blocking firewalls
need to be factored in this feasible discussion. Again, any recent data is
helpful.


For (4), the TCP-over-UDP protocol needs to be well-designed and
well-implemented. Otherwise, peers doing TCP-over-UDP may be the weakest
link in the routing chain. Approaches which recover every loss using timeout
may not be the most feasible ones.


-salman


_______________________________________________
P2PSIP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/p2psip

Reply via email to