Hi! Spencer asked me to bring up this issue, that originated on the TRAM mailing list on the topic of STUN discovery of middle boxes.
In a few projects, we’ve seen strange things happening when using mobile networks. Wireshark on a server tells us a message has left and is confirmed delivery by the TCP layer - but nothing appears using Wireshark on the other end. Seeing a mail message on a SIP implementors mailing list that one vendor have started using UDP-style retransmits when using TCP as a transport made me starting to worry that this was a more widespread problem. If I understand it right, mobile carriers are implementing some sort of middlebox that act as a TCP proxy. I guess the idea is that they want to disable retransmits on the radio, so they confirm reliable delivery prematurely. If something goes wrong beyond the proxy, it’s too late to do anything about it. I had a few discussions about this with fellows at IETF in Berlin and it seemed like a known problem. One developer pointed me to a paper discussing this. https://www.cs.montana.edu/mwittie/publications/Goel16Detecting.pdf From the abstract: "In current cellular networks, a myriad of middleboxes disregard the end-to-end principle to enable network operators to deploy services such as content caching, compression, and protocol optimization to improve end-to-end network performance.” Good reading! (and a bit scary) This mess caused me sadly to suggest that we need to discuss breaking the assumption that TCP delivery is always reliable and implement retransmits even over TCP in the STUN protocol. STUN was designed to discover middleboxes with a focus on NAT. This is just another middle box to discover. The bigger picture is even more scary - what happens if our reliable transport suddenly no longer is reliable? One developer from a well known mobile system vendor said “well, I guess that using TLS may help”… Have you seen this behaviour in networks close to you? What are your experiences? /O
