Martin,
(I put a variation of this comment in the meeting and in slack, but I wanted to
expand on it some. Sorry, but this got long. Four hours is not enough sleep.)
Thanks expressing your viewpoint as a sumary. As suggested by the
chairs, I continue the discussion on the mailing list and not on slack.
Multipath seems pretty clearly useful for certain cases. I think that the
meeting today answered at least the first two of the BoF questions I posed
earlier on the list. So if we are to regard this as a BoF, it meet its goals
(thanks chairs). There is some uncertainty about the first question about
having a clear problem to solve, but I am of the view that we could muddle
through with some combination of either ignoring our differences or working
around them. The third question regarding constituency is where I didn't find
a satisfactory answer. I want to be clear though, this is no fault of the
proponents. At the current time, I am convinced that formally starting work on
multipath would be unwise.
Multipath aims to improve performance either through latency, robustness, or
throughput. Application awareness and involvement in scheduling seemed to be
the key factor that enables finding the optimal usage pattern or scheduling
algorithm that allow multipath to deliver on those goals. Applications and
users are in the best place to balance goals against other factors like cost or
whatever else matters most. (For reference, I recall the same point being made
by Roberto and Christian most clearly, but several others made the same point.)
Christoph did a good job of showing how this applies to very specific use
cases, and I thought I saw that in the Alibaba presentation also, but we didn't
quite get enough time to get the necessary detail in either presentation. One
potential advantage in this regard is that QUIC implementations are often
closer to applications, so they might be in a good position to integrate better.
There are two different things that must be considered in a multipath
transport protocol:
- the information that the two peers exchange
- the algorithms that the two peers use
The MPTCP spec defines how the TCP options are use to carry information
between peer and how the different subflows can be managed. It does not
define how a path manager works and how a packet scheduler works, this
level of specification is not required for interoperability.
Since the publication of RFC6824, different use cases have been deployed
with MPTCP. As indicated by Christoph, those did not require any change
to the protocol, only tuning the packet schedulers and path manager to
better match the deployed use case.
I believe that the same distinction should apply to a multipath variant
of QUIC. Compared to TCP, QUIC has a lot of protocol features that make
it easy to add multipath capabilities and we are very close to having a
multipath capable transport.
Concerning the algorithms (packet scheduler and path managers), these
need to implement some kind of business logic related to the relative
cost of the different links that are use. In MPTCP, we were constrained
by the kernel/userspace boundary that made it more difficult from a
systems viewpoint to allow the applications to easily tune the kernel
mechanisms, but several solutions were found. Given that QUIC runs in
userspace and is linked in the application, it becomes easier to allow
the application to provide a set of functions that implement the
application's business logic in a flexible manner.
However, many of the cases that were presented were exactly the sorts of opaque intermediation that is almost the antithesis of that ideal.
The fundamental reason why there is intermediation is that the current
Internet architecture is not multipath capable. With widely deployed
multipath transport, users of devices that need to aggregate the
bandwidth of different paths to get the performance they need would be
able obtain them without having to rely on various types of middleboxes
that are deployed to overcome this architectural limitation.
If QUIC gets widely deployed beyond the current web-centric use case,
there is an opportunity to have a transport protocol that can address
all these problems in a clean and end-to-end manner.
Otherwise, all the Internet users that need to combine several Internet
access links for performance (throughput, reliablity, ...) reasons will
need to continue to rely on proprietary protocols or middleboxes that
introduce their own problems to be able to fully use their network.
Similarly, David's assertion that multipath is orthogonal to MASQUE is reliant
on the assumption that application involvement is not that important. In these
cases, it's not clear that using multipath is strictly good.
I should unpack that a little. For those people who are making scheduling
decisions outside of the endpoint (possible examples being the satellite case
and the 3GPP case), it's not clear that this is anything endpoints can prevent.
An endpoint probably can't stop a network provider from using ECMP either.
Similarly, it is not clear how an application endpoint could be aware of these
decisions at a level that would allow them to understand and adapt to this
treatment. The result is that these cases have a far more ambiguous value
proposition. Improvements come with trade-offs: for instance, the application
might get better throughput, but it comes at a cost to latency. So I conclude
that while these intermediary-based designs might provide an aggregate gain,
they will probably not realize the full performance gains that come from
end-to-end awareness and control.
During the meeting, David complained about the fact that ATSSS what not
beneficial for endusers. This is not totally true. Endusers want to be
connected to the Internet using their mobile devices. This requires the
deployment of mobile networks. One approach is to deploy a single mobile
network (say 5G) and always attach the enduser's devices to this
network. That's the best approach for single path protocols like TCP,
UDP and QUIC. Another approach is to recognise that a mobile network
like 5G can be combined with existing WiFi networks that are also widely
deployed. Using techniques that allow mobile devices to roam over
different WiFi networks makes it possible for them to use 5G and/or WiFi
even while moving. This is what Apple does with Apple Music and Siri.
Users do not understand that a solution that works with some application
cannot work with all of them. This is the motivation for the deployment
of ATSSS by network operators.
For IETF insiders, see also the BANANA or LOOPS BoFs which were strictly
network-based analogues of these. Many of the same concerns that caused those
BoFs to fail apply to those use cases.
If we had deployed multipath transport, BANANA and the hybrid access
networks would never had to be discussed. The existence of these BoFs is
a result of the architectural limitations of our current protocols.
Maybe we accept the application of the protocol to these questionable ends as
acceptable collateral if we are able to deploy at the endpoints. Maybe we
allow intermediaries to seek marginal improvements, but try to ensure that we
have a clear path to deploying something better in the long term. But there is
a risk that deployment in the network could interact poorly with more-ideal
end-to-end solutions and even prevent those deployments.
In network deployments depend on business needs. There is a clear
business need for combining different types of access networks. With
Multipath QUIC it would become possible to use them on en end-to-end
basis without any intermediary. If we stick to single path transport,
then we will force the development and the deployment of new
intermediaries...
Olivier