It sounds like this problem is not inherent to single-connection multi-path,
but will be present in any multi-path implementation, including
multiple-tcp-connections used with application-layer muxing.
If this is correct, then it isn’t really a ‘QUIC problem, but rather an
implementation/scheduling/CC problem.
That isn’t saying that it isn’t interesting or important to solve, but rather
that the protocol itself need not change to solve the problem for generic-QUIC
transport.
H3, OTOH, may suffer from this at L7+ proxies without some changes to QUIC or
H3, but that is a much longer conversation that doesn’t require multi-path to
happen.
-=R
From: Yunfei Ma <[email protected]>
Date: Sunday, July 18, 2021 at 1:17 AM
To: Charles 'Buck' Krasic <[email protected]>, Mirja Kuehlewind
<[email protected]>, Roberto Peon <[email protected]>
Cc: "matt.joras" <[email protected]>, 李振宇 <[email protected]>, Christian Huitema <[email protected]>, Yanmei Liu
<[email protected]>, "lucaspardue.24.7" <[email protected]>, quic <[email protected]>, Qing An
<[email protected]>, Yunfei Ma <[email protected]>
Subject: Re: Multi-path QUIC Extension Experiments
Hi Charles, Roberto, and Mirja:
Thanks a lot for your questions. As all three of you are curious about the
definition of MP-HoL, I am putting my answer into one reply.
Short answer: the MP-HoL is not because of flow control, but rather, it is
related to the nature of path heterogeneity. In other words, MP-HoL can happen
when flow control limit is not reached (as pointed out by Charles, you can set
a large limit on the client side).
More specifically, when you want to send out packets on different paths at the
same time, there is a scheduler to decide how to split your packets and put
them on different paths. However, in mobile networks, the network paths could
have very different path delays. MP-HoL blocking arises when the packets sent
earlier at the slow path arrive later than the packets sent later at the fast
path, causing out-of-order arrival. As a consequence, the out-of-order packets
are not eligible to be submitted to applications, so the fast path has to wait.
For example, say we want to send out two packets that belong to the same video
frame with a min-RTT scheduler, which is default in MPTCP. For each packet, the
scheduler selects a path for that packet to transmit. The selection has two
criterias: (1) the path's congestion window is not full and (2) the path
selected has a smaller RTT than the other. If somehow, at the moment of
transmitting, the fast path's cwnd is full (some traffic has been sent before),
the first packet is then put on the slow path by the scheduler. Later, an ACK
is received and the fast path becomes available, so the scheduler puts the
second packet on the fast path. As a result, there is an out-of-order arrival.
What makes the problem even more difficult is that in mobile networks, the RTTs
can change quickly, which makes accurate prediction very difficult. Worst case
is that when the scheduler thinks it is using the fast path, it is actually
using the slow path instead. As you can see, in order to make multi-path
transport efficient, it is important to solve this problem and that's what we
are doing in this project .
I hope I have answered your questions. If not, please let me know.
Cheers,
Yunfei
On Fri, Jul 16, 2021 at 12:51 PM Charles 'Buck' Krasic
<[email protected]<mailto:[email protected]>> wrote:
"don't overcommit" includes the common practice of setting very large limits on
the client side, where in aggregate the case of server being flow control limited is
effectively non-existent.
I am curious to hear clarification of the precise definition of MP-HoL blocking
here. is it not flow control, but rather path aliasing where distinct paths
are actually sharing some physical link(s)?
On Fri, Jul 16, 2021 at 12:13 PM Roberto Peon
<[email protected]<mailto:[email protected]>> wrote:
I too am curious!
There are only two ways to handle flow control—overcommit, or don’t overcommit.
The “don’t overcommit” choice leads to blocking, since any of that resource
allocated to one path can’t be used by the other.
The “overcommit” choice either leads to OOM, or throwing out some successfully
transmitted and received data.
Underlying this is a fun question: Which inefficiency is worse? Not using
resources that should be used (i.e. from choosing to not overcommit), or
sometimes redundantly using a resource (from choosing to overcommit)?
I’m curious too about what implementation strategies we end up doing in general
around this, and.. if enough implementations are choosing overcommit, if we
need some different protocol mechanisms to bound the redundancy?
-=R
From: QUIC <[email protected]<mailto:[email protected]>> on behalf of Mirja
Kuehlewind
<[email protected]<mailto:[email protected]>>
Date: Friday, July 16, 2021 at 6:15 AM
To: "Ma, Yunfei"
<yunfei.ma<http://yunfei.ma>[email protected]<mailto:[email protected]>>, Robin
MARX <[email protected]<mailto:[email protected]>>, Yanmei Liu
<[email protected]<mailto:[email protected]>>
Cc: "matt.joras" <[email protected]<mailto:[email protected]>>, 李振宇 <[email protected]<mailto:[email protected]>>, Christian Huitema
<[email protected]<mailto:[email protected]>>, "lucaspardue.24.7" <[email protected]<mailto:[email protected]>>,
quic <[email protected]<mailto:[email protected]>>, Qing An <[email protected]<mailto:[email protected]>>
Subject: Re: Multi-path QUIC Extension Experiments
Hi Yunfei,
thanks as well for you sharing your results! Can you explain even a bit more
what you mean by MP-HoL Blocking? Is this because of the flow control limits?
If so wouldn’t it make sense to reserve a certain “space” for each path?
Mirja
From: QUIC <[email protected]<mailto:[email protected]>> on behalf of "Ma, Yunfei"
<yunfei.ma<http://yunfei.ma>[email protected]<mailto:[email protected]>>
Date: Thursday, 15. July 2021 at 04:18
To: Robin MARX <[email protected]<mailto:[email protected]>>, Yanmei Liu
<[email protected]<mailto:[email protected]>>
Cc: "matt.joras" <[email protected]<mailto:[email protected]>>, 李振宇 <[email protected]<mailto:[email protected]>>, Christian Huitema
<[email protected]<mailto:[email protected]>>, "lucaspardue.24.7" <[email protected]<mailto:[email protected]>>,
quic <[email protected]<mailto:[email protected]>>, Qing An <[email protected]<mailto:[email protected]>>
Subject: Re: Re: Multi-path QUIC Extension Experiments
Hi Robin,
Thanks so much for your questions!
First, the head of line blocking discussed here is called multi-path
head-of-line blocking or MP-HoL blocking, and its root cause is quite different
from the stream HoL blocking usually discussed in QUICv1. The MP-HoL blocking
happens when one path blocks the other path, not when one stream blocks the
other stream. Please note that we indeed use multiple streams, for example,
different video requests are carried in different QUIC streams. QUIC’s stream
multiplexing ability and its benefits still hold in this scenario.
Second, regarding packet scheduling mode, right now, in our Taobao A/B test, we
transmit packets on multiple paths simultaneously. However, you can definitely
use traffic switching only and choose to switch when one path could not meet
your bandwidth requirement. Basically, if you use multiple paths
simultaneously, you get the most elasticity from a resource pooling
perspective. It really comes down on what your application needs. We will also
update the packet scheduling section soon in a newer version of the draft, in
which we plan to include more discussions on the packet scheduling policy.
Third, regarding the benefits of more bandwith versus the "downsides". Whether
you want more bandwidth depends on your application. For videos, yes, more bandwidth is
extremely helpful in improving the long tail QoE, which is an important target for
Taobao. We find multi-path QUIC helps us improve two important metrics, rebuffer rate and
video start-up delays. In the past, if you work on multi-path scheduling that does not
collaborate close enough with applications such as MPTCP, the MP-HoL blocking becomes the
downside that cripples the performance. However, the user space nature of QUIC provides
us the opportunity to solve this problem, so now our conclusion is that you can enjoy the
benefits of more bandwidth and more reliable connectivity from multi-path without much of
the “downsides”.
I hope my answer is helpful, but feel free to let me know if you have any
additional comments.
Cheers,
Yunfei
from Alimail
macOS<https://protect2.fireeye.com/v1/url?k=7cc82aa7-2353138a-7cc86a3c-8692dc8284cb-e08a325a5c75cf95&q=1&e=de295b4f-9105-4e32-980f-779c711eaa62&u=https://mail.alibaba-inc.com/>
------------------Original Mail ------------------
Sender:Robin MARX <[email protected]<mailto:[email protected]>>
Send Date:Wed Jul 14 07:39:37 2021
Recipients:Yanmei Liu
<[email protected]<mailto:[email protected]>>
CC:quic <[email protected]<mailto:[email protected]>>, Ma, Yunfei <[email protected]<mailto:[email protected]>>, Christian Huitema
<[email protected]<mailto:[email protected]>>, Qing An <[email protected]<mailto:[email protected]>>, 李振宇
<[email protected]<mailto:[email protected]>>, matt.joras <[email protected]<mailto:[email protected]>>, lucaspardue.24.7
<[email protected]<mailto:[email protected]>>
Subject:Re: Multi-path QUIC Extension Experiments
Hello Yanmei,
Thanks for the additional results on an interesting topic. I'm looking forward
to reading the SIGCOMM paper.
I was a bit surprised to (apparently) see HOL blocking mentioned as a major
issue, as that's one of the things QUIC aims to be better at than TCP.
It's a bit difficult to understand from the slides, but it seems like you're
sending packets for a single stream (Stream ID 1 in the diagrams) on both the
slow and fast path, which would indeed induce HOL blocking.
Consequently, I was wondering what the practical reasons are for you to
multiplex packets for a single stream over multiple paths, as opposed to for
example attaching a single stream to a single path (say: high priority streams
use the fast path for all their packets).
I see this mentioned a bit in the draft under "packet scheduling", where it
talks about switching paths once the cwnd is full for one. That indeed leads to the
behaviour seen in the slides, but that's my question: why would you take those approaches
then?
Are there so many cases where the additional "bandwidth" from using multiple
path's cwnd for a single stream outweigh the downsides of HOL blocking? Relatedly: what
are the packet loss rates you've observed on real networks?
Have you experimented with e.g., tying streams to paths more closely? Does that
work better or worse? Why?
I'm mainly wondering how these tradeoffs evolve depending on the type of paths
available and if it's possible to make a model to drive this logic.
I assume there is much existing work on this for MPTCP, but I also assume some
of that changes due to QUIC's independent streams / stream prioritization
flexibility.
Thank you in advance and with best regards,
Robin
On Sun, 11 Jul 2021 at 20:48, Yanmei Liu
<[email protected]<mailto:[email protected]>>
wrote:
Hi everyone,
We have finished some experiments about deploying multi-path quic
extension(https://datatracker.ietf.org/doc/draft-liu-multipath-quic/)<https://datatracker.ietf.org/doc/draft-liu-multipath-quic/)>
in Alibaba Taobao short-form video streaming, and the experiment results are
concluded in the slides (attached file).
If anyone is interested in the experimental details about multi-path quic,
please let us know.
All the feedbacks and suggestions are appreciated!
Best regards,
Yanmei
--
dr. Robin Marx
Postdoc researcher - Web protocols
Expertise centre for Digital Media
Cellphone +32(0)497 72 86 94
www.uhasselt.be<https://protect2.fireeye.com/v1/url?k=37557dd4-68ce44f9-37553d4f-8692dc8284cb-fe608437d16ed9d9&q=1&e=de295b4f-9105-4e32-980f-779c711eaa62&u=http://www.uhasselt.be/>
Universiteit Hasselt - Campus Diepenbeek
Agoralaan Gebouw D - B-3590 Diepenbeek
Kantoor EDM-2.05
Error! Filename not specified.