Re: [Bloat] BBR implementations, knobs to turn?

2020-11-30 Thread Aaron Wood
>
> The CPE side has met willingness to investigate these issues from early
> on, but it seems that buffer handling is much harder on CPE chipsets
> than on base station chipsets.  In particular on 5G.  We have had some
> very good results on 4G, but they do not translate to 5G.
>

My own experience with various CPE work (working with ODMs to get hardware
built) is that those building CPE are stuck with what the silicon vendors
will support, much like where we currently are with home routers.  The AP
firmware and drivers have bloated buffers and there's very little that can
be done to change that.

The large OEMs (the very large, well-known, retail brands) have the volume
to put pressure on the silicon vendors to fix this.  But there's not much
incentive for a silicon vendor to address this issue for a smaller customer.
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] BBR implementations, knobs to turn?

2020-11-23 Thread erik.taraldsen

Fra: Toke Høiland-Jørgensen 

> > I really appreciate that you are reaching out to the bufferbloat community
> > for this real-life 5G mobile testing.  Lets all help out Erik.
>
> Yes! FYI, I've been communicating off-list with Erik for quite some
> time, he's doing great work but fighting the usual up-hill battle to get
> others to recognise the issues; so +1, let's give him all the help we
> can :)

Thanks!  I'll need all the patting on the back which can be provided. :)


> > From your graphs, it does look like you are measuring latency
> > under-load, e.g. while the curl download/upload is running.  This is
> > great as this is the first rule of bufferbloat measuring :-)  (and Luca
> > hinted to this)

I've been using Flent since day one of the Fixed Wireless project.  In fact
it was an part of the CPE RFQ process that all participating vendors must 
deliver test results from Flent as a part of the technical response.  Not so 
much
that we should save ourselves the work of testing.  But to force the vendors
to see how their equipment handle latency. Trying to establish Flent as a 
standard part of their test suit.

Toke and Dave has been of great help both in helping me in interpreting
Flent results, and moral support as it can be an up hill battle to achieve 
awareness of the bufferbloat issues.


> > The Huawei policer/shaper sounds scary.  And 1000 packets deep queue
> > also sound like a recipe for bufferbloat.  I would of-cause like to
> > re-write the Huawei policer/shaper with the knowledge and techniques we
> > know from our bufferbloat work in the Linux Kernel.  (If only I knew
> > someone that coded on 5G solutions that could implement this on their
> > hardware solution, and provide a better product Cc. Carlo)

I have tried on several occasions to get the vendors to subscribe to this
mailing list.  And individuals here have been willing to do consulting
work for vendors in Telenors RFQ, but as far as I know they have not
been contacted.


> Your other points about bloated queues etc, are spot on. Ideally, we
> could get operators to fix their gear, but working around the issues
> like Erik is doing can work in the meantime. And it's great to see that
> it seems like Telenor is starting to roll this out; as far as I can tell
> that has taken quite a bit of advocacy from Erik's side to get there! :)

As you say Toke, I have had some success.  In particular with Huawei
on the base station side I believe we will have the largest impact.

The CPE side has met willingness to investigate these issues from early
on, but it seems that buffer handling is much harder on CPE chipsets 
than on base station chipsets.  In particular on 5G.  We have had some 
very good results on 4G, but they do not translate to 5G.


-Erik
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] BBR implementations, knobs to turn?

2020-11-20 Thread Toke Høiland-Jørgensen via Bloat
Jesper Dangaard Brouer  writes:

> Hi Erik,
>
> I really appreciate that you are reaching out to the bufferbloat community
> for this real-life 5G mobile testing.  Lets all help out Erik.

Yes! FYI, I've been communicating off-list with Erik for quite some
time, he's doing great work but fighting the usual up-hill battle to get
others to recognise the issues; so +1, let's give him all the help we
can :)

> From your graphs, it does look like you are measuring latency
> under-load, e.g. while the curl download/upload is running.  This is
> great as this is the first rule of bufferbloat measuring :-)  (and Luca
> hinted to this)
>
> The Huawei policer/shaper sounds scary.  And 1000 packets deep queue
> also sound like a recipe for bufferbloat.  I would of-cause like to
> re-write the Huawei policer/shaper with the knowledge and techniques we
> know from our bufferbloat work in the Linux Kernel.  (If only I knew
> someone that coded on 5G solutions that could implement this on their
> hardware solution, and provide a better product Cc. Carlo)
>
> Are you familiar with Toke's (cc) work/PhD on handling bufferbloat on
> wireless networks?  (Hint: Airtime fairness)
>
> Solving bufferbloat in wireless networks require more than applying
> fq_codel on the bottleneck queue, it requires Airtime fairness.  Doing
> scheduling based Clients use of Radio-time and transmit-opportunities
> (TXOP), instead of shaping based on bytes. (This is why it can (if you
> are very careful) make sense to "holding back packets a bit" to
> generate a packet aggregate that only consumes one TXOP).
>
> The culprit is that each Client/MobilePhone will be sending at
> different rates, and scheduling based on bytes, will cause a Client with
> a low rate to consume a too large part of the shared radio airtime.
> That basically sums up Toke's PhD ;-)

Much as I of course appreciate the call-out, airtime fairness itself is
not actually much of an issue with mobile networks (LTE/5G/etc)... :)

The reason being that they use TDMA scheduling enforced by the base
station; so there's a central controller that enforces airtime usage
built into the protocol, which ensures fairness (unless the operator
explicitly configures it to be unfair for policy reasons). So the new
insight in my PhD is not so much "airtime fairness is good for wireless
links" as it is "we can achieve airtime fairness in CDMA/CS-scheduled
networks like WiFi".

Your other points about bloated queues etc, are spot on. Ideally, we
could get operators to fix their gear, but working around the issues
like Erik is doing can work in the meantime. And it's great to see that
it seems like Telenor is starting to roll this out; as far as I can tell
that has taken quite a bit of advocacy from Erik's side to get there! :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] BBR implementations, knobs to turn?

2020-11-20 Thread Jesper Dangaard Brouer
Hi Erik,

I really appreciate that you are reaching out to the bufferbloat community
for this real-life 5G mobile testing.  Lets all help out Erik.

From your graphs, it does look like you are measuring latency
under-load, e.g. while the curl download/upload is running.  This is
great as this is the first rule of bufferbloat measuring :-)  (and Luca
hinted to this)

The Huawei policer/shaper sounds scary.  And 1000 packets deep queue
also sound like a recipe for bufferbloat.  I would of-cause like to
re-write the Huawei policer/shaper with the knowledge and techniques we
know from our bufferbloat work in the Linux Kernel.  (If only I knew
someone that coded on 5G solutions that could implement this on their
hardware solution, and provide a better product Cc. Carlo)

Are you familiar with Toke's (cc) work/PhD on handling bufferbloat on
wireless networks?  (Hint: Airtime fairness)

Solving bufferbloat in wireless networks require more than applying
fq_codel on the bottleneck queue, it requires Airtime fairness.  Doing
scheduling based Clients use of Radio-time and transmit-opportunities
(TXOP), instead of shaping based on bytes. (This is why it can (if you
are very careful) make sense to "holding back packets a bit" to
generate a packet aggregate that only consumes one TXOP).

The culprit is that each Client/MobilePhone will be sending at
different rates, and scheduling based on bytes, will cause a Client with
a low rate to consume a too large part of the shared radio airtime.
That basically sums up Toke's PhD ;-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Cc. Marek due to his twitter post[1] and link to 5G-BBR blogpost[2]:
 [1] https://twitter.com/majek04/status/1329708548297732097
 [2] https://blog.acolyer.org/2020/10/05/understanding-operational-5g/


On Thu, 19 Nov 2020 14:35:27 +  wrote:

> Hello Luca
> 
> 
> The current PGW is a policer.   What the next version will be, I'm not sure.
> 
> 
> However on parts of the Huawei RAN the policing rate is set to be a
> shaper speed on the eNodeB (radio antenna).  1000 packets deep. And
> it not only shapes down to 30Mb, but tries to aggregate packets to
> keep a level speed whenever using the radio interface.  Meaning
> holding back packets a bit to try and get to 30Mbit when sending in
> bulk in case of less than 30Mbit user traffic.  30Mbit being an
> example subscription speed.
> 
> 
> We are rolling out a fix to turn of that Huawei shaper, but it is not
> done nation wide yet.
> 
> The test device is in a lab area, using close to, but not entirely
> the same as production 5G setup from Ericsson.  Here there should not
> be any shapers involved in the downstream path here.  There is
> however a bloated buffer on the upstream path which we are working on
> correcting.
> 
> 
> The curl graphs are "time to complete a curl download of x file
> size", using a apache webserver running bbr.
> 
> 
> -Erik
> 
> 
> 
> Fra: Luca Muscariello 
> Sendt: 19. november 2020 14:32
> Til: Taraldsen Erik
> Kopi: Jesper Dangaard Brouer; priyar...@google.com; bloat; Luca Muscariello
> Emne: Re: [Bloat] BBR implementations, knobs to turn?
> 
> Hi Erick,
> 
> one question about the PGW: is it a policer or a shaper that you have 
> installed?
> Also, have you tried to run a ping session before and in parallel to the curl 
> sessions?
> 
> Luca
> 
> 
> 
> On Thu, Nov 19, 2020 at 2:15 PM 
> mailto:erik.tarald...@telenor.com>> wrote:
> Update:
> The 5G router was connected to a new base station.  Now the limiting factor 
> of throughput is the policer on the PGW in mobile core, not the radio link 
> itself.  The SIM card used is limited to 30Mbit/s.  This scenario favours the 
> new server.  I have attached graphs comparing radio link limited vs PGW 
> policer results, and a zoomed in graph of the policer
> 
> 
> We have Huawei RAN and Ericsson RAN, rate limited and not rate limited 
> subscriptions, 4G and 5G access, and we are migrating to a new core with new 
> PGW (policer).  Starting to be a bit of a matrix to set up tests for.
> 
> 
> -Erik
> 
> 
> 
> Fra: Jesper Dangaard Brouer mailto:bro...@redhat.com>>
> Sendt: 17. november 2020 16:07
> Til: Taraldsen Erik; Priyaranjan Jha
> Kopi: bro...@redhat.com<mailto:bro...@redhat.com>; 
> ncardw...@google.com<mailto:ncardw...@google.com>; 
> bloat@lists.bufferbloat.net<mailto:bloat@lists.bufferbloat.net>
> Emne: Re: [Bloat] BBR implementations, knobs to turn?
> 
> On Tue, 17 Nov 2020 10:05:24 + 
> mailto:erik.tarald...@telenor.com>> wrote:
> 
> > Thank y

Re: [Bloat] BBR implementations, knobs to turn?

2020-11-19 Thread Luca Muscariello
Hi Erick,

one question about the PGW: is it a policer or a shaper that you have
installed?
Also, have you tried to run a ping session before and in parallel to the
curl sessions?

Luca



On Thu, Nov 19, 2020 at 2:15 PM  wrote:

> Update:
> The 5G router was connected to a new base station.  Now the limiting
> factor of throughput is the policer on the PGW in mobile core, not the
> radio link itself.  The SIM card used is limited to 30Mbit/s.  This
> scenario favours the new server.  I have attached graphs comparing radio
> link limited vs PGW policer results, and a zoomed in graph of the policer
>
>
> We have Huawei RAN and Ericsson RAN, rate limited and not rate limited
> subscriptions, 4G and 5G access, and we are migrating to a new core with
> new PGW (policer).  Starting to be a bit of a matrix to set up tests for.
>
>
> -Erik
>
>
> 
> Fra: Jesper Dangaard Brouer 
> Sendt: 17. november 2020 16:07
> Til: Taraldsen Erik; Priyaranjan Jha
> Kopi: bro...@redhat.com; ncardw...@google.com; bloat@lists.bufferbloat.net
> Emne: Re: [Bloat] BBR implementations, knobs to turn?
>
> On Tue, 17 Nov 2020 10:05:24 +  wrote:
>
> > Thank you for the response Neal
>
> Yes. And it is impressive how many highly qualified people are on the
> bufferbloat list.
>
> > old_hw # uname -r
> > 5.3.0-64-generic
> > (Ubuntu 19.10 on xenon workstation, integrated network card, 1Gbit
> > GPON access.  Used as proof of concept from the lab at work)
> >
> >
> > new_hw # uname -r
> > 4.18.0-193.19.1.el8_2.x86_64
> > (Centos 8.2 on xenon rack server, discrete 10Gbit network card,
> > 40Gbit server farm link (low utilization on link), intended as fully
> > supported and run service.  Not possible to have newer kernel and
> > still get service agreement in my organization)
>
> Let me help out here.  The CentOS/RHEL8 kernels have a huge amount of
> backports.  I've attached a patch/diff of net/ipv4/tcp_bbr.c changes
> missing in RHEL8.
>
> It looks like these patches are missing in CentOS/RHEL8:
>  [1] https://git.kernel.org/torvalds/c/78dc70ebaa38aa3
>  [2] https://git.kernel.org/torvalds/c/a87c83d5ee25cf7
>
> Could missing patch [1] result in the issue Erik is seeing?
> (It explicitly mentions improvements for WiFi...)
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
> ___
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] BBR implementations, knobs to turn?

2020-11-17 Thread Jesper Dangaard Brouer
On Tue, 17 Nov 2020 10:05:24 +  wrote:

> Thank you for the response Neal

Yes. And it is impressive how many highly qualified people are on the
bufferbloat list.

> old_hw # uname -r
> 5.3.0-64-generic
> (Ubuntu 19.10 on xenon workstation, integrated network card, 1Gbit
> GPON access.  Used as proof of concept from the lab at work)
>  
> 
> new_hw # uname -r
> 4.18.0-193.19.1.el8_2.x86_64
> (Centos 8.2 on xenon rack server, discrete 10Gbit network card,
> 40Gbit server farm link (low utilization on link), intended as fully
> supported and run service.  Not possible to have newer kernel and
> still get service agreement in my organization)

Let me help out here.  The CentOS/RHEL8 kernels have a huge amount of
backports.  I've attached a patch/diff of net/ipv4/tcp_bbr.c changes
missing in RHEL8.

It looks like these patches are missing in CentOS/RHEL8:
 [1] https://git.kernel.org/torvalds/c/78dc70ebaa38aa3
 [2] https://git.kernel.org/torvalds/c/a87c83d5ee25cf7

Could missing patch [1] result in the issue Erik is seeing?
(It explicitly mentions improvements for WiFi...)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--- /home/hawk/git/redhat/kernel-rhel8/net/ipv4/tcp_bbr.c	2020-01-30 17:38:20.832726582 +0100
+++ /home/hawk/git/kernel/net-next/net/ipv4/tcp_bbr.c	2020-11-17 15:38:22.665729797 +0100
@@ -115,6 +115,14 @@ struct bbr {
 		unused_b:5;
 	u32	prior_cwnd;	/* prior cwnd upon entering loss recovery */
 	u32	full_bw;	/* recent bw, to estimate if pipe is full */
+
+	/* For tracking ACK aggregation: */
+	u64	ack_epoch_mstamp;	/* start of ACK sampling epoch */
+	u16	extra_acked[2];		/* max excess data ACKed in epoch */
+	u32	ack_epoch_acked:20,	/* packets (S)ACKed in sampling epoch */
+		extra_acked_win_rtts:5,	/* age of extra_acked, in round trips */
+		extra_acked_win_idx:1,	/* current index in extra_acked array */
+		unused_c:6;
 };
 
 #define CYCLE_LEN	8	/* number of phases in a pacing gain cycle */
@@ -128,6 +136,14 @@ static const u32 bbr_probe_rtt_mode_ms =
 /* Skip TSO below the following bandwidth (bits/sec): */
 static const int bbr_min_tso_rate = 120;
 
+/* Pace at ~1% below estimated bw, on average, to reduce queue at bottleneck.
+ * In order to help drive the network toward lower queues and low latency while
+ * maintaining high utilization, the average pacing rate aims to be slightly
+ * lower than the estimated bandwidth. This is an important aspect of the
+ * design.
+ */
+static const int bbr_pacing_margin_percent = 1;
+
 /* We use a high_gain value of 2/ln(2) because it's the smallest pacing gain
  * that will allow a smoothly increasing pacing rate that will double each RTT
  * and send the same number of packets per RTT that an un-paced, slow-starting
@@ -174,6 +190,15 @@ static const u32 bbr_lt_bw_diff = 4000 /
 /* If we estimate we're policed, use lt_bw for this many round trips: */
 static const u32 bbr_lt_bw_max_rtts = 48;
 
+/* Gain factor for adding extra_acked to target cwnd: */
+static const int bbr_extra_acked_gain = BBR_UNIT;
+/* Window length of extra_acked window. */
+static const u32 bbr_extra_acked_win_rtts = 5;
+/* Max allowed val for ack_epoch_acked, after which sampling epoch is reset */
+static const u32 bbr_ack_epoch_acked_reset_thresh = 1U << 20;
+/* Time period for clamping cwnd increment due to ack aggregation */
+static const u32 bbr_extra_acked_max_us = 100 * 1000;
+
 static void bbr_check_probe_rtt_done(struct sock *sk);
 
 /* Do we estimate that STARTUP filled the pipe? */
@@ -200,21 +225,33 @@ static u32 bbr_bw(const struct sock *sk)
 	return bbr->lt_use_bw ? bbr->lt_bw : bbr_max_bw(sk);
 }
 
+/* Return maximum extra acked in past k-2k round trips,
+ * where k = bbr_extra_acked_win_rtts.
+ */
+static u16 bbr_extra_acked(const struct sock *sk)
+{
+	struct bbr *bbr = inet_csk_ca(sk);
+
+	return max(bbr->extra_acked[0], bbr->extra_acked[1]);
+}
+
 /* Return rate in bytes per second, optionally with a gain.
  * The order here is chosen carefully to avoid overflow of u64. This should
  * work for input rates of up to 2.9Tbit/sec and gain of 2.89x.
  */
 static u64 bbr_rate_bytes_per_sec(struct sock *sk, u64 rate, int gain)
 {
-	rate *= tcp_mss_to_mtu(sk, tcp_sk(sk)->mss_cache);
+	unsigned int mss = tcp_sk(sk)->mss_cache;
+
+	rate *= mss;
 	rate *= gain;
 	rate >>= BBR_SCALE;
-	rate *= USEC_PER_SEC;
+	rate *= USEC_PER_SEC / 100 * (100 - bbr_pacing_margin_percent);
 	return rate >> BW_SCALE;
 }
 
 /* Convert a BBR bw and gain factor to a pacing rate in bytes per second. */
-static u32 bbr_bw_to_pacing_rate(struct sock *sk, u32 bw, int gain)
+static unsigned long bbr_bw_to_pacing_rate(struct sock *sk, u32 bw, int gain)
 {
 	u64 rate = bw;
 
@@ -242,18 +279,12 @@ static void bbr_init_pacing_rate_from_rt
 	sk->sk_pacing_rate = bbr_bw_to_pacing_rate(sk, bw, bbr_high_gain);
 }
 
-/* Pace using current bw estimate and a gain factor. In order to help drive the
- * network 

Re: [Bloat] BBR implementations, knobs to turn?

2020-11-16 Thread Neal Cardwell via Bloat
A couple questions:

- I guess this is Linux TCP BBRv1 ("bbr" module)? What's the OS
distribution and exact kernel version ("uname -r")?

- What do you mean when you say "The old server allows for more
re-transmits"?

- If BBRv1 is suffering throughput problems due to high retransmit rates,
then usually the retransmit rate is around 15% or higher. If the retransmit
rate is that high on a radio link that is being tested, then that radio
link may be having issues that should be investigated separately?

- Would you be able to take a tcpdump trace of the well-behaved and
problematic traffic and share the pcap or a plot?

https://github.com/google/bbr/blob/master/Documentation/bbr-faq.md#how-can-i-visualize-the-behavior-of-linux-tcp-bbr-connections

- Would you be able to share the output of "ss -tin" from a recently built
"ss" binary, near the end of a long-lived test flow, for the well-behaved
and problematic cases?

https://github.com/google/bbr/blob/master/Documentation/bbr-faq.md#how-can-i-monitor-linux-tcp-bbr-connections

best,
neal



On Mon, Nov 16, 2020 at 10:25 AM  wrote:

> I'm in the process of replacing a throughput test server.  The old server
> is running a 1Gbit Ethernet card on a 1Gbit link and ubuntu.  The new a
> 10Gbit card on a 40Gbit link and centos.  Both have low load and Xenon
> processors.
>
>
> The purpose is for field installers to verify the bandwidth sold to the
> customers using known clients against known servers.  (4G and 5G fixed
> installations mainly).
>
>
> What I'm finding is that the new server is consistently delivering
> slightly lower throughput than the old server.  The old server allows for
> more re-transmits and has a slightly higher congestion window than the new
> server.
>
>
> Is there any way to tune bbr to allow for more re-transmits (which seems
> to be the limiting factor)?  Or other suggestions?
>
>
>
> (Frankly I think the old server is to aggressive for general purpose use.
> It seems to starve out other tcp sessions more than the new server.  So for
> delivering regular content to users the new implementation seems more
> balanced, but that is not the target here.  We want to stress test the
> radio link.)
>
>
> Regards Erik
> ___
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] BBR implementations, knobs to turn?

2020-11-16 Thread erik.taraldsen
I'm in the process of replacing a throughput test server.  The old server is 
running a 1Gbit Ethernet card on a 1Gbit link and ubuntu.  The new a 10Gbit 
card on a 40Gbit link and centos.  Both have low load and Xenon processors.


The purpose is for field installers to verify the bandwidth sold to the 
customers using known clients against known servers.  (4G and 5G fixed 
installations mainly).


What I'm finding is that the new server is consistently delivering slightly 
lower throughput than the old server.  The old server allows for more 
re-transmits and has a slightly higher congestion window than the new server.


Is there any way to tune bbr to allow for more re-transmits (which seems to be 
the limiting factor)?  Or other suggestions?



(Frankly I think the old server is to aggressive for general purpose use.  It 
seems to starve out other tcp sessions more than the new server.  So for 
delivering regular content to users the new implementation seems more balanced, 
but that is not the target here.  We want to stress test the radio link.)


Regards Erik
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat