>Please try again, with a fixed tcp_rmem[1] on receiver, taking into
>account bigger memory requirement for MTU 9000
>Rationale : TCP should be ready to receive 10 full frames before
>autotuning takes place (these 10 MSS are typically in a single GRO
> packet)
>At 9000 MTU, one frame typically consumes 12KB (or 16KB on some
arches/drivers)
>TCP uses a 50% factor rule, accounting 18000 bytes of kernel memory per MSS.
->
>echo "4096 180000 15728640" >/proc/sys/net/ipv4/tcp_rmem
>diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>index
>9e8a6c1aa0190cc248b3b99b073a4c6e45884cf5..81b5d9375860ae583e08045fb25b089c456c60ab
>100644
>--- a/net/ipv4/tcp_input.c
>+++ b/net/ipv4/tcp_input.c
>@@ -534,6 +534,7 @@ static void tcp_init_buffer_space(struct sock *sk)
>
> tp->rcv_ssthresh = min(tp->rcv_ssthresh, tp->window_clamp);
> tp->snd_cwnd_stamp = tcp_jiffies32;
>+ tp->rcvq_space.space = min(tp->rcv_ssthresh, tp->rcvq_space.space);
>}
Yes this worked and it looks like echo "4096 140000 15728640"
>/proc/sys/net/ipv4/tcp_rmem is actually enough to trigger TCP autotuning, if
the current default tcp_rmem[1] doesn't work well with 9000 MTU I am curious to
know if there is specific reason behind having 131072 specifically as
tcp_rmem[1]?I think the number itself has to be divisible by page size (4K) and
16KB given what you said that each Jumbo frame packet may consume up to 16KB.
if the patch I proposed would be risky for users who have MTU of 1500 because
of its higher memory footprint in my opinion we should get the patch you
proposed merged instead of asking the Admins doing the manual work.
Thank you.
Hazem
On 07/12/2020, 17:28, "Eric Dumazet" <[email protected]> wrote:
CAUTION: This email originated from outside of the organization. Do not
click links or open attachments unless you can confirm the sender and know the
content is safe.
On Mon, Dec 7, 2020 at 6:17 PM Mohamed Abuelfotoh, Hazem
<[email protected]> wrote:
>
> >Thanks for testing this, Eric. Would you be able to share the MTU
> >config commands you used, and the tcpdump traces you get? I'm
> >surprised that receive buffer autotuning would work for advmss of
> >around 6500 or higher.
>
> Packet capture before applying the proposed patch
>
>
https://tcpautotuningpcaps.s3.eu-west-1.amazonaws.com/sender-bbr-bad-unpatched.pcap?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAJNMP5ZZ3I4FAQGAQ%2F20201207%2Feu-west-1%2Fs3%2Faws4_request&X-Amz-Date=20201207T170123Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=a599a0e0e6632a957e5619007ba5ce4f63c8e8535ea24470b7093fef440a8300
>
> Packet capture after applying the proposed patch
>
>
https://tcpautotuningpcaps.s3.eu-west-1.amazonaws.com/sender-bbr-good-patched.pcap?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAJNMP5ZZ3I4FAQGAQ%2F20201207%2Feu-west-1%2Fs3%2Faws4_request&X-Amz-Date=20201207T165831Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=f18ec7246107590e8ac35c24322af699e4c2a73d174067c51cf6b0a06bbbca77
>
> kernel version & MTU and configuration from my receiver & sender is
attached to this e-mail, please be aware that EC2 is doing MSS clamping so you
need to configure MTU as 1500 on the sender side if you don’t have any MSS
clamping between sender & receiver.
>
> Thank you.
>
> Hazem
Please try again, with a fixed tcp_rmem[1] on receiver, taking into
account bigger memory requirement for MTU 9000
Rationale : TCP should be ready to receive 10 full frames before
autotuning takes place (these 10 MSS are typically in a single GRO
packet)
At 9000 MTU, one frame typically consumes 12KB (or 16KB on some
arches/drivers)
TCP uses a 50% factor rule, accounting 18000 bytes of kernel memory per MSS.
->
echo "4096 180000 15728640" >/proc/sys/net/ipv4/tcp_rmem
>
>
> On 07/12/2020, 16:34, "Neal Cardwell" <[email protected]> wrote:
>
> CAUTION: This email originated from outside of the organization. Do
not click links or open attachments unless you can confirm the sender and know
the content is safe.
>
>
>
> On Mon, Dec 7, 2020 at 11:23 AM Eric Dumazet <[email protected]>
wrote:
> >
> > On Mon, Dec 7, 2020 at 5:09 PM Mohamed Abuelfotoh, Hazem
> > <[email protected]> wrote:
> > >
> > > >Since I can not reproduce this problem with another NIC on
x86, I
> > > >really wonder if this is not an issue with ENA driver on
PowerPC
> > > >perhaps ?
> > >
> > >
> > > I am able to reproduce it on x86 based EC2 instances using ENA
or Xen netfront or Intel ixgbevf driver on the receiver so it's not specific
to ENA, we were able to easily reproduce it between 2 VMs running in virtual
box on the same physical host considering the environment requirements I
mentioned in my first e-mail.
> > >
> > > What's the RTT between the sender & receiver in your
reproduction? Are you using bbr on the sender side?
> >
> >
> > 100ms RTT
> >
> > Which exact version of linux kernel are you using ?
>
> Thanks for testing this, Eric. Would you be able to share the MTU
> config commands you used, and the tcpdump traces you get? I'm
> surprised that receive buffer autotuning would work for advmss of
> around 6500 or higher.
>
> thanks,
> neal
>
>
>
>
> Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855
Luxembourg, R.C.S. Luxembourg B186284
>
> Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza,
Burlington Road, Dublin 4, Ireland, branch registration number 908705
>
>
Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg,
R.C.S. Luxembourg B186284
Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington
Road, Dublin 4, Ireland, branch registration number 908705