Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-28 Thread Marcelo Ricardo Leitner
On Wed, Feb 21, 2018 at 02:37:48PM -0500, David Miller wrote:
> From: Eric Dumazet 
> Date: Mon, 19 Feb 2018 11:56:46 -0800
> 
> > Switching TCP to GSO mode, relying on core networking layers
> > to perform eventual adaptation for dumb devices was overdue.
> > 
> > 1) Most TCP developments are done with TSO in mind.
> > 2) Less high-resolution timers needs to be armed for TCP-pacing
> > 3) GSO can benefit of xmit_more hint
> > 4) Receiver GRO is more effective (as if TSO was used for real on sender)
> >-> less ACK packets and overhead.
> > 5) Write queues have less overhead (one skb holds about 64KB of payload)
> > 6) SACK coalescing just works. (no payload in skb->head)
> > 7) rtx rb-tree contains less packets, SACK is cheaper.
> > 8) Removal of legacy code. Less maintenance hassles.
> > 
> > Note that I have left the sendpage/zerocopy paths, but they probably can
> > benefit from the same strategy.
> > 
> > Thanks to Oleksandr Natalenko for reporting a performance issue for 
> > BBR/fq_codel,
> > which was the main reason I worked on this patch series.
> 
> Series applied, thanks Eric.
> 
> SCTP might want to do something similar, and if so we can get rid
> of sk_can_gso() too.

Cc'ing linux-sctp and adding to the ToDo here, although it may be too
soon for SCTP. GSO support was added just a few months ago and
considering that it is not that much widely used as TCP, I fear we may
have some issues that didn't show up yet.

  Marcelo


Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-21 Thread David Miller
From: Eric Dumazet 
Date: Mon, 19 Feb 2018 11:56:46 -0800

> Switching TCP to GSO mode, relying on core networking layers
> to perform eventual adaptation for dumb devices was overdue.
> 
> 1) Most TCP developments are done with TSO in mind.
> 2) Less high-resolution timers needs to be armed for TCP-pacing
> 3) GSO can benefit of xmit_more hint
> 4) Receiver GRO is more effective (as if TSO was used for real on sender)
>-> less ACK packets and overhead.
> 5) Write queues have less overhead (one skb holds about 64KB of payload)
> 6) SACK coalescing just works. (no payload in skb->head)
> 7) rtx rb-tree contains less packets, SACK is cheaper.
> 8) Removal of legacy code. Less maintenance hassles.
> 
> Note that I have left the sendpage/zerocopy paths, but they probably can
> benefit from the same strategy.
> 
> Thanks to Oleksandr Natalenko for reporting a performance issue for 
> BBR/fq_codel,
> which was the main reason I worked on this patch series.

Series applied, thanks Eric.

SCTP might want to do something similar, and if so we can get rid
of sk_can_gso() too.


Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Oleksandr Natalenko
Hi.

On středa 21. února 2018 0:21:37 CET Eric Dumazet wrote:
> My latest patch (fixing BBR underestimation of cwnd)
> was meant for net tree, on a NIC where SG/TSO/GSO) are disabled.
> 
> ( ie when sk->sk_gso_max_segs is not set to 'infinite' )
> 
> It is packet scheduler independent really.
> 
> Tested here with pfifo_fast, TSO/GSO off.

Well, before the patch with BBR and sg off here it is ~450 Mbps for fq and 
~115 Mbps for pfifo_fast. So, comparing to what I see with the patch (850 and 
200 respectively), it is definitely an improvement.

Thanks.




Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Eric Dumazet
On Tue, 2018-02-20 at 21:45 +0100, Oleksandr Natalenko wrote:
> On úterý 20. února 2018 21:09:37 CET Eric Dumazet wrote:
> > Also you can tune your NIC to accept few MSS per GSO/TSO packet
> > 
> > ip link set dev eth0 gso_max_segs 2
> > 
> > So even if TSO/GSO is there, BBR should not use sk->sk_gso_max_segs to
> > size its bursts, since burt sizes are also impacting GRO on the
> > receiver.
> 
> net-next + 7 patches (6 from the patchset + this one).

My latest patch (fixing BBR underestimation of cwnd)
was meant for net tree, on a NIC where SG/TSO/GSO) are disabled.

( ie when sk->sk_gso_max_segs is not set to 'infinite' )

It is packet scheduler independent really.

Tested here with pfifo_fast, TSO/GSO off.

Before patch :
for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
691  (ss -temoi shows cwnd is stuck around 6 )
667
651
631
517

After patch :
# for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
   1733 (ss -temoi shows cwnd is around 386 )
   1778
   1746
   1781
   1718

Thanks.



Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Oleksandr Natalenko
On úterý 20. února 2018 21:09:37 CET Eric Dumazet wrote:
> Also you can tune your NIC to accept few MSS per GSO/TSO packet
> 
> ip link set dev eth0 gso_max_segs 2
> 
> So even if TSO/GSO is there, BBR should not use sk->sk_gso_max_segs to
> size its bursts, since burt sizes are also impacting GRO on the
> receiver.

net-next + 7 patches (6 from the patchset + this one).

Before playing with gso_max_segs:

BBR+fq
sg on:  4.39 Gbits/sec
sg off: 1.33 Gbits/sec

BBR+fq_codel
sg on:  4.02 Gbits/sec
sg off: 1.41 Gbits/sec

BBR+pfifo_fast
sg on:  3.66 Gbits/sec
sg off: 1.41 Gbits/sec

Reno+fq
sg on:  5.69 Gbits/sec
sg off: 1.53 Gbits/sec

Reno+fq_codel
sg on:  6.33 Gbits/sec
sg off: 1.50 Gbits/sec

Reno+pfifo_fast
sg on:  6.26 Gbits/sec
sg off: 1.48 Gbits/sec

After "ip link set dev eth1 gso_max_segs 2":

BBR+fq
sg on:  806 Mbits/sec
sg off: 886 Mbits/sec

BBR+fq_codel
sg on:  206 Mbits/sec
sg off: 207 Mbits/sec

BBR+pfifo_fast
sg on:  220 Mbits/sec
sg off: 200 Mbits/sec

Reno+fq
sg on:  2.16 Gbits/sec
sg off: 1.27 Gbits/sec

Reno+fq_codel
sg on:  2.45 Gbits/sec
sg off: 1.52 Gbits/sec

Reno+pfifo_fast
sg on:  2.31 Gbits/sec
sg off: 1.54 Gbits/sec

Oleksandr




Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Eric Dumazet
On Tue, 2018-02-20 at 11:56 -0800, Eric Dumazet wrote:
> On Tue, Feb 20, 2018 at 11:51 AM, Oleksandr Natalenko
>  wrote:
> > On úterý 20. února 2018 20:39:49 CET Eric Dumazet wrote:
> > > I am not trying to compare BBR and Reno on a lossless link.
> > > 
> > > Reno is running as fast as possible and will win when bufferbloat is
> > > not an issue.
> > > 
> > > If bufferbloat is not an issue, simply use Reno and be happy ;)
> > > 
> > > My patch helps BBR only, I thought it was obvious ;)
> > 
> > Umm, yes, and my point was rather something like "the speed on a lossless 
> > link
> > while using BBR with and without this patch is the same". Sorry for a
> > confusion. I guess, the key word here is "lossless".
> 
> That is with the other patches _not_ applied ?
> 
> Here the gain is quite big, since BBR can setup a slightly better
> cwnd, allowing proper GRO on receiver.

Also you can tune your NIC to accept few MSS per GSO/TSO packet

ip link set dev eth0 gso_max_segs 2

So even if TSO/GSO is there, BBR should not use sk->sk_gso_max_segs to
size its bursts, since burt sizes are also impacting GRO on the
receiver.





Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Oleksandr Natalenko
On úterý 20. února 2018 20:56:24 CET Eric Dumazet wrote:
> That is with the other patches _not_ applied ?

Yes, other patches are not applied. It is v4.15.4 + this patch only + BBR + 
fq_codel or pfifo_fast. Shall I re-test it on the net-next with the whole 
patchset (because it is not applied cleanly to 4.15)?

Oleksandr




Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Eric Dumazet
On Tue, Feb 20, 2018 at 11:51 AM, Oleksandr Natalenko
 wrote:
> On úterý 20. února 2018 20:39:49 CET Eric Dumazet wrote:
>> I am not trying to compare BBR and Reno on a lossless link.
>>
>> Reno is running as fast as possible and will win when bufferbloat is
>> not an issue.
>>
>> If bufferbloat is not an issue, simply use Reno and be happy ;)
>>
>> My patch helps BBR only, I thought it was obvious ;)
>
> Umm, yes, and my point was rather something like "the speed on a lossless link
> while using BBR with and without this patch is the same". Sorry for a
> confusion. I guess, the key word here is "lossless".

That is with the other patches _not_ applied ?

Here the gain is quite big, since BBR can setup a slightly better
cwnd, allowing proper GRO on receiver.


Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Oleksandr Natalenko
On úterý 20. února 2018 20:39:49 CET Eric Dumazet wrote:
> I am not trying to compare BBR and Reno on a lossless link.
> 
> Reno is running as fast as possible and will win when bufferbloat is
> not an issue.
> 
> If bufferbloat is not an issue, simply use Reno and be happy ;)
> 
> My patch helps BBR only, I thought it was obvious ;)

Umm, yes, and my point was rather something like "the speed on a lossless link 
while using BBR with and without this patch is the same". Sorry for a 
confusion. I guess, the key word here is "lossless".

Oleksandr




Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Eric Dumazet
On Tue, Feb 20, 2018 at 11:35 AM, Oleksandr Natalenko
 wrote:
> Hi.
>
> On úterý 20. února 2018 19:57:42 CET Eric Dumazet wrote:
>> Actually timer drifts are not horrible (at least on my lab hosts)
>>
>> But BBR has a pessimistic way to sense the burst size, as it is tied to
>> TSO/GSO being there.
>>
>> Following patch helps a lot.
>
> Not really, at least if applied to v4.15.4. Still getting 2 Gbps less between
> VMs if using BBR instead of Reno.
>
> Am I doing something wrong?

I am not trying to compare BBR and Reno on a lossless link.

Reno is running as fast as possible and will win when bufferbloat is
not an issue.

If bufferbloat is not an issue, simply use Reno and be happy ;)

My patch helps BBR only, I thought it was obvious ;)


Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Oleksandr Natalenko
Hi.

On úterý 20. února 2018 19:57:42 CET Eric Dumazet wrote:
> Actually timer drifts are not horrible (at least on my lab hosts)
> 
> But BBR has a pessimistic way to sense the burst size, as it is tied to
> TSO/GSO being there.
> 
> Following patch helps a lot.

Not really, at least if applied to v4.15.4. Still getting 2 Gbps less between 
VMs if using BBR instead of Reno.

Am I doing something wrong?

Oleksandr




Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Eric Dumazet
On Tue, 2018-02-20 at 07:39 -0800, Eric Dumazet wrote:
> On Tue, 2018-02-20 at 10:32 +0100, Oleksandr Natalenko wrote:
> > Hi.
> > 
> > 19.02.2018 20:56, Eric Dumazet wrote:
> > > Switching TCP to GSO mode, relying on core networking layers
> > > to perform eventual adaptation for dumb devices was overdue.
> > > 
> > > 1) Most TCP developments are done with TSO in mind.
> > > 2) Less high-resolution timers needs to be armed for TCP-pacing
> > > 3) GSO can benefit of xmit_more hint
> > > 4) Receiver GRO is more effective (as if TSO was used for real on 
> > > sender)
> > >-> less ACK packets and overhead.
> > > 5) Write queues have less overhead (one skb holds about 64KB of 
> > > payload)
> > > 6) SACK coalescing just works. (no payload in skb->head)
> > > 7) rtx rb-tree contains less packets, SACK is cheaper.
> > > 8) Removal of legacy code. Less maintenance hassles.
> > > 
> > > Note that I have left the sendpage/zerocopy paths, but they probably 
> > > can
> > > benefit from the same strategy.
> > > 
> > > Thanks to Oleksandr Natalenko for reporting a performance issue for
> > > BBR/fq_codel,
> > > which was the main reason I worked on this patch series.
> > 
> > Thanks for dealing with this that fast.
> > 
> > Does this mean that the option to optimise internal TCP pacing is still 
> > an open question?
> 
> It is not an optimization that is needed, but taking into account that
> highres timers can have latencies of ~2 usec or more.
> 
> When sending 64KB TSO packets, having extra 2 usec after every ~54 usec
> (at 10Gbit) has no big impact, since TCP computes a slightly inflated
> pacing rate anyway.
> 
> But when sending one MSS/packet every usec, this definitely can
> demonstrate a big slowdown.
> 
> But the anser is yes, I will take a look at this timer drift.

Actually timer drifts are not horrible (at least on my lab hosts)

But BBR has a pessimistic way to sense the burst size, as it is tied to
TSO/GSO being there.

Following patch helps a lot.

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 
b2bca373f8bee35267df49b5947a6793fed71a12..6818042cd8a9a1778f54637861647091afd9a769
 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1730,7 +1730,7 @@ u32 tcp_tso_autosize(const struct sock *sk, unsigned int 
mss_now,
 */
segs = max_t(u32, bytes / mss_now, min_tso_segs);
 
-   return min_t(u32, segs, sk->sk_gso_max_segs);
+   return segs;
 }
 EXPORT_SYMBOL(tcp_tso_autosize);
 
@@ -1742,9 +1742,10 @@ static u32 tcp_tso_segs(struct sock *sk, unsigned int 
mss_now)
const struct tcp_congestion_ops *ca_ops = inet_csk(sk)->icsk_ca_ops;
u32 tso_segs = ca_ops->tso_segs_goal ? ca_ops->tso_segs_goal(sk) : 0;
 
-   return tso_segs ? :
-   tcp_tso_autosize(sk, mss_now,
-sock_net(sk)->ipv4.sysctl_tcp_min_tso_segs);
+   if (!tso_segs)
+   tso_segs = tcp_tso_autosize(sk, mss_now,
+   sock_net(sk)->ipv4.sysctl_tcp_min_tso_segs);
+   return min_t(u32, tso_segs, sk->sk_gso_max_segs);
 }
 
 /* Returns the portion of skb which can be sent right away */



Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Eric Dumazet
On Tue, 2018-02-20 at 10:32 +0100, Oleksandr Natalenko wrote:
> Hi.
> 
> 19.02.2018 20:56, Eric Dumazet wrote:
> > Switching TCP to GSO mode, relying on core networking layers
> > to perform eventual adaptation for dumb devices was overdue.
> > 
> > 1) Most TCP developments are done with TSO in mind.
> > 2) Less high-resolution timers needs to be armed for TCP-pacing
> > 3) GSO can benefit of xmit_more hint
> > 4) Receiver GRO is more effective (as if TSO was used for real on 
> > sender)
> >-> less ACK packets and overhead.
> > 5) Write queues have less overhead (one skb holds about 64KB of 
> > payload)
> > 6) SACK coalescing just works. (no payload in skb->head)
> > 7) rtx rb-tree contains less packets, SACK is cheaper.
> > 8) Removal of legacy code. Less maintenance hassles.
> > 
> > Note that I have left the sendpage/zerocopy paths, but they probably 
> > can
> > benefit from the same strategy.
> > 
> > Thanks to Oleksandr Natalenko for reporting a performance issue for
> > BBR/fq_codel,
> > which was the main reason I worked on this patch series.
> 
> Thanks for dealing with this that fast.
> 
> Does this mean that the option to optimise internal TCP pacing is still 
> an open question?

It is not an optimization that is needed, but taking into account that
highres timers can have latencies of ~2 usec or more.

When sending 64KB TSO packets, having extra 2 usec after every ~54 usec
(at 10Gbit) has no big impact, since TCP computes a slightly inflated
pacing rate anyway.

But when sending one MSS/packet every usec, this definitely can
demonstrate a big slowdown.

But the anser is yes, I will take a look at this timer drift.



Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Oleksandr Natalenko

Hi.

19.02.2018 20:56, Eric Dumazet wrote:

Switching TCP to GSO mode, relying on core networking layers
to perform eventual adaptation for dumb devices was overdue.

1) Most TCP developments are done with TSO in mind.
2) Less high-resolution timers needs to be armed for TCP-pacing
3) GSO can benefit of xmit_more hint
4) Receiver GRO is more effective (as if TSO was used for real on 
sender)

   -> less ACK packets and overhead.
5) Write queues have less overhead (one skb holds about 64KB of 
payload)

6) SACK coalescing just works. (no payload in skb->head)
7) rtx rb-tree contains less packets, SACK is cheaper.
8) Removal of legacy code. Less maintenance hassles.

Note that I have left the sendpage/zerocopy paths, but they probably 
can

benefit from the same strategy.

Thanks to Oleksandr Natalenko for reporting a performance issue for
BBR/fq_codel,
which was the main reason I worked on this patch series.


Thanks for dealing with this that fast.

Does this mean that the option to optimise internal TCP pacing is still 
an open question?


Oleksandr


Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-19 Thread Soheil Hassas Yeganeh
On Mon, Feb 19, 2018 at 2:56 PM Eric Dumazet  wrote:
> Switching TCP to GSO mode, relying on core networking layers
> to perform eventual adaptation for dumb devices was overdue.

> 1) Most TCP developments are done with TSO in mind.
> 2) Less high-resolution timers needs to be armed for TCP-pacing
> 3) GSO can benefit of xmit_more hint
> 4) Receiver GRO is more effective (as if TSO was used for real on sender)
>  -> less ACK packets and overhead.
> 5) Write queues have less overhead (one skb holds about 64KB of payload)
> 6) SACK coalescing just works. (no payload in skb->head)
> 7) rtx rb-tree contains less packets, SACK is cheaper.
> 8) Removal of legacy code. Less maintenance hassles.

> Note that I have left the sendpage/zerocopy paths, but they probably can
> benefit from the same strategy.

> Thanks to Oleksandr Natalenko for reporting a performance issue for
BBR/fq_codel,
> which was the main reason I worked on this patch series.

> Eric Dumazet (6):
> tcp: switch to GSO being always on
> tcp: remove sk_can_gso() use
> tcp: remove sk_check_csum_caps()
> tcp: tcp_sendmsg() only deals with CHECKSUM_PARTIAL
> tcp: remove dead code from tcp_set_skb_tso_segs()
> tcp: remove dead code after CHECKSUM_PARTIAL adoption

Acked-by: Soheil Hassas Yeganeh 

Very nice patch-series! Thank you, Eric!

>include/net/sock.h| 10 +---
>net/core/sock.c   |  2 +-
>net/ipv4/tcp.c| 57 ---
>net/ipv4/tcp_input.c  |  3 ---
>net/ipv4/tcp_ipv4.c   | 13 +++---
>net/ipv4/tcp_output.c | 40 +-
>6 files changed, 26 insertions(+), 99 deletions(-)
> --
> 2.16.1.291.g4437f3f132-goog


[PATCH net-next 0/6] tcp: remove non GSO code

2018-02-19 Thread Eric Dumazet
Switching TCP to GSO mode, relying on core networking layers
to perform eventual adaptation for dumb devices was overdue.

1) Most TCP developments are done with TSO in mind.
2) Less high-resolution timers needs to be armed for TCP-pacing
3) GSO can benefit of xmit_more hint
4) Receiver GRO is more effective (as if TSO was used for real on sender)
   -> less ACK packets and overhead.
5) Write queues have less overhead (one skb holds about 64KB of payload)
6) SACK coalescing just works. (no payload in skb->head)
7) rtx rb-tree contains less packets, SACK is cheaper.
8) Removal of legacy code. Less maintenance hassles.

Note that I have left the sendpage/zerocopy paths, but they probably can
benefit from the same strategy.

Thanks to Oleksandr Natalenko for reporting a performance issue for 
BBR/fq_codel,
which was the main reason I worked on this patch series.

Eric Dumazet (6):
  tcp: switch to GSO being always on
  tcp: remove sk_can_gso() use
  tcp: remove sk_check_csum_caps()
  tcp: tcp_sendmsg() only deals with CHECKSUM_PARTIAL
  tcp: remove dead code from tcp_set_skb_tso_segs()
  tcp: remove dead code after CHECKSUM_PARTIAL adoption

 include/net/sock.h| 10 +---
 net/core/sock.c   |  2 +-
 net/ipv4/tcp.c| 57 ---
 net/ipv4/tcp_input.c  |  3 ---
 net/ipv4/tcp_ipv4.c   | 13 +++---
 net/ipv4/tcp_output.c | 40 +-
 6 files changed, 26 insertions(+), 99 deletions(-)

-- 
2.16.1.291.g4437f3f132-goog