Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Wed, Mar 22, 2017 at 06:47:42AM -0700, Eric Dumazet wrote: > On Mon, 2017-03-20 at 16:27 -0300, Marcelo Ricardo Leitner wrote: > > This warning is a hint, and can not assume senders are not dumb. > > > > Agreed. But we can make it consider such cases. What about the following > > patch? (untested) > > > > I think we can directly account for the size of the timestamps in there, > > as that won't make a difference to congestion control in case it's > > wrong, and also validate against MTU if we have it. I didn't subtract > > the headers from MTU on purpose, as dealing with ipv4/ipv6 there is > > not worth for the same reason. > > > > This should silent this false-positive. > > > Note that the problem could have its origin on a middle box, > not on the host terminating the TCP flow. > > So we can try hard, but we can't eliminate false positives. Agreed both. > > Maybe replace the 12 by MAX_TCP_OPTION_SPACE ? Yes, can be. Thanks. Marcelo
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Mon, 2017-03-20 at 16:27 -0300, Marcelo Ricardo Leitner wrote: > This warning is a hint, and can not assume senders are not dumb. > > Agreed. But we can make it consider such cases. What about the following > patch? (untested) > > I think we can directly account for the size of the timestamps in there, > as that won't make a difference to congestion control in case it's > wrong, and also validate against MTU if we have it. I didn't subtract > the headers from MTU on purpose, as dealing with ipv4/ipv6 there is > not worth for the same reason. > > This should silent this false-positive. Note that the problem could have its origin on a middle box, not on the host terminating the TCP flow. So we can try hard, but we can't eliminate false positives. Maybe replace the 12 by MAX_TCP_OPTION_SPACE ?
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Sun, Mar 19, 2017 at 12:20:26PM -0700, Eric Dumazet wrote: > On Sun, 2017-03-19 at 13:14 +0100, Markus Trippelsdorf wrote: > > On 2017.02.06 at 19:12 -0200, Marcelo Ricardo Leitner wrote: > > > On Fri, Feb 03, 2017 at 06:47:33AM -0800, Eric Dumazet wrote: > > > > On Fri, 2017-02-03 at 12:28 -0200, Marcelo Ricardo Leitner wrote: > > > > > > > > > Aren't you mixing the endpoints here? MSS is the largest amount of > > > > > data > > > > > that the peer can receive in a single segment, and not how much it > > > > > will > > > > > send. For the sending part, that depends on what the other peer > > > > > announced, and we can have 2 different MSS in a single connection, one > > > > > for each peer. > > > > > > > > > > If a peer later wants to send larger segments, it can, but it must > > > > > respect the mss advertised by the other peer during handshake. > > > > > > > > > > > > > I am not mixing endpoints, you are. > > > > > > > > If you need to be convinced, please grab : > > > > https://patchwork.ozlabs.org/patch/723028/ > > > > > > > > And just watch "ss -temoi ..." > > > > > > I still don't get it, but I also hit the warning on my laptop, using > > > iwlwifi. Not sure what I did in order to trigger it, it was by accident. > > > > After many weeks without any warning, I've hit the issue again today: Nice! > > > > TCP: eth0: Driver has suspect GRO implementation, TCP performance may be > > compromised. rcv_mss:1448 advmss:1448 len:1460 > > > > It is very possible the sender suddenly forgot to use TCP timestamps. By those 12 bytes, seems so, yes. > This warning is a hint, and can not assume senders are not dumb. Agreed. But we can make it consider such cases. What about the following patch? (untested) I think we can directly account for the size of the timestamps in there, as that won't make a difference to congestion control in case it's wrong, and also validate against MTU if we have it. I didn't subtract the headers from MTU on purpose, as dealing with ipv4/ipv6 there is not worth for the same reason. This should silent this false-positive. ---8<--- diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 96b67a8b18c3..96a99446ddce 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -126,7 +126,8 @@ int sysctl_tcp_invalid_ratelimit __read_mostly = HZ/2; #define REXMIT_LOST1 /* retransmit packets marked lost */ #define REXMIT_NEW 2 /* FRTO-style transmit of unsent/new packets */ -static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb) +static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb, +unsigned int len) { static bool __once __read_mostly; @@ -137,8 +138,9 @@ static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb) rcu_read_lock(); dev = dev_get_by_index_rcu(sock_net(sk), skb->skb_iif); - pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised.\n", - dev ? dev->name : "Unknown driver"); + if (!dev || len >= dev->mtu) + pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised.\n", + dev ? dev->name : "Unknown driver"); rcu_read_unlock(); } } @@ -161,8 +163,9 @@ static void tcp_measure_rcv_mss(struct sock *sk, const struct sk_buff *skb) if (len >= icsk->icsk_ack.rcv_mss) { icsk->icsk_ack.rcv_mss = min_t(unsigned int, len, tcp_sk(sk)->advmss); - if (unlikely(icsk->icsk_ack.rcv_mss != len)) - tcp_gro_dev_warn(sk, skb); + /* The + 12 accounts for the possible lack of timestamps */ + if (unlikely(icsk->icsk_ack.rcv_mss + 12 < len)) + tcp_gro_dev_warn(sk, skb, len); } else { /* Otherwise, we make more careful check taking into account, * that SACKs block is variable.
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Sun, 2017-03-19 at 13:14 +0100, Markus Trippelsdorf wrote: > On 2017.02.06 at 19:12 -0200, Marcelo Ricardo Leitner wrote: > > On Fri, Feb 03, 2017 at 06:47:33AM -0800, Eric Dumazet wrote: > > > On Fri, 2017-02-03 at 12:28 -0200, Marcelo Ricardo Leitner wrote: > > > > > > > Aren't you mixing the endpoints here? MSS is the largest amount of data > > > > that the peer can receive in a single segment, and not how much it will > > > > send. For the sending part, that depends on what the other peer > > > > announced, and we can have 2 different MSS in a single connection, one > > > > for each peer. > > > > > > > > If a peer later wants to send larger segments, it can, but it must > > > > respect the mss advertised by the other peer during handshake. > > > > > > > > > > I am not mixing endpoints, you are. > > > > > > If you need to be convinced, please grab : > > > https://patchwork.ozlabs.org/patch/723028/ > > > > > > And just watch "ss -temoi ..." > > > > I still don't get it, but I also hit the warning on my laptop, using > > iwlwifi. Not sure what I did in order to trigger it, it was by accident. > > After many weeks without any warning, I've hit the issue again today: > > TCP: eth0: Driver has suspect GRO implementation, TCP performance may be > compromised. rcv_mss:1448 advmss:1448 len:1460 > It is very possible the sender suddenly forgot to use TCP timestamps. This warning is a hint, and can not assume senders are not dumb.
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On 2017.02.06 at 19:12 -0200, Marcelo Ricardo Leitner wrote: > On Fri, Feb 03, 2017 at 06:47:33AM -0800, Eric Dumazet wrote: > > On Fri, 2017-02-03 at 12:28 -0200, Marcelo Ricardo Leitner wrote: > > > > > Aren't you mixing the endpoints here? MSS is the largest amount of data > > > that the peer can receive in a single segment, and not how much it will > > > send. For the sending part, that depends on what the other peer > > > announced, and we can have 2 different MSS in a single connection, one > > > for each peer. > > > > > > If a peer later wants to send larger segments, it can, but it must > > > respect the mss advertised by the other peer during handshake. > > > > > > > I am not mixing endpoints, you are. > > > > If you need to be convinced, please grab : > > https://patchwork.ozlabs.org/patch/723028/ > > > > And just watch "ss -temoi ..." > > I still don't get it, but I also hit the warning on my laptop, using > iwlwifi. Not sure what I did in order to trigger it, it was by accident. After many weeks without any warning, I've hit the issue again today: TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised. rcv_mss:1448 advmss:1448 len:1460 -- Markus
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On 2017.02.06 at 19:12 -0200, Marcelo Ricardo Leitner wrote: > On Fri, Feb 03, 2017 at 06:47:33AM -0800, Eric Dumazet wrote: > > On Fri, 2017-02-03 at 12:28 -0200, Marcelo Ricardo Leitner wrote: > > > > > Aren't you mixing the endpoints here? MSS is the largest amount of data > > > that the peer can receive in a single segment, and not how much it will > > > send. For the sending part, that depends on what the other peer > > > announced, and we can have 2 different MSS in a single connection, one > > > for each peer. > > > > > > If a peer later wants to send larger segments, it can, but it must > > > respect the mss advertised by the other peer during handshake. > > > > > > > I am not mixing endpoints, you are. > > > > If you need to be convinced, please grab : > > https://patchwork.ozlabs.org/patch/723028/ > > > > And just watch "ss -temoi ..." > > I still don't get it, but I also hit the warning on my laptop, using > iwlwifi. Not sure what I did in order to trigger it, it was by accident. I am running with your debugging patch applied since the beginning of February and was not able to reproduce the issue ever again. So I think your code is innocent and another bug (,that seems to be fixed since then) somehow caused the kernel to jump to the function. -- Markus
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Fri, Feb 03, 2017 at 06:47:33AM -0800, Eric Dumazet wrote: > On Fri, 2017-02-03 at 12:28 -0200, Marcelo Ricardo Leitner wrote: > > > Aren't you mixing the endpoints here? MSS is the largest amount of data > > that the peer can receive in a single segment, and not how much it will > > send. For the sending part, that depends on what the other peer > > announced, and we can have 2 different MSS in a single connection, one > > for each peer. > > > > If a peer later wants to send larger segments, it can, but it must > > respect the mss advertised by the other peer during handshake. > > > > I am not mixing endpoints, you are. > > If you need to be convinced, please grab : > https://patchwork.ozlabs.org/patch/723028/ > > And just watch "ss -temoi ..." I still don't get it, but I also hit the warning on my laptop, using iwlwifi. Not sure what I did in order to trigger it, it was by accident. Marcelo
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Fri, 2017-02-03 at 12:28 -0200, Marcelo Ricardo Leitner wrote: > Aren't you mixing the endpoints here? MSS is the largest amount of data > that the peer can receive in a single segment, and not how much it will > send. For the sending part, that depends on what the other peer > announced, and we can have 2 different MSS in a single connection, one > for each peer. > > If a peer later wants to send larger segments, it can, but it must > respect the mss advertised by the other peer during handshake. > I am not mixing endpoints, you are. If you need to be convinced, please grab : https://patchwork.ozlabs.org/patch/723028/ And just watch "ss -temoi ..."
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Fri, Feb 03, 2017 at 06:16:06AM -0800, Eric Dumazet wrote: > On Fri, 2017-02-03 at 11:53 -0200, Marcelo Ricardo Leitner wrote: > > On Fri, Feb 03, 2017 at 05:24:06AM -0800, Eric Dumazet wrote: > > > On Fri, 2017-02-03 at 09:54 -0200, Marcelo Ricardo Leitner wrote: > > > > On Thu, Feb 02, 2017 at 05:59:24AM -0800, Eric Dumazet wrote: > > > > > On Thu, 2017-02-02 at 05:31 -0800, Eric Dumazet wrote: > > > > > > > > > > > Anyway, I suspect the test is simply buggy ;) > > > > > > > > > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > > > > > index > > > > > > 41dcbd568cbe2403f2a9e659669afe462a42e228..5394a39fcce964a7fe7075b1531a8a1e05550a54 > > > > > > 100644 > > > > > > --- a/net/ipv4/tcp_input.c > > > > > > +++ b/net/ipv4/tcp_input.c > > > > > > @@ -164,7 +164,7 @@ static void tcp_measure_rcv_mss(struct sock > > > > > > *sk, const struct sk_buff *skb) > > > > > > if (len >= icsk->icsk_ack.rcv_mss) { > > > > > > icsk->icsk_ack.rcv_mss = min_t(unsigned int, len, > > > > > >tcp_sk(sk)->advmss); > > > > > > - if (unlikely(icsk->icsk_ack.rcv_mss != len)) > > > > > > + if (unlikely(icsk->icsk_ack.rcv_mss != len && > > > > > > skb_is_gso(skb))) > > > > > > tcp_gro_dev_warn(sk, skb); > > > > > > } else { > > > > > > /* Otherwise, we make more careful check taking into > > > > > > account, > > > > > > > > > > This wont really help. > > > > > > > > > > Our tcp_sk(sk)->advmss can be lower than the MSS used by the remote > > > > > peer. > > > > > > > > > > ip ro add advmss 512 > > > > > > > > I don't follow. With a good driver, how can advmss be smaller than the > > > > MSS used by the remote peer? Even with the route entry above, I get > > > > segments just up to advmss, and no warning. > > > > > > > > > > A TCP flow has two ends. > > > > Indeed, though should be mostly about only one of them. > > > > > > > > Common MTU = 1500 > > > > > > One can have advmss 500, the other one no advmss (or the standard 1460 > > > one) > > > > Considering the rx side of peer A. Peer A advertises a given MSS to peer > > B and should not receive any segment from peer B larger than so. > > I'm failing to see how advmss can be smaller than the segment size just > > received. > > tcp_sk(sk)->advmss records what the peer announced during its SYN (or > SYNACK) message, in the MSS option. > > Nothing prevents the peer to change its mind later. > > Eg starting with MSS 512, then switch later to sending packets of 1024 > or 1400 bytes. Aren't you mixing the endpoints here? MSS is the largest amount of data that the peer can receive in a single segment, and not how much it will send. For the sending part, that depends on what the other peer announced, and we can have 2 different MSS in a single connection, one for each peer. If a peer later wants to send larger segments, it can, but it must respect the mss advertised by the other peer during handshake. > > So the innocent NIC driver is not the problem here. > >
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Fri, 2017-02-03 at 11:53 -0200, Marcelo Ricardo Leitner wrote: > On Fri, Feb 03, 2017 at 05:24:06AM -0800, Eric Dumazet wrote: > > On Fri, 2017-02-03 at 09:54 -0200, Marcelo Ricardo Leitner wrote: > > > On Thu, Feb 02, 2017 at 05:59:24AM -0800, Eric Dumazet wrote: > > > > On Thu, 2017-02-02 at 05:31 -0800, Eric Dumazet wrote: > > > > > > > > > Anyway, I suspect the test is simply buggy ;) > > > > > > > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > > > > index > > > > > 41dcbd568cbe2403f2a9e659669afe462a42e228..5394a39fcce964a7fe7075b1531a8a1e05550a54 > > > > > 100644 > > > > > --- a/net/ipv4/tcp_input.c > > > > > +++ b/net/ipv4/tcp_input.c > > > > > @@ -164,7 +164,7 @@ static void tcp_measure_rcv_mss(struct sock *sk, > > > > > const struct sk_buff *skb) > > > > > if (len >= icsk->icsk_ack.rcv_mss) { > > > > > icsk->icsk_ack.rcv_mss = min_t(unsigned int, len, > > > > > tcp_sk(sk)->advmss); > > > > > - if (unlikely(icsk->icsk_ack.rcv_mss != len)) > > > > > + if (unlikely(icsk->icsk_ack.rcv_mss != len && > > > > > skb_is_gso(skb))) > > > > > tcp_gro_dev_warn(sk, skb); > > > > > } else { > > > > > /* Otherwise, we make more careful check taking into > > > > > account, > > > > > > > > This wont really help. > > > > > > > > Our tcp_sk(sk)->advmss can be lower than the MSS used by the remote > > > > peer. > > > > > > > > ip ro add advmss 512 > > > > > > I don't follow. With a good driver, how can advmss be smaller than the > > > MSS used by the remote peer? Even with the route entry above, I get > > > segments just up to advmss, and no warning. > > > > > > > A TCP flow has two ends. > > Indeed, though should be mostly about only one of them. > > > > > Common MTU = 1500 > > > > One can have advmss 500, the other one no advmss (or the standard 1460 > > one) > > Considering the rx side of peer A. Peer A advertises a given MSS to peer > B and should not receive any segment from peer B larger than so. > I'm failing to see how advmss can be smaller than the segment size just > received. tcp_sk(sk)->advmss records what the peer announced during its SYN (or SYNACK) message, in the MSS option. Nothing prevents the peer to change its mind later. Eg starting with MSS 512, then switch later to sending packets of 1024 or 1400 bytes. So the innocent NIC driver is not the problem here.
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Fri, Feb 03, 2017 at 05:24:06AM -0800, Eric Dumazet wrote: > On Fri, 2017-02-03 at 09:54 -0200, Marcelo Ricardo Leitner wrote: > > On Thu, Feb 02, 2017 at 05:59:24AM -0800, Eric Dumazet wrote: > > > On Thu, 2017-02-02 at 05:31 -0800, Eric Dumazet wrote: > > > > > > > Anyway, I suspect the test is simply buggy ;) > > > > > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > > > index > > > > 41dcbd568cbe2403f2a9e659669afe462a42e228..5394a39fcce964a7fe7075b1531a8a1e05550a54 > > > > 100644 > > > > --- a/net/ipv4/tcp_input.c > > > > +++ b/net/ipv4/tcp_input.c > > > > @@ -164,7 +164,7 @@ static void tcp_measure_rcv_mss(struct sock *sk, > > > > const struct sk_buff *skb) > > > > if (len >= icsk->icsk_ack.rcv_mss) { > > > > icsk->icsk_ack.rcv_mss = min_t(unsigned int, len, > > > >tcp_sk(sk)->advmss); > > > > - if (unlikely(icsk->icsk_ack.rcv_mss != len)) > > > > + if (unlikely(icsk->icsk_ack.rcv_mss != len && > > > > skb_is_gso(skb))) > > > > tcp_gro_dev_warn(sk, skb); > > > > } else { > > > > /* Otherwise, we make more careful check taking into > > > > account, > > > > > > This wont really help. > > > > > > Our tcp_sk(sk)->advmss can be lower than the MSS used by the remote > > > peer. > > > > > > ip ro add advmss 512 > > > > I don't follow. With a good driver, how can advmss be smaller than the > > MSS used by the remote peer? Even with the route entry above, I get > > segments just up to advmss, and no warning. > > > > A TCP flow has two ends. Indeed, though should be mostly about only one of them. > > Common MTU = 1500 > > One can have advmss 500, the other one no advmss (or the standard 1460 > one) Considering the rx side of peer A. Peer A advertises a given MSS to peer B and should not receive any segment from peer B larger than so. I'm failing to see how advmss can be smaller than the segment size just received. > > So if we compare apple and orange, result might be shocking ;) Yes heh just not seeing the mix here.. > > If you want to reproduce this use the "ip ro add advmss 512" hint, > and/or play with sysctl_tcp_mtu_probing I tried the route with advmss, no luck so far. Still digging.. Marcelo
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Fri, 2017-02-03 at 09:54 -0200, Marcelo Ricardo Leitner wrote: > On Thu, Feb 02, 2017 at 05:59:24AM -0800, Eric Dumazet wrote: > > On Thu, 2017-02-02 at 05:31 -0800, Eric Dumazet wrote: > > > > > Anyway, I suspect the test is simply buggy ;) > > > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > > index > > > 41dcbd568cbe2403f2a9e659669afe462a42e228..5394a39fcce964a7fe7075b1531a8a1e05550a54 > > > 100644 > > > --- a/net/ipv4/tcp_input.c > > > +++ b/net/ipv4/tcp_input.c > > > @@ -164,7 +164,7 @@ static void tcp_measure_rcv_mss(struct sock *sk, > > > const struct sk_buff *skb) > > > if (len >= icsk->icsk_ack.rcv_mss) { > > > icsk->icsk_ack.rcv_mss = min_t(unsigned int, len, > > > tcp_sk(sk)->advmss); > > > - if (unlikely(icsk->icsk_ack.rcv_mss != len)) > > > + if (unlikely(icsk->icsk_ack.rcv_mss != len && skb_is_gso(skb))) > > > tcp_gro_dev_warn(sk, skb); > > > } else { > > > /* Otherwise, we make more careful check taking into account, > > > > This wont really help. > > > > Our tcp_sk(sk)->advmss can be lower than the MSS used by the remote > > peer. > > > > ip ro add advmss 512 > > I don't follow. With a good driver, how can advmss be smaller than the > MSS used by the remote peer? Even with the route entry above, I get > segments just up to advmss, and no warning. > A TCP flow has two ends. Common MTU = 1500 One can have advmss 500, the other one no advmss (or the standard 1460 one) So if we compare apple and orange, result might be shocking ;) If you want to reproduce this use the "ip ro add advmss 512" hint, and/or play with sysctl_tcp_mtu_probing
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On 2017.02.03 at 09:54 -0200, Marcelo Ricardo Leitner wrote: > On Thu, Feb 02, 2017 at 05:59:24AM -0800, Eric Dumazet wrote: > > On Thu, 2017-02-02 at 05:31 -0800, Eric Dumazet wrote: > > > > > Anyway, I suspect the test is simply buggy ;) > > > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > > index > > > 41dcbd568cbe2403f2a9e659669afe462a42e228..5394a39fcce964a7fe7075b1531a8a1e05550a54 > > > 100644 > > > --- a/net/ipv4/tcp_input.c > > > +++ b/net/ipv4/tcp_input.c > > > @@ -164,7 +164,7 @@ static void tcp_measure_rcv_mss(struct sock *sk, > > > const struct sk_buff *skb) > > > if (len >= icsk->icsk_ack.rcv_mss) { > > > icsk->icsk_ack.rcv_mss = min_t(unsigned int, len, > > > tcp_sk(sk)->advmss); > > > - if (unlikely(icsk->icsk_ack.rcv_mss != len)) > > > + if (unlikely(icsk->icsk_ack.rcv_mss != len && skb_is_gso(skb))) > > > tcp_gro_dev_warn(sk, skb); > > > } else { > > > /* Otherwise, we make more careful check taking into account, > > > > This wont really help. > > > > Our tcp_sk(sk)->advmss can be lower than the MSS used by the remote > > peer. > > > > ip ro add advmss 512 > > I don't follow. With a good driver, how can advmss be smaller than the > MSS used by the remote peer? Even with the route entry above, I get > segments just up to advmss, and no warning. > > Though yeah, interesting that this driver doesn't even support GRO. FCS > perhaps? > > Markus, do you have other interfaces in your system? Which MTU do you > use, and please try the (untested) patch below, to gather more debug > info: No, eth0 is the only interface. MTU = 1500. Sure, I will try your patch. But I don't know how to reproduce the issue, so you will have to wait until it triggers again. -- Markus
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Thu, Feb 02, 2017 at 05:59:24AM -0800, Eric Dumazet wrote: > On Thu, 2017-02-02 at 05:31 -0800, Eric Dumazet wrote: > > > Anyway, I suspect the test is simply buggy ;) > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > index > > 41dcbd568cbe2403f2a9e659669afe462a42e228..5394a39fcce964a7fe7075b1531a8a1e05550a54 > > 100644 > > --- a/net/ipv4/tcp_input.c > > +++ b/net/ipv4/tcp_input.c > > @@ -164,7 +164,7 @@ static void tcp_measure_rcv_mss(struct sock *sk, const > > struct sk_buff *skb) > > if (len >= icsk->icsk_ack.rcv_mss) { > > icsk->icsk_ack.rcv_mss = min_t(unsigned int, len, > >tcp_sk(sk)->advmss); > > - if (unlikely(icsk->icsk_ack.rcv_mss != len)) > > + if (unlikely(icsk->icsk_ack.rcv_mss != len && skb_is_gso(skb))) > > tcp_gro_dev_warn(sk, skb); > > } else { > > /* Otherwise, we make more careful check taking into account, > > This wont really help. > > Our tcp_sk(sk)->advmss can be lower than the MSS used by the remote > peer. > > ip ro add advmss 512 I don't follow. With a good driver, how can advmss be smaller than the MSS used by the remote peer? Even with the route entry above, I get segments just up to advmss, and no warning. Though yeah, interesting that this driver doesn't even support GRO. FCS perhaps? Markus, do you have other interfaces in your system? Which MTU do you use, and please try the (untested) patch below, to gather more debug info: ---8<--- diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index bfa165cc455a..eddd5b6a28b1 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -128,6 +128,7 @@ int sysctl_tcp_invalid_ratelimit __read_mostly = HZ/2; static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb) { + struct inet_connection_sock *icsk = inet_csk(sk); static bool __once __read_mostly; if (!__once) { @@ -137,8 +138,9 @@ static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb) rcu_read_lock(); dev = dev_get_by_index_rcu(sock_net(sk), skb->skb_iif); - pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised.\n", - dev ? dev->name : "Unknown driver"); + pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised. rcv_mss:%u advmss:%u len:%u\n", + dev ? dev->name : "Unknown driver", + icsk->icsk_ack.rcv_mss, tcp_sk(sk)->advmss, skb->len); rcu_read_unlock(); } }
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Thu, 2017-02-02 at 05:31 -0800, Eric Dumazet wrote: > Anyway, I suspect the test is simply buggy ;) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index > 41dcbd568cbe2403f2a9e659669afe462a42e228..5394a39fcce964a7fe7075b1531a8a1e05550a54 > 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -164,7 +164,7 @@ static void tcp_measure_rcv_mss(struct sock *sk, const > struct sk_buff *skb) > if (len >= icsk->icsk_ack.rcv_mss) { > icsk->icsk_ack.rcv_mss = min_t(unsigned int, len, > tcp_sk(sk)->advmss); > - if (unlikely(icsk->icsk_ack.rcv_mss != len)) > + if (unlikely(icsk->icsk_ack.rcv_mss != len && skb_is_gso(skb))) > tcp_gro_dev_warn(sk, skb); > } else { > /* Otherwise, we make more careful check taking into account, This wont really help. Our tcp_sk(sk)->advmss can be lower than the MSS used by the remote peer. ip ro add advmss 512 So the test is not universal.
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Thu, 2017-02-02 at 13:34 +0100, Markus Trippelsdorf wrote: > On 2017.02.02 at 04:32 -0800, Eric Dumazet wrote: > > On Thu, 2017-02-02 at 12:52 +0100, Markus Trippelsdorf wrote: > > > Hi, > > > > > > from time to time I see the following warning in my kernel log: > > > > > > TCP: eth0: Driver has suspect GRO implementation, TCP performance may be > > > compromised. > > > > > > This happens although I run "/usr/sbin/ethtool -K eth0 gro off" in my > > > local boot script. > > > What is the warning trying to tell me? > > > > > > > Please report > > > > ethtool -i eth0 > > driver: ATL1E > version: 1.0.0.7-NAPI > firmware-version: L1e > expansion-rom-version: > bus-info: :02:00.0 > supports-statistics: no > supports-test: no > supports-eeprom-access: no > supports-register-dump: yes > supports-priv-flags: no > Note that this driver does not implement GRO yet. Hard to believe there is such push back on GRO in 2017. Anyway, I suspect the test is simply buggy ;) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 41dcbd568cbe2403f2a9e659669afe462a42e228..5394a39fcce964a7fe7075b1531a8a1e05550a54 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -164,7 +164,7 @@ static void tcp_measure_rcv_mss(struct sock *sk, const struct sk_buff *skb) if (len >= icsk->icsk_ack.rcv_mss) { icsk->icsk_ack.rcv_mss = min_t(unsigned int, len, tcp_sk(sk)->advmss); - if (unlikely(icsk->icsk_ack.rcv_mss != len)) + if (unlikely(icsk->icsk_ack.rcv_mss != len && skb_is_gso(skb))) tcp_gro_dev_warn(sk, skb); } else { /* Otherwise, we make more careful check taking into account,
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On 2017.02.02 at 04:32 -0800, Eric Dumazet wrote: > On Thu, 2017-02-02 at 12:52 +0100, Markus Trippelsdorf wrote: > > Hi, > > > > from time to time I see the following warning in my kernel log: > > > > TCP: eth0: Driver has suspect GRO implementation, TCP performance may be > > compromised. > > > > This happens although I run "/usr/sbin/ethtool -K eth0 gro off" in my > > local boot script. > > What is the warning trying to tell me? > > > > Please report > > ethtool -i eth0 driver: ATL1E version: 1.0.0.7-NAPI firmware-version: L1e expansion-rom-version: bus-info: :02:00.0 supports-statistics: no supports-test: no supports-eeprom-access: no supports-register-dump: yes supports-priv-flags: no -- Markus
Re: "TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
On Thu, 2017-02-02 at 12:52 +0100, Markus Trippelsdorf wrote: > Hi, > > from time to time I see the following warning in my kernel log: > > TCP: eth0: Driver has suspect GRO implementation, TCP performance may be > compromised. > > This happens although I run "/usr/sbin/ethtool -K eth0 gro off" in my > local boot script. > What is the warning trying to tell me? > Please report ethtool -i eth0
"TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised." message with "ethtool -K eth0 gro off"
Hi, from time to time I see the following warning in my kernel log: TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised. This happens although I run "/usr/sbin/ethtool -K eth0 gro off" in my local boot script. What is the warning trying to tell me? -- Markus