On Fri, 2015-06-05 at 17:46 -0700, Martin KaFai Lau wrote:
> The problem is caught by this WARN_ON(len > skb->len) in tcp_fragment():
> 
> [<ffffffff810510ca>] warn_slowpath_null+0x1a/0x20
> [<ffffffff8160ec90>] tcp_fragment+0x2a0/0x2b0
> [<ffffffff81604e06>] tcp_mark_head_lost+0x196/0x230
> [<ffffffff8160585d>] tcp_update_scoreboard+0x4d/0x80
> [<ffffffff8160a9ac>] tcp_fastretrans_alert+0x6ac/0xa90
> [<ffffffff8160b834>] tcp_ack+0x9d4/0x10e0
> [<ffffffff8160c699>] tcp_rcv_established+0x309/0x7e0
> 
> The WARN_ON pointed out that tcp_skb_pcount (i.e.
> TCP_SKB_CB(skb)->tcp_gso_segs) and skb->len is inconsistent.
> 
> The WARN_ON stack goes away after setting net.ipv4.tcp_mtu_probing to 0.
> 
> v2
> - Replace the skb slicing codes by the existing tcp_trim_head(),
>   suggested by Eric Dumazet.
> 
> v1
> - Call tcp_set_skb_tso_segs() for all slicing cases.
> 
> Signed-off-by: Martin KaFai Lau <ka...@fb.com>
> Reported-by: Grant Zhang <gzh...@fastly.com>
> Cc: Grant Zhang <gzh...@fastly.com>
> Cc: Eric Dumazet <eduma...@google.com>
> Cc: Neal Cardwell <ncardw...@google.com>
> Cc: Yuchung Cheng <ych...@google.com>
> ---
>  net/ipv4/tcp_output.c | 12 ++----------
>  1 file changed, 2 insertions(+), 10 deletions(-)
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index a369e8a..4ae4f0c 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1977,16 +1977,8 @@ static int tcp_mtu_probe(struct sock *sk)
>               } else {
>                       TCP_SKB_CB(nskb)->tcp_flags |= 
> TCP_SKB_CB(skb)->tcp_flags &
>                                                  ~(TCPHDR_FIN|TCPHDR_PSH);
> -                     if (!skb_shinfo(skb)->nr_frags) {
> -                             skb_pull(skb, copy);
> -                             if (skb->ip_summed != CHECKSUM_PARTIAL)
> -                                     skb->csum = csum_partial(skb->data,
> -                                                              skb->len, 0);
> -                     } else {
> -                             __pskb_trim_head(skb, copy);
> -                             tcp_set_skb_tso_segs(sk, skb, mss_now);
> -                     }
> -                     TCP_SKB_CB(skb)->seq += copy;
> +                     tcp_skb_pcount_set(skb, 0);
> +                     tcp_trim_head(sk, skb, copy);
>               }
>  
>               len += copy;


I think the invariant should be that if a packet had been never sent,
its pcount should be already 0.

(cleared in do_tcp_sendpages() and tcp_sendmsg() : it seems we hacked
these functions already in the past :( )

So we might need to track places where we violate this rule, then get
rid of the tcp_skb_pcount_set(skb, 0); done in do_tcp_sendpages() and
tcp_sendmsg().

Here, trimming a packet that was never sent (by definition) should not
force pcount to 0, it should already be the case.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to