Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-12-04 Thread Cong Wang
On Tue, Nov 27, 2018 at 10:10 PM Cong Wang  wrote:
>
> When an ethernet frame is padded to meet the minimum ethernet frame
> size, the padding octets are not covered by the hardware checksum.
> Fortunately the padding octets are ususally zero's, which don't affect
> checksum. However, we have a switch which pads non-zero octets, this
> causes kernel hardware checksum fault repeatedly.
>
> Prior to commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE 
> are friends"),
> skb checksum is forced to be CHECKSUM_NONE when padding is detected.
> After it, we need to keep skb->csum updated, like what we do for FCS.
>
> The logic is a bit complicated when dealing with both FCS and padding,
> so I wrap it up in a helper function mlx5e_csum_padding().
>
> I tested this patch with RXFCS on and off, it works fine without any
> warning in both cases.
>
> Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are 
> friends"),
> Cc: Saeed Mahameed 
> Cc: Eric Dumazet 
> Signed-off-by: Cong Wang 

Nacked-by: Cong Wang 

This patch has no value for upstream. Let's discard it. Please don't
use or rework on this patch for any purpose.

Thanks!


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-12-04 Thread Cong Wang
On Tue, Dec 4, 2018 at 5:06 PM Saeed Mahameed  wrote:
>
> Hi Cong, sorry to hear that, i will take your patch evaluate and test
> personally, will do the needed changes and post it later.

Please don't. I am withdrawing my SoB too. To avoid any legal
issues, please speak to a legal expert before you taking any action
on my self-Nacked patch.


>
> for now i feel that you don't want csum complete in your servers and we
> already have the tool for that, just do:

Thanks but we are happy to just carry the fix locally.


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-12-04 Thread Saeed Mahameed
On Tue, 2018-12-04 at 12:50 -0800, Cong Wang wrote:
> On Tue, Dec 4, 2018 at 11:21 AM Saeed Mahameed
>  wrote:
> > we are still working on it.
> 
> No, I give up.
> 
> Sorry for wasting time. Let's save everyone's time by discarding it!!
> :)

Hi Cong, sorry to hear that, i will take your patch evaluate and test
personally, will do the needed changes and post it later.

for now i feel that you don't want csum complete in your servers and we
already have the tool for that, just do:

ethtool --set-priv-flags  rx_no_csum_complete on
ethtool --show-priv-flags 
Private flags for p5p1:
rx_cqe_moder   : on
tx_cqe_moder   : off
rx_cqe_compress: off
rx_striding_rq : off
rx_no_csum_complete: on

this will disable csum complete and avoid the csum error for padded
short packets.


Thanks,
Saeed.


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-12-04 Thread Cong Wang
On Tue, Dec 4, 2018 at 11:21 AM Saeed Mahameed
 wrote:
> we are still working on it.

No, I give up.

Sorry for wasting time. Let's save everyone's time by discarding it!! :)


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-12-04 Thread Saeed Mahameed
On Mon, Dec 3, 2018 at 3:17 PM David Miller  wrote:
>
> From: Cong Wang 
> Date: Tue, 27 Nov 2018 22:10:13 -0800
>
> > When an ethernet frame is padded to meet the minimum ethernet frame
> > size, the padding octets are not covered by the hardware checksum.
> > Fortunately the padding octets are ususally zero's, which don't affect
> > checksum. However, we have a switch which pads non-zero octets, this
> > causes kernel hardware checksum fault repeatedly.
> >
> > Prior to commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE 
> > are friends"),
> > skb checksum is forced to be CHECKSUM_NONE when padding is detected.
> > After it, we need to keep skb->csum updated, like what we do for FCS.
> >
> > The logic is a bit complicated when dealing with both FCS and padding,
> > so I wrap it up in a helper function mlx5e_csum_padding().
> >
> > I tested this patch with RXFCS on and off, it works fine without any
> > warning in both cases.
> >
> > Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are 
> > friends"),
> > Cc: Saeed Mahameed 
> > Cc: Eric Dumazet 
> > Signed-off-by: Cong Wang 
>
> Saeed, are you going to take care of this?

Yes, I will take it to mlx5 net branch and submit it to you when it is ready.
Cong just submitted v3 with new title:
[Patch net v3] mlx5: force CHECKSUM_NONE for short ethernet frames

we are still working on it.

Thanks !


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-12-03 Thread Cong Wang
On Mon, Dec 3, 2018 at 3:17 PM David Miller  wrote:
>
> Saeed, are you going to take care of this?

David, I will send v3 to switch to CHECKSUM_NONE for these short
frames, which also could avoid parsing IP headers.

Thanks.


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-12-03 Thread David Miller
From: Cong Wang 
Date: Tue, 27 Nov 2018 22:10:13 -0800

> When an ethernet frame is padded to meet the minimum ethernet frame
> size, the padding octets are not covered by the hardware checksum.
> Fortunately the padding octets are ususally zero's, which don't affect
> checksum. However, we have a switch which pads non-zero octets, this
> causes kernel hardware checksum fault repeatedly.
> 
> Prior to commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE 
> are friends"),
> skb checksum is forced to be CHECKSUM_NONE when padding is detected.
> After it, we need to keep skb->csum updated, like what we do for FCS.
> 
> The logic is a bit complicated when dealing with both FCS and padding,
> so I wrap it up in a helper function mlx5e_csum_padding().
> 
> I tested this patch with RXFCS on and off, it works fine without any
> warning in both cases.
> 
> Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are 
> friends"),
> Cc: Saeed Mahameed 
> Cc: Eric Dumazet 
> Signed-off-by: Cong Wang 

Saeed, are you going to take care of this?


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-29 Thread Saeed Mahameed
On Wed, Nov 28, 2018 at 8:09 PM Eric Dumazet  wrote:
>
> On Wed, Nov 28, 2018 at 7:53 PM Cong Wang  wrote:
> >
> > On Wed, Nov 28, 2018 at 7:50 PM Eric Dumazet  wrote:
> > >
> > > On Wed, Nov 28, 2018 at 7:40 PM Cong Wang  
> > > wrote:
> > > >
> > > > On Wed, Nov 28, 2018 at 4:07 PM Eric Dumazet  
> > > > wrote:
> > > > >
> > > > > A NIC is supposed to deliver frames, even the ones that 'seem' bad.
> > > >
> > > > A quick test shows this is not the case for mlx5.
> > > >
> > > > With the trafgen script you gave to me, with tot_len==40, the dest host
> > > > could receive all the packets. Changing tot_len to 80, tcpdump could no
> > > > longer see any packet. (Both sender and receiver are mlx5.)
> > > >
> > > > So, packets with tot_len > skb->len are clearly dropped before tcpdump
> > > > could see it, that is likely by mlx5 hardware.
> > >
> > > Or a router, or a switch.
> > >
> > > Are your two hosts connected back to back ?
> >
> > Both should be plugged into a same switch. I fail to see why a
> > switch could parse IP header as the packet is nothing of interest,
> > like a IGMP snooping.
>
> Well, _something_ is dropping the frames.
> It can be mlx5, or something else.
>

mlx5 HW should deliver such packets.
just tested with scapy :
sender:

#scapy
p = Ether(dst="24:8a:07:b4:24:6e")/IP(dst="1.2.3.4", len = 1000)
sendp(p, iface = "p5p1")

receiver:
tcpdump: listening on p5p1, link-type EN10MB (Ethernet), capture size
262144 bytes
16:24:26.427563 80:18:44:e5:2f:c4 > 24:8a:07:b4:24:6e, ethertype IPv4
(0x0800), length 60: truncated-ip - 954 bytes missing! (tos 0x0, ttl
64, id 1, offset 0, flags [none], proto Options (0), length 1000)
10.20.2.212 > 1.2.3.4:  ip-proto-0 980


> Does ethtool -S show any increasing counter ?


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-29 Thread Saeed Mahameed
On Wed, Nov 28, 2018 at 4:05 PM Cong Wang  wrote:
>
> On Wed, Nov 28, 2018 at 3:57 PM Cong Wang  wrote:
> > But again, I kinda feel the hardware already does the sanity check,
> > otherwise we have much more serious trouble in mlx5e_lro_update_hdr()
> > which parses into TCP header.
> >
>
> While we are on this page, mlx5e_lro_update_hdr() incorrectly assumes
> TCP header is located right after struct iphdr, which is wrong if we could
> have IP options on this path.
>
> It could the hardware which already verified this corner case though.

yes HW will not do LRO in case of ip options, so no problem in
mlx5e_lro_update_hdr


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-28 Thread Eric Dumazet
On Wed, Nov 28, 2018 at 7:53 PM Cong Wang  wrote:
>
> On Wed, Nov 28, 2018 at 7:50 PM Eric Dumazet  wrote:
> >
> > On Wed, Nov 28, 2018 at 7:40 PM Cong Wang  wrote:
> > >
> > > On Wed, Nov 28, 2018 at 4:07 PM Eric Dumazet  wrote:
> > > >
> > > > A NIC is supposed to deliver frames, even the ones that 'seem' bad.
> > >
> > > A quick test shows this is not the case for mlx5.
> > >
> > > With the trafgen script you gave to me, with tot_len==40, the dest host
> > > could receive all the packets. Changing tot_len to 80, tcpdump could no
> > > longer see any packet. (Both sender and receiver are mlx5.)
> > >
> > > So, packets with tot_len > skb->len are clearly dropped before tcpdump
> > > could see it, that is likely by mlx5 hardware.
> >
> > Or a router, or a switch.
> >
> > Are your two hosts connected back to back ?
>
> Both should be plugged into a same switch. I fail to see why a
> switch could parse IP header as the packet is nothing of interest,
> like a IGMP snooping.

Well, _something_ is dropping the frames.
It can be mlx5, or something else.

Does ethtool -S show any increasing counter ?


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-28 Thread Cong Wang
On Wed, Nov 28, 2018 at 7:50 PM Eric Dumazet  wrote:
>
> On Wed, Nov 28, 2018 at 7:40 PM Cong Wang  wrote:
> >
> > On Wed, Nov 28, 2018 at 4:07 PM Eric Dumazet  wrote:
> > >
> > > A NIC is supposed to deliver frames, even the ones that 'seem' bad.
> >
> > A quick test shows this is not the case for mlx5.
> >
> > With the trafgen script you gave to me, with tot_len==40, the dest host
> > could receive all the packets. Changing tot_len to 80, tcpdump could no
> > longer see any packet. (Both sender and receiver are mlx5.)
> >
> > So, packets with tot_len > skb->len are clearly dropped before tcpdump
> > could see it, that is likely by mlx5 hardware.
>
> Or a router, or a switch.
>
> Are your two hosts connected back to back ?

Both should be plugged into a same switch. I fail to see why a
switch could parse IP header as the packet is nothing of interest,
like a IGMP snooping.


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-28 Thread Eric Dumazet
On Wed, Nov 28, 2018 at 7:40 PM Cong Wang  wrote:
>
> On Wed, Nov 28, 2018 at 4:07 PM Eric Dumazet  wrote:
> >
> > A NIC is supposed to deliver frames, even the ones that 'seem' bad.
>
> A quick test shows this is not the case for mlx5.
>
> With the trafgen script you gave to me, with tot_len==40, the dest host
> could receive all the packets. Changing tot_len to 80, tcpdump could no
> longer see any packet. (Both sender and receiver are mlx5.)
>
> So, packets with tot_len > skb->len are clearly dropped before tcpdump
> could see it, that is likely by mlx5 hardware.

Or a router, or a switch.

Are your two hosts connected back to back ?


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-28 Thread Cong Wang
On Wed, Nov 28, 2018 at 4:07 PM Eric Dumazet  wrote:
>
> A NIC is supposed to deliver frames, even the ones that 'seem' bad.

A quick test shows this is not the case for mlx5.

With the trafgen script you gave to me, with tot_len==40, the dest host
could receive all the packets. Changing tot_len to 80, tcpdump could no
longer see any packet. (Both sender and receiver are mlx5.)

So, packets with tot_len > skb->len are clearly dropped before tcpdump
could see it, that is likely by mlx5 hardware.


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-28 Thread Eric Dumazet



On 11/28/2018 04:05 PM, Cong Wang wrote:

> While we are on this page, mlx5e_lro_update_hdr() incorrectly assumes
> TCP header is located right after struct iphdr, which is wrong if we could
> have IP options on this path.
> 
> It could the hardware which already verified this corner case though.
> 

GRO makes sure IPv4 header has no options, so I would not be surprised
that LRO on mlx5 has the same logic.


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-28 Thread Eric Dumazet
On Wed, Nov 28, 2018 at 3:57 PM Cong Wang  wrote:
> But again, I kinda feel the hardware already does the sanity check,
> otherwise we have much more serious trouble in mlx5e_lro_update_hdr()
> which parses into TCP header.
>

LRO is a different beast.

For packets that are not recognized as LRO candidates
(for example because their IP length is bigger than the frame length),
we exactly take the code path you want to change.

A NIC is supposed to deliver frames, even the ones that 'seem' bad.


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-28 Thread Cong Wang
On Wed, Nov 28, 2018 at 3:57 PM Cong Wang  wrote:
> But again, I kinda feel the hardware already does the sanity check,
> otherwise we have much more serious trouble in mlx5e_lro_update_hdr()
> which parses into TCP header.
>

While we are on this page, mlx5e_lro_update_hdr() incorrectly assumes
TCP header is located right after struct iphdr, which is wrong if we could
have IP options on this path.

It could the hardware which already verified this corner case though.


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-28 Thread Cong Wang
On Wed, Nov 28, 2018 at 3:50 PM Eric Dumazet  wrote:
>
> On Wed, Nov 28, 2018 at 2:16 PM Cong Wang  wrote:
> >
> > On Wed, Nov 28, 2018 at 7:00 AM Eric Dumazet  wrote:
> > >
> > > Nice packet of death alert.
> > >
> > > pad_len can be 0xFF67  here, if frame_len is smaller than pad_offset.
> >
> > Unless IP header is malformed, how could it be?
>
> This is totally something an attacker can forge.

Of course, as in the email I sent to mellanox guys,__vlan_get_protocol()
could _literately_ exhaust all skb->len. If no sufficient skb tail room,
we could even possibly crash.

But again, I kinda feel the hardware already does the sanity check,
otherwise we have much more serious trouble in mlx5e_lro_update_hdr()
which parses into TCP header.

Thanks.


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-28 Thread Eric Dumazet
On Wed, Nov 28, 2018 at 2:16 PM Cong Wang  wrote:
>
> On Wed, Nov 28, 2018 at 7:00 AM Eric Dumazet  wrote:
> >
> > Nice packet of death alert.
> >
> > pad_len can be 0xFF67  here, if frame_len is smaller than pad_offset.
>
> Unless IP header is malformed, how could it be?

This is totally something an attacker can forge.

ip_rcv_core()
...
len = ntohs(iph->tot_len);
if (skb->len < len) {
   __IP_INC_STATS(net, IPSTATS_MIB_INTRUNCATEDPKTS);
   goto drop;

No crash, but we drop and increment appropriate SNMP counter.


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-28 Thread Cong Wang
On Wed, Nov 28, 2018 at 7:00 AM Eric Dumazet  wrote:
>
> Nice packet of death alert.
>
> pad_len can be 0xFF67  here, if frame_len is smaller than pad_offset.

Unless IP header is malformed, how could it be?

Speaking of IP header sanity, I am totally aware of it, I don't check it because
I know get_ip_proto() doesn't check either, it must be hardware which verifies
the sanity.


>
> Really I suggest you set ip_summed to CHECKSUM_NONE, then remove the
> initial test ( if (likely(frame_len > ETH_ZLEN)) ...)
>
> Until the firmware is fixed.

Hmm, why setting to CHECKSUM_NONE could get rid of the minimum ethernet
frame check? I am lost, there is no bug for packet > ETH_ZLEN _for me_, what
needs to fix here?

Overall, you keep pushing me to fix a bug I don't observe. I don't understand
why. If you see it, please come up with your own patch? Why do I have to fix
the problem you see??


>
> Otherwise frames with a wrong checksum and some non zero padding could
> potentially
> be seen as correct frames. (Probability of 1/65536)
>
> Do not focus on your immediate problem (small packets being padded by
> a non malicious entity)

Again, why _I_ should fix a problem I never observe? Why is it not you who
fix the problem you find during code review? No to mention I have no environment
to test it even if I really want to fix. I can' take such a risk.

Thanks.


Re: [Patch net v2] mlx5: fixup checksum for short ethernet frame padding

2018-11-28 Thread Eric Dumazet
On Tue, Nov 27, 2018 at 10:10 PM Cong Wang  wrote:
>
> When an ethernet frame is padded to meet the minimum ethernet frame
> size, the padding octets are not covered by the hardware checksum.
> Fortunately the padding octets are ususally zero's, which don't affect
> checksum. However, we have a switch which pads non-zero octets, this
> causes kernel hardware checksum fault repeatedly.
>
> Prior to commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE 
> are friends"),
> skb checksum is forced to be CHECKSUM_NONE when padding is detected.
> After it, we need to keep skb->csum updated, like what we do for FCS.
>
> The logic is a bit complicated when dealing with both FCS and padding,
> so I wrap it up in a helper function mlx5e_csum_padding().
>
> I tested this patch with RXFCS on and off, it works fine without any
> warning in both cases.
>
> Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are 
> friends"),
> Cc: Saeed Mahameed 
> Cc: Eric Dumazet 
> Signed-off-by: Cong Wang 
> ---
>  .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 39 +++
>  1 file changed, 39 insertions(+)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c 
> b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> index 16985ca3248d..9b6bd2b51556 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> @@ -732,6 +732,37 @@ static u8 get_ip_proto(struct sk_buff *skb, __be16 proto)
> ((struct ipv6hdr *)ip_p)->nexthdr;
>  }
>
> +static void mlx5e_csum_padding(struct sk_buff *skb, int network_depth,
> +  __be16 proto, bool has_fcs)
> +{
> +   u32 frame_len = has_fcs ? skb->len - ETH_FCS_LEN : skb->len;
> +   u32 pad_offset, pad_len;
> +   void *pad, *ip_p;
> +
> +   /* We only handle short frames here, and we know they are linear. */
> +   if (likely(frame_len > ETH_ZLEN))
> +   return;
> +
> +   ip_p = skb->data + network_depth;
> +   if (proto == htons(ETH_P_IP)) {
> +   struct iphdr *ipv4 = ip_p;
> +
> +   pad_offset =  network_depth + be16_to_cpu(ipv4->tot_len);
> +   } else { /* Either IPv4 or IPv6 here. */
> +   struct ipv6hdr *ipv6 = ip_p;
> +
> +   pad_offset = network_depth + sizeof(struct ipv6hdr) +
> +be16_to_cpu(ipv6->payload_len);
> +   }
> +   pad_len = frame_len - pad_offset;
> +   if (pad_len == 0)
> +   return;

Nice packet of death alert.

pad_len can be 0xFF67  here, if frame_len is smaller than pad_offset.

Really I suggest you set ip_summed to CHECKSUM_NONE, then remove the
initial test ( if (likely(frame_len > ETH_ZLEN)) ...)

Until the firmware is fixed.

Otherwise frames with a wrong checksum and some non zero padding could
potentially
be seen as correct frames. (Probability of 1/65536)

Do not focus on your immediate problem (small packets being padded by
a non malicious entity)

> +
> +   pad = skb->data + pad_offset;
> +   skb->csum = csum_block_add(skb->csum, csum_partial(pad, pad_len, 0),
> +  pad_offset);
> +}
> +
>  static inline void mlx5e_handle_csum(struct net_device *netdev,
>  struct mlx5_cqe64 *cqe,
>  struct mlx5e_rq *rq,
> @@ -772,6 +803,14 @@ static inline void mlx5e_handle_csum(struct net_device 
> *netdev,
> skb->csum = csum_block_add(skb->csum,
>(__force 
> __wsum)mlx5e_get_fcs(skb),
>skb->len - ETH_FCS_LEN);
> +
> +   /* CQE csum doesn't cover padding octets in short ethernet
> +* frames. And the pad field is appended prior to calculating
> +* and appending the FCS field.
> +*/
> +   mlx5e_csum_padding(skb, network_depth, proto,
> +  !!(netdev->features & NETIF_F_RXFCS));
> +
> stats->csum_complete++;
> return;
> }
> --
> 2.19.1
>