John,

This looks functionally OK, but could you repost this with the following 
changes:

- Include a commit message explaining what was broken and how it is fixed.
- Refer to the commit that introduced the deleted code with a "Fixes: <commit> 
(“Title”)” line.
- Include a "Signed-off-by:” -line.


> On Jan 5, 2017, at 9:12 AM, John Hurley <john.hur...@netronome.com> wrote:
> 
> Ok, this makes sense now. Ive updated and tested the patch:
> 

Include info about the kernel versions in which you tested to the commit 
message.

Then some final nits below,

  Jarno

> 
> From 93bc4c59b4eb814ed4bc3da3b7459498863add1e Mon Sep 17 00:00:00 2001
> From: John Hurley <john.hur...@netronome.com 
> <mailto:john.hur...@netronome.com>>
> Date: Thu, 5 Jan 2017 17:08:37 +0000
> Subject: [PATCH 1/1] ensure correct checksum when a packet is sent to NAT
> 
> ---
>  datapath/conntrack.c | 47 +++++++++++++++++------------------------------
>  1 file changed, 17 insertions(+), 30 deletions(-)
> 
> diff --git a/datapath/conntrack.c b/datapath/conntrack.c
> index d942884..3faa209 100644
> --- a/datapath/conntrack.c
> +++ b/datapath/conntrack.c
> @@ -314,6 +314,11 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 proto)
>       u8 nexthdr;
>       int err;
>  
> +#if LINUX_VERSION_CODE < KERNEL_VERSION(4,6,0)
> +     bool dst_set = false;
> +     struct rtable rt = { .rt_flags = 0 };
> +#endif
> +
>       ct = nf_ct_get(skb, &ctinfo);
>       if (!ct || ctinfo == IP_CT_RELATED_REPLY)
>               return NF_ACCEPT;
> @@ -352,43 +357,25 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 proto)
>  #if LINUX_VERSION_CODE < KERNEL_VERSION(4,6,0)
>       /* Linux 4.5 and older depend on skb_dst being set when recalculating
>        * checksums after NAT helper has mangled TCP or UDP packet payload.
> -      * This dependency is avoided when skb is CHECKSUM_PARTIAL or when UDP
> -      * has no checksum.
> -      *
> -      * The dependency is not triggered when the main NAT code updates
> -      * checksums after translating the IP header (address, port), so this
> -      * fix only needs to be executed on packets that are both being NATted
> -      * and that have a helper assigned.
> +      * skb_dst is cast to a rtable struct and the flags examined.
> +      * Forcing these flags to have RTCF_LOCAL not set ensures checksum mod
> +      * is carried out in the same way as kernel versions > 4.5
>        */
> -     if (ct->status & IPS_NAT_MASK && skb->ip_summed != CHECKSUM_PARTIAL) {
> -             u8 ipproto = (proto == NFPROTO_IPV4)
> -                     ? ip_hdr(skb)->protocol : nexthdr;
> -             u16 offset = 0;
> -
> -             switch (ipproto) {
> -             case IPPROTO_TCP:
> -                     offset = offsetof(struct tcphdr, check);
> -                     break;
> -             case IPPROTO_UDP:
> -                     /* Skip if no csum. */
> -                     if (udp_hdr(skb)->check)
> -                             offset = offsetof(struct udphdr, check);
> -                     break;
> -             }
> -             if (offset) {
> -                     if (unlikely(!pskb_may_pull(skb, protoff + offset + 2)))
> -                             return NF_DROP;
> -
> -                     skb->csum_start = skb_headroom(skb) + protoff;
> -                     skb->csum_offset = offset;
> -                     skb->ip_summed = CHECKSUM_PARTIAL;
> -             }
> +     if (ct->status & IPS_NAT_MASK && skb->ip_summed != CHECKSUM_PARTIAL
> +             && !skb_rtable(skb)) {

On a second thought, maybe using "!skb_dst(skb)” directly would be more 
appropriate?

> +             dst_set = true;
> +             skb_dst_set(skb, (struct dst_entry *)&rt);

"skb_dst_set(skb, &rt.dst);” seems to be a more common way of doing this in the 
kernel.

>       }
>  #endif
>       err = helper->help(skb, protoff, ct, ctinfo);
>       if (err != NF_ACCEPT)
>               return err;
>  
> +#if LINUX_VERSION_CODE < KERNEL_VERSION(4,6,0)
> +     if (dst_set)
> +             skb_dst_set(skb, NULL);
> +#endif
> +
>       /* Adjust seqs after helper.  This is needed due to some helpers (e.g.,
>        * FTP with NAT) adusting the TCP payload size when mangling IP
>        * addresses and/or port numbers in the text-based control connection.
> -- 
> 2.7.4
> 
> 
> On Wed, Jan 4, 2017 at 9:33 PM, Jarno Rajahalme <ja...@ovn.org 
> <mailto:ja...@ovn.org>> wrote:
> 
>> On Jan 4, 2017, at 9:50 AM, John Hurley <john.hur...@netronome.com 
>> <mailto:john.hur...@netronome.com>> wrote:
>> 
>> From 64ce83672aab5634990426e0a51af16d882ac2f9 Mon Sep 17 00:00:00 2001
>> From: John Hurley <john.hur...@netronome.com 
>> <mailto:john.hur...@netronome.com>>
>> Date: Wed, 4 Jan 2017 16:52:43 +0000
>> Subject: [PATCH] ensure correct checksum when a packet is sent to NAT helper
>>  in kernel < 4.6
>> 
>> ---
>>  datapath/conntrack.c | 45 ++++++++++++++++-----------------------------
>>  1 file changed, 16 insertions(+), 29 deletions(-)
>> 
>> diff --git a/datapath/conntrack.c b/datapath/conntrack.c
>> index d942884..fa3b0b5 100644
>> --- a/datapath/conntrack.c
>> +++ b/datapath/conntrack.c
>> @@ -314,6 +314,10 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 proto)
>>      u8 nexthdr;
>>      int err;
>>  
>> +#if LINUX_VERSION_CODE < KERNEL_VERSION(4,6,0)
>> +    struct rtable rt;
>> +#endif
>> +
> 
> This should be properly initialized (e.g., " = { .rt_flags = 0 };”). With 
> this RFCF_LOCAL will be unset (see below) and everything else will also be 
> initialized to 0.
> 
>>      ct = nf_ct_get(skb, &ctinfo);
>>      if (!ct || ctinfo == IP_CT_RELATED_REPLY)
>>              return NF_ACCEPT;
>> @@ -352,43 +356,26 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 
>> proto)
>>  #if LINUX_VERSION_CODE < KERNEL_VERSION(4,6,0)
>>      /* Linux 4.5 and older depend on skb_dst being set when recalculating
>>       * checksums after NAT helper has mangled TCP or UDP packet payload.
>> -     * This dependency is avoided when skb is CHECKSUM_PARTIAL or when UDP
>> -     * has no checksum.
>> -     *
>> -     * The dependency is not triggered when the main NAT code updates
>> -     * checksums after translating the IP header (address, port), so this
>> -     * fix only needs to be executed on packets that are both being NATted
>> -     * and that have a helper assigned.
>> +     * skb_dst is cast to a rtable struct and the flags examined.
>> +     * Forcing these flags to have RTCF_LOCAL set allows checksum mod
>> +     * to be carried out in the same way as kernel versions > 4.5
> 
> As noted in the previous email, the reverse is the case. When RTCF_LOCAL is 
> not set, then the checksum update code will behave the same as in kernel 
> versions > 4.5.
> 
>>       */
>>      if (ct->status & IPS_NAT_MASK && skb->ip_summed != CHECKSUM_PARTIAL) {
> 
> Maybe it would be a bit more robust to set the dst only if not already set, 
> i.e., add “!skb_rtable(skb)” to the condition above?
> 
>> -            u8 ipproto = (proto == NFPROTO_IPV4)
>> -                    ? ip_hdr(skb)->protocol : nexthdr;
>> -            u16 offset = 0;
>> -
>> -            switch (ipproto) {
>> -            case IPPROTO_TCP:
>> -                    offset = offsetof(struct tcphdr, check);
>> -                    break;
>> -            case IPPROTO_UDP:
>> -                    /* Skip if no csum. */
>> -                    if (udp_hdr(skb)->check)
>> -                            offset = offsetof(struct udphdr, check);
>> -                    break;
>> -            }
>> -            if (offset) {
>> -                    if (unlikely(!pskb_may_pull(skb, protoff + offset + 2)))
>> -                            return NF_DROP;
>> -
>> -                    skb->csum_start = skb_headroom(skb) + protoff;
>> -                    skb->csum_offset = offset;
>> -                    skb->ip_summed = CHECKSUM_PARTIAL;
>> -            }
>> +            rt.rt_flags = RTCF_LOCAL;
> 
> Given the initialization above this line is not needed any more.
> 
>> +            skb_dst_set(skb, (struct dst_entry *)&rt);
>>      }
>>  #endif
>>      err = helper->help(skb, protoff, ct, ctinfo);
>>      if (err != NF_ACCEPT)
>>              return err;
>>  
>> +#if LINUX_VERSION_CODE < KERNEL_VERSION(4,6,0)
>> +    if (ct->status & IPS_NAT_MASK && skb->ip_summed != CHECKSUM_PARTIAL) {
> 
> The checksum code may change skb->ip_summed value, so we cannot use this 
> check here any more. Maybe set a boolean (e.g., "dst_set = true;”) if the 
> skb_dst_set call was done above and use that as the check to undo it here?
> 
>> +            /* reset dst of skb to NULL */
>> +            skb_dst_set(skb, NULL);
>> +    }
>> +#endif
>> +
>>      /* Adjust seqs after helper.  This is needed due to some helpers (e.g.,
>>       * FTP with NAT) adusting the TCP payload size when mangling IP
>>       * addresses and/or port numbers in the text-based control connection.
>> -- 
>> 2.7.4
>> 
>> 
>> On Wed, Jan 4, 2017 at 5:50 PM, John Hurley <john.hur...@netronome.com 
>> <mailto:john.hur...@netronome.com>> wrote:
>> 
>> 
>> On Wed, Jan 4, 2017 at 12:53 AM, Jarno Rajahalme <ja...@ovn.org 
>> <mailto:ja...@ovn.org>> wrote:
>> 
>> > On Dec 28, 2016, at 3:05 PM, John Hurley <john.hur...@netronome.com 
>> > <mailto:john.hur...@netronome.com>> wrote:
>> >
>> > sorry, updated patch….
>> 
>> This patch is still whitespace damaged. Maybe use git format-patch and git 
>> send-email to send it?
>> 
>> >
>> > --------------------------------
>> >
>> > Fix for a bug when sending a NATed packet to helper function in kernels
>> > <4.6.
>> >
>> > Setting CHECKSUM_PARTIAL flag can lead to L4 checksum corruption.
>> >
>> > Corruption can occur in datapath.c/queue_userspace_packet().
>> >
>> 
>> You mean this code:
>> 
>>         /* Complete checksum if needed */
>>         if (skb->ip_summed == CHECKSUM_PARTIAL &&
>>             (err = skb_checksum_help(skb)))
>>                 goto out;
>> 
>> 
>> 
>> 
>> 
>> Yes, my finding was that if a packet was sent to conntrack NAT and an FTP 
>> helper, but had only IP addresses/ports translated (no payload changes), 
>> that the TCP checksum would be correct prior to the call of 
>> skb_checksum_help and corrupt after.
>>  
>> If the packet payload was mangled then the checksum was incorrect both 
>> before and after this call.
>> 
>> 
>>  
>> Would you care detailing why the corruption happens? Does it always happen 
>> if ip_summed is CHECKSUM_PARTIAL?
>> 
>> 
>> 
>> If ip_summed was not set to CHECKSUM_PARTIAL then the non payload mangled 
>> FTP packets on the tests I was running would have the correct checksum but 
>> mangled payload packets would not.
>> 
>> My take on the CHECKSUM_PARTIAL flag from the documentation is that it 
>> signals that the checksum had been partially complete and that the remaining 
>> (payload section) section needs to be calculated and combined to the 
>> existing field. However, the NAT kernel modules seem to handle this update 
>> so we end up doing the 'remainder' of the calculation without needing to.  
>> It may also expect this to be corrected in the network card but the cards I 
>> was testing on were not running checksum offloading.
>> 
>> 
>>  
>> > Giving the packet an skb_dst allows the kernel to correct the checksum.
>> >
>> > Verified with packets mangled by Conntrack/NAT helpers.
>> >
>> 
>> It would also be helpful to add the reference to the patch that this fixes 
>> (“Fixes:”)
>> 
>> > Signed-off-by: John Hurley <john.hur...@netronome.com 
>> > <mailto:john.hur...@netronome.com>>
>> > ---
>> >
>> > diff --git a/datapath/conntrack.c b/datapath/conntrack.c
>> > index d942884..fef67ba 100644
>> > --- a/datapath/conntrack.c
>> > +++ b/datapath/conntrack.c
>> > @@ -314,6 +314,10 @@ static int ovs_ct_helper(struct sk_buff *skb, u16
>> > proto)
>> >  u8 nexthdr;
>> >  int err;
>> >
>> > +#if LINUX_VERSION_CODE < KERNEL_VERSION(4,6,0)
>> > + struct rtable *rt = NULL;
>> > +#endif
>> > +
>> >  ct = nf_ct_get(skb, &ctinfo);
>> >  if (!ct || ctinfo == IP_CT_RELATED_REPLY)
>> >  return NF_ACCEPT;
>> > @@ -352,43 +356,28 @@ static int ovs_ct_helper(struct sk_buff *skb, u16
>> > proto)
>> > #if LINUX_VERSION_CODE < KERNEL_VERSION(4,6,0)
>> >  /* Linux 4.5 and older depend on skb_dst being set when recalculating
>> >  * checksums after NAT helper has mangled TCP or UDP packet payload.
>> > - * This dependency is avoided when skb is CHECKSUM_PARTIAL or when UDP
>> > - * has no checksum.
>> >  *
>> > - * The dependency is not triggered when the main NAT code updates
>> > - * checksums after translating the IP header (address, port), so this
>> > - * fix only needs to be executed on packets that are both being NATted
>> > - * and that have a helper assigned.
>> > + * skb_dst is cast to a rtable struct and the flags examined.
>> > + * Forcing these flags to have RTCF_LOCAL set allows checksum mod
>> > + * to be carried out in the same way as kernel versions > 4.5
>> 
>> Are kernels > 4.5 treating all these packets as RTCF_LOCAL, or are some of 
>> them possibly CHECKSUM_PARTIAL?
>> 
>> 
>> 
>> 
>> yes, it appears so - assuming CHECKSUM_PARTIAL is not set already.
>> 
>> The following are the nat checksum calculations from kernel 4.5:
>> 126 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L126>
>>  static void nf_nat_ipv4_csum_recalc 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=nf_nat_ipv4_csum_recalc>(struct 
>> sk_buff <http://lxr.free-electrons.com/ident?v=4.5;i=sk_buff> *skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>,
>> 127 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L127>
>>                                      u8 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=u8> proto 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=proto>, void *data 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=data>, __sum16 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=__sum16> *check 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=check>,
>> 128 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L128>
>>                                      int datalen, int oldlen)
>> 129 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L129>
>>  {
>> 130 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L130>
>>          const struct iphdr 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=iphdr> *iph = ip_hdr 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=ip_hdr>(skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>);
>> 131 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L131>
>>          struct rtable <http://lxr.free-electrons.com/ident?v=4.5;i=rtable> 
>> *rt <http://lxr.free-electrons.com/ident?v=4.5;i=rt> = skb_rtable 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb_rtable>(skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>);
>> 132 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L132>
>>  
>> 133 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L133>
>>          if (skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>->ip_summed != 
>> CHECKSUM_PARTIAL 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=CHECKSUM_PARTIAL>) {
>> 134 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L134>
>>                  if (!(rt 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=rt>->rt_flags & RTCF_LOCAL 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=RTCF_LOCAL>) &&
>> 135 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L135>
>>                      (!skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>->dev 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=dev> || skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>->dev 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=dev>->features 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=features> &
>> 136 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L136>
>>                       (NETIF_F_IP_CSUM 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=NETIF_F_IP_CSUM> | 
>> NETIF_F_HW_CSUM 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=NETIF_F_HW_CSUM>))) {
>> 137 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L137>
>>                          skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>->ip_summed = 
>> CHECKSUM_PARTIAL 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=CHECKSUM_PARTIAL>;
>> 138 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L138>
>>                          skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>->csum_start = skb_headroom 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb_headroom>(skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>) +
>> 139 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L139>
>>                                            skb_network_offset 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb_network_offset>(skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>) +
>> 140 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L140>
>>                                            ip_hdrlen 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=ip_hdrlen>(skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>);
>> 141 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L141>
>>                          skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>->csum_offset = (void 
>> *)check <http://lxr.free-electrons.com/ident?v=4.5;i=check> - data 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=data>;
>> 142 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L142>
>>                          *check 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=check> = 
>> ~csum_tcpudp_magic(iph->saddr 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=saddr>, iph->daddr 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=daddr>,
>> 143 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L143>
>>                                                      datalen, proto 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=proto>, 0);
>> 144 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L144>
>>                  } else {
>> 145 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L145>
>>                          *check 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=check> = 0;
>> 146 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L146>
>>                          *check 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=check> = csum_tcpudp_magic 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=csum_tcpudp_magic>(iph->saddr 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=saddr>, iph->daddr 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=daddr>,
>> 147 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L147>
>>                                                     datalen, proto 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=proto>,
>> 148 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L148>
>>                                                     csum_partial 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=csum_partial>(data 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=data>, datalen,
>> 149 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L149>
>>                                                                  0));
>> 150 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L150>
>>                          if (proto 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=proto> == IPPROTO_UDP 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=IPPROTO_UDP> && !*check 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=check>)
>> 151 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L151>
>>                                  *check 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=check> = CSUM_MANGLED_0 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=CSUM_MANGLED_0>;
>> 152 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L152>
>>                  }
>> 153 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L153>
>>          } else
>> 154 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L154>
>>                  inet_proto_csum_replace2 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=inet_proto_csum_replace2>(check 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=check>, skb 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=skb>,
>> 155 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L155>
>>                                           htons 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=htons>(oldlen), htons 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=htons>(datalen), true 
>> <http://lxr.free-electrons.com/ident?v=4.5;i=true>);
>> 156 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.5#L156>
>>  }
>> and version 4.6:
>> 126 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L126>
>>  static void nf_nat_ipv4_csum_recalc 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=nf_nat_ipv4_csum_recalc>(struct 
>> sk_buff <http://lxr.free-electrons.com/ident?v=4.6;i=sk_buff> *skb 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=skb>,
>> 127 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L127>
>>                                      u8 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=u8> proto 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=proto>, void *data 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=data>, __sum16 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=__sum16> *check 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=check>,
>> 128 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L128>
>>                                      int datalen, int oldlen)
>> 129 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L129>
>>  {
>> 130 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L130>
>>          if (skb 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=skb>->ip_summed != 
>> CHECKSUM_PARTIAL 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=CHECKSUM_PARTIAL>) {
>> 131 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L131>
>>                  const struct iphdr 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=iphdr> *iph = ip_hdr 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=ip_hdr>(skb 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=skb>);
>> 132 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L132>
>>  
>> 133 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L133>
>>                  skb 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=skb>->ip_summed = 
>> CHECKSUM_PARTIAL 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=CHECKSUM_PARTIAL>;
>> 134 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L134>
>>                  skb 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=skb>->csum_start = skb_headroom 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=skb_headroom>(skb 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=skb>) + skb_network_offset 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=skb_network_offset>(skb 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=skb>) +
>> 135 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L135>
>>                          ip_hdrlen 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=ip_hdrlen>(skb 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=skb>);
>> 136 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L136>
>>                  skb 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=skb>->csum_offset = (void 
>> *)check <http://lxr.free-electrons.com/ident?v=4.6;i=check> - data 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=data>;
>> 137 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L137>
>>                  *check <http://lxr.free-electrons.com/ident?v=4.6;i=check> 
>> = ~csum_tcpudp_magic(iph->saddr 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=saddr>, iph->daddr 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=daddr>, datalen,
>> 138 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L138>
>>                                              proto 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=proto>, 0);
>> 139 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L139>
>>          } else
>> 140 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L140>
>>                  inet_proto_csum_replace2 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=inet_proto_csum_replace2>(check 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=check>, skb 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=skb>,
>> 141 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L141>
>>                                           htons 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=htons>(oldlen), htons 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=htons>(datalen), true 
>> <http://lxr.free-electrons.com/ident?v=4.6;i=true>);
>> 142 
>> <http://lxr.free-electrons.com/source/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c?v=4.6#L142>
>>  } 
>> 
>> >  */
>> >  if (ct->status & IPS_NAT_MASK && skb->ip_summed != CHECKSUM_PARTIAL) {
>> > - u8 ipproto = (proto == NFPROTO_IPV4)
>> > - ? ip_hdr(skb)->protocol : nexthdr;
>> > - u16 offset = 0;
>> > -
>> > - switch (ipproto) {
>> > - case IPPROTO_TCP:
>> > - offset = offsetof(struct tcphdr, check);
>> > - break;
>> > - case IPPROTO_UDP:
>> > - /* Skip if no csum. */
>> > - if (udp_hdr(skb)->check)
>> > - offset = offsetof(struct udphdr, check);
>> > - break;
>> > - }
>> > - if (offset) {
>> > - if (unlikely(!pskb_may_pull(skb, protoff + offset + 2)))
>> > - return NF_DROP;
>> > -
>> > - skb->csum_start = skb_headroom(skb) + protoff;
>> > - skb->csum_offset = offset;
>> > - skb->ip_summed = CHECKSUM_PARTIAL;
>> > - }
>> > + rt = kmalloc(sizeof(struct rtable), GFP_KERNEL);
>> 
>> Could the struct rtable be allocated from stack instead? Or could it be a 
>> static variable in the file scope? In either case we would avoid dynamic 
>> memory allocation/free. I recall setting ip_summed to CHECKSUM_PARTIAL was 
>> deemed a simpler way to backport as it (also) avoids the dynamic memory 
>> allocations.
>> 
>> 
>> 
>> yes, it could be added as a stack variable. As it is only added to the skb 
>> temporarily to force the checksum change then this would make more sense!
>> I'll forward on a further email after this with a patch using a stack based 
>> variable - I tested this patch using an FTP session across OVS. 
>> 
>> 
>>  
>> > + rt->rt_flags = RTCF_LOCAL;
>> > + skb_dst_set(skb, (struct dst_entry *)rt);
>> >  }
>> > #endif
>> >  err = helper->help(skb, protoff, ct, ctinfo);
>> >  if (err != NF_ACCEPT)
>> >  return err;
>> >
>> > +#if LINUX_VERSION_CODE < KERNEL_VERSION(4,6,0)
>> > + if (rt) {
>> > + skb_dst_set(skb, NULL);
>> > + kfree(rt);
>> > + }
>> > +#endif
>> > +
>> >  /* Adjust seqs after helper.  This is needed due to some helpers (e.g.,
>> >  * FTP with NAT) adusting the TCP payload size when mangling IP
>> >  * addresses and/or port numbers in the text-based control connection.
>> >
>> > --
>> >
>> > On Wed, Dec 28, 2016 at 10:08 PM, Ben Pfaff <b...@ovn.org 
>> > <mailto:b...@ovn.org>> wrote:
>> >
>> >> On Wed, Dec 28, 2016 at 09:37:30PM +0000, John Hurley wrote:
>> >>> Fix for a bug when sending a NATed packet to helper function in kernels
>> >>> <4.6.
>> >>>
>> >>> Setting CHECKSUM_PARTIAL flag means packets could have L4 checksum
>> >>> corrupted in
>> >>>
>> >>> datapath.c/queue_userspace_packet().
>> >>>
>> >>> Giving the packet an skb_dst allows the kernel to correct the checksum if
>> >>> packet
>> >>> mangling happens in Conntrack/NAT helpers.
>> >>>
>> >>> Signed-off-by: John Hurley <john.hur...@netronome.com 
>> >>> <mailto:john.hur...@netronome.com>>
>> >>
>> >> I'm not the right person to review this but I did notice that the patch
>> >> is wordwrapped and otherwise space-damaged.
>> >>
>> > _______________________________________________
>> > dev mailing list
>> > d...@openvswitch.org <mailto:d...@openvswitch.org>
>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
>> > <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
>> 
>> 
>> 
> 
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to