Re: [PATCH 1/1] openvswitch: fix infoleak in conntrack

2020-06-16 Thread Pravin Shelar
On Mon, Jun 15, 2020 at 7:13 PM Xidong Wang  wrote:
>
> From: xidongwang 
>
> The stack object “zone_limit” has 3 members. In function
> ovs_ct_limit_get_default_limit(), the member "count" is
> not initialized and sent out via “nla_put_nohdr”.
>
> Signed-off-by: xidongwang 

Looks good.
Acked-by: Pravin B Shelar 


Re: [PATCH] net: openvswitch: free vport unless register_netdevice() succeeds

2019-08-10 Thread Pravin Shelar
On Thu, Aug 8, 2019 at 8:55 PM Hillf Danton  wrote:
>
>
> syzbot found the following crash on:
>
> HEAD commit:1e78030e Merge tag 'mmc-v5.3-rc1' of git://git.kernel.org/..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=148d3d1a60
> kernel config:  https://syzkaller.appspot.com/x/.config?x=30cef20daf3e9977
> dashboard link: https://syzkaller.appspot.com/bug?extid=13210896153522fe1ee5
> compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=136aa8c460
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=109ba79260
>
> =
> BUG: memory leak
> unreferenced object 0x8881207e4100 (size 128):
>comm "syz-executor032", pid 7014, jiffies 4294944027 (age 13.830s)
>hex dump (first 32 bytes):
>  00 70 16 18 81 88 ff ff 80 af 8c 22 81 88 ff ff  .p."
>  00 b6 23 17 81 88 ff ff 00 00 00 00 00 00 00 00  ..#.
>backtrace:
>  [<0eb78212>] kmemleak_alloc_recursive  
> include/linux/kmemleak.h:43 [inline]
>  [<0eb78212>] slab_post_alloc_hook mm/slab.h:522 [inline]
>  [<0eb78212>] slab_alloc mm/slab.c:3319 [inline]
>  [<0eb78212>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
>  [<006ea6c6>] kmalloc include/linux/slab.h:552 [inline]
>  [<006ea6c6>] kzalloc include/linux/slab.h:748 [inline]
>  [<006ea6c6>] ovs_vport_alloc+0x37/0xf0  
> net/openvswitch/vport.c:130
>  [] internal_dev_create+0x24/0x1d0  
> net/openvswitch/vport-internal_dev.c:164
>  [<56ee7c13>] ovs_vport_add+0x81/0x190  
> net/openvswitch/vport.c:199
>  [<5434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
>  [] ovs_dp_cmd_new+0x22f/0x410  
> net/openvswitch/datapath.c:1614
>  [] genl_family_rcv_msg+0x2ab/0x5b0  
> net/netlink/genetlink.c:629
>  [] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
>  [<6694b647>] netlink_rcv_skb+0x61/0x170  
> net/netlink/af_netlink.c:2477
>  [<88381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
>  [] netlink_unicast_kernel  
> net/netlink/af_netlink.c:1302 [inline]
>  [] netlink_unicast+0x1ec/0x2d0  
> net/netlink/af_netlink.c:1328
>  [<67e6b079>] netlink_sendmsg+0x270/0x480  
> net/netlink/af_netlink.c:1917
>  [] sock_sendmsg_nosec net/socket.c:637 [inline]
>  [] sock_sendmsg+0x54/0x70 net/socket.c:657
>  [<4cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
>  [] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
>  [] __do_sys_sendmsg net/socket.c:2365 [inline]
>  [] __se_sys_sendmsg net/socket.c:2363 [inline]
>  [] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2363
>
> BUG: memory leak
> unreferenced object 0x88811723b600 (size 64):
>comm "syz-executor032", pid 7014, jiffies 4294944027 (age 13.830s)
>hex dump (first 32 bytes):
>  01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  
>  00 00 00 00 00 00 00 00 02 00 00 00 05 35 82 c1  .5..
>backtrace:
>  [<352f46d8>] kmemleak_alloc_recursive  
> include/linux/kmemleak.h:43 [inline]
>  [<352f46d8>] slab_post_alloc_hook mm/slab.h:522 [inline]
>  [<352f46d8>] slab_alloc mm/slab.c:3319 [inline]
>  [<352f46d8>] __do_kmalloc mm/slab.c:3653 [inline]
>  [<352f46d8>] __kmalloc+0x169/0x300 mm/slab.c:3664
>  [<8e48f3d1>] kmalloc include/linux/slab.h:557 [inline]
>  [<8e48f3d1>] ovs_vport_set_upcall_portids+0x54/0xd0  
> net/openvswitch/vport.c:343
>  [<541e4f4a>] ovs_vport_alloc+0x7f/0xf0  
> net/openvswitch/vport.c:139
>  [] internal_dev_create+0x24/0x1d0  
> net/openvswitch/vport-internal_dev.c:164
>  [<56ee7c13>] ovs_vport_add+0x81/0x190  
> net/openvswitch/vport.c:199
>  [<5434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
>  [] ovs_dp_cmd_new+0x22f/0x410  
> net/openvswitch/datapath.c:1614
>  [] genl_family_rcv_msg+0x2ab/0x5b0  
> net/netlink/genetlink.c:629
>  [] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
>  [<6694b647>] netlink_rcv_skb+0x61/0x170  
> net/netlink/af_netlink.c:2477
>  [<88381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
>  [] netlink_unicast_kernel  
> net/netlink/af_netlink.c:1302 [inline]
>  [] netlink_unicast+0x1ec/0x2d0  
> net/netlink/af_netlink.c:1328
>  [<67e6b079>] netlink_sendmsg+0x270/0x480  
> net/netlink/af_netlink.c:1917
>  

Re: memory leak in internal_dev_create

2019-08-07 Thread Pravin Shelar
On Tue, Aug 6, 2019 at 5:00 AM Hillf Danton  wrote:
>
>
> On Tue, 06 Aug 2019 01:58:05 -0700
> > Hello,
> >
> > syzbot found the following crash on:
> >

...
> > BUG: memory leak
> > unreferenced object 0x8881228ca500 (size 128):
> >comm "syz-executor032", pid 7015, jiffies 4294944622 (age 7.880s)
> >hex dump (first 32 bytes):
> >  00 f0 27 18 81 88 ff ff 80 ac 8c 22 81 88 ff ff  ..'"
> >  40 b7 23 17 81 88 ff ff 00 00 00 00 00 00 00 00  @.#.
> >backtrace:
> >  [<0eb78212>] kmemleak_alloc_recursive  
> > include/linux/kmemleak.h:43 [inline]
> >  [<0eb78212>] slab_post_alloc_hook mm/slab.h:522 [inline]
> >  [<0eb78212>] slab_alloc mm/slab.c:3319 [inline]
> >  [<0eb78212>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
> >  [<006ea6c6>] kmalloc include/linux/slab.h:552 [inline]
> >  [<006ea6c6>] kzalloc include/linux/slab.h:748 [inline]
> >  [<006ea6c6>] ovs_vport_alloc+0x37/0xf0  
> > net/openvswitch/vport.c:130
> >  [] internal_dev_create+0x24/0x1d0  
> > net/openvswitch/vport-internal_dev.c:164
> >  [<56ee7c13>] ovs_vport_add+0x81/0x190  
> > net/openvswitch/vport.c:199
> >  [<5434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
> >  [] ovs_dp_cmd_new+0x22f/0x410  
> > net/openvswitch/datapath.c:1614
> >  [] genl_family_rcv_msg+0x2ab/0x5b0  
> > net/netlink/genetlink.c:629
> >  [] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
> >  [<6694b647>] netlink_rcv_skb+0x61/0x170  
> > net/netlink/af_netlink.c:2477
> >  [<88381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
> >  [] netlink_unicast_kernel  
> > net/netlink/af_netlink.c:1302 [inline]
> >  [] netlink_unicast+0x1ec/0x2d0  
> > net/netlink/af_netlink.c:1328
> >  [<67e6b079>] netlink_sendmsg+0x270/0x480  
> > net/netlink/af_netlink.c:1917
> >  [] sock_sendmsg_nosec net/socket.c:637 [inline]
> >  [] sock_sendmsg+0x54/0x70 net/socket.c:657
> >  [<4cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
> >  [] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
> >  [] __do_sys_sendmsg net/socket.c:2365 [inline]
> >  [] __se_sys_sendmsg net/socket.c:2363 [inline]
> >  [] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2363
>
>
> Always free vport manually unless register_netdevice() succeeds.
>
> --- a/net/openvswitch/vport-internal_dev.c
> +++ b/net/openvswitch/vport-internal_dev.c
> @@ -137,7 +137,7 @@ static void do_setup(struct net_device *
> netdev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_OPENVSWITCH |
>   IFF_NO_QUEUE;
> netdev->needs_free_netdev = true;
> -   netdev->priv_destructor = internal_dev_destructor;
> +   netdev->priv_destructor = NULL;
> netdev->ethtool_ops = _dev_ethtool_ops;
> netdev->rtnl_link_ops = _dev_link_ops;
>
> @@ -159,7 +159,6 @@ static struct vport *internal_dev_create
> struct internal_dev *internal_dev;
> struct net_device *dev;
> int err;
> -   bool free_vport = true;
>
> vport = ovs_vport_alloc(0, _internal_vport_ops, parms);
> if (IS_ERR(vport)) {
> @@ -190,10 +189,9 @@ static struct vport *internal_dev_create
>
> rtnl_lock();
> err = register_netdevice(vport->dev);
> -   if (err) {
> -   free_vport = false;
> +   if (err)
> goto error_unlock;
> -   }
> +   vport->dev->priv_destructor = internal_dev_destructor;
>
I am not sure why have you moved this assignment out of do_setup().

Otherwise patch looks good to me.

Thanks.
> dev_set_promiscuity(vport->dev, 1);
> rtnl_unlock();
> @@ -207,8 +205,7 @@ error_unlock:
>  error_free_netdev:
> free_netdev(dev);
>  error_free_vport:
> -   if (free_vport)
> -   ovs_vport_free(vport);
> +   ovs_vport_free(vport);
>  error:
> return ERR_PTR(err);
>  }
> --
>


Re: [ovs-dev] [PATCH 7/8] net: ovs: remove unused hardirq.h

2017-12-07 Thread Pravin Shelar
On Fri, Nov 17, 2017 at 3:02 PM, Yang Shi <yan...@alibaba-inc.com> wrote:
> Preempt counter APIs have been split out, currently, hardirq.h just
> includes irq_enter/exit APIs which are not used by openvswitch at all.
>
> So, remove the unused hardirq.h.
>
> Signed-off-by: Yang Shi <yan...@alibaba-inc.com>
> Cc: Pravin Shelar <pshe...@nicira.com>
> Cc: "David S. Miller" <da...@davemloft.net>
> Cc: d...@openvswitch.org

Acked-by: Pravin B Shelar <pshe...@ovn.org>


Re: [ovs-dev] [PATCH 7/8] net: ovs: remove unused hardirq.h

2017-12-07 Thread Pravin Shelar
On Fri, Nov 17, 2017 at 3:02 PM, Yang Shi  wrote:
> Preempt counter APIs have been split out, currently, hardirq.h just
> includes irq_enter/exit APIs which are not used by openvswitch at all.
>
> So, remove the unused hardirq.h.
>
> Signed-off-by: Yang Shi 
> Cc: Pravin Shelar 
> Cc: "David S. Miller" 
> Cc: d...@openvswitch.org

Acked-by: Pravin B Shelar 


Re: [PATCH] openvswitch: use ktime_get_ts64() instead of ktime_get_ts()

2017-11-27 Thread Pravin Shelar
On Mon, Nov 27, 2017 at 5:11 PM, Arnd Bergmann  wrote:
> timespec is deprecated because of the y2038 overflow, so let's convert
> this one to ktime_get_ts64(). The code is already safe even on 32-bit
> architectures, since it uses monotonic times. On 64-bit architectures,
> nothing changes, while on 32-bit architectures this avoids one
> type conversion.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  net/openvswitch/flow.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> index dbe2379329c5..76d050aba7a4 100644
> --- a/net/openvswitch/flow.c
> +++ b/net/openvswitch/flow.c
> @@ -56,12 +56,12 @@
>
>  u64 ovs_flow_used_time(unsigned long flow_jiffies)
>  {
> -   struct timespec cur_ts;
> +   struct timespec64 cur_ts;
> u64 cur_ms, idle_ms;
>
> -   ktime_get_ts(_ts);
> +   ktime_get_ts64(_ts);
> idle_ms = jiffies_to_msecs(jiffies - flow_jiffies);
> -   cur_ms = (u64)cur_ts.tv_sec * MSEC_PER_SEC +
> +   cur_ms = (u64)(u32)cur_ts.tv_sec * MSEC_PER_SEC +

I am not sure why is tv_sec converted to u32.

>  cur_ts.tv_nsec / NSEC_PER_MSEC;
>
> return cur_ms - idle_ms;
> --
> 2.9.0
>


Re: [PATCH] openvswitch: use ktime_get_ts64() instead of ktime_get_ts()

2017-11-27 Thread Pravin Shelar
On Mon, Nov 27, 2017 at 5:11 PM, Arnd Bergmann  wrote:
> timespec is deprecated because of the y2038 overflow, so let's convert
> this one to ktime_get_ts64(). The code is already safe even on 32-bit
> architectures, since it uses monotonic times. On 64-bit architectures,
> nothing changes, while on 32-bit architectures this avoids one
> type conversion.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  net/openvswitch/flow.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> index dbe2379329c5..76d050aba7a4 100644
> --- a/net/openvswitch/flow.c
> +++ b/net/openvswitch/flow.c
> @@ -56,12 +56,12 @@
>
>  u64 ovs_flow_used_time(unsigned long flow_jiffies)
>  {
> -   struct timespec cur_ts;
> +   struct timespec64 cur_ts;
> u64 cur_ms, idle_ms;
>
> -   ktime_get_ts(_ts);
> +   ktime_get_ts64(_ts);
> idle_ms = jiffies_to_msecs(jiffies - flow_jiffies);
> -   cur_ms = (u64)cur_ts.tv_sec * MSEC_PER_SEC +
> +   cur_ms = (u64)(u32)cur_ts.tv_sec * MSEC_PER_SEC +

I am not sure why is tv_sec converted to u32.

>  cur_ts.tv_nsec / NSEC_PER_MSEC;
>
> return cur_ms - idle_ms;
> --
> 2.9.0
>


Re: [ovs-dev] [PATCH] openvswitch: add null pointer check on upcall

2017-11-09 Thread Pravin Shelar
On Thu, Nov 9, 2017 at 7:29 PM, Colin King  wrote:
> From: Colin Ian King 
>
> upcall may be assigned a NULL pointer as genlmsg_put can potentially
> return a NULL.  Add a null check to avoid a null pointer dereference
> on upcall.
>
> Detected by CoverityScan, CID#728404 ("Dereference null return value")
>
> Fixes: commit ccb1352e76cf ("net: Add Open vSwitch kernel components.")
> Signed-off-by: Colin Ian King 
> ---
>  net/openvswitch/datapath.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 4d38ac044cee..e8fb3d76fa6e 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -461,6 +461,10 @@ static int queue_userspace_packet(struct datapath *dp, 
> struct sk_buff *skb,
>
> upcall = genlmsg_put(user_skb, 0, 0, _packet_genl_family,
>  0, upcall_info->cmd);
> +   if (!upcall) {
> +   err = -ENOBUFS;
> +   goto out;
> +   }
user_skb is allocated according to all required attributes, so
genlmsg_put() can not return null value.


> upcall->dp_ifindex = dp_ifindex;
>
> err = ovs_nla_put_key(key, key, OVS_PACKET_ATTR_KEY, false, user_skb);
> --
> 2.14.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] openvswitch: add null pointer check on upcall

2017-11-09 Thread Pravin Shelar
On Thu, Nov 9, 2017 at 7:29 PM, Colin King  wrote:
> From: Colin Ian King 
>
> upcall may be assigned a NULL pointer as genlmsg_put can potentially
> return a NULL.  Add a null check to avoid a null pointer dereference
> on upcall.
>
> Detected by CoverityScan, CID#728404 ("Dereference null return value")
>
> Fixes: commit ccb1352e76cf ("net: Add Open vSwitch kernel components.")
> Signed-off-by: Colin Ian King 
> ---
>  net/openvswitch/datapath.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 4d38ac044cee..e8fb3d76fa6e 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -461,6 +461,10 @@ static int queue_userspace_packet(struct datapath *dp, 
> struct sk_buff *skb,
>
> upcall = genlmsg_put(user_skb, 0, 0, _packet_genl_family,
>  0, upcall_info->cmd);
> +   if (!upcall) {
> +   err = -ENOBUFS;
> +   goto out;
> +   }
user_skb is allocated according to all required attributes, so
genlmsg_put() can not return null value.


> upcall->dp_ifindex = dp_ifindex;
>
> err = ovs_nla_put_key(key, key, OVS_PACKET_ATTR_KEY, false, user_skb);
> --
> 2.14.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [PATCH v2] geneve: Fix setting ttl value in collect metadata mode

2017-09-13 Thread Pravin Shelar
On Wed, Sep 13, 2017 at 4:15 AM, 严海双 <yanhaishu...@cmss.chinamobile.com> wrote:
>
>
>> On 2017年9月13日, at 上午7:43, Pravin Shelar <pshe...@ovn.org> wrote:
>>
>> On Tue, Sep 12, 2017 at 12:05 AM, Haishuang Yan
>> <yanhaishu...@cmss.chinamobile.com> wrote:
>>> Similar to vxlan/ipip tunnel, if key->tos is zero in collect metadata
>>> mode, tos should also fallback to ip{4,6}_dst_hoplimit.
>>>
>>> Signed-off-by: Haishuang Yan <yanhaishu...@cmss.chinamobile.com>
>>>
>>> ---
>>> Changes since v2:
>>>  * Make the commit message more clearer.
>>> ---
>>> drivers/net/geneve.c | 6 ++
>>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
>>> index f640407..d52a65f 100644
>>> --- a/drivers/net/geneve.c
>>> +++ b/drivers/net/geneve.c
>>> @@ -834,11 +834,10 @@ static int geneve_xmit_skb(struct sk_buff *skb, 
>>> struct net_device *dev,
>>>sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
>>>if (geneve->collect_md) {
>>>tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
>>> -   ttl = key->ttl;
>>>} else {
>>>tos = ip_tunnel_ecn_encap(fl4.flowi4_tos, ip_hdr(skb), skb);
>>> -   ttl = key->ttl ? : ip4_dst_hoplimit(>dst);
>>>}
>>> +   ttl = key->ttl ? : ip4_dst_hoplimit(>dst);
>>>df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
>>>
>> This changes user API of Geneve collect-metadata mode. I do not see
>> good reason for this. Why user can not set right TTL for the flow?
>>
>
> For example, I test this case with script samples/bpf/test_tunnel_bpf.sh,
> and modify samples/bpf/tcbpf2_kern.c with following patch:
>
But user is suppose to specify the TTL in collect-md mode for Geneve
tunnel. That is the API.


Re: [PATCH v2] geneve: Fix setting ttl value in collect metadata mode

2017-09-13 Thread Pravin Shelar
On Wed, Sep 13, 2017 at 4:15 AM, 严海双  wrote:
>
>
>> On 2017年9月13日, at 上午7:43, Pravin Shelar  wrote:
>>
>> On Tue, Sep 12, 2017 at 12:05 AM, Haishuang Yan
>>  wrote:
>>> Similar to vxlan/ipip tunnel, if key->tos is zero in collect metadata
>>> mode, tos should also fallback to ip{4,6}_dst_hoplimit.
>>>
>>> Signed-off-by: Haishuang Yan 
>>>
>>> ---
>>> Changes since v2:
>>>  * Make the commit message more clearer.
>>> ---
>>> drivers/net/geneve.c | 6 ++
>>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
>>> index f640407..d52a65f 100644
>>> --- a/drivers/net/geneve.c
>>> +++ b/drivers/net/geneve.c
>>> @@ -834,11 +834,10 @@ static int geneve_xmit_skb(struct sk_buff *skb, 
>>> struct net_device *dev,
>>>sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
>>>if (geneve->collect_md) {
>>>tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
>>> -   ttl = key->ttl;
>>>} else {
>>>tos = ip_tunnel_ecn_encap(fl4.flowi4_tos, ip_hdr(skb), skb);
>>> -   ttl = key->ttl ? : ip4_dst_hoplimit(>dst);
>>>}
>>> +   ttl = key->ttl ? : ip4_dst_hoplimit(>dst);
>>>df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
>>>
>> This changes user API of Geneve collect-metadata mode. I do not see
>> good reason for this. Why user can not set right TTL for the flow?
>>
>
> For example, I test this case with script samples/bpf/test_tunnel_bpf.sh,
> and modify samples/bpf/tcbpf2_kern.c with following patch:
>
But user is suppose to specify the TTL in collect-md mode for Geneve
tunnel. That is the API.


Re: [PATCH v2] geneve: Fix setting ttl value in collect metadata mode

2017-09-12 Thread Pravin Shelar
On Tue, Sep 12, 2017 at 12:05 AM, Haishuang Yan
 wrote:
> Similar to vxlan/ipip tunnel, if key->tos is zero in collect metadata
> mode, tos should also fallback to ip{4,6}_dst_hoplimit.
>
> Signed-off-by: Haishuang Yan 
>
> ---
> Changes since v2:
>   * Make the commit message more clearer.
> ---
>  drivers/net/geneve.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> index f640407..d52a65f 100644
> --- a/drivers/net/geneve.c
> +++ b/drivers/net/geneve.c
> @@ -834,11 +834,10 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct 
> net_device *dev,
> sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
> if (geneve->collect_md) {
> tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
> -   ttl = key->ttl;
> } else {
> tos = ip_tunnel_ecn_encap(fl4.flowi4_tos, ip_hdr(skb), skb);
> -   ttl = key->ttl ? : ip4_dst_hoplimit(>dst);
> }
> +   ttl = key->ttl ? : ip4_dst_hoplimit(>dst);
> df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
>
This changes user API of Geneve collect-metadata mode. I do not see
good reason for this. Why user can not set right TTL for the flow?


Re: [PATCH v2] geneve: Fix setting ttl value in collect metadata mode

2017-09-12 Thread Pravin Shelar
On Tue, Sep 12, 2017 at 12:05 AM, Haishuang Yan
 wrote:
> Similar to vxlan/ipip tunnel, if key->tos is zero in collect metadata
> mode, tos should also fallback to ip{4,6}_dst_hoplimit.
>
> Signed-off-by: Haishuang Yan 
>
> ---
> Changes since v2:
>   * Make the commit message more clearer.
> ---
>  drivers/net/geneve.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> index f640407..d52a65f 100644
> --- a/drivers/net/geneve.c
> +++ b/drivers/net/geneve.c
> @@ -834,11 +834,10 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct 
> net_device *dev,
> sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
> if (geneve->collect_md) {
> tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
> -   ttl = key->ttl;
> } else {
> tos = ip_tunnel_ecn_encap(fl4.flowi4_tos, ip_hdr(skb), skb);
> -   ttl = key->ttl ? : ip4_dst_hoplimit(>dst);
> }
> +   ttl = key->ttl ? : ip4_dst_hoplimit(>dst);
> df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
>
This changes user API of Geneve collect-metadata mode. I do not see
good reason for this. Why user can not set right TTL for the flow?


Re: [PATCH v4 1/2] ip_tunnel: fix ip tunnel lookup in collect_md mode

2017-09-12 Thread Pravin Shelar
On Tue, Sep 12, 2017 at 2:47 AM, Haishuang Yan
 wrote:
> In collect_md mode, if the tun dev is down, it still can call
> ip_tunnel_rcv to receive on packets, and the rx statistics increase
> improperly.
>
> When the md tunnel is down, it's not neccessary to increase RX drops
> for the tunnel device, packets would be recieved on fallback tunnel,
> and the RX drops on fallback device will be increased as expected.
>
> Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
> Cc: Pravin B Shelar 
> Signed-off-by: Haishuang Yan 
Acked-by: Pravin B Shelar 


Re: [PATCH v4 1/2] ip_tunnel: fix ip tunnel lookup in collect_md mode

2017-09-12 Thread Pravin Shelar
On Tue, Sep 12, 2017 at 2:47 AM, Haishuang Yan
 wrote:
> In collect_md mode, if the tun dev is down, it still can call
> ip_tunnel_rcv to receive on packets, and the rx statistics increase
> improperly.
>
> When the md tunnel is down, it's not neccessary to increase RX drops
> for the tunnel device, packets would be recieved on fallback tunnel,
> and the RX drops on fallback device will be increased as expected.
>
> Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
> Cc: Pravin B Shelar 
> Signed-off-by: Haishuang Yan 
Acked-by: Pravin B Shelar 


Re: [PATCH v2] openvswitch: Fix an error handling path in 'ovs_nla_init_match_and_action()'

2017-09-11 Thread Pravin Shelar
On Mon, Sep 11, 2017 at 12:56 PM, Christophe JAILLET
 wrote:
> All other error handling paths in this function go through the 'error'
> label. This one should do the same.
>
> Fixes: 9cc9a5cb176c ("datapath: Avoid using stack larger than 1024.")
> Signed-off-by: Christophe JAILLET 
> ---
> I think that the comment above the function could be improved. It looks
> like the commit log which has introduced this function.
>
> I'm also not sure that commit 9cc9a5cb176c is of any help. It is
> supposed to remove a warning, and I guess it does. But 
> 'ovs_nla_init_match_and_action()'
> is called unconditionnaly from 'ovs_flow_cmd_set()'. So even if the stack
> used by each function is reduced, the overall stack should be the same, if
> not larger.
>
It depends on which function stack depth are are looking at. for some
function it remains same. For nested function it goes down.

> So this commit sounds like adding a bug where the code was fine and states
> to fix an issue but, at the best, only hides it.
>
> Instead of fixing the code with the proposed patch, reverting the initial
> commit could also be considered.
>
> V2: update Subject line
> ---
>  net/openvswitch/datapath.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 76cf273a56c7..c3aec6227c91 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -1112,7 +1112,8 @@ static int ovs_nla_init_match_and_action(struct net 
> *net,
> if (!a[OVS_FLOW_ATTR_KEY]) {
> OVS_NLERR(log,
>   "Flow key attribute not present in set 
> flow.");
> -   return -EINVAL;
> +   error = -EINVAL;
> +   goto error;
> }
>
Patch looks good to me.

Acked-by: Pravin B Shelar 


Re: [PATCH v2] openvswitch: Fix an error handling path in 'ovs_nla_init_match_and_action()'

2017-09-11 Thread Pravin Shelar
On Mon, Sep 11, 2017 at 12:56 PM, Christophe JAILLET
 wrote:
> All other error handling paths in this function go through the 'error'
> label. This one should do the same.
>
> Fixes: 9cc9a5cb176c ("datapath: Avoid using stack larger than 1024.")
> Signed-off-by: Christophe JAILLET 
> ---
> I think that the comment above the function could be improved. It looks
> like the commit log which has introduced this function.
>
> I'm also not sure that commit 9cc9a5cb176c is of any help. It is
> supposed to remove a warning, and I guess it does. But 
> 'ovs_nla_init_match_and_action()'
> is called unconditionnaly from 'ovs_flow_cmd_set()'. So even if the stack
> used by each function is reduced, the overall stack should be the same, if
> not larger.
>
It depends on which function stack depth are are looking at. for some
function it remains same. For nested function it goes down.

> So this commit sounds like adding a bug where the code was fine and states
> to fix an issue but, at the best, only hides it.
>
> Instead of fixing the code with the proposed patch, reverting the initial
> commit could also be considered.
>
> V2: update Subject line
> ---
>  net/openvswitch/datapath.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 76cf273a56c7..c3aec6227c91 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -1112,7 +1112,8 @@ static int ovs_nla_init_match_and_action(struct net 
> *net,
> if (!a[OVS_FLOW_ATTR_KEY]) {
> OVS_NLERR(log,
>   "Flow key attribute not present in set 
> flow.");
> -   return -EINVAL;
> +   error = -EINVAL;
> +   goto error;
> }
>
Patch looks good to me.

Acked-by: Pravin B Shelar 


Re: [PATCH] geneve: Fix setting ttl value in collect metadata mode

2017-09-05 Thread Pravin Shelar
On Sun, Sep 3, 2017 at 5:49 AM, Haishuang Yan
 wrote:
> If key->tos is zero in collect metadata mode, tos should fallback to
> ip{4,6}_dst_hoplimit, same as normal mode.
>
> Signed-off-by: Haishuang Yan 
> ---
>  drivers/net/geneve.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> index f640407..d52a65f 100644
> --- a/drivers/net/geneve.c
> +++ b/drivers/net/geneve.c
> @@ -834,11 +834,10 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct 
> net_device *dev,
> sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
> if (geneve->collect_md) {
> tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
> -   ttl = key->ttl;
> } else {
> tos = ip_tunnel_ecn_encap(fl4.flowi4_tos, ip_hdr(skb), skb);
> -   ttl = key->ttl ? : ip4_dst_hoplimit(>dst);
> }
> +   ttl = key->ttl ? : ip4_dst_hoplimit(>dst);
In collect md mode, geneve has to set TTL value from tunnel metadata.
That is the API exposed to userspace. is there reason for this change?


> df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
>
> err = geneve_build_skb(>dst, skb, info, xnet, sizeof(struct 
> iphdr));
> @@ -873,12 +872,11 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct 
> net_device *dev,
> sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
> if (geneve->collect_md) {
> prio = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
> -   ttl = key->ttl;
> } else {
> prio = ip_tunnel_ecn_encap(ip6_tclass(fl6.flowlabel),
>ip_hdr(skb), skb);
> -   ttl = key->ttl ? : ip6_dst_hoplimit(dst);
> }
> +   ttl = key->ttl ? : ip6_dst_hoplimit(dst);
> err = geneve_build_skb(dst, skb, info, xnet, sizeof(struct ipv6hdr));
> if (unlikely(err))
> return err;
> --
> 1.8.3.1
>
>
>


Re: [PATCH] geneve: Fix setting ttl value in collect metadata mode

2017-09-05 Thread Pravin Shelar
On Sun, Sep 3, 2017 at 5:49 AM, Haishuang Yan
 wrote:
> If key->tos is zero in collect metadata mode, tos should fallback to
> ip{4,6}_dst_hoplimit, same as normal mode.
>
> Signed-off-by: Haishuang Yan 
> ---
>  drivers/net/geneve.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> index f640407..d52a65f 100644
> --- a/drivers/net/geneve.c
> +++ b/drivers/net/geneve.c
> @@ -834,11 +834,10 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct 
> net_device *dev,
> sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
> if (geneve->collect_md) {
> tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
> -   ttl = key->ttl;
> } else {
> tos = ip_tunnel_ecn_encap(fl4.flowi4_tos, ip_hdr(skb), skb);
> -   ttl = key->ttl ? : ip4_dst_hoplimit(>dst);
> }
> +   ttl = key->ttl ? : ip4_dst_hoplimit(>dst);
In collect md mode, geneve has to set TTL value from tunnel metadata.
That is the API exposed to userspace. is there reason for this change?


> df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
>
> err = geneve_build_skb(>dst, skb, info, xnet, sizeof(struct 
> iphdr));
> @@ -873,12 +872,11 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct 
> net_device *dev,
> sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
> if (geneve->collect_md) {
> prio = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
> -   ttl = key->ttl;
> } else {
> prio = ip_tunnel_ecn_encap(ip6_tclass(fl6.flowlabel),
>ip_hdr(skb), skb);
> -   ttl = key->ttl ? : ip6_dst_hoplimit(dst);
> }
> +   ttl = key->ttl ? : ip6_dst_hoplimit(dst);
> err = geneve_build_skb(dst, skb, info, xnet, sizeof(struct ipv6hdr));
> if (unlikely(err))
> return err;
> --
> 1.8.3.1
>
>
>


Re: [PATCH v2 1/2] ip_tunnel: fix ip tunnel lookup in collect_md mode

2017-06-19 Thread Pravin Shelar
On Mon, Jun 19, 2017 at 6:13 AM, 严海双 <yanhaishu...@cmss.chinamobile.com> wrote:
>
>
>> On 19 Jun 2017, at 1:43 PM, Pravin Shelar <pshe...@ovn.org> wrote:
>>
>> On Fri, Jun 16, 2017 at 8:27 PM, Haishuang Yan
>> <yanhaishu...@cmss.chinamobile.com> wrote:
>>> In collect_md mode, if the tun dev is down, it still can call
>>> ip_tunnel_rcv to receive on packets, and the rx statistics increase
>>> improperly.
>>>
>>> Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
>>> Cc: Pravin B Shelar <pshe...@nicira.com>
>>> Signed-off-by: Haishuang Yan <yanhaishu...@cmss.chinamobile.com>
>>>
>>> ---
>>> Change since v2:
>>>  * Fix wrong recipient addresss
>>> ---
>>> net/ipv4/ip_tunnel.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
>>> index 0f1d876..a3caba1 100644
>>> --- a/net/ipv4/ip_tunnel.c
>>> +++ b/net/ipv4/ip_tunnel.c
>>> @@ -176,7 +176,7 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net 
>>> *itn,
>>>return cand;
>>>
>>>t = rcu_dereference(itn->collect_md_tun);
>>> -   if (t)
>>> +   if (t && (t->dev->flags & IFF_UP))
>>>return t;
>>>
>> It would be nice if we could increment drop count if tunnel device is not up.
>>
> Hi Pravin
>
> I think it’s not necessary, for example as gre tunnel, if ipgre_rcv fails, it 
> would trigger send an icmp unreachable
> message:
>
> if (ipgre_rcv(skb, , hdr_len) == PACKET_RCVD)
> return 0;
>
> icmp_send(skb, ICMP_DEST_UNREACH, ICMP_PORT_UNREACH, 0);
>
> Since the tunnel device didn’t touch the packets, so increase drop statistics 
> is not necessary.
>
icmp err packets are not reliable on all networks. device stats are
much more convenient during debugging connectivity issues.


Re: [PATCH v2 1/2] ip_tunnel: fix ip tunnel lookup in collect_md mode

2017-06-19 Thread Pravin Shelar
On Mon, Jun 19, 2017 at 6:13 AM, 严海双  wrote:
>
>
>> On 19 Jun 2017, at 1:43 PM, Pravin Shelar  wrote:
>>
>> On Fri, Jun 16, 2017 at 8:27 PM, Haishuang Yan
>>  wrote:
>>> In collect_md mode, if the tun dev is down, it still can call
>>> ip_tunnel_rcv to receive on packets, and the rx statistics increase
>>> improperly.
>>>
>>> Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
>>> Cc: Pravin B Shelar 
>>> Signed-off-by: Haishuang Yan 
>>>
>>> ---
>>> Change since v2:
>>>  * Fix wrong recipient addresss
>>> ---
>>> net/ipv4/ip_tunnel.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
>>> index 0f1d876..a3caba1 100644
>>> --- a/net/ipv4/ip_tunnel.c
>>> +++ b/net/ipv4/ip_tunnel.c
>>> @@ -176,7 +176,7 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net 
>>> *itn,
>>>return cand;
>>>
>>>t = rcu_dereference(itn->collect_md_tun);
>>> -   if (t)
>>> +   if (t && (t->dev->flags & IFF_UP))
>>>return t;
>>>
>> It would be nice if we could increment drop count if tunnel device is not up.
>>
> Hi Pravin
>
> I think it’s not necessary, for example as gre tunnel, if ipgre_rcv fails, it 
> would trigger send an icmp unreachable
> message:
>
> if (ipgre_rcv(skb, , hdr_len) == PACKET_RCVD)
> return 0;
>
> icmp_send(skb, ICMP_DEST_UNREACH, ICMP_PORT_UNREACH, 0);
>
> Since the tunnel device didn’t touch the packets, so increase drop statistics 
> is not necessary.
>
icmp err packets are not reliable on all networks. device stats are
much more convenient during debugging connectivity issues.


Re: [PATCH v2 1/2] ip_tunnel: fix ip tunnel lookup in collect_md mode

2017-06-18 Thread Pravin Shelar
On Fri, Jun 16, 2017 at 8:27 PM, Haishuang Yan
 wrote:
> In collect_md mode, if the tun dev is down, it still can call
> ip_tunnel_rcv to receive on packets, and the rx statistics increase
> improperly.
>
> Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
> Cc: Pravin B Shelar 
> Signed-off-by: Haishuang Yan 
>
> ---
> Change since v2:
>   * Fix wrong recipient addresss
> ---
>  net/ipv4/ip_tunnel.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> index 0f1d876..a3caba1 100644
> --- a/net/ipv4/ip_tunnel.c
> +++ b/net/ipv4/ip_tunnel.c
> @@ -176,7 +176,7 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net 
> *itn,
> return cand;
>
> t = rcu_dereference(itn->collect_md_tun);
> -   if (t)
> +   if (t && (t->dev->flags & IFF_UP))
> return t;
>
It would be nice if we could increment drop count if tunnel device is not up.


Re: [PATCH v2 1/2] ip_tunnel: fix ip tunnel lookup in collect_md mode

2017-06-18 Thread Pravin Shelar
On Fri, Jun 16, 2017 at 8:27 PM, Haishuang Yan
 wrote:
> In collect_md mode, if the tun dev is down, it still can call
> ip_tunnel_rcv to receive on packets, and the rx statistics increase
> improperly.
>
> Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
> Cc: Pravin B Shelar 
> Signed-off-by: Haishuang Yan 
>
> ---
> Change since v2:
>   * Fix wrong recipient addresss
> ---
>  net/ipv4/ip_tunnel.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> index 0f1d876..a3caba1 100644
> --- a/net/ipv4/ip_tunnel.c
> +++ b/net/ipv4/ip_tunnel.c
> @@ -176,7 +176,7 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net 
> *itn,
> return cand;
>
> t = rcu_dereference(itn->collect_md_tun);
> -   if (t)
> +   if (t && (t->dev->flags & IFF_UP))
> return t;
>
It would be nice if we could increment drop count if tunnel device is not up.


Re: [PATCH v3 1/2] ip_tunnel: fix potential issue in ip_tunnel_rcv

2017-06-08 Thread Pravin Shelar
On Wed, Jun 7, 2017 at 9:32 PM, Haishuang Yan
 wrote:
> When ip_tunnel_rcv fails, the tun_dst won't be freed, so call
> dst_release to free it in error code path.
>
> CC: Pravin B Shelar 
> Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
> Signed-off-by: Haishuang Yan 
>
> ---
> Changes in v2:
>   - Add the the missing Fixes information
> Changes in v3:
>   - Free tun_dst from error code path
> ---

Acked-by: Pravin B Shelar 


Re: [PATCH v3 1/2] ip_tunnel: fix potential issue in ip_tunnel_rcv

2017-06-08 Thread Pravin Shelar
On Wed, Jun 7, 2017 at 9:32 PM, Haishuang Yan
 wrote:
> When ip_tunnel_rcv fails, the tun_dst won't be freed, so call
> dst_release to free it in error code path.
>
> CC: Pravin B Shelar 
> Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
> Signed-off-by: Haishuang Yan 
>
> ---
> Changes in v2:
>   - Add the the missing Fixes information
> Changes in v3:
>   - Free tun_dst from error code path
> ---

Acked-by: Pravin B Shelar 


Re: [PATCH v2 1/2] ip_tunnel: fix potential issue in ip_tunnel_rcv

2017-06-07 Thread Pravin Shelar
On Wed, Jun 7, 2017 at 8:15 PM, Eric Dumazet <eric.duma...@gmail.com> wrote:
> On Wed, 2017-06-07 at 19:13 -0700, Pravin Shelar wrote:
>> On Wed, Jun 7, 2017 at 5:57 PM, Haishuang Yan
>> <yanhaishu...@cmss.chinamobile.com> wrote:
>> > When ip_tunnel_rcv fails, the tun_dst won't be freed, so move
>> > skb_dst_set to begin and tun_dst would be freed by kfree_skb.
>> >
>> > CC: Pravin B Shelar <pshe...@nicira.com>
>> > Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
>> > Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com390e>
>> > ---
>> >  net/ipv4/ip_tunnel.c | 6 +++---
>> >  1 file changed, 3 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
>> > index b878ecb..27fc20f 100644
>> > --- a/net/ipv4/ip_tunnel.c
>> > +++ b/net/ipv4/ip_tunnel.c
>> > @@ -386,6 +386,9 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct 
>> > sk_buff *skb,
>> > const struct iphdr *iph = ip_hdr(skb);
>> > int err;
>> >
>> > +   if (tun_dst)
>> > +   skb_dst_set(skb, (struct dst_entry *)tun_dst);
>> > +
>> If dst is set so early, skb_scrub_packet() would remove the tunnel dst
>> reference.
>> It is better to call skb_dst_drop() from error code path.
>
> Do we really want to keep a dst from another namespace if
> skb_scrub_packet() is called with xnet=true ?
>
tun_dst is allocated in same namespace. It is required for LWT to work.


Re: [PATCH v2 1/2] ip_tunnel: fix potential issue in ip_tunnel_rcv

2017-06-07 Thread Pravin Shelar
On Wed, Jun 7, 2017 at 8:15 PM, Eric Dumazet  wrote:
> On Wed, 2017-06-07 at 19:13 -0700, Pravin Shelar wrote:
>> On Wed, Jun 7, 2017 at 5:57 PM, Haishuang Yan
>>  wrote:
>> > When ip_tunnel_rcv fails, the tun_dst won't be freed, so move
>> > skb_dst_set to begin and tun_dst would be freed by kfree_skb.
>> >
>> > CC: Pravin B Shelar 
>> > Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
>> > Signed-off-by: Haishuang Yan 
>> > ---
>> >  net/ipv4/ip_tunnel.c | 6 +++---
>> >  1 file changed, 3 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
>> > index b878ecb..27fc20f 100644
>> > --- a/net/ipv4/ip_tunnel.c
>> > +++ b/net/ipv4/ip_tunnel.c
>> > @@ -386,6 +386,9 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct 
>> > sk_buff *skb,
>> > const struct iphdr *iph = ip_hdr(skb);
>> > int err;
>> >
>> > +   if (tun_dst)
>> > +   skb_dst_set(skb, (struct dst_entry *)tun_dst);
>> > +
>> If dst is set so early, skb_scrub_packet() would remove the tunnel dst
>> reference.
>> It is better to call skb_dst_drop() from error code path.
>
> Do we really want to keep a dst from another namespace if
> skb_scrub_packet() is called with xnet=true ?
>
tun_dst is allocated in same namespace. It is required for LWT to work.


Re: [PATCH v2 1/2] ip_tunnel: fix potential issue in ip_tunnel_rcv

2017-06-07 Thread Pravin Shelar
On Wed, Jun 7, 2017 at 5:57 PM, Haishuang Yan
 wrote:
> When ip_tunnel_rcv fails, the tun_dst won't be freed, so move
> skb_dst_set to begin and tun_dst would be freed by kfree_skb.
>
> CC: Pravin B Shelar 
> Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
> Signed-off-by: Haishuang Yan 
> ---
>  net/ipv4/ip_tunnel.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> index b878ecb..27fc20f 100644
> --- a/net/ipv4/ip_tunnel.c
> +++ b/net/ipv4/ip_tunnel.c
> @@ -386,6 +386,9 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct 
> sk_buff *skb,
> const struct iphdr *iph = ip_hdr(skb);
> int err;
>
> +   if (tun_dst)
> +   skb_dst_set(skb, (struct dst_entry *)tun_dst);
> +
If dst is set so early, skb_scrub_packet() would remove the tunnel dst
reference.
It is better to call skb_dst_drop() from error code path.

>  #ifdef CONFIG_NET_IPGRE_BROADCAST
> if (ipv4_is_multicast(iph->daddr)) {
> tunnel->dev->stats.multicast++;
> @@ -439,9 +442,6 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct 
> sk_buff *skb,
> skb->dev = tunnel->dev;
> }
>
> -   if (tun_dst)
> -   skb_dst_set(skb, (struct dst_entry *)tun_dst);
> -
> gro_cells_receive(>gro_cells, skb);
> return 0;
>
> --
> 1.8.3.1
>
>
>


Re: [PATCH v2 1/2] ip_tunnel: fix potential issue in ip_tunnel_rcv

2017-06-07 Thread Pravin Shelar
On Wed, Jun 7, 2017 at 5:57 PM, Haishuang Yan
 wrote:
> When ip_tunnel_rcv fails, the tun_dst won't be freed, so move
> skb_dst_set to begin and tun_dst would be freed by kfree_skb.
>
> CC: Pravin B Shelar 
> Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
> Signed-off-by: Haishuang Yan 
> ---
>  net/ipv4/ip_tunnel.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> index b878ecb..27fc20f 100644
> --- a/net/ipv4/ip_tunnel.c
> +++ b/net/ipv4/ip_tunnel.c
> @@ -386,6 +386,9 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct 
> sk_buff *skb,
> const struct iphdr *iph = ip_hdr(skb);
> int err;
>
> +   if (tun_dst)
> +   skb_dst_set(skb, (struct dst_entry *)tun_dst);
> +
If dst is set so early, skb_scrub_packet() would remove the tunnel dst
reference.
It is better to call skb_dst_drop() from error code path.

>  #ifdef CONFIG_NET_IPGRE_BROADCAST
> if (ipv4_is_multicast(iph->daddr)) {
> tunnel->dev->stats.multicast++;
> @@ -439,9 +442,6 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct 
> sk_buff *skb,
> skb->dev = tunnel->dev;
> }
>
> -   if (tun_dst)
> -   skb_dst_set(skb, (struct dst_entry *)tun_dst);
> -
> gro_cells_receive(>gro_cells, skb);
> return 0;
>
> --
> 1.8.3.1
>
>
>


Re: [ovs-dev] [PATCH 1/1] openvswitch: check return value of nla_nest_start

2017-04-23 Thread Pravin Shelar
On Sat, Apr 22, 2017 at 11:43 PM, Pan Bian  wrote:
> Function nla_nest_start() will return a NULL pointer on error, and its
> return value should be validated before it is used. However, in function
> queue_userspace_packet(), its return value is ignored. This may result
> in NULL dereference when calling nla_nest_end(). This patch fixes the
> bug.
>
> Signed-off-by: Pan Bian 
> ---
>  net/openvswitch/datapath.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 9c62b63..34c0fbd 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -489,7 +489,8 @@ static int queue_userspace_packet(struct datapath *dp, 
> struct sk_buff *skb,
> err = ovs_nla_put_tunnel_info(user_skb,
>   upcall_info->egress_tun_info);
> BUG_ON(err);
> -   nla_nest_end(user_skb, nla);
> +   if (nla)
> +   nla_nest_end(user_skb, nla);
There is enough memory allocated for the netlink message accounting
for all attributes beforehand. So it is not required to check retuned
'nla' after each attributes addition to the msg.


> }
>
> if (upcall_info->actions_len) {
> @@ -497,7 +498,7 @@ static int queue_userspace_packet(struct datapath *dp, 
> struct sk_buff *skb,
> err = ovs_nla_put_actions(upcall_info->actions,
>   upcall_info->actions_len,
>   user_skb);
> -   if (!err)
> +   if (!err && nla)
> nla_nest_end(user_skb, nla);
> else
> nla_nest_cancel(user_skb, nla);
> --
> 1.9.1
>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] openvswitch: check return value of nla_nest_start

2017-04-23 Thread Pravin Shelar
On Sat, Apr 22, 2017 at 11:43 PM, Pan Bian  wrote:
> Function nla_nest_start() will return a NULL pointer on error, and its
> return value should be validated before it is used. However, in function
> queue_userspace_packet(), its return value is ignored. This may result
> in NULL dereference when calling nla_nest_end(). This patch fixes the
> bug.
>
> Signed-off-by: Pan Bian 
> ---
>  net/openvswitch/datapath.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 9c62b63..34c0fbd 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -489,7 +489,8 @@ static int queue_userspace_packet(struct datapath *dp, 
> struct sk_buff *skb,
> err = ovs_nla_put_tunnel_info(user_skb,
>   upcall_info->egress_tun_info);
> BUG_ON(err);
> -   nla_nest_end(user_skb, nla);
> +   if (nla)
> +   nla_nest_end(user_skb, nla);
There is enough memory allocated for the netlink message accounting
for all attributes beforehand. So it is not required to check retuned
'nla' after each attributes addition to the msg.


> }
>
> if (upcall_info->actions_len) {
> @@ -497,7 +498,7 @@ static int queue_userspace_packet(struct datapath *dp, 
> struct sk_buff *skb,
> err = ovs_nla_put_actions(upcall_info->actions,
>   upcall_info->actions_len,
>   user_skb);
> -   if (!err)
> +   if (!err && nla)
> nla_nest_end(user_skb, nla);
> else
> nla_nest_cancel(user_skb, nla);
> --
> 1.9.1
>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [PATCH] openvswitch: add sanity check in queue_userspace_packet.

2016-11-28 Thread Pravin Shelar
On Mon, Nov 28, 2016 at 8:36 PM, Haishuang Yan
 wrote:
> kernel will crash in oops if genlmsg_put return NULL,
> so add the sanity check.
>
> Signed-off-by: Haishuang Yan 
> ---
>  net/openvswitch/datapath.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 2d4c4d3..ceb1b1e 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -474,6 +474,10 @@ static int queue_userspace_packet(struct datapath *dp, 
> struct sk_buff *skb,
>
> upcall = genlmsg_put(user_skb, 0, 0, _packet_genl_family,
>  0, upcall_info->cmd);
> +   if (!upcall) {
> +   err = -EMSGSIZE;
> +   goto out;
> +   }

user_skb has already enough space allocated, so there is no need to
check upcall pointer.


Re: [PATCH] openvswitch: add sanity check in queue_userspace_packet.

2016-11-28 Thread Pravin Shelar
On Mon, Nov 28, 2016 at 8:36 PM, Haishuang Yan
 wrote:
> kernel will crash in oops if genlmsg_put return NULL,
> so add the sanity check.
>
> Signed-off-by: Haishuang Yan 
> ---
>  net/openvswitch/datapath.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 2d4c4d3..ceb1b1e 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -474,6 +474,10 @@ static int queue_userspace_packet(struct datapath *dp, 
> struct sk_buff *skb,
>
> upcall = genlmsg_put(user_skb, 0, 0, _packet_genl_family,
>  0, upcall_info->cmd);
> +   if (!upcall) {
> +   err = -EMSGSIZE;
> +   goto out;
> +   }

user_skb has already enough space allocated, so there is no need to
check upcall pointer.


Re: [PATCH] geneve: fix ip_hdr_len reserved for geneve6 tunnel.

2016-11-27 Thread Pravin Shelar
On Sun, Nov 27, 2016 at 9:26 PM, Haishuang Yan
 wrote:
> It shold reserved sizeof(ipv6hdr) for geneve in ipv6 tunnel.
>
> Fixes: c3ef5aa5e5 ('geneve: Merge ipv4 and ipv6 geneve_build_skb()')
>
> Signed-off-by: Haishuang Yan 

Thanks for fix.

Acked-by: Pravin B Shelar 


Re: [PATCH] geneve: fix ip_hdr_len reserved for geneve6 tunnel.

2016-11-27 Thread Pravin Shelar
On Sun, Nov 27, 2016 at 9:26 PM, Haishuang Yan
 wrote:
> It shold reserved sizeof(ipv6hdr) for geneve in ipv6 tunnel.
>
> Fixes: c3ef5aa5e5 ('geneve: Merge ipv4 and ipv6 geneve_build_skb()')
>
> Signed-off-by: Haishuang Yan 

Thanks for fix.

Acked-by: Pravin B Shelar 


Re: [PATCH v3] openvswitch: allow management from inside user namespaces

2016-02-11 Thread pravin shelar
On Fri, Feb 5, 2016 at 5:20 PM, Tycho Andersen
 wrote:
> Operations with the GENL_ADMIN_PERM flag fail permissions checks because
> this flag means we call netlink_capable, which uses the init user ns.
>
> Instead, let's introduce a new flag, GENL_UNS_ADMIN_PERM for operations
> which should be allowed inside a user namespace.
>
> The motivation for this is to be able to run openvswitch in unprivileged
> containers. I've tested this and it seems to work, but I really have no
> idea about the security consequences of this patch, so thoughts would be
> much appreciated.
>
> v2: use the GENL_UNS_ADMIN_PERM flag instead of a check in each function
> v3: use separate ifs for UNS_ADMIN_PERM and ADMIN_PERM, instead of one
> massive one
>
> Reported-by: James Page 
> Signed-off-by: Tycho Andersen 
> CC: Eric Biederman 
> CC: Pravin Shelar 
> CC: Justin Pettit 
> CC: "David S. Miller" 
> ---
>  include/uapi/linux/genetlink.h |  1 +
>  net/netlink/genetlink.c|  4 
>  net/openvswitch/datapath.c | 20 ++--
>  3 files changed, 15 insertions(+), 10 deletions(-)
>
Looks good.

Acked-by: Pravin B Shelar 


Re: [PATCH v3] openvswitch: allow management from inside user namespaces

2016-02-11 Thread pravin shelar
On Fri, Feb 5, 2016 at 5:20 PM, Tycho Andersen
<tycho.ander...@canonical.com> wrote:
> Operations with the GENL_ADMIN_PERM flag fail permissions checks because
> this flag means we call netlink_capable, which uses the init user ns.
>
> Instead, let's introduce a new flag, GENL_UNS_ADMIN_PERM for operations
> which should be allowed inside a user namespace.
>
> The motivation for this is to be able to run openvswitch in unprivileged
> containers. I've tested this and it seems to work, but I really have no
> idea about the security consequences of this patch, so thoughts would be
> much appreciated.
>
> v2: use the GENL_UNS_ADMIN_PERM flag instead of a check in each function
> v3: use separate ifs for UNS_ADMIN_PERM and ADMIN_PERM, instead of one
> massive one
>
> Reported-by: James Page <james.p...@canonical.com>
> Signed-off-by: Tycho Andersen <tycho.ander...@canonical.com>
> CC: Eric Biederman <ebied...@xmission.com>
> CC: Pravin Shelar <pshe...@ovn.org>
> CC: Justin Pettit <jpet...@nicira.com>
> CC: "David S. Miller" <da...@davemloft.net>
> ---
>  include/uapi/linux/genetlink.h |  1 +
>  net/netlink/genetlink.c|  4 
>  net/openvswitch/datapath.c | 20 ++--
>  3 files changed, 15 insertions(+), 10 deletions(-)
>
Looks good.

Acked-by: Pravin B Shelar <pshe...@ovn.org>


Re: [PATCH] openvswitch: allow management from inside user namespaces

2016-02-01 Thread pravin shelar
On Fri, Jan 29, 2016 at 8:37 AM, Tycho Andersen
 wrote:
> Hi Eric,
>
> Thanks for the review.
>
> On Fri, Jan 29, 2016 at 08:29:55AM -0600, Eric W. Biederman wrote:
>> Tycho Andersen  writes:
>>
>> > Operations with the GENL_ADMIN_PERM flag fail permissions checks because
>> > this flag means we call netlink_capable, which uses the init user ns.
>> >
>> > Instead, let's do permissions checks in each function, but use the netlink
>> > socket's user ns instead of the initial one, to allow management of
>> > openvswitch resources from inside a user ns.
>> >
>> > The motivation for this is to be able to run openvswitch in unprivileged
>> > containers. I've tested this and it seems to work, but I really have no
>> > idea about the security consequences of this patch, so thoughts would be
>> > much appreciated.
>>
>> So at a quick look using ns_capable this way is probably buggy.
>>
>> netlink is goofy (because historically we got this wrong), and I forget
>> what the specific rules are.  The general rule is that you need to do
>> your permission checks on open/create/connect and not inside send/write
>> while processing data.  Otherwise there is a class of privileged
>> applications where you can set their stdout to some precreated file
>> descriptor and their output can be made to act as a command, bypassing
>> your permission checks.
>
> It's worth noting that this patch doesn't move the checks (i.e., they
> are still done at write time currently in the kernel), it just relaxes
> them to root in the user ns instead of real root. This means I can
> currently exploit netlink this way as an unprivileged, just not from
> within an unprivileged container.
>
> An alternate version of this patch below might be more favorable, as
> we may want to do something like this elsewhere in netlink. I think it
> also clarifies the situation a bit, at the cost of adding another
> flag.
>
> A third option would be to move this check to connect time, but that
> would force everything in the family (since that's the only thing you
> connect /to/ in netlink) to have the same required permissions, which
> might (probably?) break stuff; e.g. you can call OVS_FLOW_CMD_GET
> without CAP_NET_ADMIN, but if we changed everything in the family to
> require it, that would break.
>
> Tycho
> ---
>  include/uapi/linux/genetlink.h |  1 +
>  net/netlink/genetlink.c|  6 --
>  net/openvswitch/datapath.c | 20 ++--
>  3 files changed, 15 insertions(+), 12 deletions(-)
>
> diff --git a/include/uapi/linux/genetlink.h b/include/uapi/linux/genetlink.h
> index c3363ba..5512c90 100644
> --- a/include/uapi/linux/genetlink.h
> +++ b/include/uapi/linux/genetlink.h
> @@ -21,6 +21,7 @@ struct genlmsghdr {
>  #define GENL_CMD_CAP_DO0x02
>  #define GENL_CMD_CAP_DUMP  0x04
>  #define GENL_CMD_CAP_HASPOL0x08
> +#define GENL_UNS_ADMIN_PERM0x10
>
This approach looks good to me.


Re: [PATCH] openvswitch: allow management from inside user namespaces

2016-02-01 Thread pravin shelar
On Fri, Jan 29, 2016 at 8:37 AM, Tycho Andersen
 wrote:
> Hi Eric,
>
> Thanks for the review.
>
> On Fri, Jan 29, 2016 at 08:29:55AM -0600, Eric W. Biederman wrote:
>> Tycho Andersen  writes:
>>
>> > Operations with the GENL_ADMIN_PERM flag fail permissions checks because
>> > this flag means we call netlink_capable, which uses the init user ns.
>> >
>> > Instead, let's do permissions checks in each function, but use the netlink
>> > socket's user ns instead of the initial one, to allow management of
>> > openvswitch resources from inside a user ns.
>> >
>> > The motivation for this is to be able to run openvswitch in unprivileged
>> > containers. I've tested this and it seems to work, but I really have no
>> > idea about the security consequences of this patch, so thoughts would be
>> > much appreciated.
>>
>> So at a quick look using ns_capable this way is probably buggy.
>>
>> netlink is goofy (because historically we got this wrong), and I forget
>> what the specific rules are.  The general rule is that you need to do
>> your permission checks on open/create/connect and not inside send/write
>> while processing data.  Otherwise there is a class of privileged
>> applications where you can set their stdout to some precreated file
>> descriptor and their output can be made to act as a command, bypassing
>> your permission checks.
>
> It's worth noting that this patch doesn't move the checks (i.e., they
> are still done at write time currently in the kernel), it just relaxes
> them to root in the user ns instead of real root. This means I can
> currently exploit netlink this way as an unprivileged, just not from
> within an unprivileged container.
>
> An alternate version of this patch below might be more favorable, as
> we may want to do something like this elsewhere in netlink. I think it
> also clarifies the situation a bit, at the cost of adding another
> flag.
>
> A third option would be to move this check to connect time, but that
> would force everything in the family (since that's the only thing you
> connect /to/ in netlink) to have the same required permissions, which
> might (probably?) break stuff; e.g. you can call OVS_FLOW_CMD_GET
> without CAP_NET_ADMIN, but if we changed everything in the family to
> require it, that would break.
>
> Tycho
> ---
>  include/uapi/linux/genetlink.h |  1 +
>  net/netlink/genetlink.c|  6 --
>  net/openvswitch/datapath.c | 20 ++--
>  3 files changed, 15 insertions(+), 12 deletions(-)
>
> diff --git a/include/uapi/linux/genetlink.h b/include/uapi/linux/genetlink.h
> index c3363ba..5512c90 100644
> --- a/include/uapi/linux/genetlink.h
> +++ b/include/uapi/linux/genetlink.h
> @@ -21,6 +21,7 @@ struct genlmsghdr {
>  #define GENL_CMD_CAP_DO0x02
>  #define GENL_CMD_CAP_DUMP  0x04
>  #define GENL_CMD_CAP_HASPOL0x08
> +#define GENL_UNS_ADMIN_PERM0x10
>
This approach looks good to me.


Re: [PATCH] ip_tunnel: make ip6tunnel_xmit definition conditional

2016-01-01 Thread Pravin Shelar
On Fri, Jan 1, 2016 at 5:48 AM, Arnd Bergmann  wrote:
> From 433df301cf49624871346fa63f3fc65033caeda3 Mon Sep 17 00:00:00 2001
> From: Arnd Bergmann 
> Date: Fri, 1 Jan 2016 13:18:48 +0100
> Subject: [PATCH] net: make ip6tunnel_xmit definition conditional
>
> Moving the caller of iptunnel_xmit_stats causes a build error in
> randconfig builds that disable CONFIG_INET:
>
> In file included from ../net/xfrm/xfrm_input.c:17:0:
> ../include/net/ip6_tunnel.h: In function 'ip6tunnel_xmit':
> ../include/net/ip6_tunnel.h:93:2: error: implicit declaration of function 
> 'iptunnel_xmit_stats' [-Werror=implicit-function-declaration]
>   iptunnel_xmit_stats(dev, pkt_len);
>
> The reason is that the iptunnel_xmit_stats definition is hidden
> inside #ifdef CONFIG_INET but the caller is not. We can change
> one or the other to fix it, and this patch adds a second #ifdef
> around ip6tunnel_xmit() to avoid seeing the invalid call.
>
> Signed-off-by: Arnd Bergmann 
> Fixes: 039f50629b7f ("ip_tunnel: Move stats update to iptunnel_xmit()")

Thanks for the fix.
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ip_tunnel: make ip6tunnel_xmit definition conditional

2016-01-01 Thread Pravin Shelar
On Fri, Jan 1, 2016 at 5:48 AM, Arnd Bergmann  wrote:
> From 433df301cf49624871346fa63f3fc65033caeda3 Mon Sep 17 00:00:00 2001
> From: Arnd Bergmann 
> Date: Fri, 1 Jan 2016 13:18:48 +0100
> Subject: [PATCH] net: make ip6tunnel_xmit definition conditional
>
> Moving the caller of iptunnel_xmit_stats causes a build error in
> randconfig builds that disable CONFIG_INET:
>
> In file included from ../net/xfrm/xfrm_input.c:17:0:
> ../include/net/ip6_tunnel.h: In function 'ip6tunnel_xmit':
> ../include/net/ip6_tunnel.h:93:2: error: implicit declaration of function 
> 'iptunnel_xmit_stats' [-Werror=implicit-function-declaration]
>   iptunnel_xmit_stats(dev, pkt_len);
>
> The reason is that the iptunnel_xmit_stats definition is hidden
> inside #ifdef CONFIG_INET but the caller is not. We can change
> one or the other to fix it, and this patch adds a second #ifdef
> around ip6tunnel_xmit() to avoid seeing the invalid call.
>
> Signed-off-by: Arnd Bergmann 
> Fixes: 039f50629b7f ("ip_tunnel: Move stats update to iptunnel_xmit()")

Thanks for the fix.
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ovs: do not allocate memory from offline numa node

2015-10-02 Thread Pravin Shelar
On Fri, Oct 2, 2015 at 3:18 AM, Konstantin Khlebnikov
 wrote:
> When openvswitch tries allocate memory from offline numa node 0:
> stats = kmem_cache_alloc_node(flow_stats_cache, GFP_KERNEL | __GFP_ZERO, 0)
> It catches VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid))
> [ replaced with VM_WARN_ON(!node_online(nid)) recently ] in linux/gfp.h
> This patch disables numa affinity in this case.
>
> Signed-off-by: Konstantin Khlebnikov 
>
> ---
>
> <4>[   24.368805] [ cut here ]
> <2>[   24.368846] kernel BUG at include/linux/gfp.h:325!
> <4>[   24.368868] invalid opcode:  [#1] SMP
> <4>[   24.368892] Modules linked in: openvswitch vxlan udp_tunnel 
> ip6_udp_tunnel gre libcrc32c kvm_amd kvm crc32_pclmul ghash_clmulni_intel 
> aesni_intel ablk_helper cryptd lrw mgag200 ttm drm_kms_helper drm gf128mul 
> glue_helper serio_raw aes_x86_64 sysimgblt sysfillrect syscopyarea sp5100_tco 
> amd64_edac_mod edac_core edac_mce_amd i2c_piix4 k10temp fam15h_power 
> microcode raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
> async_tx xor raid6_pq raid1 raid0 igb multipath i2c_algo_bit i2c_core linear 
> dca psmouse ptp ahci pata_atiixp pps_core libahci
> <4>[   24.369225] CPU: 22 PID: 987 Comm: ovs-vswitchd Not tainted 3.18.19-24 
> #1
> <4>[   24.369255] Hardware name: Supermicro H8DGU/H8DGU, BIOS 3.0b   
> 05/07/2013
> <4>[   24.369286] task: 8807f2433240 ti: 8807ec9a task.ti: 
> 8807ec9a
> <4>[   24.369317] RIP: 0010:[]  [] 
> new_slab+0x2d4/0x380
> <4>[   24.369359] RSP: 0018:8807ec9a35d8  EFLAGS: 00010246
> <4>[   24.369383] RAX:  RBX: 8807ff403c00 RCX: 
> 
> <4>[   24.369412] RDX:  RSI:  RDI: 
> 002012d0
> <4>[   24.369441] RBP: 8807ec9a3608 R08: 8807f193cfe0 R09: 
> 0001008a
> <4>[   24.369471] R10: f193cf01 R11: 00015f38 R12: 
> 
> <4>[   24.369501] R13: 0080 R14:  R15: 
> 00d0
> <4>[   24.369531] FS:  7febb0cbe980() GS:8807ffd8() 
> knlGS:
> <4>[   24.369563] CS:  0010 DS:  ES:  CR0: 80050033
> <4>[   24.369588] CR2: 7efc53abc1b8 CR3: 0007f213f000 CR4: 
> 000407e0
> <4>[   24.369618] Stack:
> <4>[   24.369630]  8807ec9a3618   
> 8807ffd958c0
> <4>[   24.369669]  8807ff403c00 80d0 8807ec9a36f8 
> 816cc548
> <4>[   24.370755]  8807ec9a3708 0296 0004 
> 
> <4>[   24.371777] Call Trace:
> <4>[   24.372929]  [] __slab_alloc+0x33b/0x459
> <4>[   24.374179]  [] ? ovs_flow_alloc+0x59/0x110 
> [openvswitch]
> <4>[   24.375390]  [] ? get_page_from_freelist+0x483/0x9f0
> <4>[   24.376623]  [] ? memzero_explicit+0xe/0x10
> <4>[   24.377767]  [] ? ovs_flow_alloc+0x59/0x110 
> [openvswitch]
> <4>[   24.378951]  [] kmem_cache_alloc_node+0x9c/0x1b0
> <4>[   24.379916]  [] ? kmem_cache_alloc+0x18b/0x1a0
> <4>[   24.390806]  [] ? ovs_flow_alloc+0x1d/0x110 
> [openvswitch]
> <4>[   24.391779]  [] ovs_flow_alloc+0x59/0x110 
> [openvswitch]
> <4>[   24.392875]  [] ovs_flow_cmd_new+0x5b/0x360 
> [openvswitch]
> <4>[   24.394004]  [] ? __alloc_pages_nodemask+0x16c/0xaf0
> <4>[   24.394973]  [] ? __alloc_skb+0x87/0x2a0
> <4>[   24.395926]  [] ? nla_parse+0x90/0x110
> <4>[   24.476276]  [] genl_family_rcv_msg+0x373/0x3d0
> <4>[   24.477704]  [] ? 
> __kmalloc_node_track_caller+0x6c/0x220
> <4>[   24.478859]  [] genl_rcv_msg+0x44/0x80
> <4>[   24.479987]  [] ? genl_family_rcv_msg+0x3d0/0x3d0
> <4>[   24.481325]  [] netlink_rcv_skb+0xb9/0xe0
> <4>[   24.482466]  [] genl_rcv+0x2c/0x40
> <4>[   24.483554]  [] netlink_unicast+0x12b/0x1c0
> <4>[   24.484739]  [] netlink_sendmsg+0x392/0x6d0
> <4>[   24.485942]  [] sock_sendmsg+0xaf/0xc0
> <4>[   24.486953]  [] ? netlink_sendmsg+0x202/0x6d0
> <4>[   24.487969]  [] ___sys_sendmsg.part.19+0x322/0x330
> <4>[   24.489167]  [] ? SYSC_sendto+0xf9/0x130
> <4>[   24.490217]  [] ___sys_sendmsg+0x4a/0x70
> <4>[   24.491162]  [] __sys_sendmsg+0x49/0x90
> <4>[   24.492082]  [] SyS_sendmsg+0x19/0x20
> <4>[   24.493181]  [] system_call_fastpath+0x12/0x17
> <4>[   24.494124] Code: 40 e9 ea fe ff ff 90 e8 6b 69 ff ff 49 89 c4 e9 07 fe 
> ff ff 4c 89 f7 ff d0 e9 26 ff ff ff 49 c7 04 06 00 00 00 00 e9 3c ff ff ff 
> <0f> 0b ba 00 10 00 00 be 5a 00 00 00 4c 89 ef 48 d3 e2 e8 65 2a
> <1>[   24.496071] RIP  [] new_slab+0x2d4/0x380
> <4>[   24.497152]  RSP 
> <4>[   24.498945] ---[ end trace 6f97360ff4a9ee45 ]---
> ---
>  net/openvswitch/flow_table.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> index f2ea83ba4763..c7f74aab34b9 100644
> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -93,7 +93,8 @@ struct sw_flow *ovs_flow_alloc(void)
>
> /* Initialize the default 

Re: [PATCH] ovs: do not allocate memory from offline numa node

2015-10-02 Thread Pravin Shelar
On Fri, Oct 2, 2015 at 3:18 AM, Konstantin Khlebnikov
 wrote:
> When openvswitch tries allocate memory from offline numa node 0:
> stats = kmem_cache_alloc_node(flow_stats_cache, GFP_KERNEL | __GFP_ZERO, 0)
> It catches VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid))
> [ replaced with VM_WARN_ON(!node_online(nid)) recently ] in linux/gfp.h
> This patch disables numa affinity in this case.
>
> Signed-off-by: Konstantin Khlebnikov 
>
> ---
>
> <4>[   24.368805] [ cut here ]
> <2>[   24.368846] kernel BUG at include/linux/gfp.h:325!
> <4>[   24.368868] invalid opcode:  [#1] SMP
> <4>[   24.368892] Modules linked in: openvswitch vxlan udp_tunnel 
> ip6_udp_tunnel gre libcrc32c kvm_amd kvm crc32_pclmul ghash_clmulni_intel 
> aesni_intel ablk_helper cryptd lrw mgag200 ttm drm_kms_helper drm gf128mul 
> glue_helper serio_raw aes_x86_64 sysimgblt sysfillrect syscopyarea sp5100_tco 
> amd64_edac_mod edac_core edac_mce_amd i2c_piix4 k10temp fam15h_power 
> microcode raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
> async_tx xor raid6_pq raid1 raid0 igb multipath i2c_algo_bit i2c_core linear 
> dca psmouse ptp ahci pata_atiixp pps_core libahci
> <4>[   24.369225] CPU: 22 PID: 987 Comm: ovs-vswitchd Not tainted 3.18.19-24 
> #1
> <4>[   24.369255] Hardware name: Supermicro H8DGU/H8DGU, BIOS 3.0b   
> 05/07/2013
> <4>[   24.369286] task: 8807f2433240 ti: 8807ec9a task.ti: 
> 8807ec9a
> <4>[   24.369317] RIP: 0010:[]  [] 
> new_slab+0x2d4/0x380
> <4>[   24.369359] RSP: 0018:8807ec9a35d8  EFLAGS: 00010246
> <4>[   24.369383] RAX:  RBX: 8807ff403c00 RCX: 
> 
> <4>[   24.369412] RDX:  RSI:  RDI: 
> 002012d0
> <4>[   24.369441] RBP: 8807ec9a3608 R08: 8807f193cfe0 R09: 
> 0001008a
> <4>[   24.369471] R10: f193cf01 R11: 00015f38 R12: 
> 
> <4>[   24.369501] R13: 0080 R14:  R15: 
> 00d0
> <4>[   24.369531] FS:  7febb0cbe980() GS:8807ffd8() 
> knlGS:
> <4>[   24.369563] CS:  0010 DS:  ES:  CR0: 80050033
> <4>[   24.369588] CR2: 7efc53abc1b8 CR3: 0007f213f000 CR4: 
> 000407e0
> <4>[   24.369618] Stack:
> <4>[   24.369630]  8807ec9a3618   
> 8807ffd958c0
> <4>[   24.369669]  8807ff403c00 80d0 8807ec9a36f8 
> 816cc548
> <4>[   24.370755]  8807ec9a3708 0296 0004 
> 
> <4>[   24.371777] Call Trace:
> <4>[   24.372929]  [] __slab_alloc+0x33b/0x459
> <4>[   24.374179]  [] ? ovs_flow_alloc+0x59/0x110 
> [openvswitch]
> <4>[   24.375390]  [] ? get_page_from_freelist+0x483/0x9f0
> <4>[   24.376623]  [] ? memzero_explicit+0xe/0x10
> <4>[   24.377767]  [] ? ovs_flow_alloc+0x59/0x110 
> [openvswitch]
> <4>[   24.378951]  [] kmem_cache_alloc_node+0x9c/0x1b0
> <4>[   24.379916]  [] ? kmem_cache_alloc+0x18b/0x1a0
> <4>[   24.390806]  [] ? ovs_flow_alloc+0x1d/0x110 
> [openvswitch]
> <4>[   24.391779]  [] ovs_flow_alloc+0x59/0x110 
> [openvswitch]
> <4>[   24.392875]  [] ovs_flow_cmd_new+0x5b/0x360 
> [openvswitch]
> <4>[   24.394004]  [] ? __alloc_pages_nodemask+0x16c/0xaf0
> <4>[   24.394973]  [] ? __alloc_skb+0x87/0x2a0
> <4>[   24.395926]  [] ? nla_parse+0x90/0x110
> <4>[   24.476276]  [] genl_family_rcv_msg+0x373/0x3d0
> <4>[   24.477704]  [] ? 
> __kmalloc_node_track_caller+0x6c/0x220
> <4>[   24.478859]  [] genl_rcv_msg+0x44/0x80
> <4>[   24.479987]  [] ? genl_family_rcv_msg+0x3d0/0x3d0
> <4>[   24.481325]  [] netlink_rcv_skb+0xb9/0xe0
> <4>[   24.482466]  [] genl_rcv+0x2c/0x40
> <4>[   24.483554]  [] netlink_unicast+0x12b/0x1c0
> <4>[   24.484739]  [] netlink_sendmsg+0x392/0x6d0
> <4>[   24.485942]  [] sock_sendmsg+0xaf/0xc0
> <4>[   24.486953]  [] ? netlink_sendmsg+0x202/0x6d0
> <4>[   24.487969]  [] ___sys_sendmsg.part.19+0x322/0x330
> <4>[   24.489167]  [] ? SYSC_sendto+0xf9/0x130
> <4>[   24.490217]  [] ___sys_sendmsg+0x4a/0x70
> <4>[   24.491162]  [] __sys_sendmsg+0x49/0x90
> <4>[   24.492082]  [] SyS_sendmsg+0x19/0x20
> <4>[   24.493181]  [] system_call_fastpath+0x12/0x17
> <4>[   24.494124] Code: 40 e9 ea fe ff ff 90 e8 6b 69 ff ff 49 89 c4 e9 07 fe 
> ff ff 4c 89 f7 ff d0 e9 26 ff ff ff 49 c7 04 06 00 00 00 00 e9 3c ff ff ff 
> <0f> 0b ba 00 10 00 00 be 5a 00 00 00 4c 89 ef 48 d3 e2 e8 65 2a
> <1>[   24.496071] RIP  [] new_slab+0x2d4/0x380
> <4>[   24.497152]  RSP 
> <4>[   24.498945] ---[ end trace 6f97360ff4a9ee45 ]---
> ---
>  net/openvswitch/flow_table.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> index f2ea83ba4763..c7f74aab34b9 100644
> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -93,7 +93,8 @@ struct sw_flow 

Re: [PATCHv2 7/7] openvswitch: Change CT_ATTR_FLAGS to CT_ATTR_COMMIT

2015-10-01 Thread Pravin Shelar
On Thu, Oct 1, 2015 at 1:53 PM, Joe Stringer  wrote:
> Previously, the CT_ATTR_FLAGS attribute, when nested under the
> OVS_ACTION_ATTR_CT, encoded a 32-bit bitmask of flags that modify the
> semantics of the ct action. It's more extensible to just represent each
> flag as a nested attribute, and this requires no additional error
> checking to reject flags that aren't currently supported.
>
> Suggested-by: Ben Pfaff 
> Signed-off-by: Joe Stringer 
> ---
> v2: Use bitmask for internal representation of COMMIT.
> ---
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 3/7] openvswitch: Fix skb leak in ovs_fragment()

2015-10-01 Thread Pravin Shelar
On Thu, Oct 1, 2015 at 1:53 PM, Joe Stringer  wrote:
> If ovs_fragment() was unable to fragment the skb due to an L2 header
> that exceeds the supported length, skbs would be leaked. Fix the bug.
>
> Fixes: 7f8a436 "openvswitch: Add conntrack action"
> Signed-off-by: Joe Stringer 
> ---
> v2: Drop if condition, return in success case.
> ---
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3] openvswitch: Rename LABEL->LABELS

2015-10-01 Thread Pravin Shelar
On Thu, Oct 1, 2015 at 3:00 PM, Joe Stringer  wrote:
> Conntrack LABELS (plural) are exposed by conntrack; rename the OVS name
> for these to be consistent with conntrack.
>
> Fixes: c2ac667 "openvswitch: Allow matching on conntrack label"
> Signed-off-by: Joe Stringer 
> ---
> v3: Fix build with !CONFIG_NF_CONNTRACK
> v2: Change ct_label struct names as well as constants.
> ---
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 3/7] openvswitch: Fix skb leak in ovs_fragment()

2015-10-01 Thread Pravin Shelar
On Thu, Oct 1, 2015 at 1:53 PM, Joe Stringer  wrote:
> If ovs_fragment() was unable to fragment the skb due to an L2 header
> that exceeds the supported length, skbs would be leaked. Fix the bug.
>
> Fixes: 7f8a436 "openvswitch: Add conntrack action"
> Signed-off-by: Joe Stringer 
> ---
> v2: Drop if condition, return in success case.
> ---
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3] openvswitch: Rename LABEL->LABELS

2015-10-01 Thread Pravin Shelar
On Thu, Oct 1, 2015 at 3:00 PM, Joe Stringer  wrote:
> Conntrack LABELS (plural) are exposed by conntrack; rename the OVS name
> for these to be consistent with conntrack.
>
> Fixes: c2ac667 "openvswitch: Allow matching on conntrack label"
> Signed-off-by: Joe Stringer 
> ---
> v3: Fix build with !CONFIG_NF_CONNTRACK
> v2: Change ct_label struct names as well as constants.
> ---
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 7/7] openvswitch: Change CT_ATTR_FLAGS to CT_ATTR_COMMIT

2015-10-01 Thread Pravin Shelar
On Thu, Oct 1, 2015 at 1:53 PM, Joe Stringer  wrote:
> Previously, the CT_ATTR_FLAGS attribute, when nested under the
> OVS_ACTION_ATTR_CT, encoded a 32-bit bitmask of flags that modify the
> semantics of the ct action. It's more extensible to just represent each
> flag as a nested attribute, and this requires no additional error
> checking to reject flags that aren't currently supported.
>
> Suggested-by: Ben Pfaff 
> Signed-off-by: Joe Stringer 
> ---
> v2: Use bitmask for internal representation of COMMIT.
> ---
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 5/7] openvswitch: Reject ct_state unsupported bits

2015-09-30 Thread Pravin Shelar
On Wed, Sep 30, 2015 at 6:20 PM, Joe Stringer  wrote:
> On 30 September 2015 at 17:31, Pravin Shelar  wrote:
>> On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
>>> Previously, if userspace specified ct_state bits in the flow key which
>>> are currently undefined (and therefore unsupported), then they would be
>>> ignored. This could cause unexpected behaviour in future if userspace is
>>> extended to support additional bits but attempts to communicate with the
>>> current version of the kernel. This patch rectifies the situation by
>>> rejecting such ct_state bits.
>>>
>>> Fixes: 7f8a436 "openvswitch: Add conntrack action"
>>> Signed-off-by: Joe Stringer 
>>> ---
>>>  net/openvswitch/conntrack.h| 12 
>>>  net/openvswitch/flow_netlink.c |  6 ++
>>>  2 files changed, 18 insertions(+)
>>>
>>> diff --git a/net/openvswitch/conntrack.h b/net/openvswitch/conntrack.h
>>> index 43f5dd7..c658d95 100644
>>> --- a/net/openvswitch/conntrack.h
>>> +++ b/net/openvswitch/conntrack.h
>>> @@ -34,6 +34,13 @@ int ovs_ct_execute(struct net *, struct sk_buff *, 
>>> struct sw_flow_key *,
>>>  void ovs_ct_fill_key(const struct sk_buff *skb, struct sw_flow_key *key);
>>>  int ovs_ct_put_key(const struct sw_flow_key *key, struct sk_buff *skb);
>>>  void ovs_ct_free_action(const struct nlattr *a);
>>> +
>>> +static inline bool ovs_ct_state_supported(u8 state)
>>> +{
>>> +   return !(state & ~(OVS_CS_F_NEW | OVS_CS_F_ESTABLISHED |
>>> +OVS_CS_F_RELATED | OVS_CS_F_REPLY_DIR |
>>> +OVS_CS_F_INVALID | OVS_CS_F_TRACKED));
>>> +}
>>>  #else
>>>  #include 
>>>
>>> @@ -46,6 +53,11 @@ static inline bool ovs_ct_verify(struct net *net, int 
>>> attr)
>>> return false;
>>>  }
>>>
>>> +static inline bool ovs_ct_state_supported(u8 state)
>>> +{
>>> +   return false;
>>> +}
>>> +
>>>  static inline int ovs_ct_copy_action(struct net *net, const struct nlattr 
>>> *nla,
>>>  const struct sw_flow_key *key,
>>>  struct sw_flow_actions **acts, bool 
>>> log)
>>> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
>>> index ea82cd5..c4917c9 100644
>>> --- a/net/openvswitch/flow_netlink.c
>>> +++ b/net/openvswitch/flow_netlink.c
>>> @@ -816,6 +816,12 @@ static int metadata_from_nlattrs(struct net *net, 
>>> struct sw_flow_match *match,
>>> ovs_ct_verify(net, OVS_KEY_ATTR_CT_STATE)) {
>>> u8 ct_state = nla_get_u8(a[OVS_KEY_ATTR_CT_STATE]);
>>>
>> We also need to return error if kernel does not support the feature.
>
> Already handled by ovs_ct_verify() in conntrack.h.

Right. Patch looks good then.

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 4/7] openvswitch: Ensure flow is valid before executing ct

2015-09-30 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
> The ct action uses parts of the flow key, so we need to ensure that it
> is valid before executing that action.
>
> Fixes: 7f8a436 "openvswitch: Add conntrack action"
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 6/7] openvswitch: Extend ct_state match field to 32 bits

2015-09-30 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
> The ct_state field was initially added as an 8-bit field, however six of
> the bits are already being used and use cases are already starting to
> appear that may push the limits of this field. This patch extends the
> field to 32 bits while retaining the internal representation of 8 bits.
> This should cover forward compatibility of the ABI for the foreseeable
> future.
>
> This patch also reorders the OVS_CS_F_* bits to be sequential.
>
> Suggested-by: Jarno Rajahalme 
> Signed-off-by: Joe Stringer 

Looks good.
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 7/7] openvswitch: Change CT_ATTR_FLAGS to CT_ATTR_COMMIT

2015-09-30 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
> Previously, the CT_ATTR_FLAGS attribute, when nested under the
> OVS_ACTION_ATTR_CT, encoded a 32-bit bitmask of flags that modify the
> semantics of the ct action. It's more extensible to just represent each
> flag as a nested attribute, and this requires no additional error
> checking to reject flags that aren't currently supported.
>
> Suggested-by: Ben Pfaff 
> Signed-off-by: Joe Stringer 
> ---
...
...

> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index 167cf43..effa78c 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -42,12 +42,18 @@ struct md_label {
> struct ovs_key_ct_label mask;
>  };
>
> +/* Flags for performing connection tracking.
> + *
> + * CT_F_COMMIT: Commits the flow to the conntrack table.
> + */
> +#define CT_F_COMMITBIT(0)
> +
>  /* Conntrack action context for execution. */
>  struct ovs_conntrack_info {
> struct nf_conntrack_helper *helper;
> struct nf_conntrack_zone zone;
> struct nf_conn *ct;
> -   u32 flags;
> +   u8 flags;   /* bitmask of CT_F_*. */

I think bit field has better readability in such use case.
Otherwise looks good.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 2/7] openvswitch: Fix typos in CT headers

2015-09-30 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
> These comments hadn't caught up to their implementations, fix them.
>
> Fixes: 7f8a436 "openvswitch: Add conntrack action"
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 5/7] openvswitch: Reject ct_state unsupported bits

2015-09-30 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
> Previously, if userspace specified ct_state bits in the flow key which
> are currently undefined (and therefore unsupported), then they would be
> ignored. This could cause unexpected behaviour in future if userspace is
> extended to support additional bits but attempts to communicate with the
> current version of the kernel. This patch rectifies the situation by
> rejecting such ct_state bits.
>
> Fixes: 7f8a436 "openvswitch: Add conntrack action"
> Signed-off-by: Joe Stringer 
> ---
>  net/openvswitch/conntrack.h| 12 
>  net/openvswitch/flow_netlink.c |  6 ++
>  2 files changed, 18 insertions(+)
>
> diff --git a/net/openvswitch/conntrack.h b/net/openvswitch/conntrack.h
> index 43f5dd7..c658d95 100644
> --- a/net/openvswitch/conntrack.h
> +++ b/net/openvswitch/conntrack.h
> @@ -34,6 +34,13 @@ int ovs_ct_execute(struct net *, struct sk_buff *, struct 
> sw_flow_key *,
>  void ovs_ct_fill_key(const struct sk_buff *skb, struct sw_flow_key *key);
>  int ovs_ct_put_key(const struct sw_flow_key *key, struct sk_buff *skb);
>  void ovs_ct_free_action(const struct nlattr *a);
> +
> +static inline bool ovs_ct_state_supported(u8 state)
> +{
> +   return !(state & ~(OVS_CS_F_NEW | OVS_CS_F_ESTABLISHED |
> +OVS_CS_F_RELATED | OVS_CS_F_REPLY_DIR |
> +OVS_CS_F_INVALID | OVS_CS_F_TRACKED));
> +}
>  #else
>  #include 
>
> @@ -46,6 +53,11 @@ static inline bool ovs_ct_verify(struct net *net, int attr)
> return false;
>  }
>
> +static inline bool ovs_ct_state_supported(u8 state)
> +{
> +   return false;
> +}
> +
>  static inline int ovs_ct_copy_action(struct net *net, const struct nlattr 
> *nla,
>  const struct sw_flow_key *key,
>  struct sw_flow_actions **acts, bool log)
> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index ea82cd5..c4917c9 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
> @@ -816,6 +816,12 @@ static int metadata_from_nlattrs(struct net *net, struct 
> sw_flow_match *match,
> ovs_ct_verify(net, OVS_KEY_ATTR_CT_STATE)) {
> u8 ct_state = nla_get_u8(a[OVS_KEY_ATTR_CT_STATE]);
>
We also need to return error if kernel does not support the feature.

> +   if (!is_mask && !ovs_ct_state_supported(ct_state)) {
> +   OVS_NLERR(log, "ct_state flags %02x unsupported",
> + ct_state);
> +   return -EINVAL;
> +   }
> +
> SW_FLOW_KEY_PUT(match, ct.state, ct_state, is_mask);
> *attrs &= ~(1ULL << OVS_KEY_ATTR_CT_STATE);
> }
> --
> 2.1.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 1/7] openvswitch: Make LABELS name more consistent

2015-09-30 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
> Conntrack LABELS (plural) are exposed by conntrack; rename the OVS name
> for these to be consistent with conntrack.
>
> Fixes: c2ac667 "openvswitch: Allow matching on conntrack label"
> Signed-off-by: Joe Stringer 
> ---
>  include/uapi/linux/openvswitch.h |  4 ++--
>  net/openvswitch/actions.c|  2 +-
>  net/openvswitch/conntrack.c  | 10 +-
>  net/openvswitch/flow_netlink.c   | 14 +++---
>  4 files changed, 15 insertions(+), 15 deletions(-)
>
...

> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index 002a755..8c5d482c 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -179,7 +179,7 @@ int ovs_ct_put_key(const struct sw_flow_key *key, struct 
> sk_buff *skb)
> return -EMSGSIZE;
>
> if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) &&
> -   nla_put(skb, OVS_KEY_ATTR_CT_LABEL, sizeof(key->ct.label),
> +   nla_put(skb, OVS_KEY_ATTR_CT_LABELS, sizeof(key->ct.label),
> >ct.label))
> return -EMSGSIZE;
>
> @@ -545,7 +545,7 @@ static const struct ovs_ct_len_tbl 
> ovs_ct_attr_lens[OVS_CT_ATTR_MAX + 1] = {
> .maxlen = sizeof(u16) },
> [OVS_CT_ATTR_MARK]  = { .minlen = sizeof(struct md_mark),
> .maxlen = sizeof(struct md_mark) },
> -   [OVS_CT_ATTR_LABEL] = { .minlen = sizeof(struct md_label),
> +   [OVS_CT_ATTR_LABELS]= { .minlen = sizeof(struct md_label),
> .maxlen = sizeof(struct md_label) },
> [OVS_CT_ATTR_HELPER]= { .minlen = 1,
> .maxlen = NF_CT_HELPER_NAME_LEN }
> @@ -593,7 +593,7 @@ static int parse_ct(const struct nlattr *attr, struct 
> ovs_conntrack_info *info,
> }
>  #endif
>  #ifdef CONFIG_NF_CONNTRACK_LABELS
> -   case OVS_CT_ATTR_LABEL: {
> +   case OVS_CT_ATTR_LABELS: {
> struct md_label *label = nla_data(a);
>
> info->label = *label;
> @@ -633,7 +633,7 @@ bool ovs_ct_verify(struct net *net, enum ovs_key_attr 
> attr)
> attr == OVS_KEY_ATTR_CT_MARK)
> return true;
> if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) &&
> -   attr == OVS_KEY_ATTR_CT_LABEL) {
> +   attr == OVS_KEY_ATTR_CT_LABELS) {
> struct ovs_net *ovs_net = net_generic(net, ovs_net_id);
>
> return ovs_net->xt_label;
> @@ -711,7 +711,7 @@ int ovs_ct_action_to_attr(const struct ovs_conntrack_info 
> *ct_info,
> _info->mark))
> return -EMSGSIZE;
> if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) &&
> -   nla_put(skb, OVS_CT_ATTR_LABEL, sizeof(ct_info->label),
> +   nla_put(skb, OVS_CT_ATTR_LABELS, sizeof(ct_info->label),
> _info->label))
> return -EMSGSIZE;
> if (ct_info->helper) {
> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index 5c030a4..ea82cd5 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
> @@ -294,7 +294,7 @@ size_t ovs_key_attr_size(void)
> + nla_total_size(1)   /* OVS_KEY_ATTR_CT_STATE */
> + nla_total_size(2)   /* OVS_KEY_ATTR_CT_ZONE */
> + nla_total_size(4)   /* OVS_KEY_ATTR_CT_MARK */
> -   + nla_total_size(16)  /* OVS_KEY_ATTR_CT_LABEL */
> +   + nla_total_size(16)  /* OVS_KEY_ATTR_CT_LABELS */
> + nla_total_size(12)  /* OVS_KEY_ATTR_ETHERNET */
> + nla_total_size(2)   /* OVS_KEY_ATTR_ETHERTYPE */
> + nla_total_size(4)   /* OVS_KEY_ATTR_VLAN */
> @@ -352,7 +352,7 @@ static const struct ovs_len_tbl 
> ovs_key_lens[OVS_KEY_ATTR_MAX + 1] = {
> [OVS_KEY_ATTR_CT_STATE]  = { .len = sizeof(u8) },
> [OVS_KEY_ATTR_CT_ZONE]   = { .len = sizeof(u16) },
> [OVS_KEY_ATTR_CT_MARK]   = { .len = sizeof(u32) },
> -   [OVS_KEY_ATTR_CT_LABEL]  = { .len = sizeof(struct ovs_key_ct_label) },
> +   [OVS_KEY_ATTR_CT_LABELS] = { .len = sizeof(struct ovs_key_ct_label) },
>  };
After this change, struct ovs_key_ct_label and other reference to
label also should be renamed to labels. is there reason you did only
this one?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression] freezing usbip list by commit 6ae459bdaaeebc632

2015-09-30 Thread Pravin Shelar
On Wed, Sep 30, 2015 at 3:18 AM, Igor Kotrasinski
 wrote:
> Commit 6ae459bdaaeebc632 (skbuff: Fix skb checksum flag on skb pull)
> introduces a regression when using usbip userspace tools.
> Running usbipd and attempting to list remote devices on localhost causes
> usbip to freeze. Stopping usbip then causes usbipd to exit.
>
> Steps to reproduce:
> 1. Compile usbip tools and drivers.
> 2. Run:
> usbipd &
> usbip list -r localhost

This should be already fixed on latest net branch by commit
31b33dfb0a1444 ("skbuff: Fix skb checksum partial check"). Can you try
it?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression] freezing usbip list by commit 6ae459bdaaeebc632

2015-09-30 Thread Pravin Shelar
On Wed, Sep 30, 2015 at 3:18 AM, Igor Kotrasinski
 wrote:
> Commit 6ae459bdaaeebc632 (skbuff: Fix skb checksum flag on skb pull)
> introduces a regression when using usbip userspace tools.
> Running usbipd and attempting to list remote devices on localhost causes
> usbip to freeze. Stopping usbip then causes usbipd to exit.
>
> Steps to reproduce:
> 1. Compile usbip tools and drivers.
> 2. Run:
> usbipd &
> usbip list -r localhost

This should be already fixed on latest net branch by commit
31b33dfb0a1444 ("skbuff: Fix skb checksum partial check"). Can you try
it?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 1/7] openvswitch: Make LABELS name more consistent

2015-09-30 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
> Conntrack LABELS (plural) are exposed by conntrack; rename the OVS name
> for these to be consistent with conntrack.
>
> Fixes: c2ac667 "openvswitch: Allow matching on conntrack label"
> Signed-off-by: Joe Stringer 
> ---
>  include/uapi/linux/openvswitch.h |  4 ++--
>  net/openvswitch/actions.c|  2 +-
>  net/openvswitch/conntrack.c  | 10 +-
>  net/openvswitch/flow_netlink.c   | 14 +++---
>  4 files changed, 15 insertions(+), 15 deletions(-)
>
...

> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index 002a755..8c5d482c 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -179,7 +179,7 @@ int ovs_ct_put_key(const struct sw_flow_key *key, struct 
> sk_buff *skb)
> return -EMSGSIZE;
>
> if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) &&
> -   nla_put(skb, OVS_KEY_ATTR_CT_LABEL, sizeof(key->ct.label),
> +   nla_put(skb, OVS_KEY_ATTR_CT_LABELS, sizeof(key->ct.label),
> >ct.label))
> return -EMSGSIZE;
>
> @@ -545,7 +545,7 @@ static const struct ovs_ct_len_tbl 
> ovs_ct_attr_lens[OVS_CT_ATTR_MAX + 1] = {
> .maxlen = sizeof(u16) },
> [OVS_CT_ATTR_MARK]  = { .minlen = sizeof(struct md_mark),
> .maxlen = sizeof(struct md_mark) },
> -   [OVS_CT_ATTR_LABEL] = { .minlen = sizeof(struct md_label),
> +   [OVS_CT_ATTR_LABELS]= { .minlen = sizeof(struct md_label),
> .maxlen = sizeof(struct md_label) },
> [OVS_CT_ATTR_HELPER]= { .minlen = 1,
> .maxlen = NF_CT_HELPER_NAME_LEN }
> @@ -593,7 +593,7 @@ static int parse_ct(const struct nlattr *attr, struct 
> ovs_conntrack_info *info,
> }
>  #endif
>  #ifdef CONFIG_NF_CONNTRACK_LABELS
> -   case OVS_CT_ATTR_LABEL: {
> +   case OVS_CT_ATTR_LABELS: {
> struct md_label *label = nla_data(a);
>
> info->label = *label;
> @@ -633,7 +633,7 @@ bool ovs_ct_verify(struct net *net, enum ovs_key_attr 
> attr)
> attr == OVS_KEY_ATTR_CT_MARK)
> return true;
> if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) &&
> -   attr == OVS_KEY_ATTR_CT_LABEL) {
> +   attr == OVS_KEY_ATTR_CT_LABELS) {
> struct ovs_net *ovs_net = net_generic(net, ovs_net_id);
>
> return ovs_net->xt_label;
> @@ -711,7 +711,7 @@ int ovs_ct_action_to_attr(const struct ovs_conntrack_info 
> *ct_info,
> _info->mark))
> return -EMSGSIZE;
> if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) &&
> -   nla_put(skb, OVS_CT_ATTR_LABEL, sizeof(ct_info->label),
> +   nla_put(skb, OVS_CT_ATTR_LABELS, sizeof(ct_info->label),
> _info->label))
> return -EMSGSIZE;
> if (ct_info->helper) {
> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index 5c030a4..ea82cd5 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
> @@ -294,7 +294,7 @@ size_t ovs_key_attr_size(void)
> + nla_total_size(1)   /* OVS_KEY_ATTR_CT_STATE */
> + nla_total_size(2)   /* OVS_KEY_ATTR_CT_ZONE */
> + nla_total_size(4)   /* OVS_KEY_ATTR_CT_MARK */
> -   + nla_total_size(16)  /* OVS_KEY_ATTR_CT_LABEL */
> +   + nla_total_size(16)  /* OVS_KEY_ATTR_CT_LABELS */
> + nla_total_size(12)  /* OVS_KEY_ATTR_ETHERNET */
> + nla_total_size(2)   /* OVS_KEY_ATTR_ETHERTYPE */
> + nla_total_size(4)   /* OVS_KEY_ATTR_VLAN */
> @@ -352,7 +352,7 @@ static const struct ovs_len_tbl 
> ovs_key_lens[OVS_KEY_ATTR_MAX + 1] = {
> [OVS_KEY_ATTR_CT_STATE]  = { .len = sizeof(u8) },
> [OVS_KEY_ATTR_CT_ZONE]   = { .len = sizeof(u16) },
> [OVS_KEY_ATTR_CT_MARK]   = { .len = sizeof(u32) },
> -   [OVS_KEY_ATTR_CT_LABEL]  = { .len = sizeof(struct ovs_key_ct_label) },
> +   [OVS_KEY_ATTR_CT_LABELS] = { .len = sizeof(struct ovs_key_ct_label) },
>  };
After this change, struct ovs_key_ct_label and other reference to
label also should be renamed to labels. is there reason you did only
this one?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 2/7] openvswitch: Fix typos in CT headers

2015-09-30 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
> These comments hadn't caught up to their implementations, fix them.
>
> Fixes: 7f8a436 "openvswitch: Add conntrack action"
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 5/7] openvswitch: Reject ct_state unsupported bits

2015-09-30 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
> Previously, if userspace specified ct_state bits in the flow key which
> are currently undefined (and therefore unsupported), then they would be
> ignored. This could cause unexpected behaviour in future if userspace is
> extended to support additional bits but attempts to communicate with the
> current version of the kernel. This patch rectifies the situation by
> rejecting such ct_state bits.
>
> Fixes: 7f8a436 "openvswitch: Add conntrack action"
> Signed-off-by: Joe Stringer 
> ---
>  net/openvswitch/conntrack.h| 12 
>  net/openvswitch/flow_netlink.c |  6 ++
>  2 files changed, 18 insertions(+)
>
> diff --git a/net/openvswitch/conntrack.h b/net/openvswitch/conntrack.h
> index 43f5dd7..c658d95 100644
> --- a/net/openvswitch/conntrack.h
> +++ b/net/openvswitch/conntrack.h
> @@ -34,6 +34,13 @@ int ovs_ct_execute(struct net *, struct sk_buff *, struct 
> sw_flow_key *,
>  void ovs_ct_fill_key(const struct sk_buff *skb, struct sw_flow_key *key);
>  int ovs_ct_put_key(const struct sw_flow_key *key, struct sk_buff *skb);
>  void ovs_ct_free_action(const struct nlattr *a);
> +
> +static inline bool ovs_ct_state_supported(u8 state)
> +{
> +   return !(state & ~(OVS_CS_F_NEW | OVS_CS_F_ESTABLISHED |
> +OVS_CS_F_RELATED | OVS_CS_F_REPLY_DIR |
> +OVS_CS_F_INVALID | OVS_CS_F_TRACKED));
> +}
>  #else
>  #include 
>
> @@ -46,6 +53,11 @@ static inline bool ovs_ct_verify(struct net *net, int attr)
> return false;
>  }
>
> +static inline bool ovs_ct_state_supported(u8 state)
> +{
> +   return false;
> +}
> +
>  static inline int ovs_ct_copy_action(struct net *net, const struct nlattr 
> *nla,
>  const struct sw_flow_key *key,
>  struct sw_flow_actions **acts, bool log)
> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index ea82cd5..c4917c9 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
> @@ -816,6 +816,12 @@ static int metadata_from_nlattrs(struct net *net, struct 
> sw_flow_match *match,
> ovs_ct_verify(net, OVS_KEY_ATTR_CT_STATE)) {
> u8 ct_state = nla_get_u8(a[OVS_KEY_ATTR_CT_STATE]);
>
We also need to return error if kernel does not support the feature.

> +   if (!is_mask && !ovs_ct_state_supported(ct_state)) {
> +   OVS_NLERR(log, "ct_state flags %02x unsupported",
> + ct_state);
> +   return -EINVAL;
> +   }
> +
> SW_FLOW_KEY_PUT(match, ct.state, ct_state, is_mask);
> *attrs &= ~(1ULL << OVS_KEY_ATTR_CT_STATE);
> }
> --
> 2.1.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 4/7] openvswitch: Ensure flow is valid before executing ct

2015-09-30 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
> The ct action uses parts of the flow key, so we need to ensure that it
> is valid before executing that action.
>
> Fixes: 7f8a436 "openvswitch: Add conntrack action"
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 6/7] openvswitch: Extend ct_state match field to 32 bits

2015-09-30 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
> The ct_state field was initially added as an 8-bit field, however six of
> the bits are already being used and use cases are already starting to
> appear that may push the limits of this field. This patch extends the
> field to 32 bits while retaining the internal representation of 8 bits.
> This should cover forward compatibility of the ABI for the foreseeable
> future.
>
> This patch also reorders the OVS_CS_F_* bits to be sequential.
>
> Suggested-by: Jarno Rajahalme 
> Signed-off-by: Joe Stringer 

Looks good.
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 7/7] openvswitch: Change CT_ATTR_FLAGS to CT_ATTR_COMMIT

2015-09-30 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer  wrote:
> Previously, the CT_ATTR_FLAGS attribute, when nested under the
> OVS_ACTION_ATTR_CT, encoded a 32-bit bitmask of flags that modify the
> semantics of the ct action. It's more extensible to just represent each
> flag as a nested attribute, and this requires no additional error
> checking to reject flags that aren't currently supported.
>
> Suggested-by: Ben Pfaff 
> Signed-off-by: Joe Stringer 
> ---
...
...

> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index 167cf43..effa78c 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -42,12 +42,18 @@ struct md_label {
> struct ovs_key_ct_label mask;
>  };
>
> +/* Flags for performing connection tracking.
> + *
> + * CT_F_COMMIT: Commits the flow to the conntrack table.
> + */
> +#define CT_F_COMMITBIT(0)
> +
>  /* Conntrack action context for execution. */
>  struct ovs_conntrack_info {
> struct nf_conntrack_helper *helper;
> struct nf_conntrack_zone zone;
> struct nf_conn *ct;
> -   u32 flags;
> +   u8 flags;   /* bitmask of CT_F_*. */

I think bit field has better readability in such use case.
Otherwise looks good.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 5/7] openvswitch: Reject ct_state unsupported bits

2015-09-30 Thread Pravin Shelar
On Wed, Sep 30, 2015 at 6:20 PM, Joe Stringer <joestrin...@nicira.com> wrote:
> On 30 September 2015 at 17:31, Pravin Shelar <pshe...@nicira.com> wrote:
>> On Tue, Sep 29, 2015 at 3:39 PM, Joe Stringer <joestrin...@nicira.com> wrote:
>>> Previously, if userspace specified ct_state bits in the flow key which
>>> are currently undefined (and therefore unsupported), then they would be
>>> ignored. This could cause unexpected behaviour in future if userspace is
>>> extended to support additional bits but attempts to communicate with the
>>> current version of the kernel. This patch rectifies the situation by
>>> rejecting such ct_state bits.
>>>
>>> Fixes: 7f8a436 "openvswitch: Add conntrack action"
>>> Signed-off-by: Joe Stringer <joestrin...@nicira.com>
>>> ---
>>>  net/openvswitch/conntrack.h| 12 
>>>  net/openvswitch/flow_netlink.c |  6 ++
>>>  2 files changed, 18 insertions(+)
>>>
>>> diff --git a/net/openvswitch/conntrack.h b/net/openvswitch/conntrack.h
>>> index 43f5dd7..c658d95 100644
>>> --- a/net/openvswitch/conntrack.h
>>> +++ b/net/openvswitch/conntrack.h
>>> @@ -34,6 +34,13 @@ int ovs_ct_execute(struct net *, struct sk_buff *, 
>>> struct sw_flow_key *,
>>>  void ovs_ct_fill_key(const struct sk_buff *skb, struct sw_flow_key *key);
>>>  int ovs_ct_put_key(const struct sw_flow_key *key, struct sk_buff *skb);
>>>  void ovs_ct_free_action(const struct nlattr *a);
>>> +
>>> +static inline bool ovs_ct_state_supported(u8 state)
>>> +{
>>> +   return !(state & ~(OVS_CS_F_NEW | OVS_CS_F_ESTABLISHED |
>>> +OVS_CS_F_RELATED | OVS_CS_F_REPLY_DIR |
>>> +OVS_CS_F_INVALID | OVS_CS_F_TRACKED));
>>> +}
>>>  #else
>>>  #include 
>>>
>>> @@ -46,6 +53,11 @@ static inline bool ovs_ct_verify(struct net *net, int 
>>> attr)
>>> return false;
>>>  }
>>>
>>> +static inline bool ovs_ct_state_supported(u8 state)
>>> +{
>>> +   return false;
>>> +}
>>> +
>>>  static inline int ovs_ct_copy_action(struct net *net, const struct nlattr 
>>> *nla,
>>>  const struct sw_flow_key *key,
>>>  struct sw_flow_actions **acts, bool 
>>> log)
>>> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
>>> index ea82cd5..c4917c9 100644
>>> --- a/net/openvswitch/flow_netlink.c
>>> +++ b/net/openvswitch/flow_netlink.c
>>> @@ -816,6 +816,12 @@ static int metadata_from_nlattrs(struct net *net, 
>>> struct sw_flow_match *match,
>>> ovs_ct_verify(net, OVS_KEY_ATTR_CT_STATE)) {
>>> u8 ct_state = nla_get_u8(a[OVS_KEY_ATTR_CT_STATE]);
>>>
>> We also need to return error if kernel does not support the feature.
>
> Already handled by ovs_ct_verify() in conntrack.h.

Right. Patch looks good then.

Acked-by: Pravin B Shelar <pshe...@nicira.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 4.3-rc3 Regression: NFS access stall by commit 6ae459bdaaee

2015-09-29 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:33 AM, Takashi Iwai  wrote:
> On Tue, 29 Sep 2015 02:35:04 +0200,
> Pravin Shelar wrote:
>>
>> On Mon, Sep 28, 2015 at 6:12 AM, Takashi Iwai  wrote:
>> > [I resent this since the previous mail didn't go out properly, as it
>> >  seems; apologies if you already read it, please disregard]
>> >
>> > Hi,
>> >
>> > I noticed that NFS access from my workstation slowed down drastically,
>> > almost stalls, with the fresh 4.3-rc3.  There are no particular kernel
>> > errors / warnings.
>> >
>> > Then I performed git section, and it leaded to the commit:
>> > 6ae459bdaaeebc632b16e54dcbabb490c6931d61
>> > skbuff: Fix skb checksum flag on skb pull
>> >
>> > Reverting this commit from 4.3-rc3 fixed the issue indeed.
>> >
>> > Could you take a look at this?  I added Trond to Cc in case he might
>> > already know of it.
>> >
>> I send out fix for similar issue. Can you try the posted patch.
>> https://patchwork.ozlabs.org/patch/523632/
>
> Yes, the patch fixes the problem, thanks.  Feel free to take my
> tested-by tag:
>   Tested-by: Takashi Iwai 
>
Thanks for testing the patch.

> But I guess the real fix is only the first chunk and the latter is
> nothing but a cleanup?  If so, it'd be better to split it.
>

Both chunks are required for the fix.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 4.3-rc3 Regression: NFS access stall by commit 6ae459bdaaee

2015-09-29 Thread Pravin Shelar
On Tue, Sep 29, 2015 at 3:33 AM, Takashi Iwai <ti...@suse.de> wrote:
> On Tue, 29 Sep 2015 02:35:04 +0200,
> Pravin Shelar wrote:
>>
>> On Mon, Sep 28, 2015 at 6:12 AM, Takashi Iwai <ti...@suse.de> wrote:
>> > [I resent this since the previous mail didn't go out properly, as it
>> >  seems; apologies if you already read it, please disregard]
>> >
>> > Hi,
>> >
>> > I noticed that NFS access from my workstation slowed down drastically,
>> > almost stalls, with the fresh 4.3-rc3.  There are no particular kernel
>> > errors / warnings.
>> >
>> > Then I performed git section, and it leaded to the commit:
>> > 6ae459bdaaeebc632b16e54dcbabb490c6931d61
>> > skbuff: Fix skb checksum flag on skb pull
>> >
>> > Reverting this commit from 4.3-rc3 fixed the issue indeed.
>> >
>> > Could you take a look at this?  I added Trond to Cc in case he might
>> > already know of it.
>> >
>> I send out fix for similar issue. Can you try the posted patch.
>> https://patchwork.ozlabs.org/patch/523632/
>
> Yes, the patch fixes the problem, thanks.  Feel free to take my
> tested-by tag:
>   Tested-by: Takashi Iwai <ti...@suse.de>
>
Thanks for testing the patch.

> But I guess the real fix is only the first chunk and the latter is
> nothing but a cleanup?  If so, it'd be better to split it.
>

Both chunks are required for the fix.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 4.3-rc3 Regression: NFS access stall by commit 6ae459bdaaee

2015-09-28 Thread Pravin Shelar
On Mon, Sep 28, 2015 at 6:12 AM, Takashi Iwai  wrote:
> [I resent this since the previous mail didn't go out properly, as it
>  seems; apologies if you already read it, please disregard]
>
> Hi,
>
> I noticed that NFS access from my workstation slowed down drastically,
> almost stalls, with the fresh 4.3-rc3.  There are no particular kernel
> errors / warnings.
>
> Then I performed git section, and it leaded to the commit:
> 6ae459bdaaeebc632b16e54dcbabb490c6931d61
> skbuff: Fix skb checksum flag on skb pull
>
> Reverting this commit from 4.3-rc3 fixed the issue indeed.
>
> Could you take a look at this?  I added Trond to Cc in case he might
> already know of it.
>
I send out fix for similar issue. Can you try the posted patch.
https://patchwork.ozlabs.org/patch/523632/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 4.3-rc3 Regression: NFS access stall by commit 6ae459bdaaee

2015-09-28 Thread Pravin Shelar
On Mon, Sep 28, 2015 at 6:12 AM, Takashi Iwai  wrote:
> [I resent this since the previous mail didn't go out properly, as it
>  seems; apologies if you already read it, please disregard]
>
> Hi,
>
> I noticed that NFS access from my workstation slowed down drastically,
> almost stalls, with the fresh 4.3-rc3.  There are no particular kernel
> errors / warnings.
>

I have seen error reports related to IPv6 traffic which I am
debugging. Are you trying access NFS over IPv6?


> Then I performed git section, and it leaded to the commit:
> 6ae459bdaaeebc632b16e54dcbabb490c6931d61
> skbuff: Fix skb checksum flag on skb pull
>
> Reverting this commit from 4.3-rc3 fixed the issue indeed.
>
> Could you take a look at this?  I added Trond to Cc in case he might
> already know of it.
>
>
> thanks,
>
> Takashi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 4.3-rc3 Regression: NFS access stall by commit 6ae459bdaaee

2015-09-28 Thread Pravin Shelar
On Mon, Sep 28, 2015 at 6:12 AM, Takashi Iwai  wrote:
> [I resent this since the previous mail didn't go out properly, as it
>  seems; apologies if you already read it, please disregard]
>
> Hi,
>
> I noticed that NFS access from my workstation slowed down drastically,
> almost stalls, with the fresh 4.3-rc3.  There are no particular kernel
> errors / warnings.
>

I have seen error reports related to IPv6 traffic which I am
debugging. Are you trying access NFS over IPv6?


> Then I performed git section, and it leaded to the commit:
> 6ae459bdaaeebc632b16e54dcbabb490c6931d61
> skbuff: Fix skb checksum flag on skb pull
>
> Reverting this commit from 4.3-rc3 fixed the issue indeed.
>
> Could you take a look at this?  I added Trond to Cc in case he might
> already know of it.
>
>
> thanks,
>
> Takashi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 4.3-rc3 Regression: NFS access stall by commit 6ae459bdaaee

2015-09-28 Thread Pravin Shelar
On Mon, Sep 28, 2015 at 6:12 AM, Takashi Iwai  wrote:
> [I resent this since the previous mail didn't go out properly, as it
>  seems; apologies if you already read it, please disregard]
>
> Hi,
>
> I noticed that NFS access from my workstation slowed down drastically,
> almost stalls, with the fresh 4.3-rc3.  There are no particular kernel
> errors / warnings.
>
> Then I performed git section, and it leaded to the commit:
> 6ae459bdaaeebc632b16e54dcbabb490c6931d61
> skbuff: Fix skb checksum flag on skb pull
>
> Reverting this commit from 4.3-rc3 fixed the issue indeed.
>
> Could you take a look at this?  I added Trond to Cc in case he might
> already know of it.
>
I send out fix for similar issue. Can you try the posted patch.
https://patchwork.ozlabs.org/patch/523632/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 08/38] openvswitch: fix handling result of ipv6_skip_exthdr

2015-09-21 Thread Pravin Shelar
On Mon, Sep 21, 2015 at 6:33 AM, Andrzej Hajda  wrote:
> The function can return negative value.
>
> The problem has been detected using proposed semantic patch
> scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci [1].
>
> [1]: http://permalink.gmane.org/gmane.linux.kernel/2038576
>
> Signed-off-by: Andrzej Hajda 
> ---
>  net/openvswitch/conntrack.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index 002a755..fde3391 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -253,7 +253,7 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 proto)
> const struct nf_conntrack_helper *helper;
> const struct nf_conn_help *help;
> enum ip_conntrack_info ctinfo;
> -   unsigned int protoff;
> +   int protoff;
> struct nf_conn *ct;
>

A patch is already pushed to the net tree to fix this issue.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 08/38] openvswitch: fix handling result of ipv6_skip_exthdr

2015-09-21 Thread Pravin Shelar
On Mon, Sep 21, 2015 at 6:33 AM, Andrzej Hajda  wrote:
> The function can return negative value.
>
> The problem has been detected using proposed semantic patch
> scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci [1].
>
> [1]: http://permalink.gmane.org/gmane.linux.kernel/2038576
>
> Signed-off-by: Andrzej Hajda 
> ---
>  net/openvswitch/conntrack.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index 002a755..fde3391 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -253,7 +253,7 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 proto)
> const struct nf_conntrack_helper *helper;
> const struct nf_conn_help *help;
> enum ip_conntrack_info ctinfo;
> -   unsigned int protoff;
> +   int protoff;
> struct nf_conn *ct;
>

A patch is already pushed to the net tree to fix this issue.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] openvswitch: Fix IPv6 exthdr handling with ct helpers.

2015-09-14 Thread Pravin Shelar
On Mon, Sep 14, 2015 at 11:14 AM, Joe Stringer  wrote:
> Static code analysis reveals the following bug:
>
> net/openvswitch/conntrack.c:281 ovs_ct_helper()
> warn: unsigned 'protoff' is never less than zero.
>
> This signedness bug breaks error handling for IPv6 extension headers when
> using conntrack helpers. Fix the error by using a local signed variable.
>
> Fixes:  cae3a2627520: "openvswitch: Allow attaching helpers to ct
> action"
> Reported-by: Dan Carpenter 
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] openvswitch: Fix IPv6 exthdr handling with ct helpers.

2015-09-14 Thread Pravin Shelar
On Mon, Sep 14, 2015 at 11:14 AM, Joe Stringer  wrote:
> Static code analysis reveals the following bug:
>
> net/openvswitch/conntrack.c:281 ovs_ct_helper()
> warn: unsigned 'protoff' is never less than zero.
>
> This signedness bug breaks error handling for IPv6 extension headers when
> using conntrack helpers. Fix the error by using a local signed variable.
>
> Fixes:  cae3a2627520: "openvswitch: Allow attaching helpers to ct
> action"
> Reported-by: Dan Carpenter 
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] openvswitch: Fix dependency on IPv6 defrag.

2015-09-11 Thread Pravin Shelar
On Fri, Sep 11, 2015 at 3:01 PM, Joe Stringer  wrote:
> When NF_CONNTRACK is built-in, NF_DEFRAG_IPV6 is a module, and
> OPENVSWITCH is built-in, the following build error would occur:
>
> net/built-in.o: In function `ovs_ct_execute':
> (.text+0x10f587): undefined reference to `nf_ct_frag6_gather'
>
> Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
> Reported-by: Jim Davis 
> Signed-off-by: Joe Stringer 

Looks good
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] openvswitch: Fix dependency on IPv6 defrag.

2015-09-11 Thread Pravin Shelar
On Fri, Sep 11, 2015 at 3:01 PM, Joe Stringer  wrote:
> When NF_CONNTRACK is built-in, NF_DEFRAG_IPV6 is a module, and
> OPENVSWITCH is built-in, the following build error would occur:
>
> net/built-in.o: In function `ovs_ct_execute':
> (.text+0x10f587): undefined reference to `nf_ct_frag6_gather'
>
> Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
> Reported-by: Jim Davis 
> Signed-off-by: Joe Stringer 

Looks good
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] openvswitch: Remove conntrack Kconfig option.

2015-09-04 Thread Pravin Shelar
On Fri, Sep 4, 2015 at 1:07 PM, Joe Stringer  wrote:
> There's no particular desire to have conntrack action support in Open
> vSwitch as an independently configurable bit, rather just to ensure
> there is not a hard dependency. This exposed option doesn't accurately
> reflect the conntrack dependency when enabled, so simplify this by
> removing the option. Compile the support if NF_CONNTRACK is enabled.
>
> Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] openvswitch: Remove conntrack Kconfig option.

2015-09-04 Thread Pravin Shelar
On Fri, Sep 4, 2015 at 1:07 PM, Joe Stringer  wrote:
> There's no particular desire to have conntrack action support in Open
> vSwitch as an independently configurable bit, rather just to ensure
> there is not a hard dependency. This exposed option doesn't accurately
> reflect the conntrack dependency when enabled, so simplify this by
> removing the option. Compile the support if NF_CONNTRACK is enabled.
>
> Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv6 net-next 05/10] openvswitch: Add conntrack action

2015-08-26 Thread Pravin Shelar
On Wed, Aug 26, 2015 at 11:31 AM, Joe Stringer  wrote:
> Expose the kernel connection tracker via OVS. Userspace components can
> make use of the CT action to populate the connection state (ct_state)
> field for a flow. This state can be subsequently matched.
>
> Exposed connection states are OVS_CS_F_*:
> - NEW (0x01) - Beginning of a new connection.
> - ESTABLISHED (0x02) - Part of an existing connection.
> - RELATED (0x04) - Related to an established connection.
> - INVALID (0x20) - Could not track the connection for this packet.
> - REPLY_DIR (0x40) - This packet is in the reply direction for the flow.
> - TRACKED (0x80) - This packet has been sent through conntrack.
>
> When the CT action is executed by itself, it will send the packet
> through the connection tracker and populate the ct_state field with one
> or more of the connection state flags above. The CT action will always
> set the TRACKED bit.
>
> When the COMMIT flag is passed to the conntrack action, this specifies
> that information about the connection should be stored. This allows
> subsequent packets for the same (or related) connections to be
> correlated with this connection. Sending subsequent packets for the
> connection through conntrack allows the connection tracker to consider
> the packets as ESTABLISHED, RELATED, and/or REPLY_DIR.
>
> The CT action may optionally take a zone to track the flow within. This
> allows connections with the same 5-tuple to be kept logically separate
> from connections in other zones. If the zone is specified, then the
> "ct_zone" match field will be subsequently populated with the zone id.
>
> IP fragments are handled by transparently assembling them as part of the
> CT action. The maximum received unit (MRU) size is tracked so that
> refragmentation can occur during output.
>
> IP frag handling contributed by Andy Zhou.
>
> Signed-off-by: Joe Stringer 
> Signed-off-by: Justin Pettit 
> Signed-off-by: Andy Zhou 
> Acked-by: Thomas Graf 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv6 net-next 05/10] openvswitch: Add conntrack action

2015-08-26 Thread Pravin Shelar
On Wed, Aug 26, 2015 at 11:31 AM, Joe Stringer joestrin...@nicira.com wrote:
 Expose the kernel connection tracker via OVS. Userspace components can
 make use of the CT action to populate the connection state (ct_state)
 field for a flow. This state can be subsequently matched.

 Exposed connection states are OVS_CS_F_*:
 - NEW (0x01) - Beginning of a new connection.
 - ESTABLISHED (0x02) - Part of an existing connection.
 - RELATED (0x04) - Related to an established connection.
 - INVALID (0x20) - Could not track the connection for this packet.
 - REPLY_DIR (0x40) - This packet is in the reply direction for the flow.
 - TRACKED (0x80) - This packet has been sent through conntrack.

 When the CT action is executed by itself, it will send the packet
 through the connection tracker and populate the ct_state field with one
 or more of the connection state flags above. The CT action will always
 set the TRACKED bit.

 When the COMMIT flag is passed to the conntrack action, this specifies
 that information about the connection should be stored. This allows
 subsequent packets for the same (or related) connections to be
 correlated with this connection. Sending subsequent packets for the
 connection through conntrack allows the connection tracker to consider
 the packets as ESTABLISHED, RELATED, and/or REPLY_DIR.

 The CT action may optionally take a zone to track the flow within. This
 allows connections with the same 5-tuple to be kept logically separate
 from connections in other zones. If the zone is specified, then the
 ct_zone match field will be subsequently populated with the zone id.

 IP fragments are handled by transparently assembling them as part of the
 CT action. The maximum received unit (MRU) size is tracked so that
 refragmentation can occur during output.

 IP frag handling contributed by Andy Zhou.

 Signed-off-by: Joe Stringer joestrin...@nicira.com
 Signed-off-by: Justin Pettit jpet...@nicira.com
 Signed-off-by: Andy Zhou az...@nicira.com
 Acked-by: Thomas Graf tg...@suug.ch

Acked-by: Pravin B Shelar pshe...@nicira.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv5 net-next 10/10] openvswitch: Allow attaching helpers to ct action

2015-08-25 Thread Pravin Shelar
On Mon, Aug 24, 2015 at 5:32 PM, Joe Stringer  wrote:
> Add support for using conntrack helpers to assist protocol detection.
> The new OVS_CT_ATTR_HELPER attribute of the CT action specifies a helper
> to be used for this connection. If no helper is specified, then helpers
> will be automatically applied as per the sysctl configuration of
> net.netfilter.nf_conntrack_helper.
>
> The helper may be specified as part of the conntrack action, eg:
> ct(helper=ftp). Initial packets for related connections should be
> committed to allow later packets for the flow to be considered
> established.
>
> Example ovs-ofctl flows allowing FTP connections from ports 1->2:
> in_port=1,tcp,action=ct(helper=ftp,commit),2
> in_port=2,tcp,ct_state=-trk,action=ct(recirc)
> in_port=2,tcp,ct_state=+trk-new+est,action=1
> in_port=2,tcp,ct_state=+trk+rel,action=1
>
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv5 net-next 09/10] openvswitch: Allow matching on conntrack label

2015-08-25 Thread Pravin Shelar
On Mon, Aug 24, 2015 at 5:32 PM, Joe Stringer  wrote:
> Allow matching and setting the ct_label field. As with ct_mark, this is
> populated by executing the CT action. The label field may be modified by
> specifying a label and mask nested under the CT action. It is stored as
> metadata attached to the connection. Label modification occurs after
> lookup, and will only persist when the conntrack entry is committed by
> providing the COMMIT flag to the CT action. Labels are currently fixed
> to 128 bits in size.
>
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv5 net-next 06/10] openvswitch: Allow matching on conntrack mark

2015-08-25 Thread Pravin Shelar
On Mon, Aug 24, 2015 at 5:32 PM, Joe Stringer  wrote:
> Allow matching and setting the ct_mark field. As with ct_state and
> ct_zone, these fields are populated when the CT action is executed. To
> write to this field, a value and mask can be specified as a nested
> attribute under the CT action. This data is stored with the conntrack
> entry, and is executed after the lookup occurs for the CT action. The
> conntrack entry itself must be committed using the COMMIT flag in the CT
> action flags for this change to persist.
>
> Signed-off-by: Justin Pettit 
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv5 net-next 05/10] openvswitch: Add conntrack action

2015-08-25 Thread Pravin Shelar
On Mon, Aug 24, 2015 at 5:32 PM, Joe Stringer  wrote:
> Expose the kernel connection tracker via OVS. Userspace components can
> make use of the CT action to populate the connection state (ct_state)
> field for a flow. This state can be subsequently matched.
>
> Exposed connection states are OVS_CS_F_*:
> - NEW (0x01) - Beginning of a new connection.
> - ESTABLISHED (0x02) - Part of an existing connection.
> - RELATED (0x04) - Related to an established connection.
> - INVALID (0x20) - Could not track the connection for this packet.
> - REPLY_DIR (0x40) - This packet is in the reply direction for the flow.
> - TRACKED (0x80) - This packet has been sent through conntrack.
>
> When the CT action is executed by itself, it will send the packet
> through the connection tracker and populate the ct_state field with one
> or more of the connection state flags above. The CT action will always
> set the TRACKED bit.
>
> When the COMMIT flag is passed to the conntrack action, this specifies
> that information about the connection should be stored. This allows
> subsequent packets for the same (or related) connections to be
> correlated with this connection. Sending subsequent packets for the
> connection through conntrack allows the connection tracker to consider
> the packets as ESTABLISHED, RELATED, and/or REPLY_DIR.
>
> The CT action may optionally take a zone to track the flow within. This
> allows connections with the same 5-tuple to be kept logically separate
> from connections in other zones. If the zone is specified, then the
> "ct_zone" match field will be subsequently populated with the zone id.
>
> IP fragments are handled by transparently assembling them as part of the
> CT action. The maximum received unit (MRU) size is tracked so that
> refragmentation can occur during output.
>
> IP frag handling contributed by Andy Zhou.
>
> Signed-off-by: Joe Stringer 
> Signed-off-by: Justin Pettit 
> Signed-off-by: Andy Zhou 

Patch looks good except one issue.

When I turn off conntrack support (
CONFIG_OPENVSWITCH_CONNTRACK) I got compilation error:

net/openvswitch/datapath.o: In function `ovs_ct_fill_key':

/home/pravin/linux/net-next/net/openvswitch/conntrack.h:69: multiple
definition of `ovs_ct_fill_key'

net/openvswitch/actions.o:/home/pravin/linux/net-next/net/openvswitch/conntrack.h:69:
first defined here

net/openvswitch/dp_notify.o: In function `ovs_ct_fill_key':

/home/pravin/linux/net-next/net/openvswitch/conntrack.h:69: multiple
definition of `ovs_ct_fill_key'

net/openvswitch/actions.o:/home/pravin/linux/net-next/net/openvswitch/conntrack.h:69:
first defined here
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv5 net-next 03/10] ipv6: Export nf_ct_frag6_gather()

2015-08-25 Thread Pravin Shelar
On Mon, Aug 24, 2015 at 5:32 PM, Joe Stringer  wrote:
> Signed-off-by: Joe Stringer 
> Acked-by: Thomas Graf 
> Acked-by: Pravin B Shelar 

When I apply this patch I see empty commit msg. I think you need to
add atleast a blank line after the subject.

> ---
> v4: Add ack.
> v5: No change.
> ---
>  net/ipv6/netfilter/nf_conntrack_reasm.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c 
> b/net/ipv6/netfilter/nf_conntrack_reasm.c
> index 6d02498..701cd2b 100644
> --- a/net/ipv6/netfilter/nf_conntrack_reasm.c
> +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
> @@ -633,6 +633,7 @@ ret_orig:
> kfree_skb(clone);
> return skb;
>  }
> +EXPORT_SYMBOL_GPL(nf_ct_frag6_gather);
>
>  void nf_ct_frag6_consume_orig(struct sk_buff *skb)
>  {
> --
> 2.1.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv5 net-next 03/10] ipv6: Export nf_ct_frag6_gather()

2015-08-25 Thread Pravin Shelar
On Mon, Aug 24, 2015 at 5:32 PM, Joe Stringer joestrin...@nicira.com wrote:
 Signed-off-by: Joe Stringer joestrin...@nicira.com
 Acked-by: Thomas Graf tg...@suug.ch
 Acked-by: Pravin B Shelar pshe...@nicira.com

When I apply this patch I see empty commit msg. I think you need to
add atleast a blank line after the subject.

 ---
 v4: Add ack.
 v5: No change.
 ---
  net/ipv6/netfilter/nf_conntrack_reasm.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c 
 b/net/ipv6/netfilter/nf_conntrack_reasm.c
 index 6d02498..701cd2b 100644
 --- a/net/ipv6/netfilter/nf_conntrack_reasm.c
 +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
 @@ -633,6 +633,7 @@ ret_orig:
 kfree_skb(clone);
 return skb;
  }
 +EXPORT_SYMBOL_GPL(nf_ct_frag6_gather);

  void nf_ct_frag6_consume_orig(struct sk_buff *skb)
  {
 --
 2.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv5 net-next 05/10] openvswitch: Add conntrack action

2015-08-25 Thread Pravin Shelar
On Mon, Aug 24, 2015 at 5:32 PM, Joe Stringer joestrin...@nicira.com wrote:
 Expose the kernel connection tracker via OVS. Userspace components can
 make use of the CT action to populate the connection state (ct_state)
 field for a flow. This state can be subsequently matched.

 Exposed connection states are OVS_CS_F_*:
 - NEW (0x01) - Beginning of a new connection.
 - ESTABLISHED (0x02) - Part of an existing connection.
 - RELATED (0x04) - Related to an established connection.
 - INVALID (0x20) - Could not track the connection for this packet.
 - REPLY_DIR (0x40) - This packet is in the reply direction for the flow.
 - TRACKED (0x80) - This packet has been sent through conntrack.

 When the CT action is executed by itself, it will send the packet
 through the connection tracker and populate the ct_state field with one
 or more of the connection state flags above. The CT action will always
 set the TRACKED bit.

 When the COMMIT flag is passed to the conntrack action, this specifies
 that information about the connection should be stored. This allows
 subsequent packets for the same (or related) connections to be
 correlated with this connection. Sending subsequent packets for the
 connection through conntrack allows the connection tracker to consider
 the packets as ESTABLISHED, RELATED, and/or REPLY_DIR.

 The CT action may optionally take a zone to track the flow within. This
 allows connections with the same 5-tuple to be kept logically separate
 from connections in other zones. If the zone is specified, then the
 ct_zone match field will be subsequently populated with the zone id.

 IP fragments are handled by transparently assembling them as part of the
 CT action. The maximum received unit (MRU) size is tracked so that
 refragmentation can occur during output.

 IP frag handling contributed by Andy Zhou.

 Signed-off-by: Joe Stringer joestrin...@nicira.com
 Signed-off-by: Justin Pettit jpet...@nicira.com
 Signed-off-by: Andy Zhou az...@nicira.com

Patch looks good except one issue.

When I turn off conntrack support (
CONFIG_OPENVSWITCH_CONNTRACK) I got compilation error:

net/openvswitch/datapath.o: In function `ovs_ct_fill_key':

/home/pravin/linux/net-next/net/openvswitch/conntrack.h:69: multiple
definition of `ovs_ct_fill_key'

net/openvswitch/actions.o:/home/pravin/linux/net-next/net/openvswitch/conntrack.h:69:
first defined here

net/openvswitch/dp_notify.o: In function `ovs_ct_fill_key':

/home/pravin/linux/net-next/net/openvswitch/conntrack.h:69: multiple
definition of `ovs_ct_fill_key'

net/openvswitch/actions.o:/home/pravin/linux/net-next/net/openvswitch/conntrack.h:69:
first defined here
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv5 net-next 09/10] openvswitch: Allow matching on conntrack label

2015-08-25 Thread Pravin Shelar
On Mon, Aug 24, 2015 at 5:32 PM, Joe Stringer joestrin...@nicira.com wrote:
 Allow matching and setting the ct_label field. As with ct_mark, this is
 populated by executing the CT action. The label field may be modified by
 specifying a label and mask nested under the CT action. It is stored as
 metadata attached to the connection. Label modification occurs after
 lookup, and will only persist when the conntrack entry is committed by
 providing the COMMIT flag to the CT action. Labels are currently fixed
 to 128 bits in size.

 Signed-off-by: Joe Stringer joestrin...@nicira.com

Acked-by: Pravin B Shelar pshe...@nicira.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv5 net-next 06/10] openvswitch: Allow matching on conntrack mark

2015-08-25 Thread Pravin Shelar
On Mon, Aug 24, 2015 at 5:32 PM, Joe Stringer joestrin...@nicira.com wrote:
 Allow matching and setting the ct_mark field. As with ct_state and
 ct_zone, these fields are populated when the CT action is executed. To
 write to this field, a value and mask can be specified as a nested
 attribute under the CT action. This data is stored with the conntrack
 entry, and is executed after the lookup occurs for the CT action. The
 conntrack entry itself must be committed using the COMMIT flag in the CT
 action flags for this change to persist.

 Signed-off-by: Justin Pettit jpet...@nicira.com
 Signed-off-by: Joe Stringer joestrin...@nicira.com

Acked-by: Pravin B Shelar pshe...@nicira.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv5 net-next 10/10] openvswitch: Allow attaching helpers to ct action

2015-08-25 Thread Pravin Shelar
On Mon, Aug 24, 2015 at 5:32 PM, Joe Stringer joestrin...@nicira.com wrote:
 Add support for using conntrack helpers to assist protocol detection.
 The new OVS_CT_ATTR_HELPER attribute of the CT action specifies a helper
 to be used for this connection. If no helper is specified, then helpers
 will be automatically applied as per the sysctl configuration of
 net.netfilter.nf_conntrack_helper.

 The helper may be specified as part of the conntrack action, eg:
 ct(helper=ftp). Initial packets for related connections should be
 committed to allow later packets for the flow to be considered
 established.

 Example ovs-ofctl flows allowing FTP connections from ports 1-2:
 in_port=1,tcp,action=ct(helper=ftp,commit),2
 in_port=2,tcp,ct_state=-trk,action=ct(recirc)
 in_port=2,tcp,ct_state=+trk-new+est,action=1
 in_port=2,tcp,ct_state=+trk+rel,action=1

 Signed-off-by: Joe Stringer joestrin...@nicira.com

Acked-by: Pravin B Shelar pshe...@nicira.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 net-next 10/10] openvswitch: Allow attaching helpers to ct action

2015-08-21 Thread Pravin Shelar
On Thu, Aug 20, 2015 at 5:47 PM, Joe Stringer  wrote:
> On 19 August 2015 at 15:57, Pravin Shelar  wrote:
>> On Tue, Aug 18, 2015 at 4:39 PM, Joe Stringer  wrote:
>>> Add support for using conntrack helpers to assist protocol detection.
>>> The new OVS_CT_ATTR_HELPER attribute of the ct action specifies a helper
>>> to be used for this connection.
>>>
>>> Example ODP flows allowing FTP connections from ports 1->2:
>>> in_port=1,tcp,action=ct(helper=ftp,commit),2
>>> in_port=2,tcp,ct_state=-trk,action=ct(),recirc(1)
>>> recirc_id=1,in_port=2,tcp,ct_state=+trk-new+est,action=1
>>> recirc_id=1,in_port=2,tcp,ct_state=+trk+rel,action=1
>>>
>>> Signed-off-by: Joe Stringer 
>>> ---
>>> v2-v3: No change.
>>> v4: Change error code for unknown helper ENOENT->EINVAL.
>>
>> I got following compilation warning :
>>
>> net/openvswitch/conntrack.c:352:42: error: incompatible types in
>> comparison expression (different address spaces)
>
> Is this made available via another sparse flag? It looks like it's
> related to the __rcu as you've mentioned below, but I'm not seeing
> this (latest sparse, gcc-4.9.2)
>
You need to enable RCU space checker in kernel config.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 net-next 10/10] openvswitch: Allow attaching helpers to ct action

2015-08-21 Thread Pravin Shelar
On Thu, Aug 20, 2015 at 5:47 PM, Joe Stringer joestrin...@nicira.com wrote:
 On 19 August 2015 at 15:57, Pravin Shelar pshe...@nicira.com wrote:
 On Tue, Aug 18, 2015 at 4:39 PM, Joe Stringer joestrin...@nicira.com wrote:
 Add support for using conntrack helpers to assist protocol detection.
 The new OVS_CT_ATTR_HELPER attribute of the ct action specifies a helper
 to be used for this connection.

 Example ODP flows allowing FTP connections from ports 1-2:
 in_port=1,tcp,action=ct(helper=ftp,commit),2
 in_port=2,tcp,ct_state=-trk,action=ct(),recirc(1)
 recirc_id=1,in_port=2,tcp,ct_state=+trk-new+est,action=1
 recirc_id=1,in_port=2,tcp,ct_state=+trk+rel,action=1

 Signed-off-by: Joe Stringer joestrin...@nicira.com
 ---
 v2-v3: No change.
 v4: Change error code for unknown helper ENOENT-EINVAL.

 I got following compilation warning :

 net/openvswitch/conntrack.c:352:42: error: incompatible types in
 comparison expression (different address spaces)

 Is this made available via another sparse flag? It looks like it's
 related to the __rcu as you've mentioned below, but I'm not seeing
 this (latest sparse, gcc-4.9.2)

You need to enable RCU space checker in kernel config.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 net-next 09/10] openvswitch: Allow matching on conntrack label

2015-08-20 Thread Pravin Shelar
On Thu, Aug 20, 2015 at 12:13 PM, Joe Stringer  wrote:
> On 20 August 2015 at 08:45, Pravin Shelar  wrote:
>> On Wed, Aug 19, 2015 at 4:04 PM, Joe Stringer  wrote:
>>> Thanks for the review,
>>>
>>> On 19 August 2015 at 14:24, Pravin Shelar  wrote:
>>>> On Tue, Aug 18, 2015 at 4:39 PM, Joe Stringer  
>>>> wrote:
>>>>> Allow matching and setting the conntrack label field. As with ct_mark,
>>>>> this is populated by executing the CT action, and is a writable field.
>>>>> Specifying a label and optional mask allows the label to be modified,
>>>>> which takes effect on the entry found by the lookup of the CT action.
>>>>>
>>>>> E.g.: actions:ct(zone=1,label=1)
>>>>>
>>>>> This will perform conntrack lookup in zone 1, then modify the label for
>>>>> that entry. The conntrack entry itself must be committed using the
>>>>> "commit" flag in the conntrack action flags for this change to persist.
>>>>>
>>
>>>
>>>>> return false;
>>>>>  }
>>>>> @@ -508,8 +601,12 @@ void ovs_ct_free_action(const struct nlattr *a)
>>>>>
>>>>>  void ovs_ct_init(struct net *net, struct ovs_ct_perdp_data *data)
>>>>>  {
>>>>> +   unsigned int n_bits = sizeof(struct ovs_key_ct_label) * 
>>>>> BITS_PER_BYTE;
>>>>> +
>>>>> data->xt_v4 = !nf_ct_l3proto_try_module_get(PF_INET);
>>>>> data->xt_v6 = !nf_ct_l3proto_try_module_get(PF_INET6);
>>>>> +   if (nf_connlabels_get(net, n_bits);
>>>>> +   OVS_NLERR(true, "Failed to set connlabel length");
>>>>>  }
>>>>>
>>>> In case of error should we reject conntrack label actions? Otherwise
>>>> user will never see any error. But action could drop packets.
>>>
>>> I suspect that currently errors would be seen from ovs_ct_set_label():
>>>
>>>>...if (!cl || cl->words * sizeof(long) < OVS_CT_LABEL_LEN)
>>>>...>...return -ENOSPC;
>>>
>>> So, for cmd_execute, userspace would see this. For regular handling,
>>> pipeline processing would stop (so, drop).
>>>
>>> However, I agree it would be more friendly to have the attribute
>>> rejected up-front. Just means we'll pass the datapath all the way
>>> down:
>>> ovs_nla_get_match()
>>> --> ovs_key_from_nlattrs()
>>> --> metadata_from_nlattrs()
>>> --> ovs_ct_verify()
>
> Incidentally, we generally don't have the datapath by this point
> (ovs_nla_get_match()). There'd need to be a bit of rearranging in the
> ovs_flow_cmd_* functions, which would include holding the locks for
> longer. Given that the two most common cases are that either A) The
> kernel is configured with connlabel support, and built with support
> for at least 128 bits of label, or B) the kernel is configured without
> connlabel, and this is handled already in ovs_ct_verify(), I don't
> think it's worth making this particular change.

Actually I do not see need for this to be per datapath property.
infact there is no need to have struct ovs_ct_perdp_data.
ovs_ct_init can be called from ovs-module init.
nf-connlabel bit length is per net-namespace property. So
nf_connlabels_get()  should be called from ovs namespace init. This
way you can move xt_label flag to ovs_net. ovs_net can be accessed
from ovs_flow_cmd_* functions.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 net-next 09/10] openvswitch: Allow matching on conntrack label

2015-08-20 Thread Pravin Shelar
On Wed, Aug 19, 2015 at 4:04 PM, Joe Stringer  wrote:
> Thanks for the review,
>
> On 19 August 2015 at 14:24, Pravin Shelar  wrote:
>> On Tue, Aug 18, 2015 at 4:39 PM, Joe Stringer  wrote:
>>> Allow matching and setting the conntrack label field. As with ct_mark,
>>> this is populated by executing the CT action, and is a writable field.
>>> Specifying a label and optional mask allows the label to be modified,
>>> which takes effect on the entry found by the lookup of the CT action.
>>>
>>> E.g.: actions:ct(zone=1,label=1)
>>>
>>> This will perform conntrack lookup in zone 1, then modify the label for
>>> that entry. The conntrack entry itself must be committed using the
>>> "commit" flag in the conntrack action flags for this change to persist.
>>>

>
>>> return false;
>>>  }
>>> @@ -508,8 +601,12 @@ void ovs_ct_free_action(const struct nlattr *a)
>>>
>>>  void ovs_ct_init(struct net *net, struct ovs_ct_perdp_data *data)
>>>  {
>>> +   unsigned int n_bits = sizeof(struct ovs_key_ct_label) * 
>>> BITS_PER_BYTE;
>>> +
>>> data->xt_v4 = !nf_ct_l3proto_try_module_get(PF_INET);
>>> data->xt_v6 = !nf_ct_l3proto_try_module_get(PF_INET6);
>>> +   if (nf_connlabels_get(net, n_bits);
>>> +   OVS_NLERR(true, "Failed to set connlabel length");
>>>  }
>>>
>> In case of error should we reject conntrack label actions? Otherwise
>> user will never see any error. But action could drop packets.
>
> I suspect that currently errors would be seen from ovs_ct_set_label():
>
>>...if (!cl || cl->words * sizeof(long) < OVS_CT_LABEL_LEN)
>>...>...return -ENOSPC;
>
> So, for cmd_execute, userspace would see this. For regular handling,
> pipeline processing would stop (so, drop).
>
> However, I agree it would be more friendly to have the attribute
> rejected up-front. Just means we'll pass the datapath all the way
> down:
> ovs_nla_get_match()
> --> ovs_key_from_nlattrs()
> --> metadata_from_nlattrs()
> --> ovs_ct_verify()
>
> And rather than simply reporting the error in ovs_ct_init() there,
> we'll store the success condition something like:
>
> @@ -721,8 +721,12 @@ void ovs_ct_init(struct net *net, struct
> ovs_ct_perdp_data *data)
>
> data->xt_v4 = !nf_ct_l3proto_try_module_get(PF_INET);
> data->xt_v6 = !nf_ct_l3proto_try_module_get(PF_INET6);
> -   if (nf_connlabels_get(net, n_bits));
> +   if (nf_connlabels_get(net, n_bits)) {
> +   data->xt_label = false;
> OVS_NLERR(true, "Failed to set connlabel length");
> +   } else {
> +   data->xt_label = true;
> +   }
>  }
>
>  void ovs_ct_exit(struct net *net, struct ovs_ct_perdp_data *data)
>
> ovs_ct_exit() also needs to be updated to ensure that if this fails,
> we don't try to put the connlabel use back.

Looks good.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 net-next 09/10] openvswitch: Allow matching on conntrack label

2015-08-20 Thread Pravin Shelar
On Thu, Aug 20, 2015 at 12:13 PM, Joe Stringer joestrin...@nicira.com wrote:
 On 20 August 2015 at 08:45, Pravin Shelar pshe...@nicira.com wrote:
 On Wed, Aug 19, 2015 at 4:04 PM, Joe Stringer joestrin...@nicira.com wrote:
 Thanks for the review,

 On 19 August 2015 at 14:24, Pravin Shelar pshe...@nicira.com wrote:
 On Tue, Aug 18, 2015 at 4:39 PM, Joe Stringer joestrin...@nicira.com 
 wrote:
 Allow matching and setting the conntrack label field. As with ct_mark,
 this is populated by executing the CT action, and is a writable field.
 Specifying a label and optional mask allows the label to be modified,
 which takes effect on the entry found by the lookup of the CT action.

 E.g.: actions:ct(zone=1,label=1)

 This will perform conntrack lookup in zone 1, then modify the label for
 that entry. The conntrack entry itself must be committed using the
 commit flag in the conntrack action flags for this change to persist.



 return false;
  }
 @@ -508,8 +601,12 @@ void ovs_ct_free_action(const struct nlattr *a)

  void ovs_ct_init(struct net *net, struct ovs_ct_perdp_data *data)
  {
 +   unsigned int n_bits = sizeof(struct ovs_key_ct_label) * 
 BITS_PER_BYTE;
 +
 data-xt_v4 = !nf_ct_l3proto_try_module_get(PF_INET);
 data-xt_v6 = !nf_ct_l3proto_try_module_get(PF_INET6);
 +   if (nf_connlabels_get(net, n_bits);
 +   OVS_NLERR(true, Failed to set connlabel length);
  }

 In case of error should we reject conntrack label actions? Otherwise
 user will never see any error. But action could drop packets.

 I suspect that currently errors would be seen from ovs_ct_set_label():

...if (!cl || cl-words * sizeof(long)  OVS_CT_LABEL_LEN)
..return -ENOSPC;

 So, for cmd_execute, userspace would see this. For regular handling,
 pipeline processing would stop (so, drop).

 However, I agree it would be more friendly to have the attribute
 rejected up-front. Just means we'll pass the datapath all the way
 down:
 ovs_nla_get_match()
 -- ovs_key_from_nlattrs()
 -- metadata_from_nlattrs()
 -- ovs_ct_verify()

 Incidentally, we generally don't have the datapath by this point
 (ovs_nla_get_match()). There'd need to be a bit of rearranging in the
 ovs_flow_cmd_* functions, which would include holding the locks for
 longer. Given that the two most common cases are that either A) The
 kernel is configured with connlabel support, and built with support
 for at least 128 bits of label, or B) the kernel is configured without
 connlabel, and this is handled already in ovs_ct_verify(), I don't
 think it's worth making this particular change.

Actually I do not see need for this to be per datapath property.
infact there is no need to have struct ovs_ct_perdp_data.
ovs_ct_init can be called from ovs-module init.
nf-connlabel bit length is per net-namespace property. So
nf_connlabels_get()  should be called from ovs namespace init. This
way you can move xt_label flag to ovs_net. ovs_net can be accessed
from ovs_flow_cmd_* functions.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 net-next 09/10] openvswitch: Allow matching on conntrack label

2015-08-20 Thread Pravin Shelar
On Wed, Aug 19, 2015 at 4:04 PM, Joe Stringer joestrin...@nicira.com wrote:
 Thanks for the review,

 On 19 August 2015 at 14:24, Pravin Shelar pshe...@nicira.com wrote:
 On Tue, Aug 18, 2015 at 4:39 PM, Joe Stringer joestrin...@nicira.com wrote:
 Allow matching and setting the conntrack label field. As with ct_mark,
 this is populated by executing the CT action, and is a writable field.
 Specifying a label and optional mask allows the label to be modified,
 which takes effect on the entry found by the lookup of the CT action.

 E.g.: actions:ct(zone=1,label=1)

 This will perform conntrack lookup in zone 1, then modify the label for
 that entry. The conntrack entry itself must be committed using the
 commit flag in the conntrack action flags for this change to persist.



 return false;
  }
 @@ -508,8 +601,12 @@ void ovs_ct_free_action(const struct nlattr *a)

  void ovs_ct_init(struct net *net, struct ovs_ct_perdp_data *data)
  {
 +   unsigned int n_bits = sizeof(struct ovs_key_ct_label) * 
 BITS_PER_BYTE;
 +
 data-xt_v4 = !nf_ct_l3proto_try_module_get(PF_INET);
 data-xt_v6 = !nf_ct_l3proto_try_module_get(PF_INET6);
 +   if (nf_connlabels_get(net, n_bits);
 +   OVS_NLERR(true, Failed to set connlabel length);
  }

 In case of error should we reject conntrack label actions? Otherwise
 user will never see any error. But action could drop packets.

 I suspect that currently errors would be seen from ovs_ct_set_label():

...if (!cl || cl-words * sizeof(long)  OVS_CT_LABEL_LEN)
..return -ENOSPC;

 So, for cmd_execute, userspace would see this. For regular handling,
 pipeline processing would stop (so, drop).

 However, I agree it would be more friendly to have the attribute
 rejected up-front. Just means we'll pass the datapath all the way
 down:
 ovs_nla_get_match()
 -- ovs_key_from_nlattrs()
 -- metadata_from_nlattrs()
 -- ovs_ct_verify()

 And rather than simply reporting the error in ovs_ct_init() there,
 we'll store the success condition something like:

 @@ -721,8 +721,12 @@ void ovs_ct_init(struct net *net, struct
 ovs_ct_perdp_data *data)

 data-xt_v4 = !nf_ct_l3proto_try_module_get(PF_INET);
 data-xt_v6 = !nf_ct_l3proto_try_module_get(PF_INET6);
 -   if (nf_connlabels_get(net, n_bits));
 +   if (nf_connlabels_get(net, n_bits)) {
 +   data-xt_label = false;
 OVS_NLERR(true, Failed to set connlabel length);
 +   } else {
 +   data-xt_label = true;
 +   }
  }

  void ovs_ct_exit(struct net *net, struct ovs_ct_perdp_data *data)

 ovs_ct_exit() also needs to be updated to ensure that if this fails,
 we don't try to put the connlabel use back.

Looks good.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 net-next 10/10] openvswitch: Allow attaching helpers to ct action

2015-08-19 Thread Pravin Shelar
On Tue, Aug 18, 2015 at 4:39 PM, Joe Stringer  wrote:
> Add support for using conntrack helpers to assist protocol detection.
> The new OVS_CT_ATTR_HELPER attribute of the ct action specifies a helper
> to be used for this connection.
>
> Example ODP flows allowing FTP connections from ports 1->2:
> in_port=1,tcp,action=ct(helper=ftp,commit),2
> in_port=2,tcp,ct_state=-trk,action=ct(),recirc(1)
> recirc_id=1,in_port=2,tcp,ct_state=+trk-new+est,action=1
> recirc_id=1,in_port=2,tcp,ct_state=+trk+rel,action=1
>
> Signed-off-by: Joe Stringer 
> ---
> v2-v3: No change.
> v4: Change error code for unknown helper ENOENT->EINVAL.

I got following compilation warning :

net/openvswitch/conntrack.c:352:42: error: incompatible types in
comparison expression (different address spaces)

> ---
>  include/uapi/linux/openvswitch.h |   3 ++
>  net/openvswitch/conntrack.c  | 109 
> ++-
>  2 files changed, 110 insertions(+), 2 deletions(-)
>
...

> +static int ovs_ct_add_helper(struct ovs_conntrack_info *info, const char 
> *name,
> +const struct sw_flow_key *key, bool log)
> +{
> +   struct nf_conntrack_helper *helper;
> +   struct nf_conn_help *help;
> +
> +   helper = nf_conntrack_helper_try_module_get(name, info->family,
> +   key->ip.proto);
> +   if (!helper) {
> +   OVS_NLERR(log, "Unknown helper \"%s\"", name);
> +   return -EINVAL;
> +   }
> +
> +   help = nf_ct_helper_ext_add(info->ct, helper, GFP_KERNEL);
> +   if (!help) {
> +   module_put(helper->me);
> +   return -ENOMEM;
> +   }
> +
> +   help->helper = helper;
helper is rcu pointer so need to use rcu API to set the value. I know
it is not required here, but it is still cleaner to use the API.

> +   info->helper = helper;
> +   return 0;
> +}
> +
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 net-next 09/10] openvswitch: Allow matching on conntrack label

2015-08-19 Thread Pravin Shelar
On Tue, Aug 18, 2015 at 4:39 PM, Joe Stringer  wrote:
> Allow matching and setting the conntrack label field. As with ct_mark,
> this is populated by executing the CT action, and is a writable field.
> Specifying a label and optional mask allows the label to be modified,
> which takes effect on the entry found by the lookup of the CT action.
>
> E.g.: actions:ct(zone=1,label=1)
>
> This will perform conntrack lookup in zone 1, then modify the label for
> that entry. The conntrack entry itself must be committed using the
> "commit" flag in the conntrack action flags for this change to persist.
>
> Signed-off-by: Joe Stringer 
I got compilation error after applying this patch:
net/openvswitch/conntrack.c: In function ‘ovs_ct_init’:
net/openvswitch/conntrack.c:713: error: expected ‘)’ before ‘;’ token
net/openvswitch/conntrack.c:715: error: expected expression before ‘}’ token

> ---
> v2: Split out setting the connlabel size for the current namespace.
> v3: No change.
> v4: Only allow setting label via ct action.
> Update documentation.
> ---
...
>   type);
> @@ -432,6 +521,10 @@ bool ovs_ct_verify(enum ovs_key_attr attr)
> if (attr & OVS_KEY_ATTR_CT_MARK)
> return true;
>  #endif
> +#ifdef CONFIG_NF_CONNTRACK_LABELS
> +   if (attr & OVS_KEY_ATTR_CT_LABEL)
> +   return true;
> +#endif
>
OVS_KEY_ATTR_CT_LABEL is not bit field so bitwise AND operation does
not work here. This applies to all check done in this function.

> return false;
>  }
> @@ -508,8 +601,12 @@ void ovs_ct_free_action(const struct nlattr *a)
>
>  void ovs_ct_init(struct net *net, struct ovs_ct_perdp_data *data)
>  {
> +   unsigned int n_bits = sizeof(struct ovs_key_ct_label) * BITS_PER_BYTE;
> +
> data->xt_v4 = !nf_ct_l3proto_try_module_get(PF_INET);
> data->xt_v6 = !nf_ct_l3proto_try_module_get(PF_INET6);
> +   if (nf_connlabels_get(net, n_bits);
> +   OVS_NLERR(true, "Failed to set connlabel length");
>  }
>
In case of error should we reject conntrack label actions? Otherwise
user will never see any error. But action could drop packets.

>  void ovs_ct_exit(struct net *net, struct ovs_ct_perdp_data *data)
> @@ -518,4 +615,5 @@ void ovs_ct_exit(struct net *net, struct 
> ovs_ct_perdp_data *data)
> nf_ct_l3proto_module_put(PF_INET);
> if (data->xt_v6)
> nf_ct_l3proto_module_put(PF_INET6);
> +   nf_connlabels_put(net);
>  }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 net-next 08/10] netfilter: connlabels: Export setting connlabel length

2015-08-19 Thread Pravin Shelar
On Tue, Aug 18, 2015 at 4:39 PM, Joe Stringer  wrote:
> Add functions to change connlabel length into nf_conntrack_labels.c so
> they may be reused by other modules like OVS and nftables without
> needing to jump through xt_match_check() hoops.
>
> Suggested-by: Florian Westphal 
> Signed-off-by: Joe Stringer 
> Acked-by: Florian Westphal 
> ---
> v2: Protect connlabel modification with spinlock.
> Fix reference leak in error case.
> Style fixups.
> v3: No change.
> v4: Add ack.
> ---
Looks good to me.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   >