Re: [ovs-dev] [PATCH] pkt reassemble: fix kernel panic for ovs reassemble
Thanks Joe. BR, Wang Zhike -Original Message- From: Joe Stringer [mailto:j...@ovn.org] Sent: Saturday, July 22, 2017 2:29 AM To: 王志克 Cc: ovs dev; Ben Pfaff Subject: Re: [ovs-dev] [PATCH] pkt reassemble: fix kernel panic for ovs reassemble On 6 July 2017 at 13:57, Ben Pfaff wrote: > From: wangzhike > > Ovs and kernel stack would add frag_queue to same netns_frags list. > As result, ovs and kernel may access the fraq_queue without correct > lock. Also the struct ipq may be different on kernel(older than 4.3), > which leads to invalid pointer access. > > The fix creates specific netns_frags for ovs. > > Signed-off-by: wangzhike > --- Hi, Thanks a lot for your hard work on this. I know through the several revisions of private review we did, you considered several options for how to better structure this, and I don't see a straightforward way to improve it any further. I've run this patch dozens of times on a variety of kernel versions, and at this point I am at this point reasonably confident that it doesn't make the current fragmentation situation worse. Given that it fixes a problem that you have been hitting, I think that the best way for this to get more testing is for us to apply it. If there are subsequent future issues with this code, then we can always deal with that at a later time. Given the nature of bugs that affect this area of the code--that is to say, racy---I am not currently intending to backport this patch to earlier branches. It will however become part of the upcoming OVS 2.8 release. I applied this patch to master. Thanks, Joe ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] pkt reassemble: fix kernel panic for ovs reassemble
On 6 July 2017 at 13:57, Ben Pfaff wrote: > From: wangzhike > > Ovs and kernel stack would add frag_queue to same netns_frags list. > As result, ovs and kernel may access the fraq_queue without correct > lock. Also the struct ipq may be different on kernel(older than 4.3), > which leads to invalid pointer access. > > The fix creates specific netns_frags for ovs. > > Signed-off-by: wangzhike > --- Hi, Thanks a lot for your hard work on this. I know through the several revisions of private review we did, you considered several options for how to better structure this, and I don't see a straightforward way to improve it any further. I've run this patch dozens of times on a variety of kernel versions, and at this point I am at this point reasonably confident that it doesn't make the current fragmentation situation worse. Given that it fixes a problem that you have been hitting, I think that the best way for this to get more testing is for us to apply it. If there are subsequent future issues with this code, then we can always deal with that at a later time. Given the nature of bugs that affect this area of the code--that is to say, racy---I am not currently intending to backport this patch to earlier branches. It will however become part of the upcoming OVS 2.8 release. I applied this patch to master. Thanks, Joe ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] pkt reassemble: fix kernel panic for ovs reassemble
From: wangzhike Ovs and kernel stack would add frag_queue to same netns_frags list. As result, ovs and kernel may access the fraq_queue without correct lock. Also the struct ipq may be different on kernel(older than 4.3), which leads to invalid pointer access. The fix creates specific netns_frags for ovs. Signed-off-by: wangzhike --- This was originally posted at: https://github.com/openvswitch/ovs/pull/187 I'm reposting it to ovs-dev to make sure that it gets attention. datapath/datapath.c| 22 +++--- datapath/datapath.h| 6 ++ datapath/linux/compat/include/net/inet_frag.h | 18 - datapath/linux/compat/include/net/ip.h | 4 ++ .../include/net/netfilter/ipv6/nf_defrag_ipv6.h| 4 ++ datapath/linux/compat/inet_fragment.c | 83 -- datapath/linux/compat/ip_fragment.c| 66 ++--- datapath/linux/compat/nf_conntrack_reasm.c | 58 +-- 8 files changed, 138 insertions(+), 123 deletions(-) diff --git a/datapath/datapath.c b/datapath/datapath.c index c85029c067ca..82cad74b7972 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -2297,6 +2297,8 @@ static int __net_init ovs_init_net(struct net *net) INIT_LIST_HEAD(&ovs_net->dps); INIT_WORK(&ovs_net->dp_notify_work, ovs_dp_notify_wq); ovs_ct_init(net); + ovs_netns_frags_init(net); + ovs_netns_frags6_init(net); return 0; } @@ -2332,6 +2334,8 @@ static void __net_exit ovs_exit_net(struct net *dnet) struct net *net; LIST_HEAD(head); + ovs_netns_frags6_exit(dnet); + ovs_netns_frags_exit(dnet); ovs_ct_exit(dnet); ovs_lock(); list_for_each_entry_safe(dp, dp_next, &ovs_net->dps, list_node) @@ -2368,13 +2372,9 @@ static int __init dp_init(void) pr_info("Open vSwitch switching datapath %s\n", VERSION); - err = compat_init(); - if (err) - goto error; - err = action_fifos_init(); if (err) - goto error_compat_exit; + goto error; err = ovs_internal_dev_rtnl_link_register(); if (err) @@ -2392,10 +2392,14 @@ static int __init dp_init(void) if (err) goto error_vport_exit; - err = register_netdevice_notifier(&ovs_dp_device_notifier); + err = compat_init(); if (err) goto error_netns_exit; + err = register_netdevice_notifier(&ovs_dp_device_notifier); + if (err) + goto error_compat_exit; + err = ovs_netdev_init(); if (err) goto error_unreg_notifier; @@ -2410,6 +2414,8 @@ error_unreg_netdev: ovs_netdev_exit(); error_unreg_notifier: unregister_netdevice_notifier(&ovs_dp_device_notifier); +error_compat_exit: + compat_exit(); error_netns_exit: unregister_pernet_device(&ovs_net_ops); error_vport_exit: @@ -2420,8 +2426,6 @@ error_unreg_rtnl_link: ovs_internal_dev_rtnl_link_unregister(); error_action_fifos_exit: action_fifos_exit(); -error_compat_exit: - compat_exit(); error: return err; } @@ -2431,13 +2435,13 @@ static void dp_cleanup(void) dp_unregister_genl(ARRAY_SIZE(dp_genl_families)); ovs_netdev_exit(); unregister_netdevice_notifier(&ovs_dp_device_notifier); + compat_exit(); unregister_pernet_device(&ovs_net_ops); rcu_barrier(); ovs_vport_exit(); ovs_flow_exit(); ovs_internal_dev_rtnl_link_unregister(); action_fifos_exit(); - compat_exit(); } module_init(dp_init); diff --git a/datapath/datapath.h b/datapath/datapath.h index b835adac5bf9..88496257d486 100644 --- a/datapath/datapath.h +++ b/datapath/datapath.h @@ -141,6 +141,12 @@ struct ovs_net { /* Module reference for configuring conntrack. */ bool xt_label; + +#ifdef HAVE_INET_FRAG_LRU_MOVE + struct net *net; + struct netns_frags ipv4_frags; + struct netns_frags nf_frags; +#endif }; extern unsigned int ovs_net_id; diff --git a/datapath/linux/compat/include/net/inet_frag.h b/datapath/linux/compat/include/net/inet_frag.h index 01d79ad8147b..34078c80db0f 100644 --- a/datapath/linux/compat/include/net/inet_frag.h +++ b/datapath/linux/compat/include/net/inet_frag.h @@ -52,22 +52,4 @@ static inline int rpl_inet_frags_init(struct inet_frags *frags) #define inet_frags_init rpl_inet_frags_init #endif -#ifndef HAVE_CORRECT_MRU_HANDLING -/* We reuse the upstream inet_fragment.c common code for managing fragment - * stores, However we actually store the fragments within our own 'inet_frags' - * structures (in {ip_fragment,nf_conntrack_reasm}.c). When unloading the OVS - * kernel module, we need to flush all of the remaining fragments from these - * caches, or else we will panic with the following sequence of events: - * - * 1) A fragment for a packet a
Re: [ovs-dev] [PATCH] pkt reassemble: fix kernel panic for ovs reassemble
On 21 June 2017 at 18:54, 王志克 wrote: > Ovs and kernel stack would add frag_queue to same netns_frags list. > As result, ovs and kernel may access the fraq_queue without correct > lock. Also the struct ipq may be different on kernel(older than 4.3), > which leads to invalid pointer access. > > The fix creates specific netns_frags for ovs. > > Signed-off-by: wangzhike > --- Hi, It looks like the whitespace has been corrupted in this version of the patch that you sent, I cannot apply it. Probably your email client mistreats it when sending the email out. A reliable method to send patches correctly via email is to use the commandline client 'git send-email'. This is the preferred method. If you are unable to set that up, consider attaching the patch to the email (or send a pull request on GitHub). Cheers, Joe ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] pkt reassemble: fix kernel panic for ovs reassemble
Ovs and kernel stack would add frag_queue to same netns_frags list. As result, ovs and kernel may access the fraq_queue without correct lock. Also the struct ipq may be different on kernel(older than 4.3), which leads to invalid pointer access. The fix creates specific netns_frags for ovs. Signed-off-by: wangzhike --- datapath/datapath.c| 22 +++--- datapath/datapath.h| 6 ++ datapath/linux/compat/include/net/inet_frag.h | 18 - datapath/linux/compat/include/net/ip.h | 4 ++ .../include/net/netfilter/ipv6/nf_defrag_ipv6.h| 4 ++ datapath/linux/compat/inet_fragment.c | 83 -- datapath/linux/compat/ip_fragment.c| 66 ++--- datapath/linux/compat/nf_conntrack_reasm.c | 58 +-- 8 files changed, 138 insertions(+), 123 deletions(-) diff --git a/datapath/datapath.c b/datapath/datapath.c index c85029c..82cad74 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -2297,6 +2297,8 @@ static int __net_init ovs_init_net(struct net *net) INIT_LIST_HEAD(&ovs_net->dps); INIT_WORK(&ovs_net->dp_notify_work, ovs_dp_notify_wq); ovs_ct_init(net); + ovs_netns_frags_init(net); + ovs_netns_frags6_init(net); return 0; } @@ -2332,6 +2334,8 @@ static void __net_exit ovs_exit_net(struct net *dnet) struct net *net; LIST_HEAD(head); + ovs_netns_frags6_exit(dnet); + ovs_netns_frags_exit(dnet); ovs_ct_exit(dnet); ovs_lock(); list_for_each_entry_safe(dp, dp_next, &ovs_net->dps, list_node) @@ -2368,13 +2372,9 @@ static int __init dp_init(void) pr_info("Open vSwitch switching datapath %s\n", VERSION); -err = compat_init(); -if (err) - goto error; - err = action_fifos_init(); if (err) - goto error_compat_exit; +goto error; err = ovs_internal_dev_rtnl_link_register(); if (err) @@ -2392,10 +2392,14 @@ static int __init dp_init(void) if (err) goto error_vport_exit; -err = register_netdevice_notifier(&ovs_dp_device_notifier); + err = compat_init(); if (err) goto error_netns_exit; + err = register_netdevice_notifier(&ovs_dp_device_notifier); + if (err) +goto error_compat_exit; + err = ovs_netdev_init(); if (err) goto error_unreg_notifier; @@ -2410,6 +2414,8 @@ error_unreg_netdev: ovs_netdev_exit(); error_unreg_notifier: unregister_netdevice_notifier(&ovs_dp_device_notifier); +error_compat_exit: + compat_exit(); error_netns_exit: unregister_pernet_device(&ovs_net_ops); error_vport_exit: @@ -2420,8 +2426,6 @@ error_unreg_rtnl_link: ovs_internal_dev_rtnl_link_unregister(); error_action_fifos_exit: action_fifos_exit(); -error_compat_exit: -compat_exit(); error: return err; } @@ -2431,13 +2435,13 @@ static void dp_cleanup(void) dp_unregister_genl(ARRAY_SIZE(dp_genl_families)); ovs_netdev_exit(); unregister_netdevice_notifier(&ovs_dp_device_notifier); + compat_exit(); unregister_pernet_device(&ovs_net_ops); rcu_barrier(); ovs_vport_exit(); ovs_flow_exit(); ovs_internal_dev_rtnl_link_unregister(); action_fifos_exit(); -compat_exit(); } module_init(dp_init); diff --git a/datapath/datapath.h b/datapath/datapath.h index b835ada..8849625 100644 --- a/datapath/datapath.h +++ b/datapath/datapath.h @@ -141,6 +141,12 @@ struct ovs_net { /* Module reference for configuring conntrack. */ bool xt_label; + +#ifdef HAVE_INET_FRAG_LRU_MOVE + struct net *net; + struct netns_frags ipv4_frags; + struct netns_frags nf_frags; +#endif }; extern unsigned int ovs_net_id; diff --git a/datapath/linux/compat/include/net/inet_frag.h b/datapath/linux/compat/include/net/inet_frag.h index 01d79ad..34078c8 100644 --- a/datapath/linux/compat/include/net/inet_frag.h +++ b/datapath/linux/compat/include/net/inet_frag.h @@ -52,22 +52,4 @@ static inline int rpl_inet_frags_init(struct inet_frags *frags) #define inet_frags_init rpl_inet_frags_init #endif -#ifndef HAVE_CORRECT_MRU_HANDLING -/* We reuse the upstream inet_fragment.c common code for managing fragment - * stores, However we actually store the fragments within our own 'inet_frags' - * structures (in {ip_fragment,nf_conntrack_reasm}.c). When unloading the OVS - * kernel module, we need to flush all of the remaining fragments from these - * caches, or else we will panic with the following sequence of events: - * - * 1) A fragment for a packet arrives and is cached in inet_frags. This - *starts a timer to ensure the fragment does not hang around forever. - * 2) openvswitch module is unloaded. - * 3) The timer for the fragment fires, calling into backported OVS code - *to free the frag