Kir Kolyshkin wrote:
I am currently checking all the ~80 patches that are not in openvz lenny kernel. Looks like most are really needed. Let me suggest some in a few emails I will send as a reply to this one.
Here is a set of netfilter patches, quite a few. Some are very critical (read security-related) since they fix various container/host isolation issues, others are to prevent kernel oopses...
http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=8562975430153848dd817a050133b53adda96910 nf: fix use after free Fix use after free error, found by internal testing. Not an ABI breaker. Attached as 0010* http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=fa7ac0b2423dc741cd7016565545abb8e36c4af4 nf: fix call to kmem_cache_destroy from VEs Found by internal testing. Not an ABI breaker. Attached as 0011* http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=17b09e1de42db77743ea9ae3dfd3a910ac57ee71 conntrack: prevent double allocate/free of protos Found by internal testing. Not an ABI breaker. Attached as 0022* http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=7d3f10fc5d8e268f7572cfdd2287c049bce3af7c conntrack: prevent call register_pernet_subsys() from VE context Found by internal audit. Not an ABI breaker. Attached as 0023* http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=482dd20be37f61b2f94e6b3f3de1c1b9b4f9e6f1 conntrack: prevent call nf_register_hooks() from VE context Found by internal audit. Not an ABI breaker. Attached as 0024* http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=5fff3eb60f78acaadcae8562de5d3e6504f4d4f9 conntrack: adjust context during freeing Found by internal audit. Not an ABI breaker. Attached as 0029* http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=3cb8bc3781889ade74c02840b2eb8ddafb6d39c5 netfilter: NAT: assign nf_nat_seq_adjust_hook from VE0 context only Found by internal audit. Not an ABI breaker. Attached as 0033* http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=490910232ebe61f65e5e5c03b7286f11291b6092 netfilter: call nf_register_hooks from VE0 context only Found by internal audit. Not an ABI breaker. Attached as 0034* http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=1acba8533b788e95c52f827d06d9629d672c80fc netfilter: Fix NULL dereference in nf_nat_setup_info. OpenVZ Bug #1051 (http://bugzilla.openvz.org/1051). Might be an ABI breaker. Attached as 0047* http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=b405aed753ac48a46e66cccfd0a37006fd11feb8 netfilter: Add check to the nat hooks OpenVZ Bug #1051 (http://bugzilla.openvz.org/1051). Might be an ABI breaker. Attached as 0048* http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=b5e1f74cee5bc2c45bdca53a7218fb8de89215dd netlink: Fix oops in netlink conntrack module OpenVZ bug #788 (http://bugzilla.openvz.org/788) Attached as 0053* http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=09686c184a2cb815cbd5af500fe468311887d746 Free skb->nf_bridge in veth_xmit() and venet_xmit() OpenVZ bug #1146 (http://bugzilla.openvz.org/1146) Attached as 0066* http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h= http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h= http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=
>From 8562975430153848dd817a050133b53adda96910 Mon Sep 17 00:00:00 2001 From: Vitaliy Gusev <[email protected]> Date: Wed, 27 Aug 2008 19:36:28 +0400 Subject: [PATCH] nf: fix use after free Fix use after free error: move freeing ve_nf_conntrack_l4proto_generic to nf_ct_proto_generic_sysctl_cleanup(). Signed-off-by: Vitaliy Gusev <[email protected]> Signed-off-by: Pavel Emelyanov <[email protected]> --- net/netfilter/nf_conntrack_proto.c | 4 ---- net/netfilter/nf_conntrack_proto_generic.c | 2 ++ 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c index 49fc01f..67c53a7 100644 --- a/net/netfilter/nf_conntrack_proto.c +++ b/net/netfilter/nf_conntrack_proto.c @@ -358,8 +358,4 @@ void nf_conntrack_proto_fini(void) /* free l3proto protocol tables */ for (i = 0; i < PF_MAX; i++) kfree(ve_nf_ct_protos[i]); -#ifdef CONFIG_VE_IPTABLES - if (!ve_is_super(get_exec_env())) - kfree(ve_nf_conntrack_l4proto_generic); -#endif } diff --git a/net/netfilter/nf_conntrack_proto_generic.c b/net/netfilter/nf_conntrack_proto_generic.c index e65f9a7..24b0e29 100644 --- a/net/netfilter/nf_conntrack_proto_generic.c +++ b/net/netfilter/nf_conntrack_proto_generic.c @@ -163,6 +163,8 @@ void nf_ct_proto_generic_sysctl_cleanup(void) kfree(ve_nf_conntrack_l4proto_generic->ctl_compat_table); #endif kfree(ve_nf_conntrack_l4proto_generic->ctl_table); + + kfree(ve_nf_conntrack_l4proto_generic); } } EXPORT_SYMBOL(nf_ct_proto_generic_sysctl_cleanup); -- 1.6.0.6
>From fa7ac0b2423dc741cd7016565545abb8e36c4af4 Mon Sep 17 00:00:00 2001 From: Vitaliy Gusev <[email protected]> Date: Wed, 27 Aug 2008 19:37:09 +0400 Subject: [PATCH] nf: fix call to kmem_cache_destroy from VEs Free nf_conntrack_cachep only for VE0 as it is a global variable for all VEs. Signed-off-by: Vitaliy Gusev <[email protected]> Signed-off-by: Pavel Emelyanov <[email protected]> --- net/netfilter/nf_conntrack_core.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index e811c0b..b4050b0 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1041,7 +1041,8 @@ void nf_conntrack_cleanup(void) rcu_assign_pointer(nf_ct_destroy, NULL); - kmem_cache_destroy(nf_conntrack_cachep); + if (ve_is_super(ve)) + kmem_cache_destroy(nf_conntrack_cachep); skip_ct_cache: nf_conntrack_helper_fini(); nf_conntrack_expect_fini(); -- 1.6.0.6
>From 17b09e1de42db77743ea9ae3dfd3a910ac57ee71 Mon Sep 17 00:00:00 2001 From: Vitaliy Gusev <[email protected]> Date: Mon, 22 Sep 2008 13:53:27 +0400 Subject: [PATCH] conntrack: prevent double allocate/free of protos Call nf_ct_proto_tcp_sysctl_xxx()/nf_ct_proto_tcp_sysctl_cleanup() from nf_conntrack_init_ve()/nf_conntrack_cleanup_ve() to prevent to be called twice from functions: - init_nf_ct_l3proto_ipv4() - init_nf_ct_l3proto_ipv6() - fini_nf_ct_l3proto_ipv4() - fini_nf_ct_l3proto_ipv6() Signed-off-by: Vitaliy Gusev <[email protected]> Signed-off-by: Pavel Emelyanov <[email protected]> --- net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c | 12 ------------ net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c | 12 ------------ net/netfilter/nf_conntrack_standalone.c | 13 +++++++++++++ 3 files changed, 13 insertions(+), 24 deletions(-) diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c index dca8da7..b4bb436 100644 --- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c +++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c @@ -512,12 +512,6 @@ int init_nf_ct_l3proto_ipv4(void) ret = nf_ct_proto_ipv4_sysctl_init(); if (ret < 0) goto no_mem_ipv4; - ret = nf_ct_proto_tcp_sysctl_init(); - if (ret < 0) - goto no_mem_tcp; - ret = nf_ct_proto_udp_sysctl_init(); - if (ret < 0) - goto no_mem_udp; ret = nf_ct_proto_icmp_sysctl_init(); if (ret < 0) goto no_mem_icmp; @@ -575,10 +569,6 @@ unreg_tcp: cleanup_sys: #ifdef CONFIG_VE_IPTABLES no_mem_icmp: - nf_ct_proto_udp_sysctl_cleanup(); -no_mem_udp: - nf_ct_proto_tcp_sysctl_cleanup(); -no_mem_tcp: nf_ct_proto_ipv4_sysctl_cleanup(); no_mem_ipv4: nf_ct_proto_ipv4_fini(); @@ -606,8 +596,6 @@ void fini_nf_ct_l3proto_ipv4(void) #ifdef CONFIG_VE_IPTABLES nf_ct_proto_icmp_sysctl_cleanup(); - nf_ct_proto_udp_sysctl_cleanup(); - nf_ct_proto_tcp_sysctl_cleanup(); nf_ct_proto_ipv4_sysctl_cleanup(); nf_ct_proto_ipv4_fini(); if (!ve_is_super(get_exec_env())) diff --git a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c index e6f8f7d..cbfe1a2 100644 --- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c +++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c @@ -368,12 +368,6 @@ int init_nf_ct_l3proto_ipv6(void) if (!ve_is_super(get_exec_env())) __module_get(THIS_MODULE); - ret = nf_ct_proto_tcp_sysctl_init(); - if (ret < 0) - goto no_mem_tcp; - ret = nf_ct_proto_udp_sysctl_init(); - if (ret < 0) - goto no_mem_udp; ret = nf_ct_proto_icmpv6_sysctl_init(); if (ret < 0) goto no_mem_icmp; @@ -430,10 +424,6 @@ cleanup_frag6: cleanup_sys: #ifdef CONFIG_VE_IPTABLES no_mem_icmp: - nf_ct_proto_udp_sysctl_cleanup(); -no_mem_udp: - nf_ct_proto_tcp_sysctl_cleanup(); -no_mem_tcp: if (!ve_is_super(get_exec_env())) module_put(THIS_MODULE); #endif /* CONFIG_VE_IPTABLES */ @@ -452,8 +442,6 @@ void fini_nf_ct_l3proto_ipv6(void) #ifdef CONFIG_VE_IPTABLES nf_ct_proto_icmpv6_sysctl_cleanup(); - nf_ct_proto_udp_sysctl_cleanup(); - nf_ct_proto_tcp_sysctl_cleanup(); if (!ve_is_super(get_exec_env())) module_put(THIS_MODULE); #endif /* CONFIG_VE_IPTABLES */ diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c index c4d8ef2..0439df6 100644 --- a/net/netfilter/nf_conntrack_standalone.c +++ b/net/netfilter/nf_conntrack_standalone.c @@ -510,8 +510,19 @@ static int nf_conntrack_init_ve(void) ret = nf_conntrack_standalone_init_sysctl(); if (ret < 0) goto out_sysctl; + ret = nf_ct_proto_tcp_sysctl_init(); + if (ret < 0) + goto out_tcp_sysctl; + ret = nf_ct_proto_udp_sysctl_init(); + if (ret < 0) + goto out_udp_sysctl; + return 0; +out_udp_sysctl: + nf_ct_proto_tcp_sysctl_cleanup(); +out_tcp_sysctl: + nf_conntrack_standalone_fini_sysctl(); out_sysctl: nf_conntrack_standalone_fini_proc(); out_proc: @@ -522,6 +533,8 @@ out: static void nf_conntrack_cleanup_ve(void) { + nf_ct_proto_udp_sysctl_cleanup(); + nf_ct_proto_tcp_sysctl_cleanup(); nf_conntrack_standalone_fini_sysctl(); nf_conntrack_standalone_fini_proc(); nf_conntrack_cleanup(); -- 1.6.0.6
>From 7d3f10fc5d8e268f7572cfdd2287c049bce3af7c Mon Sep 17 00:00:00 2001 From: Vitaliy Gusev <[email protected]> Date: Mon, 22 Sep 2008 14:04:45 +0400 Subject: [PATCH] conntrack: prevent call register_pernet_subsys() from VE context nf_ct_frag6_init calls register_pernet_subsys. So move nf_ct_frag6_init to nf_conntrack_l3proto_ipv6_init() to prevent call from VE context. Signed-off-by: Vitaliy Gusev <[email protected]> Signed-off-by: Pavel Emelyanov <[email protected]> --- net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c | 24 +++++++++++++----------- 1 files changed, 13 insertions(+), 11 deletions(-) diff --git a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c index cbfe1a2..b97914e 100644 --- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c +++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c @@ -372,16 +372,10 @@ int init_nf_ct_l3proto_ipv6(void) if (ret < 0) goto no_mem_icmp; #endif /* CONFIG_VE_IPTABLES */ - ret = nf_ct_frag6_init(); - if (ret < 0) { - printk("nf_conntrack_ipv6: can't initialize frag6.\n"); - goto cleanup_sys; - } - ret = nf_conntrack_l4proto_register(ve_nf_conntrack_l4proto_tcp6); if (ret < 0) { printk("nf_conntrack_ipv6: can't register tcp.\n"); - goto cleanup_frag6; + goto cleanup_sys; } ret = nf_conntrack_l4proto_register(ve_nf_conntrack_l4proto_udp6); @@ -419,8 +413,6 @@ unreg_udp: nf_conntrack_l4proto_unregister(ve_nf_conntrack_l4proto_udp6); unreg_tcp: nf_conntrack_l4proto_unregister(ve_nf_conntrack_l4proto_tcp6); -cleanup_frag6: - nf_ct_frag6_cleanup(); cleanup_sys: #ifdef CONFIG_VE_IPTABLES no_mem_icmp: @@ -438,7 +430,6 @@ void fini_nf_ct_l3proto_ipv6(void) nf_conntrack_l4proto_unregister(ve_nf_conntrack_l4proto_icmpv6); nf_conntrack_l4proto_unregister(ve_nf_conntrack_l4proto_udp6); nf_conntrack_l4proto_unregister(ve_nf_conntrack_l4proto_tcp6); - nf_ct_frag6_cleanup(); #ifdef CONFIG_VE_IPTABLES nf_ct_proto_icmpv6_sysctl_cleanup(); @@ -454,15 +445,25 @@ static int __init nf_conntrack_l3proto_ipv6_init(void) need_conntrack(); + ret = nf_ct_frag6_init(); + if (ret < 0) { + printk("nf_conntrack_ipv6: can't initialize frag6.\n"); + return ret; + } + ret = init_nf_ct_l3proto_ipv6(); if (ret < 0) { printk(KERN_ERR "Unable to initialize netfilter protocols\n"); - return ret; + goto cleanup_frag6; } KSYMRESOLVE(init_nf_ct_l3proto_ipv6); KSYMRESOLVE(fini_nf_ct_l3proto_ipv6); KSYMMODRESOLVE(nf_conntrack_ipv6); return 0; + +cleanup_frag6: + nf_ct_frag6_cleanup(); + return ret; } static void __exit nf_conntrack_l3proto_ipv6_fini(void) @@ -472,6 +473,7 @@ static void __exit nf_conntrack_l3proto_ipv6_fini(void) KSYMUNRESOLVE(init_nf_ct_l3proto_ipv6); KSYMUNRESOLVE(fini_nf_ct_l3proto_ipv6); fini_nf_ct_l3proto_ipv6(); + nf_ct_frag6_cleanup(); } module_init(nf_conntrack_l3proto_ipv6_init); -- 1.6.0.6
>From 482dd20be37f61b2f94e6b3f3de1c1b9b4f9e6f1 Mon Sep 17 00:00:00 2001 From: Vitaliy Gusev <[email protected]> Date: Mon, 22 Sep 2008 14:05:54 +0400 Subject: [PATCH] conntrack: prevent call nf_register_hooks() from VE context Move nf_register_hooks from init_nf_ct_l3proto_ipv6() to nf_conntrack_l3proto_ipv6_init() to prevent call from VE context. Signed-off-by: Vitaliy Gusev <[email protected]> Signed-off-by: Pavel Emelyanov <[email protected]> --- net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c | 19 +++++++++---------- 1 files changed, 9 insertions(+), 10 deletions(-) diff --git a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c index b97914e..71b15ab 100644 --- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c +++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c @@ -396,17 +396,8 @@ int init_nf_ct_l3proto_ipv6(void) goto unreg_icmpv6; } - ret = nf_register_hooks(ipv6_conntrack_ops, - ARRAY_SIZE(ipv6_conntrack_ops)); - if (ret < 0) { - printk("nf_conntrack_ipv6: can't register pre-routing defrag " - "hook.\n"); - goto unreg_ipv6; - } return 0; -unreg_ipv6: - nf_conntrack_l3proto_unregister(ve_nf_conntrack_l3proto_ipv6); unreg_icmpv6: nf_conntrack_l4proto_unregister(ve_nf_conntrack_l4proto_icmpv6); unreg_udp: @@ -425,7 +416,6 @@ EXPORT_SYMBOL(init_nf_ct_l3proto_ipv6); void fini_nf_ct_l3proto_ipv6(void) { - nf_unregister_hooks(ipv6_conntrack_ops, ARRAY_SIZE(ipv6_conntrack_ops)); nf_conntrack_l3proto_unregister(ve_nf_conntrack_l3proto_ipv6); nf_conntrack_l4proto_unregister(ve_nf_conntrack_l4proto_icmpv6); nf_conntrack_l4proto_unregister(ve_nf_conntrack_l4proto_udp6); @@ -456,6 +446,14 @@ static int __init nf_conntrack_l3proto_ipv6_init(void) printk(KERN_ERR "Unable to initialize netfilter protocols\n"); goto cleanup_frag6; } + + ret = nf_register_hooks(ipv6_conntrack_ops, + ARRAY_SIZE(ipv6_conntrack_ops)); + if (ret < 0) { + printk(KERN_ERR "nf_conntrack_ipv6: can't register pre-routing " + "defrag hook.\n"); + return ret; + } KSYMRESOLVE(init_nf_ct_l3proto_ipv6); KSYMRESOLVE(fini_nf_ct_l3proto_ipv6); KSYMMODRESOLVE(nf_conntrack_ipv6); @@ -472,6 +470,7 @@ static void __exit nf_conntrack_l3proto_ipv6_fini(void) KSYMMODUNRESOLVE(nf_conntrack_ipv6); KSYMUNRESOLVE(init_nf_ct_l3proto_ipv6); KSYMUNRESOLVE(fini_nf_ct_l3proto_ipv6); + nf_unregister_hooks(ipv6_conntrack_ops, ARRAY_SIZE(ipv6_conntrack_ops)); fini_nf_ct_l3proto_ipv6(); nf_ct_frag6_cleanup(); } -- 1.6.0.6
>From 5fff3eb60f78acaadcae8562de5d3e6504f4d4f9 Mon Sep 17 00:00:00 2001 From: Vitaliy Gusev <[email protected]> Date: Fri, 26 Sep 2008 19:06:41 +0400 Subject: [PATCH] conntrack: adjust context during freeing rcu callback are called from VE0 context, so we must specify context when accessing to virtualized variables (ve_nf_conntrack_count) Signed-off-by: Vitaliy Gusev <[email protected]> Signed-off-by: Pavel Emelyanov <[email protected]> --- net/netfilter/nf_conntrack_core.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index b4050b0..b38699c 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -539,10 +539,16 @@ EXPORT_SYMBOL_GPL(nf_conntrack_alloc); static void nf_conntrack_free_rcu(struct rcu_head *head) { struct nf_conn *ct = container_of(head, struct nf_conn, rcu); +#ifdef CONFIG_VE_IPTABLES + struct ve_struct *ve = set_exec_env(ct->ct_owner_env); +#endif nf_ct_ext_free(ct); kmem_cache_free(nf_conntrack_cachep, ct); atomic_dec(&ve_nf_conntrack_count); +#ifdef CONFIG_VE_IPTABLES + set_exec_env(ve); +#endif } void nf_conntrack_free(struct nf_conn *ct) -- 1.6.0.6
>From 3cb8bc3781889ade74c02840b2eb8ddafb6d39c5 Mon Sep 17 00:00:00 2001 From: Vitaliy Gusev <[email protected]> Date: Wed, 1 Oct 2008 12:10:51 +0400 Subject: [PATCH] netfilter: NAT: assign nf_nat_seq_adjust_hook from VE0 context only Signed-off-by: Vitaliy Gusev <[email protected]> Signed-off-by: Pavel Emelyanov <[email protected]> --- net/ipv4/netfilter/nf_nat_core.c | 9 +++++---- 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/net/ipv4/netfilter/nf_nat_core.c b/net/ipv4/netfilter/nf_nat_core.c index f7f832b..ac9319d 100644 --- a/net/ipv4/netfilter/nf_nat_core.c +++ b/net/ipv4/netfilter/nf_nat_core.c @@ -645,12 +645,12 @@ int nf_nat_init(void) if (ve_is_super(get_exec_env())) { /* Initialize fake conntrack so that NAT will skip it */ nf_conntrack_untracked.status |= IPS_NAT_DONE_MASK; + BUG_ON(nf_nat_seq_adjust_hook != NULL); + rcu_assign_pointer(nf_nat_seq_adjust_hook, nf_nat_seq_adjust); } ve_nf_nat_l3proto = nf_ct_l3proto_find_get((u_int16_t)AF_INET); - BUG_ON(nf_nat_seq_adjust_hook != NULL); - rcu_assign_pointer(nf_nat_seq_adjust_hook, nf_nat_seq_adjust); return 0; #ifdef CONFIG_VE_IPTABLES @@ -683,9 +683,10 @@ void nf_nat_cleanup(void) #ifdef CONFIG_VE_IPTABLES kfree(ve_nf_nat_protos); #endif - if (ve_is_super(get_exec_env())) + if (ve_is_super(get_exec_env())) { nf_ct_extend_unregister(&nat_extend); - rcu_assign_pointer(nf_nat_seq_adjust_hook, NULL); + rcu_assign_pointer(nf_nat_seq_adjust_hook, NULL); + } synchronize_net(); } -- 1.6.0.6
>From 490910232ebe61f65e5e5c03b7286f11291b6092 Mon Sep 17 00:00:00 2001 From: Vitaliy Gusev <[email protected]> Date: Wed, 1 Oct 2008 12:12:36 +0400 Subject: [PATCH] netfilter: call nf_register_hooks from VE0 context only Signed-off-by: Vitaliy Gusev <[email protected]> Signed-off-by: Pavel Emelyanov <[email protected]> --- net/ipv4/netfilter/nf_nat_standalone.c | 14 +++++++++----- 1 files changed, 9 insertions(+), 5 deletions(-) diff --git a/net/ipv4/netfilter/nf_nat_standalone.c b/net/ipv4/netfilter/nf_nat_standalone.c index 9aec464..72f45db 100644 --- a/net/ipv4/netfilter/nf_nat_standalone.c +++ b/net/ipv4/netfilter/nf_nat_standalone.c @@ -295,10 +295,13 @@ int init_nftable_nat(void) printk("nf_nat_init: can't setup rules.\n"); goto out_modput; } - ret = nf_register_hooks(nf_nat_ops, ARRAY_SIZE(nf_nat_ops)); - if (ret < 0) { - printk("nf_nat_init: can't register hooks.\n"); - goto cleanup_rule_init; + + if (ve_is_super(get_exec_env())) { + ret = nf_register_hooks(nf_nat_ops, ARRAY_SIZE(nf_nat_ops)); + if (ret < 0) { + printk("nf_nat_init: can't register hooks.\n"); + goto cleanup_rule_init; + } } return 0; @@ -312,7 +315,8 @@ out_modput: void fini_nftable_nat(void) { - nf_unregister_hooks(nf_nat_ops, ARRAY_SIZE(nf_nat_ops)); + if (ve_is_super(get_exec_env())) + nf_unregister_hooks(nf_nat_ops, ARRAY_SIZE(nf_nat_ops)); nf_nat_rule_cleanup(); if (!ve_is_super(get_exec_env())) module_put(THIS_MODULE); -- 1.6.0.6
>From 1acba8533b788e95c52f827d06d9629d672c80fc Mon Sep 17 00:00:00 2001 From: Vitaliy Gusev <[email protected]> Date: Wed, 19 Nov 2008 20:50:25 +0300 Subject: [PATCH] netfilter: Fix NULL dereference in nf_nat_setup_info If conntrack is allowed in VE but iptable_nat is not allowed and loaded then Oops occurs: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffffa0123df6>] :nf_nat:nf_nat_setup_info+0x343/0x489 Oops: 0000 [1] SMP DEBUG_PAGEALLOC CPU: 1 [<ffffffff8028c277>] ? poison_obj+0x27/0x32 [<ffffffffa012a084>] :iptable_nat:alloc_null_binding+0x44/0x46 [<ffffffffa012a1f7>] :iptable_nat:nf_nat_rule_find+0x62/0x6b [<ffffffffa012a4e5>] :iptable_nat:nf_nat_fn+0x11d/0x149 [<ffffffffa012a551>] :iptable_nat:nf_nat_local_fn+0x40/0xbf [<ffffffff80476ad5>] nf_iterate+0x43/0x80 [<ffffffff8047efa0>] ? dst_output+0x0/0xd [<ffffffff80476de9>] nf_hook_slow+0x5e/0xc1 [<ffffffff8047efa0>] ? dst_output+0x0/0xd [<ffffffff80480314>] __ip_local_out+0x9f/0xa1 [<ffffffff80480327>] ip_local_out+0x11/0x24 [<ffffffff80480600>] ip_push_pending_frames+0x2c6/0x345 [<ffffffff8049b668>] raw_sendmsg+0x6a9/0x739 [<ffffffff804a3750>] inet_sendmsg+0x46/0x53 [<ffffffff80455ffa>] sock_sendmsg+0xdf/0xf8 RIP [<ffffffffa0123df6>] :nf_nat:nf_nat_setup_info+0x343/0x489 So create/use iptable_nat to check was nat table initialized in VE or not. Bug #1051 http://bugzilla.openvz.org/show_bug.cgi?id=1051 Signed-off-by: Vitaliy Gusev <[email protected]> Signed-off-by: Pavel Emelyanov <[email protected]> --- include/linux/netfilter.h | 15 +++++++++++++++ include/linux/ve.h | 1 - include/net/netns/ipv4.h | 1 + net/ipv4/netfilter/nf_nat_rule.c | 25 +++++++++++-------------- 4 files changed, 27 insertions(+), 15 deletions(-) diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h index 8d41ea4..63c92ad 100644 --- a/include/linux/netfilter.h +++ b/include/linux/netfilter.h @@ -394,6 +394,21 @@ static inline struct net *nf_post_routing_net(const struct net_device *in, #endif } +static inline struct net *nf_net(unsigned hook, + const struct net_device *in, + const struct net_device *out) +{ + switch (hook) { + case NF_INET_PRE_ROUTING: + case NF_INET_LOCAL_IN: + case NF_INET_FORWARD: + return dev_net(in); + case NF_INET_POST_ROUTING: + case NF_INET_LOCAL_OUT: + return dev_net(out); + } +} + #ifdef CONFIG_VE_IPTABLES #include <linux/vziptable_defs.h> diff --git a/include/linux/ve.h b/include/linux/ve.h index 2180c1f..f55f43e 100644 --- a/include/linux/ve.h +++ b/include/linux/ve.h @@ -56,7 +56,6 @@ struct ve_nf_conntrack { struct hlist_head *_bysource; struct nf_nat_protocol **_nf_nat_protos; int _nf_nat_vmalloced; - struct xt_table *_nf_nat_table; struct nf_conntrack_l3proto *_nf_nat_l3proto; atomic_t _nf_conntrack_count; int _nf_conntrack_max; diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index d8588d5..31add33 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -34,6 +34,7 @@ struct netns_ipv4 { struct netns_frags frags; #ifdef CONFIG_NETFILTER struct xt_table *iptable_filter; + struct xt_table *iptable_nat; struct xt_table *iptable_mangle; struct xt_table *iptable_raw; struct xt_table *arptable_filter; diff --git a/net/ipv4/netfilter/nf_nat_rule.c b/net/ipv4/netfilter/nf_nat_rule.c index f301178..505c1cd 100644 --- a/net/ipv4/netfilter/nf_nat_rule.c +++ b/net/ipv4/netfilter/nf_nat_rule.c @@ -66,12 +66,6 @@ static struct xt_table __nat_table = { .me = THIS_MODULE, .af = AF_INET, }; -#ifdef CONFIG_VE_IPTABLES -#define nat_table \ - (get_exec_env()->_nf_conntrack->_nf_nat_table) -#else -static struct xt_table *nat_table; -#endif /* Source NAT */ static unsigned int ipt_snat_target(struct sk_buff *skb, @@ -202,7 +196,8 @@ int nf_nat_rule_find(struct sk_buff *skb, { int ret; - ret = ipt_do_table(skb, hooknum, in, out, nat_table); + ret = ipt_do_table(skb, hooknum, in, out, + nf_net(hooknum, in, out)->ipv4.iptable_nat); if (ret == NF_ACCEPT) { if (!nf_nat_initialized(ct, HOOK2MANIP(hooknum))) @@ -237,10 +232,10 @@ int nf_nat_rule_init(void) int ret; struct net *net = get_exec_env()->ve_netns; - nat_table = ipt_register_table(net, &__nat_table, + net->ipv4.iptable_nat = ipt_register_table(net, &__nat_table, &nat_initial_table.repl); - if (IS_ERR(nat_table)) - return PTR_ERR(nat_table); + if (IS_ERR(net->ipv4.iptable_nat)) + return PTR_ERR(net->ipv4.iptable_nat); ret = 0; if (!ve_is_super(get_exec_env())) @@ -260,20 +255,22 @@ done: unregister_snat: xt_unregister_target(&ipt_snat_reg); unregister_table: - ipt_unregister_table(nat_table); - nat_table = NULL; + ipt_unregister_table(net->ipv4.iptable_nat); + net->ipv4.iptable_nat = NULL; return ret; } void nf_nat_rule_cleanup(void) { + struct net *net = get_exec_env()->ve_netns; + if (!ve_is_super(get_exec_env())) goto skip; xt_unregister_target(&ipt_dnat_reg); xt_unregister_target(&ipt_snat_reg); skip: - ipt_unregister_table(nat_table); - nat_table = NULL; + ipt_unregister_table(net->ipv4.iptable_nat); + net->ipv4.iptable_nat = NULL; } -- 1.6.0.6
>From b405aed753ac48a46e66cccfd0a37006fd11feb8 Mon Sep 17 00:00:00 2001 From: Vitaliy Gusev <[email protected]> Date: Wed, 19 Nov 2008 20:39:51 +0300 Subject: [PATCH] netfilter: Add check to the nat hooks Pass skb if VE wasn't granded to have nat table. Related to bug #1051 http://bugzilla.openvz.org/show_bug.cgi?id=1051 Signed-off-by: Vitaliy Gusev <[email protected]> Signed-off-by: Pavel Emelyanov <[email protected]> --- net/ipv4/netfilter/nf_nat_standalone.c | 24 +++++++++++++++++++++++- 1 files changed, 23 insertions(+), 1 deletions(-) diff --git a/net/ipv4/netfilter/nf_nat_standalone.c b/net/ipv4/netfilter/nf_nat_standalone.c index 72f45db..17d7527 100644 --- a/net/ipv4/netfilter/nf_nat_standalone.c +++ b/net/ipv4/netfilter/nf_nat_standalone.c @@ -157,6 +157,19 @@ nf_nat_fn(unsigned int hooknum, } static unsigned int +nf_nat_local_in(unsigned int hooknum, + struct sk_buff *skb, + const struct net_device *in, + const struct net_device *out, + int (*okfn)(struct sk_buff *)) +{ + if (!dev_net(in)->ipv4.iptable_nat) + return NF_ACCEPT; + + return nf_nat_fn(hooknum, skb, in, out, okfn); +} + +static unsigned int nf_nat_in(unsigned int hooknum, struct sk_buff *skb, const struct net_device *in, @@ -166,6 +179,9 @@ nf_nat_in(unsigned int hooknum, unsigned int ret; __be32 daddr = ip_hdr(skb)->daddr; + if (!dev_net(in)->ipv4.iptable_nat) + return NF_ACCEPT; + ret = nf_nat_fn(hooknum, skb, in, out, okfn); if (ret != NF_DROP && ret != NF_STOLEN && daddr != ip_hdr(skb)->daddr) { @@ -188,6 +204,9 @@ nf_nat_out(unsigned int hooknum, #endif unsigned int ret; + if (!dev_net(out)->ipv4.iptable_nat) + return NF_ACCEPT; + /* root is playing with raw sockets. */ if (skb->len < sizeof(struct iphdr) || ip_hdrlen(skb) < sizeof(struct iphdr)) @@ -221,6 +240,9 @@ nf_nat_local_fn(unsigned int hooknum, enum ip_conntrack_info ctinfo; unsigned int ret; + if (!dev_net(out)->ipv4.iptable_nat) + return NF_ACCEPT; + /* root is playing with raw sockets. */ if (skb->len < sizeof(struct iphdr) || ip_hdrlen(skb) < sizeof(struct iphdr)) @@ -275,7 +297,7 @@ static struct nf_hook_ops nf_nat_ops[] __read_mostly = { }, /* After packet filtering, change source */ { - .hook = nf_nat_fn, + .hook = nf_nat_local_in, .owner = THIS_MODULE, .pf = PF_INET, .hooknum = NF_INET_LOCAL_IN, -- 1.6.0.6
>From b5e1f74cee5bc2c45bdca53a7218fb8de89215dd Mon Sep 17 00:00:00 2001 From: Pavel Emelyanov <[email protected]> Date: Fri, 28 Nov 2008 12:46:11 +0300 Subject: [PATCH] netlink: Fix oops in netlink conntrack module If we load conntrack modules after ve start one pointer on ve_struct is NULL and accessing it causes an oops. This is handled in most of the places, but the netlink interface. Fix this one as well. http://bugzilla.openvz.org/show_bug.cgi?id=788 Signed-off-by: Pavel Emelyanov <[email protected]> --- include/net/netfilter/nf_conntrack_l4proto.h | 3 +++ net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c | 3 +++ net/netfilter/nf_conntrack_netlink.c | 18 ++++++++++++++++++ 3 files changed, 24 insertions(+), 0 deletions(-) diff --git a/include/net/netfilter/nf_conntrack_l4proto.h b/include/net/netfilter/nf_conntrack_l4proto.h index 43ecaf7..43ca754 100644 --- a/include/net/netfilter/nf_conntrack_l4proto.h +++ b/include/net/netfilter/nf_conntrack_l4proto.h @@ -126,6 +126,9 @@ extern unsigned int nf_ct_log_invalid; #ifdef CONFIG_VE_IPTABLES #include <linux/sched.h> #define ve_nf_ct4 (get_exec_env()->_nf_conntrack) +#define ve_nf_ct_initialized() (get_exec_env()->_nf_conntrack != NULL) +#else +#define ve_nf_ct_initialized() 1 #endif #if defined(CONFIG_VE_IPTABLES) && defined(CONFIG_SYSCTL) diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c index b4bb436..c3c22dd 100644 --- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c +++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c @@ -304,6 +304,9 @@ getorigdst(struct sock *sk, int optval, void __user *user, int *len) const struct nf_conntrack_tuple_hash *h; struct nf_conntrack_tuple tuple; + if (!ve_nf_ct_initialized()) + return -ENOPROTOOPT; + memset(&tuple, 0, sizeof(tuple)); tuple.src.u3.ip = inet->rcv_saddr; tuple.src.u.tcp.port = inet->sport; diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index e9bee13..f15c4ba 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -790,6 +790,9 @@ ctnetlink_del_conntrack(struct sock *ctnl, struct sk_buff *skb, u_int8_t u3 = nfmsg->nfgen_family; int err = 0; + if (!ve_nf_ct_initialized()) + return -ENOPROTOOPT; + if (cda[CTA_TUPLE_ORIG]) err = ctnetlink_parse_tuple(cda, &tuple, CTA_TUPLE_ORIG, u3); else if (cda[CTA_TUPLE_REPLY]) @@ -836,6 +839,9 @@ ctnetlink_get_conntrack(struct sock *ctnl, struct sk_buff *skb, u_int8_t u3 = nfmsg->nfgen_family; int err = 0; + if (!ve_nf_ct_initialized()) + return -ENOPROTOOPT; + if (nlh->nlmsg_flags & NLM_F_DUMP) { #ifndef CONFIG_NF_CT_ACCT if (NFNL_MSG_TYPE(nlh->nlmsg_type) == IPCTNL_MSG_CT_GET_CTRZERO) @@ -1203,6 +1209,9 @@ ctnetlink_new_conntrack(struct sock *ctnl, struct sk_buff *skb, u_int8_t u3 = nfmsg->nfgen_family; int err = 0; + if (!ve_nf_ct_initialized()) + return -ENOPROTOOPT; + if (cda[CTA_TUPLE_ORIG]) { err = ctnetlink_parse_tuple(cda, &otuple, CTA_TUPLE_ORIG, u3); if (err < 0) @@ -1527,6 +1536,9 @@ ctnetlink_get_expect(struct sock *ctnl, struct sk_buff *skb, u_int8_t u3 = nfmsg->nfgen_family; int err = 0; + if (!ve_nf_ct_initialized()) + return -ENOPROTOOPT; + if (nlh->nlmsg_flags & NLM_F_DUMP) { return netlink_dump_start(ctnl, skb, nlh, ctnetlink_exp_dump_table, @@ -1588,6 +1600,9 @@ ctnetlink_del_expect(struct sock *ctnl, struct sk_buff *skb, unsigned int i; int err; + if (!ve_nf_ct_initialized()) + return -ENOPROTOOPT; + if (cda[CTA_EXPECT_TUPLE]) { /* delete a single expect by tuple */ err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_TUPLE, u3); @@ -1726,6 +1741,9 @@ ctnetlink_new_expect(struct sock *ctnl, struct sk_buff *skb, u_int8_t u3 = nfmsg->nfgen_family; int err = 0; + if (!ve_nf_ct_initialized()) + return -ENOPROTOOPT; + if (!cda[CTA_EXPECT_TUPLE] || !cda[CTA_EXPECT_MASK] || !cda[CTA_EXPECT_MASTER]) -- 1.6.0.6
>From 09686c184a2cb815cbd5af500fe468311887d746 Mon Sep 17 00:00:00 2001 From: Vitaliy Gusev <[email protected]> Date: Mon, 26 Jan 2009 15:48:02 +0300 Subject: [PATCH] Free skb->nf_bridge in veth_xmit() and venet_xmit() We free skb->nfct in veth_xmit, but also have to free skb->nf_bridge. Note: Why it works in 2.6.24-ovz but doesn't work in 2.6.26-ovz ? 1. It issue is only if BRIDGE_NETFILTER=y 2. nf_hook_register() has effect to all VEs in 2.6.26-ovz (in 2.6.24-ovz doesn't). Thus bridge hook ip_sabotage_in is not called for 2.6.24-ovz, but is called for 2.6.26-ovz. http://bugzilla.openvz.org/show_bug.cgi?id=1146 Signed-off-by: Vitaliy Gusev <[email protected]> Signed-off-by: Pavel Emelyanov <[email protected]> --- drivers/net/venet_core.c | 5 +---- drivers/net/vzethdev.c | 5 +---- 2 files changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/net/venet_core.c b/drivers/net/venet_core.c index 6b21630..8770255 100644 --- a/drivers/net/venet_core.c +++ b/drivers/net/venet_core.c @@ -272,10 +272,7 @@ static int venet_xmit(struct sk_buff *skb, struct net_device *dev) dst_release(skb->dst); skb->dst = NULL; -#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE) - nf_conntrack_put(skb->nfct); - skb->nfct = NULL; -#endif + nf_reset(skb); length = skb->len; netif_rx(skb); diff --git a/drivers/net/vzethdev.c b/drivers/net/vzethdev.c index 1414618..dd2b693 100644 --- a/drivers/net/vzethdev.c +++ b/drivers/net/vzethdev.c @@ -311,10 +311,7 @@ out: dst_release(skb->dst); skb->dst = NULL; -#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE) - nf_conntrack_put(skb->nfct); - skb->nfct = NULL; -#endif + nf_reset(skb); length = skb->len; netif_rx(skb); -- 1.6.0.6

