From: Stanislav Kinsburskiy <[email protected]> Patchset description:
Create conntrack structures only if they are really needed Allocate conntracks only after there is a rule which uses them. v2: Allow after there is a rule and never prohibit. khorenko@: the idea behind all of this: we want to provide the possibility to Containers to use iptables rules which require conntracks. At the same time we'd like to avoid problem we currently have in case we just enable conntracks allocation for all Containers and Hardware Node by default: 1) in case conntracks are really not used by a CT - structures are still allocated decreasing the performance 2) number of conntracks in the system is limited => DDoS is possible So we decided to implement a feature: not to allocate conntracks until there are rules in the netspace which require them. Disadvantage: if a user on live system loads iptables rule which requires conntracks, connections which are already alive can be handled not that precise. i believe this is OK. Once conntracks allocation is enabled, it cannot be disabled until reboot/CT restart. This is done in order to: a) simplify the code b) to have a possbility to unconditionally enable conntracks, for example for userspace conntrack users (http://conntrack-tools.netfilter.org/manual.html) c) adding a new iptables rule is implemented in the following way: - all rules are unloaded - new rule is added to the bunch of rules - all rules (including the new one) are uploaded to the kernel => each new rule add results in conntrack allocation disable/enable => race window for unhandled connections ======================= This patch description: Allocation are allowed only when there are conntracks users. By default they are prohibited. https://jira.sw.ru/browse/PSBM-51050 Signed-off-by: Kirill Tkhai <[email protected]> Reviewed-by: Andrei Vagin <[email protected]> +++ ve/net: Move net->ct.can_alloc check up to resolve_normal_ct() Move it up on stack to break creation of a CT earlier. This avoids us to search in CT hashes and speeds work up. So, now nf_conntrack_alloc() creates a CT certanly, __nf_conntrack_alloc() doesn't return NULL and it does not need to be external. Signed-off-by: Kirill Tkhai <[email protected]> Reviewed-by: Pavel Tikhomirov <[email protected]> To be merged to commit 874e7b5c6eb9 "net: Primitives to enable conntrack allocation" https://jira.sw.ru/browse/PSBM-54823 Signed-off-by: Kirill Tkhai <[email protected]> +++ ve/net: Do not initialize netns_ct::can_alloc twice It's already initialized to zero during net creation in net_alloc(), so do not do that twice. Also, some conntrack allowing modules do not depend on nf_conntrack.ko, so it rewrites can_alloc to zero, if it's loaded later. (This may be merged with "commit af2b974e4755 "net: Primitives to enable conntrack allocation") https://jira.sw.ru/browse/PSBM-56500 Signed-off-by: Kirill Tkhai <[email protected]> ======================= net: Do not allow conntrack if netlink conntrack is requested The scheme with allowing conntracks suggestes to allow conntrack only after a rule is inserted. But this place is not inserting a rule, it's a manual conntrack creation. Signed-off-by: Kirill Tkhai <[email protected]> Reviewed-by: Pavel Tikhomirov <[email protected]> (cherry picked from commit 550b98d291cb0fb0b0270ab83dfc0fb6f48aadfe) VZ 8 rebase part https://jira.sw.ru/browse/PSBM-127783 Signed-off-by: Alexander Mikhalitsyn <[email protected]> --- include/net/net_namespace.h | 10 ++++++++++ include/net/netns/conntrack.h | 1 + net/netfilter/nf_conntrack_core.c | 6 ++++++ net/netfilter/nf_synproxy_core.c | 1 + 4 files changed, 18 insertions(+) diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index 93838c430818..634d107dff8b 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -344,6 +344,16 @@ static inline struct net *read_pnet(const possible_net_t *pnet) #define __net_initconst __initconst #endif +#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE) +static inline void allow_conntrack_allocation(struct net *net) +{ + net->ct.can_alloc = true; + smp_wmb(); /* Pairs with rmb in resolve_normal_ct() */ +} +#else +static inline void allow_conntrack_allocation(struct net *net) { } +#endif + int peernet2id_alloc(struct net *net, struct net *peer, gfp_t gfp); int peernet2id(struct net *net, struct net *peer); bool peernet_has_id(struct net *net, struct net *peer); diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h index 19bcf4173ccb..1094ad116224 100644 --- a/include/net/netns/conntrack.h +++ b/include/net/netns/conntrack.h @@ -106,6 +106,7 @@ struct ct_pcpu { struct netns_ct { atomic_t count; + bool can_alloc; /* Initialized in 0 by net_alloc */ unsigned int max; unsigned int expect_count; #ifdef CONFIG_NF_CONNTRACK_EVENTS diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 6ac5168d6c84..3a1057d8c368 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1660,6 +1660,12 @@ resolve_normal_ct(struct nf_conn *tmpl, struct nf_conn *ct; u32 hash; + if (!state->net->ct.can_alloc) { + /* No rules loaded */ + return 0; + } + smp_rmb(); /* Pairs with wmb in allow_conntrack_allocation() */ + if (!nf_ct_get_tuple(skb, skb_network_offset(skb), dataoff, state->pf, protonum, state->net, &tuple)) { diff --git a/net/netfilter/nf_synproxy_core.c b/net/netfilter/nf_synproxy_core.c index 3996ca086ec2..eae42e67af47 100644 --- a/net/netfilter/nf_synproxy_core.c +++ b/net/netfilter/nf_synproxy_core.c @@ -340,6 +340,7 @@ static int __net_init synproxy_net_init(struct net *net) struct nf_conn *ct; int err = -ENOMEM; + allow_conntrack_allocation(net); ct = nf_ct_tmpl_alloc(net, &nf_ct_zone_dflt, GFP_KERNEL); if (!ct) goto err1; -- 2.28.0 _______________________________________________ Devel mailing list [email protected] https://lists.openvz.org/mailman/listinfo/devel
