Re: RFC: Designing per chain rule cache support in libnftnl
Phil Sutter wrote: > In order to improve performance in 'nft -f' as well as xtables-restore > with very large rulesets, we need to store rules by chain they belong > to. In order to avoid pointless code duplication, this should be > supported by libnftnl. Unfortunately we still need to change lookup algorithm as well (hash, tree?), linear list scan is too expensive. We might even need multiple internal ways to keep track of the chains, e.g. to accelerate insert/delete-by-index :-/ > Looking into the topic, it seems like extending struct nftnl_chain is > the most straightforward way to go. My idea is to embed an > nftnl_rule_list in there, though I'm unsure how to best do that in > practice: > > We could either add a field of type struct nftnl_rule_list which would > have to be initialized/cleared in nftnl_chain_alloc() and > nftnl_chain_free(). This would be accompanied by a function to retrieve > the pointer to that field so the existing rule_list routines may be used > with it. > > Another option would be to add a pointer to a struct nftnl_rule_list. > Having a function to retrieve a pointer to that pointer, the rule_list > could be initialized/cleared by users on demand. > > What do you consider more practical? Is there a third option I didn't > think of yet? I'd vote for the former (embed nftnl_rule_list). If user doesn't want it cleared at nftnl_chain_free() time they can always allocate a new nftnl_rule_list and splice to that list.
Re: [iptables PATCH] arptables: Support --set-counters option
Phil Sutter wrote: > Relevant code for this was already present (short option '-c'), just the > long option definition was missing. Applied, thanks.
[iptables PATCH] arptables: Support --set-counters option
Relevant code for this was already present (short option '-c'), just the long option definition was missing. While being at it, add '-c' to help text. Signed-off-by: Phil Sutter --- iptables/xtables-arp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/iptables/xtables-arp.c b/iptables/xtables-arp.c index 5a9924ca56442..2f369d9aadb01 100644 --- a/iptables/xtables-arp.c +++ b/iptables/xtables-arp.c @@ -144,6 +144,7 @@ static struct option original_opts[] = { { "help", 2, 0, 'h' }, { "line-numbers", 0, 0, '0' }, { "modprobe", 1, 0, 'M' }, + { "set-counters", 1, 0, 'c' }, { 0 } }; @@ -481,7 +482,7 @@ exit_printhelp(void) " --line-numbers print line numbers when listing\n" " --exact -x expand numbers (display exact values)\n" " --modprobe=try to insert modules using this command\n" -" --set-counters PKTS BYTES set the counter during insert/append\n" +" --set-counters -c PKTS BYTESset the counter during insert/append\n" "[!] --version -V print package version.\n"); printf(" opcode strings: \n"); for (i = 0; i < NUMOPCODES; i++) -- 2.19.0
[PATCH nf v2 2/2] netfilter: nat: fix double register in masquerade modules
masquerade modules register notifier and that should not be double-registered. so that these modules manage reference counter. If already notifiers are registered, it just return success. But there is unsafe scenario. test commands: while : do modprobe ip6t_MASQUERADE & modprobe nft_masq_ipv6 & modprobe -rv ip6t_MASQUERADE & modprobe -rv nft_masq_ipv6 & done numbers are reference count. CPU0CPU1CPU2CPU3CPU4 [insmod][insmod][rmmod] [rmmod] [insmod] 0->1 register1->2 returns 2->1 returns 1->0 0->1 register <-- unregister The unregistation of CPU3 should be processed before the registration of CPU4. In order to fix this, mutex can be used. So that this patch uses it. splat looks like: [ 323.869557] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [modprobe:1381] [ 323.869574] Modules linked in: nf_tables(+) nf_nat_ipv6(-) nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 n] [ 323.869574] irq event stamp: 194074 [ 323.898930] hardirqs last enabled at (194073): [] trace_hardirqs_on_thunk+0x1a/0x1c [ 323.898930] hardirqs last disabled at (194074): [] trace_hardirqs_off_thunk+0x1a/0x1c [ 323.898930] softirqs last enabled at (182132): [] __do_softirq+0x6ec/0xa3b [ 323.898930] softirqs last disabled at (182109): [] irq_exit+0x1a6/0x1e0 [ 323.898930] CPU: 0 PID: 1381 Comm: modprobe Not tainted 4.20.0-rc2+ #27 [ 323.898930] RIP: 0010:raw_notifier_chain_register+0xea/0x240 [ 323.898930] Code: 3c 03 0f 8e f2 00 00 00 44 3b 6b 10 7f 4d 49 bc 00 00 00 00 00 fc ff df eb 22 48 8d 7b 10 488 [ 323.898930] RSP: 0018:888101597218 EFLAGS: 0206 ORIG_RAX: ff13 [ 323.898930] RAX: RBX: c04361c0 RCX: [ 323.898930] RDX: 126132ae RSI: c04aa3c0 RDI: c04361d0 [ 323.898930] RBP: c04361c8 R08: R09: 0001 [ 323.898930] R10: 8881015972b0 R11: fbfff26132c4 R12: dc00 [ 323.898930] R13: R14: 1110202b2e44 R15: c04aa3c0 [ 323.898930] FS: 7f813ed41540() GS:88811ae0() knlGS: [ 323.898930] CS: 0010 DS: ES: CR0: 80050033 [ 323.898930] CR2: 559bf2c9f120 CR3: 00010bc8 CR4: 001006f0 [ 323.898930] Call Trace: [ 323.898930] ? atomic_notifier_chain_register+0x2d0/0x2d0 [ 323.898930] ? down_read+0x150/0x150 [ 323.898930] ? sched_clock_cpu+0x126/0x170 [ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 323.898930] register_netdevice_notifier+0xbb/0x790 [ 323.898930] ? __dev_close_many+0x2d0/0x2d0 [ 323.898930] ? __mutex_unlock_slowpath+0x17f/0x740 [ 323.898930] ? wait_for_completion+0x710/0x710 [ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 323.898930] ? up_write+0x6c/0x210 [ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 324.127073] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 324.127073] nft_chain_filter_init+0x1e/0xe8a [nf_tables] [ 324.127073] nf_tables_module_init+0x37/0x92 [nf_tables] [ ... ] Fixes: 8dd33cc93ec9 ("netfilter: nf_nat: generalize IPv4 masquerading support for nf_tables") Fixes: be6b635cd674 ("netfilter: nf_nat: generalize IPv6 masquerading support for nf_tables") Signed-off-by: Taehee Yoo --- v2: - Add second patch - return success when notifier is already registered. (Florian Westphal) v1: Initial patch net/ipv4/netfilter/nf_nat_masquerade_ipv4.c | 23 ++--- net/ipv6/netfilter/nf_nat_masquerade_ipv6.c | 23 ++--- 2 files changed, 32 insertions(+), 14 deletions(-) diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c index c7d7fa4fc369..41327bb99093 100644 --- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c +++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c @@ -147,15 +147,17 @@ static struct notifier_block masq_inet_notifier = { .notifier_call = masq_inet_event, }; -static atomic_t masquerade_notifier_refcount = ATOMIC_INIT(0); +static int masq_refcnt; +static DEFINE_MUTEX(masq_mutex); int nf_nat_masquerade_ipv4_register_notifier(void) { - int ret; + int ret = 0; + mutex_lock(_mutex); /* check if the notifier was already set */ - if (atomic_inc_return(_notifier_refcount) > 1) - return 0; + if (++masq_refcnt > 1) + goto out_unlock; /* Register for device down reports */ ret
[PATCH nf v2 0/2] netfilter: fix notifier registration bugs
This patch series fix notifier registration bugs. First patch adds error handling code for failure of notifier registration. notifier registration can be failed. so that error handling code are needed. Second patch fixes double-register bug in masqerade modules. In order to protect double-register, masquerade modules manage reference count. but it's not enough. So that, this patch uses mutex instead of atomic value. v2: - Add second patch - return success when notifier is already registered. (Florian Westphal) v1: Initial patch Taehee Yoo (2): netfilter: add missing error handling code for register functions netfilter: nat: fix double register in masquerade modules .../net/netfilter/ipv4/nf_nat_masquerade.h| 2 +- .../net/netfilter/ipv6/nf_nat_masquerade.h| 2 +- net/ipv4/netfilter/ipt_MASQUERADE.c | 7 ++- net/ipv4/netfilter/nf_nat_masquerade_ipv4.c | 38 +++--- net/ipv4/netfilter/nft_masq_ipv4.c| 4 +- net/ipv6/netfilter/ip6t_MASQUERADE.c | 8 ++- net/ipv6/netfilter/nf_nat_masquerade_ipv6.c | 49 ++- net/ipv6/netfilter/nft_masq_ipv6.c| 4 +- net/netfilter/nft_flow_offload.c | 5 +- 9 files changed, 89 insertions(+), 30 deletions(-) -- 2.17.1
[PATCH nf v2 1/2] netfilter: add missing error handling code for register functions
register_{netdevice/inetaddr/inet6addr}_notifier returns value that could be error value. so that error handling code are needed. Signed-off-by: Taehee Yoo --- v2: - Add second patch - return success when notifier is already registered. (Florian Westphal) v1: Initial patch .../net/netfilter/ipv4/nf_nat_masquerade.h| 2 +- .../net/netfilter/ipv6/nf_nat_masquerade.h| 2 +- net/ipv4/netfilter/ipt_MASQUERADE.c | 7 ++-- net/ipv4/netfilter/nf_nat_masquerade_ipv4.c | 21 +--- net/ipv4/netfilter/nft_masq_ipv4.c| 4 ++- net/ipv6/netfilter/ip6t_MASQUERADE.c | 8 +++-- net/ipv6/netfilter/nf_nat_masquerade_ipv6.c | 32 +-- net/ipv6/netfilter/nft_masq_ipv6.c| 4 ++- net/netfilter/nft_flow_offload.c | 5 ++- 9 files changed, 63 insertions(+), 22 deletions(-) diff --git a/include/net/netfilter/ipv4/nf_nat_masquerade.h b/include/net/netfilter/ipv4/nf_nat_masquerade.h index cd24be4c4a99..13d55206bb9f 100644 --- a/include/net/netfilter/ipv4/nf_nat_masquerade.h +++ b/include/net/netfilter/ipv4/nf_nat_masquerade.h @@ -9,7 +9,7 @@ nf_nat_masquerade_ipv4(struct sk_buff *skb, unsigned int hooknum, const struct nf_nat_range2 *range, const struct net_device *out); -void nf_nat_masquerade_ipv4_register_notifier(void); +int nf_nat_masquerade_ipv4_register_notifier(void); void nf_nat_masquerade_ipv4_unregister_notifier(void); #endif /*_NF_NAT_MASQUERADE_IPV4_H_ */ diff --git a/include/net/netfilter/ipv6/nf_nat_masquerade.h b/include/net/netfilter/ipv6/nf_nat_masquerade.h index 0c3b5ebf0bb8..2917bf95c437 100644 --- a/include/net/netfilter/ipv6/nf_nat_masquerade.h +++ b/include/net/netfilter/ipv6/nf_nat_masquerade.h @@ -5,7 +5,7 @@ unsigned int nf_nat_masquerade_ipv6(struct sk_buff *skb, const struct nf_nat_range2 *range, const struct net_device *out); -void nf_nat_masquerade_ipv6_register_notifier(void); +int nf_nat_masquerade_ipv6_register_notifier(void); void nf_nat_masquerade_ipv6_unregister_notifier(void); #endif /* _NF_NAT_MASQUERADE_IPV6_H_ */ diff --git a/net/ipv4/netfilter/ipt_MASQUERADE.c b/net/ipv4/netfilter/ipt_MASQUERADE.c index ce1512b02cb2..fd3f9e8a74da 100644 --- a/net/ipv4/netfilter/ipt_MASQUERADE.c +++ b/net/ipv4/netfilter/ipt_MASQUERADE.c @@ -81,9 +81,12 @@ static int __init masquerade_tg_init(void) int ret; ret = xt_register_target(_tg_reg); + if (ret) + return ret; - if (ret == 0) - nf_nat_masquerade_ipv4_register_notifier(); + ret = nf_nat_masquerade_ipv4_register_notifier(); + if (ret) + xt_unregister_target(_tg_reg); return ret; } diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c index a9d5e013e555..c7d7fa4fc369 100644 --- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c +++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c @@ -149,16 +149,29 @@ static struct notifier_block masq_inet_notifier = { static atomic_t masquerade_notifier_refcount = ATOMIC_INIT(0); -void nf_nat_masquerade_ipv4_register_notifier(void) +int nf_nat_masquerade_ipv4_register_notifier(void) { + int ret; + /* check if the notifier was already set */ if (atomic_inc_return(_notifier_refcount) > 1) - return; + return 0; /* Register for device down reports */ - register_netdevice_notifier(_dev_notifier); + ret = register_netdevice_notifier(_dev_notifier); + if (ret) + goto err_dec; /* Register IP address change reports */ - register_inetaddr_notifier(_inet_notifier); + ret = register_inetaddr_notifier(_inet_notifier); + if (ret) + goto err_unregister; + + return ret; +err_unregister: + unregister_netdevice_notifier(_dev_notifier); +err_dec: + atomic_dec(_notifier_refcount); + return ret; } EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv4_register_notifier); diff --git a/net/ipv4/netfilter/nft_masq_ipv4.c b/net/ipv4/netfilter/nft_masq_ipv4.c index f1193e1e928a..6847de1d1db8 100644 --- a/net/ipv4/netfilter/nft_masq_ipv4.c +++ b/net/ipv4/netfilter/nft_masq_ipv4.c @@ -69,7 +69,9 @@ static int __init nft_masq_ipv4_module_init(void) if (ret < 0) return ret; - nf_nat_masquerade_ipv4_register_notifier(); + ret = nf_nat_masquerade_ipv4_register_notifier(); + if (ret) + nft_unregister_expr(_masq_ipv4_type); return ret; } diff --git a/net/ipv6/netfilter/ip6t_MASQUERADE.c b/net/ipv6/netfilter/ip6t_MASQUERADE.c index 491f808e356a..29c7f1915a96 100644 --- a/net/ipv6/netfilter/ip6t_MASQUERADE.c +++ b/net/ipv6/netfilter/ip6t_MASQUERADE.c @@ -58,8 +58,12 @@ static int __init masquerade_tg6_init(void) int err; err = xt_register_target(_tg6_reg); - if (err == 0) -