Re: [PATCH 00/28] Reenable maybe-uninitialized warnings
On Tue, Oct 18, 2016 at 12:03:28AM +0200, Arnd Bergmann wrote: > This is a set of patches that I hope to get into v4.9 in some form > in order to turn on the -Wmaybe-uninitialized warnings again. Hi Arnd, I jsut complained to Geert that I was introducing way to many bugs or pointless warnings for some compilers lately, but gcc didn't warn me about them. From a little research the lack of -Wmaybe-uninitialized seems to be the reason for it, so I'm all for re-enabling it. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH nftables] Fix register allocation for EXPR_SET_ELEM
From: Anders K. PedersenI noticed that while # nft add rule ip6 filter postrouting \ flow table acct_out \{ meta iif . ip6 saddr timeout 600s counter \} works, the opposite order for the concatenated expressions fails: # nft add rule ip6 filter postrouting \ flow table acct_out \{ ip6 saddr . meta iif timeout 600s counter \} nft: netlink_linearize.c:634: netlink_gen_expr: Assertion `dreg < ctx->reg_low' failed. I traced this down to get_register() and release_register(), where the EXPR_CONCAT handling isn't hit, when it's embedded in EXPR_SET_ELEM, and fixed it similarly to how EXPR_SET_ELEM is handled in netlink_gen_expr(). Signed-off-by: Anders K. Pedersen --- src/netlink_linearize.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/netlink_linearize.c b/src/netlink_linearize.c --- a/src/netlink_linearize.c +++ b/src/netlink_linearize.c @@ -73,6 +73,9 @@ static void __release_register(struct netlink_linearize_ctx *ctx, static enum nft_registers get_register(struct netlink_linearize_ctx *ctx, const struct expr *expr) { + if (expr && expr->ops->type == EXPR_SET_ELEM) + return get_register(ctx, expr->key); + if (expr && expr->ops->type == EXPR_CONCAT) return __get_register(ctx, expr->len); else @@ -82,6 +85,9 @@ static enum nft_registers get_register(struct netlink_linearize_ctx *ctx, static void release_register(struct netlink_linearize_ctx *ctx, const struct expr *expr) { + if (expr && expr->ops->type == EXPR_SET_ELEM) + return release_register(ctx, expr->key); + if (expr && expr->ops->type == EXPR_CONCAT) __release_register(ctx, expr->len); else
[PATCH 28/28] Kbuild: bring back -Wmaybe-uninitialized warning
Traditionally, we have always had warnings about uninitialized variables enabled, as this is part of -Wall, and generally a good idea [1], but it also always produced false positives, mainly because this is a variation of the halting problem and provably impossible to get right in all cases [2]. Various people have identified cases that are particularly bad for false positives, and in commit e74fc973b6e5 ("Turn off -Wmaybe-uninitialized when building with -Os"), I turned off the warning for any build that was done with CC_OPTIMIZE_FOR_SIZE. This drastically reduced the number of false positive warnings in the default build but unfortunately had the side effect of turning the warning off completely in 'allmodconfig' builds, which in turn led to a lot of warnings (both actual bugs, and remaining false positives) to go in unnoticed. With commit 877417e6ffb9 ("Kbuild: change CC_OPTIMIZE_FOR_SIZE definition") enabled the warning again for allmodconfig builds in v4.7 and in v4.8-rc1, I had finally managed to address all warnings I get in an ARM allmodconfig build and most other maybe-uninitialized warnings for ARM randconfig builds. However, commit 6e8d666e9253 ("Disable "maybe-uninitialized" warning globally") was merged at the same time and disabled it completely for all configurations, because of false-positive warnings on x86 that I had not addressed until then. This caused a lot of actual bugs to get merged into mainline, and I sent several dozen patches for these during the v4.9 development cycle. Most of these are actual bugs, some are for correct code that is safe because it is only called under external constraints that make it impossible to run into the case that gcc sees, and in a few cases gcc is just stupid and finds something that can obviously never happen. I have now done a few thousand randconfig builds on x86 and collected all patches that I needed to address every single warning I got (I can provide the combined patch for the other warnings if anyone is interested), so I hope we can get the warning back and let people catch the actual bugs earlier. Note that the majority of the patches I created are for the third kind of problem (stupid false-positives), for one of two reasons: - some of them only get triggered in certain combinations of config options, so we don't always run into them, and - the actual bugs tend to get addressed much quicker as they also lead to incorrect runtime behavior. These 27 patches address the warnings that either occur in one of the more common configurations (defconfig, allmodconfig, or something built by the kbuild robot or kernelci.org), or they are about a real bug. It would be good to get these all into v4.9 if we want to turn on the warning again. I have tested these extensively with gcc-4.9 and gcc-6 and done a bit of testing with gcc-5, and all of these should now be fine. gcc-4.8 is much worse about the false-positive warnings and is also fairly old now, so I'm leaving the warning disabled with that version. gcc-4.7 and older don't understand the -Wno-maybe-uninitialized option and are not affected by this patch either way. I have another (smaller) series of patches for warnings that are both harmless and not as easy to trigger, and I will send them for inclusion in v4.10. Link: https://rusty.ozlabs.org/?p=232 [1] Link: https://gcc.gnu.org/wiki/Better_Uninitialized_Warnings [2] Signed-off-by: Arnd Bergmann--- Makefile | 10 ++ arch/arc/Makefile | 4 +++- scripts/Makefile.ubsan | 4 3 files changed, 13 insertions(+), 5 deletions(-) Cc: x...@kernel.org Cc: linux-me...@vger.kernel.org Cc: Mauro Carvalho Chehab Cc: Martin Schwidefsky Cc: linux-s...@vger.kernel.org Cc: Ilya Dryomov Cc: dri-de...@lists.freedesktop.org Cc: linux-...@lists.infradead.org Cc: Herbert Xu Cc: linux-cry...@vger.kernel.org Cc: "David S. Miller" Cc: net...@vger.kernel.org Cc: Greg Kroah-Hartman Cc: ceph-de...@vger.kernel.org Cc: linux-f2fs-de...@lists.sourceforge.net Cc: linux-e...@vger.kernel.org Cc: netfilter-devel@vger.kernel.org diff --git a/Makefile b/Makefile index 512e47a..43cd3d9 100644 --- a/Makefile +++ b/Makefile @@ -370,7 +370,7 @@ LDFLAGS_MODULE = CFLAGS_KERNEL = AFLAGS_KERNEL = LDFLAGS_vmlinux = -CFLAGS_GCOV= -fprofile-arcs -ftest-coverage -fno-tree-loop-im +CFLAGS_GCOV= -fprofile-arcs -ftest-coverage -fno-tree-loop-im -Wno-maybe-uninitialized CFLAGS_KCOV:= $(call cc-option,-fsanitize-coverage=trace-pc,) @@ -620,7 +620,6 @@ ARCH_CFLAGS := include arch/$(SRCARCH)/Makefile KBUILD_CFLAGS += $(call cc-option,-fno-delete-null-pointer-checks,) -KBUILD_CFLAGS += $(call cc-disable-warning,maybe-uninitialized,) KBUILD_CFLAGS += $(call cc-disable-warning,frame-address,) ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION @@ -629,15 +628,18 @@
[PATCH 01/28] [v2] netfilter: nf_tables: avoid uninitialized variable warning
The newly added nft_range_eval() function handles the two possible nft range operations, but as the compiler warning points out, any unexpected value would lead to the 'mismatch' variable being used without being initialized: net/netfilter/nft_range.c: In function 'nft_range_eval': net/netfilter/nft_range.c:45:5: error: 'mismatch' may be used uninitialized in this function [-Werror=maybe-uninitialized] This removes the variable in question and instead moves the condition into the switch itself, which is potentially more efficient than adding a bogus 'default' clause as in my first approach, and is nicer than using the 'uninitialized_var' macro. Fixes: 0f3cd9b36977 ("netfilter: nf_tables: add range expression") Link: http://patchwork.ozlabs.org/patch/677114/ Signed-off-by: Arnd Bergmann--- net/netfilter/nft_range.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) Cc: Pablo Neira Ayuso diff --git a/net/netfilter/nft_range.c b/net/netfilter/nft_range.c index c6d5358..2dd80f4 100644 --- a/net/netfilter/nft_range.c +++ b/net/netfilter/nft_range.c @@ -28,22 +28,20 @@ static void nft_range_eval(const struct nft_expr *expr, const struct nft_pktinfo *pkt) { const struct nft_range_expr *priv = nft_expr_priv(expr); - bool mismatch; int d1, d2; d1 = memcmp(>data[priv->sreg], >data_from, priv->len); d2 = memcmp(>data[priv->sreg], >data_to, priv->len); switch (priv->op) { case NFT_RANGE_EQ: - mismatch = (d1 < 0 || d2 > 0); + if (d1 < 0 || d2 > 0) + regs->verdict.code = NFT_BREAK; break; case NFT_RANGE_NEQ: - mismatch = (d1 >= 0 && d2 <= 0); + if (d1 >= 0 && d2 <= 0) + regs->verdict.code = NFT_BREAK; break; } - - if (mismatch) - regs->verdict.code = NFT_BREAK; } static const struct nla_policy nft_range_policy[NFTA_RANGE_MAX + 1] = { -- 2.9.0 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/28] Reenable maybe-uninitialized warnings
This is a set of patches that I hope to get into v4.9 in some form in order to turn on the -Wmaybe-uninitialized warnings again. After talking to Linus in person at Linaro Connect about this, I spent some time on finding all the remaining warnings, and this is the resulting patch series. More details are in the description of the last patch that actually enables the warning. Let me know if there are other warnings that I missed, and whether you think these are still appropriate for v4.9 or not. A couple of patches are non-obvious, and could use some more detailed review. Arnd Arnd Bergmann (28): [v2] netfilter: nf_tables: avoid uninitialized variable warning [v2] mtd: mtk: avoid warning in mtk_ecc_encode [v2] infiniband: shut up a maybe-uninitialized warning f2fs: replace a build-time warning with runtime WARN_ON ext2: avoid bogus -Wmaybe-uninitialized warning NFSv4.1: work around -Wmaybe-uninitialized warning ceph: avoid false positive maybe-uninitialized warning staging: lustre: restore initialization of return code staging: lustre: remove broken dead code in cfs_cpt_table_create_pattern UBI: fix uninitialized access of vid_hdr pointer block: rdb: false-postive gcc-4.9 -Wmaybe-uninitialized [media] rc: print correct variable for z8f0811 [media] dib0700: fix uninitialized data on 'repeat' event iio: accel: sca3000_core: avoid potentially uninitialized variable crypto: aesni: avoid -Wmaybe-uninitialized warning pcmcia: fix return value of soc_pcmcia_regulator_set spi: fsl-espi: avoid processing uninitalized data on error drm: avoid uninitialized timestamp use in wait_vblank brcmfmac: avoid maybe-uninitialized warning in brcmf_cfg80211_start_ap net: bcm63xx: avoid referencing uninitialized variable net/hyperv: avoid uninitialized variable x86: apm: avoid uninitialized data x86: mark target address as output in 'insb' asm x86: math-emu: possible uninitialized variable use s390: pci: don't print uninitialized data for debugging nios2: fix timer initcall return value rocker: fix maybe-uninitialized warning Kbuild: bring back -Wmaybe-uninitialized warning Makefile | 10 +- arch/arc/Makefile | 4 +- arch/nios2/kernel/time.c | 1 + arch/s390/pci/pci_dma.c| 2 +- arch/x86/crypto/aesni-intel_glue.c | 121 + arch/x86/include/asm/io.h | 4 +- arch/x86/kernel/apm_32.c | 5 +- arch/x86/math-emu/Makefile | 4 +- arch/x86/math-emu/reg_compare.c| 16 +-- drivers/block/rbd.c| 1 + drivers/gpu/drm/drm_irq.c | 4 +- drivers/infiniband/core/cma.c | 56 +- drivers/media/i2c/ir-kbd-i2c.c | 2 +- drivers/media/usb/dvb-usb/dib0700_core.c | 10 +- drivers/mtd/nand/mtk_ecc.c | 19 ++-- drivers/mtd/ubi/eba.c | 2 +- drivers/net/ethernet/broadcom/bcm63xx_enet.c | 3 +- drivers/net/ethernet/rocker/rocker_ofdpa.c | 4 +- drivers/net/hyperv/netvsc_drv.c| 2 +- .../broadcom/brcm80211/brcmfmac/cfg80211.c | 2 +- drivers/pcmcia/soc_common.c| 2 +- drivers/spi/spi-fsl-espi.c | 2 +- drivers/staging/iio/accel/sca3000_core.c | 2 + .../staging/lustre/lnet/libcfs/linux/linux-cpu.c | 7 -- drivers/staging/lustre/lustre/lov/lov_pack.c | 2 + fs/ceph/super.c| 3 +- fs/ext2/inode.c| 7 +- fs/f2fs/data.c | 7 ++ fs/nfs/nfs4session.c | 10 +- net/netfilter/nft_range.c | 10 +- scripts/Makefile.ubsan | 4 + 31 files changed, 187 insertions(+), 141 deletions(-) -- Cc: x...@kernel.org Cc: linux-me...@vger.kernel.org Cc: Mauro Carvalho ChehabCc: Martin Schwidefsky Cc: linux-s...@vger.kernel.org Cc: Ilya Dryomov Cc: dri-de...@lists.freedesktop.org Cc: linux-...@lists.infradead.org Cc: Herbert Xu Cc: linux-cry...@vger.kernel.org Cc: "David S. Miller" Cc: net...@vger.kernel.org Cc: Greg Kroah-Hartman Cc: ceph-de...@vger.kernel.org Cc: linux-f2fs-de...@lists.sourceforge.net Cc: linux-e...@vger.kernel.org Cc: netfilter-devel@vger.kernel.org 2.9.0 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH nf] netfilter: x_tables: suppress kmemcheck warning
Markus Trippelsdorf reports: WARNING: kmemcheck: Caught 64-bit read from uninitialized memory (88001e605480) 4055601e008890686d81 u u u u u u u u u u u u u u u u i i i i i i i i u u u u u u u u ^ |RIP: 0010:[] [] nf_register_net_hook+0x51/0x160 [..] [] nf_register_net_hook+0x51/0x160 [] nf_register_net_hooks+0x3f/0xa0 [] ipt_register_table+0xe5/0x110 [..] This warning is harmless; we copy 'uninitialized' data from the hook ops but it will not be used. Long term the structures keeping run-time data should be disentangled from those only containing config-time data (such as where in the list to insert a hook), but thats -next material. Reported-by: Markus TrippelsdorfSuggested-by: Al Viro Signed-off-by: Florian Westphal --- net/netfilter/x_tables.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index e0aa7c1d0224..fc4977456c30 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -1513,7 +1513,7 @@ xt_hook_ops_alloc(const struct xt_table *table, nf_hookfn *fn) if (!num_hooks) return ERR_PTR(-EINVAL); - ops = kmalloc(sizeof(*ops) * num_hooks, GFP_KERNEL); + ops = kcalloc(num_hooks, sizeof(*ops), GFP_KERNEL); if (ops == NULL) return ERR_PTR(-ENOMEM); -- 2.7.3 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf,v2] netfilter: nf_queue: don't re-enter same hook on packet reinjection
Pablo Neira Ayusowrites: > On Mon, Oct 17, 2016 at 11:23:01AM -0400, Aaron Conole wrote: >> Pablo Neira Ayuso writes: >> >> > Make sure we skip the current hook from where the packet was enqueued, >> > otherwise the packets gets enqueued over and over again. >> > >> > Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked >> > list") >> > Signed-off-by: Pablo Neira Ayuso >> > --- >> > v2: Make sure next hook is non-null, otherwise we are at the end of the >> >hook list and we can skip nf_iterate(). >> > >> > net/netfilter/nf_queue.c | 3 ++- >> > 1 file changed, 2 insertions(+), 1 deletion(-) >> > >> > diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c >> > index 96964a0070e1..691e713d70f5 100644 >> > --- a/net/netfilter/nf_queue.c >> > +++ b/net/netfilter/nf_queue.c >> > @@ -185,8 +185,9 @@ void nf_reinject(struct nf_queue_entry *entry, >> > unsigned int verdict) >> >} >> > >> >entry->state.thresh = INT_MIN; >> > + hook_entry = rcu_dereference(hook_entry->next); >> > >> > - if (verdict == NF_ACCEPT) { >> > + if (hook_entry && verdict == NF_ACCEPT) { >> >next_hook: >> >verdict = nf_iterate(skb, >state, _entry); >> >} >> >> ACK. I thought switch case below could have a problem, but re-checked >> the first nf_queue leg, and it seems okay. > > Argh, still not right. If we get a NF_QUEUE verdict to re-enqueue > again, then hook_entry may become NULL. > > switch (verdict & NF_VERDICT_MASK) { > case NF_ACCEPT: > case NF_STOP: > local_bh_disable(); > entry->state.okfn(entry->state.net, entry->state.sk, skb); > local_bh_enable(); > break; > case NF_QUEUE: > RCU_INIT_POINTER(entry->state.hook_entries, hook_entry); <-- > > Attaching new patch. > > From c1a731c68791bcd504a7fe5d28f5f0fd59d66118 Mon Sep 17 00:00:00 2001 > From: Pablo Neira Ayuso > Date: Thu, 13 Oct 2016 08:14:03 +0200 > Subject: [PATCH nf,v3] netfilter: nf_queue: don't re-enter same hook on packet > reinjection > > If the packet is accepted, we have to skip the current hook from where > the packet was enqueued. Thus, we can emulate the previous > list_for_each_entry_continue() behaviour happening from nf_reinject(), > otherwise the packets gets enqueued over and over again. > > Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list") > Signed-off-by: Pablo Neira Ayuso > --- > net/netfilter/nf_queue.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c > index 96964a0070e1..0b5ac3c9c2bc 100644 > --- a/net/netfilter/nf_queue.c > +++ b/net/netfilter/nf_queue.c > @@ -187,8 +187,10 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned > int verdict) > entry->state.thresh = INT_MIN; > > if (verdict == NF_ACCEPT) { > - next_hook: > - verdict = nf_iterate(skb, >state, _entry); > + hook_entry = rcu_dereference(hook_entry->next); > + if (hook_entry) > +next_hook: Should the above two lines be transposed to this? next_hook: if (hook_entry) Sorry if I'm misunderstanding it. Too many special cases for my tiny brain... -Aaron -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf-next 0/2] netfilter: autoload NAT support for non-builtin L4 protocols
On Thu, Oct 06, 2016 at 07:09:27PM +0200, Davide Caratti wrote: > this series fixes SNAT/DNAT rules where port number translation is > explicitly configured, but only the L3 address is translated: > > # iptables -t nat -A POSTROUTING -o eth1 -p stcp -j SNAT --to-source > 10.0.0.1:61000 > # tcpdump -s46 -tni eth1 sctp > tcpdump: verbose output suppressed, use -v or -vv for full protocol decode > listening on eth1, link-type EN10MB (Ethernet), capture size 46 bytes > IP 10.0.0.1.37788 > 10.0.0.2.2000: sctp > ^ > IP 10.0.0.2.2000 > 10.0.0.1.37788: sctp > IP 10.0.0.1.37788 > 10.0.0.2.2000: sctp > IP 10.0.0.2.2000 > 10.0.0.1.37788: sctp > IP 10.0.0.2.2000 > 10.0.0.1.37788: sctp > IP 10.0.0.1.37788 > 10.0.0.2.2000: sctp > IP 10.0.0.2.2000 > 10.0.0.1.37788: sctp > > This happens for all protocols that don't have L4 NAT support built into > nf_nat.ko, such as DCCP, SCTP and UDPLite: unless the user modprobes > nf_nat_proto_{dccp,sctp,udplite}.ko, port translation as specified in the > above rule will not be done. > The first patch provides persistent and generic aliases for the above > modules; the second patch autoloads nf_nat_proto_{dccp,sctp,udplite} when a > SNAT/DNAT rule matching one of the above protocols is created. I would really like to see DCCP, SCTP and UDPlite built-in, just like other protocol trackers (TCP, UDP...). This may require a bit of review work on your/our side, but it would greatly appreciated. We discussed this during the last Netfilter Workshop, the current situation is not good, we're in some way responsible for breaking the deployment of new protocols on the Internet. Many vendors rely on default configurations, not even looking into modprobing things, so these protocols are hopeless in the current situation since routers running Netfilter will likely not supported them. This is worse since nf_conntrack drops packets for protocols like SCTP and DCCP since the generic protocol can no longer be used. Once these protocols are supported built-in, users can configure from our control plane, ie. iptables/nft, if they explicitly don't want to allow them by dropping protocols of this kind. But in that case we would not be responsible anymore for the current situation at least. Moreover, following this approach, we would also avoid the new attribute in nft_nat to indicate the layer 4 protocol that you have mentioned already. Thanks! -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf,v2] netfilter: nf_queue: don't re-enter same hook on packet reinjection
On Mon, Oct 17, 2016 at 11:23:01AM -0400, Aaron Conole wrote: > Pablo Neira Ayusowrites: > > > Make sure we skip the current hook from where the packet was enqueued, > > otherwise the packets gets enqueued over and over again. > > > > Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list") > > Signed-off-by: Pablo Neira Ayuso > > --- > > v2: Make sure next hook is non-null, otherwise we are at the end of the > > hook list and we can skip nf_iterate(). > > > > net/netfilter/nf_queue.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c > > index 96964a0070e1..691e713d70f5 100644 > > --- a/net/netfilter/nf_queue.c > > +++ b/net/netfilter/nf_queue.c > > @@ -185,8 +185,9 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned > > int verdict) > > } > > > > entry->state.thresh = INT_MIN; > > + hook_entry = rcu_dereference(hook_entry->next); > > > > - if (verdict == NF_ACCEPT) { > > + if (hook_entry && verdict == NF_ACCEPT) { > > next_hook: > > verdict = nf_iterate(skb, >state, _entry); > > } > > ACK. I thought switch case below could have a problem, but re-checked > the first nf_queue leg, and it seems okay. Argh, still not right. If we get a NF_QUEUE verdict to re-enqueue again, then hook_entry may become NULL. switch (verdict & NF_VERDICT_MASK) { case NF_ACCEPT: case NF_STOP: local_bh_disable(); entry->state.okfn(entry->state.net, entry->state.sk, skb); local_bh_enable(); break; case NF_QUEUE: RCU_INIT_POINTER(entry->state.hook_entries, hook_entry); <-- Attaching new patch. >From c1a731c68791bcd504a7fe5d28f5f0fd59d66118 Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Thu, 13 Oct 2016 08:14:03 +0200 Subject: [PATCH nf,v3] netfilter: nf_queue: don't re-enter same hook on packet reinjection If the packet is accepted, we have to skip the current hook from where the packet was enqueued. Thus, we can emulate the previous list_for_each_entry_continue() behaviour happening from nf_reinject(), otherwise the packets gets enqueued over and over again. Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list") Signed-off-by: Pablo Neira Ayuso --- net/netfilter/nf_queue.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c index 96964a0070e1..0b5ac3c9c2bc 100644 --- a/net/netfilter/nf_queue.c +++ b/net/netfilter/nf_queue.c @@ -187,8 +187,10 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict) entry->state.thresh = INT_MIN; if (verdict == NF_ACCEPT) { - next_hook: - verdict = nf_iterate(skb, >state, _entry); + hook_entry = rcu_dereference(hook_entry->next); + if (hook_entry) +next_hook: + verdict = nf_iterate(skb, >state, _entry); } switch (verdict & NF_VERDICT_MASK) { -- 2.1.4
Re: [PATCH 00/10, nf-next] Netfilter core updates
On Mon, Oct 17, 2016 at 09:52:14AM -0400, Aaron Conole wrote: > Florian Westphalwrites: > > > Pablo Neira Ayuso wrote: > >> Let me know if you have any comment, otherwise I'll place this in the > >> nf-next tree so we can follow up working on top of these. > > > > Please do, thanks! > > +1. Some of this work was in my back burner, so thanks Pablo :) Thanks. I still need that the fix for nf_queue propagates to David's net tree. Will request him to pull net into net-next. May take a little while. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nft] src: support ct l3proto/protocol without direction syntax
On Thu, Sep 22, 2016 at 10:34:52PM +0800, Liping Zhang wrote: > From: Liping Zhang> > Acctually, ct l3proto and ct protocol are unrelated to direction, so > it's unnecessary that we must specify dir if we want to use them. > > Now add support that we can match ct l3proto/protocol without direction: > # nft add rule filter input ct l3proto ipv4 > # nft add rule filter output ct protocol 17 > > Note: existing syntax is still preserved, so "ct reply l3proto ipv6" > is still fine. Applied, thanks. Sorry, it seems I accidentally left this patch behind. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [libnftnl PATCH] libnftnl: update Arturo Borrero Gonzalez email
On Mon, Oct 10, 2016 at 12:26:34PM +0200, Arturo Borrero Gonzalez wrote: > Update Arturo Borrero Gonzalez email address. Applied, thanks Arturo. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH libnftnl] set_elem: don't add NFTA_SET_ELEM_LIST_ELEMENTS attribute if set is empty
If the set is empty, don't send an empty NFTA_SET_ELEM_LIST_ELEMENTS netlink attributes with no elements. Signed-off-by: Pablo Neira Ayuso--- src/set_elem.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/set_elem.c b/src/set_elem.c index 46fb7c6e424b..4d2b4f6074b7 100644 --- a/src/set_elem.c +++ b/src/set_elem.c @@ -304,6 +304,9 @@ void nftnl_set_elems_nlmsg_build_payload(struct nlmsghdr *nlh, struct nftnl_set nftnl_set_elem_nlmsg_build_def(nlh, s); + if (list_empty(>element_list)) + return; + nest1 = mnl_attr_nest_start(nlh, NFTA_SET_ELEM_LIST_ELEMENTS); list_for_each_entry(elem, >element_list, head) nftnl_set_elem_build(nlh, elem, ++i); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH ulogd2] ulogd: fix crash when ipv4 packet is truncated
On Tue, Oct 11, 2016 at 10:22:27PM +0800, Liping Zhang wrote: > From: Liping Zhang> > If ipv4 packet is truncated, we should not try to dereference the > iph pointer. Otherwise, if the user add such iptables rules > "-j NFLOG --nflog-size 0", we will dereference the NULL pointer > and crash may happen. With Eric's permission, I'm applying this. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch v2] netfilter: nf_tables: underflow in nft_parse_u32_check()
On Wed, Oct 12, 2016 at 12:14:29PM +0300, Dan Carpenter wrote: > We don't want to allow negatives here. Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] netfilter: nft_exthdr: fix error handling in nft_exthdr_init()
On Wed, Oct 12, 2016 at 09:09:12AM +0300, Dan Carpenter wrote: > "err" needs to be signed for the error handling to work. Applied, thanks Dan. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net 1/2] conntrack: remove obsolete sysctl (nf_conntrack_events_retry_timeout)
On Mon, Oct 10, 2016 at 03:57:37PM +0200, Florian Westphal wrote: > Nicolas Dichtelwrote: > > This entry has been removed in commit 9500507c6138. > > > > Fixes: 9500507c6138 ("netfilter: conntrack: remove timer from ecache > > extension") > > Signed-off-by: Nicolas Dichtel > > Acked-by: Florian Westphal Applied, thanks Nicolas. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf] netfilter: xt_NFLOG: fix unexpected truncated packet
On Tue, Oct 11, 2016 at 10:26:27PM +0800, Liping Zhang wrote: > From: Liping Zhang> > Justin and Chris spotted that iptables NFLOG target was broken when they > upgraded the kernel to 4.8: "ulogd-2.0.5- IPs are no longer logged" or > "results in segfaults in ulogd-2.0.5". > > Because "struct nf_loginfo li;" is a local variable, and flags will be > filled with garbage value, not inited to zero. So if it contains 0x1, > packets will not be logged to the userspace anymore. Applied and enqueued for -stable, thanks. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf] netfilter: xt_ipcomp: add "ip[6]t_ipcomp" module alias name
On Wed, Oct 12, 2016 at 09:09:22PM +0800, Liping Zhang wrote: > From: Liping Zhang> > Otherwise, user cannot add related rules if xt_ipcomp.ko is not loaded: > # iptables -A OUTPUT -p 108 -m ipcomp --ipcompspi 1 > iptables: No chain/target/match by that name. Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf] netfilter: nft_hash: add missing NFTA_HASH_OFFSET's nla_policy
On Wed, Oct 12, 2016 at 09:10:45PM +0800, Liping Zhang wrote: > From: Liping Zhang> > Missing the nla_policy description will also miss the validation check > in kernel. Also applied, thanks Liping. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf,v2] netfilter: nf_queue: don't re-enter same hook on packet reinjection
Pablo Neira Ayusowrites: > Make sure we skip the current hook from where the packet was enqueued, > otherwise the packets gets enqueued over and over again. > > Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list") > Signed-off-by: Pablo Neira Ayuso > --- > v2: Make sure next hook is non-null, otherwise we are at the end of the > hook list and we can skip nf_iterate(). > > net/netfilter/nf_queue.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c > index 96964a0070e1..691e713d70f5 100644 > --- a/net/netfilter/nf_queue.c > +++ b/net/netfilter/nf_queue.c > @@ -185,8 +185,9 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned > int verdict) > } > > entry->state.thresh = INT_MIN; > + hook_entry = rcu_dereference(hook_entry->next); > > - if (verdict == NF_ACCEPT) { > + if (hook_entry && verdict == NF_ACCEPT) { > next_hook: > verdict = nf_iterate(skb, >state, _entry); > } ACK. I thought switch case below could have a problem, but re-checked the first nf_queue leg, and it seems okay. -Aaron -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ANNOUNCE] ipset 6.30 released
Hi, I'm happy to announce ipset 6.30 which introduces a new set type, hash:ip,mac, and brings a couple of small corrections and backports from the most recent kernel tree. Userspace changes: - Drop extra comma from error message (Neutron Soutmun) - Fix the incorrect dynamic/static modules list (Neutron Soutmun) - Correct tests to check the number of entries too - hash:ipmac type support added to ipset, userspace part (Tomasz Chilinski) Kernel part changes: - netfilter: ipset: hash: fix boolreturn.cocci warnings (Fengguang Wu) - Fix the nla_put_net64() API changes backport - netfilter: ipset: Fixing unnamed union init (Elad Raz) - netfilter: x_tables: Use par->net instead of computing from the passed net devices (Eric W. Biederman) - Correct the reported memory size for bitmap:* types - Fix coding styles reported by checkpatch.pl, already in kernel - netfilter: x_tables: Pass struct net in xt_action_param (Eric W. Biederman) - net: sched: fix skb->protocol use in case of accelerated vlan path (Jiri Pirko) - Check IPSET_ATTR_ETHER netlink attribute length in hash:ipmac too - netfilter: fix include files for compilation (Mikko Rapeli) - ipset: Backports for the nla_put_net64() API changes (Neutron Soutmun) - netfilter: ipset: use setup_timer() and mod_timer(). (Muhammad Falak R Wani) - hash:ipmac type support added to ipset (Tomasz Chilinski) You can download the source code of ipset from: http://ipset.netfilter.org ftp://ftp.netfilter.org/pub/ipset/ git://git.netfilter.org/ipset.git Best regards, Jozsef - E-mail : kad...@blackhole.kfki.hu, kadlecsik.joz...@wigner.mta.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences H-1525 Budapest 114, POB. 49, Hungary -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/10, nf-next] Netfilter core updates
Florian Westphalwrites: > Pablo Neira Ayuso wrote: >> Let me know if you have any comment, otherwise I'll place this in the >> nf-next tree so we can follow up working on top of these. > > Please do, thanks! +1. Some of this work was in my back burner, so thanks Pablo :) -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 18/22] netfilter: ipset: hash:ipmac type support added to ipset
From: Tomasz ChilinskiSigned-off-by: Tomasz Chili??ski Signed-off-by: Jozsef Kadlecsik --- net/netfilter/ipset/Kconfig | 9 + net/netfilter/ipset/Makefile| 1 + net/netfilter/ipset/ip_set_hash_ipmac.c | 315 3 files changed, 325 insertions(+) create mode 100644 net/netfilter/ipset/ip_set_hash_ipmac.c diff --git a/net/netfilter/ipset/Kconfig b/net/netfilter/ipset/Kconfig index 234a8ec..4083a80 100644 --- a/net/netfilter/ipset/Kconfig +++ b/net/netfilter/ipset/Kconfig @@ -99,6 +99,15 @@ config IP_SET_HASH_IPPORTNET To compile it as a module, choose M here. If unsure, say N. +config IP_SET_HASH_IPMAC + tristate "hash:ip,mac set support" + depends on IP_SET + help + This option adds the hash:ip,mac set type support, by which + one can store IPv4/IPv6 address and MAC (ethernet address) pairs in a set. + + To compile it as a module, choose M here. If unsure, say N. + config IP_SET_HASH_MAC tristate "hash:mac set support" depends on IP_SET diff --git a/net/netfilter/ipset/Makefile b/net/netfilter/ipset/Makefile index 3dbd5e9..28ec148 100644 --- a/net/netfilter/ipset/Makefile +++ b/net/netfilter/ipset/Makefile @@ -14,6 +14,7 @@ obj-$(CONFIG_IP_SET_BITMAP_PORT) += ip_set_bitmap_port.o # hash types obj-$(CONFIG_IP_SET_HASH_IP) += ip_set_hash_ip.o +obj-$(CONFIG_IP_SET_HASH_IPMAC) += ip_set_hash_ipmac.o obj-$(CONFIG_IP_SET_HASH_IPMARK) += ip_set_hash_ipmark.o obj-$(CONFIG_IP_SET_HASH_IPPORT) += ip_set_hash_ipport.o obj-$(CONFIG_IP_SET_HASH_IPPORTIP) += ip_set_hash_ipportip.o diff --git a/net/netfilter/ipset/ip_set_hash_ipmac.c b/net/netfilter/ipset/ip_set_hash_ipmac.c new file mode 100644 index 000..d9eb144 --- /dev/null +++ b/net/netfilter/ipset/ip_set_hash_ipmac.c @@ -0,0 +1,315 @@ +/* Copyright (C) 2016 Tomasz Chilinski + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +/* Kernel module implementing an IP set type: the hash:ip,mac type */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#define IPSET_TYPE_REV_MIN 0 +#define IPSET_TYPE_REV_MAX 0 + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Tomasz Chilinski "); +IP_SET_MODULE_DESC("hash:ip,mac", IPSET_TYPE_REV_MIN, IPSET_TYPE_REV_MAX); +MODULE_ALIAS("ip_set_hash:ip,mac"); + +/* Type specific function prefix */ +#define HTYPE hash_ipmac + +/* Zero valued element is not supported */ +static const unsigned char invalid_ether[ETH_ALEN] = { 0 }; + +/* IPv4 variant */ + +/* Member elements */ +struct hash_ipmac4_elem { + /* Zero valued IP addresses cannot be stored */ + __be32 ip; + union { + unsigned char ether[ETH_ALEN]; + __be32 foo[2]; + }; +}; + +/* Common functions */ + +static inline bool +hash_ipmac4_data_equal(const struct hash_ipmac4_elem *e1, + const struct hash_ipmac4_elem *e2, + u32 *multi) +{ + return e1->ip == e2->ip && ether_addr_equal(e1->ether, e2->ether); +} + +static bool +hash_ipmac4_data_list(struct sk_buff *skb, const struct hash_ipmac4_elem *e) +{ + if (nla_put_ipaddr4(skb, IPSET_ATTR_IP, e->ip) || + nla_put(skb, IPSET_ATTR_ETHER, ETH_ALEN, e->ether)) + goto nla_put_failure; + return 0; + +nla_put_failure: + return 1; +} + +static inline void +hash_ipmac4_data_next(struct hash_ipmac4_elem *next, + const struct hash_ipmac4_elem *e) +{ + next->ip = e->ip; +} + +#define MTYPE hash_ipmac4 +#define PF 4 +#define HOST_MASK 32 +#define HKEY_DATALEN sizeof(struct hash_ipmac4_elem) +#include "ip_set_hash_gen.h" + +static int +hash_ipmac4_kadt(struct ip_set *set, const struct sk_buff *skb, +const struct xt_action_param *par, +enum ipset_adt adt, struct ip_set_adt_opt *opt) +{ + ipset_adtfn adtfn = set->variant->adt[adt]; + struct hash_ipmac4_elem e = { .ip = 0, { .foo[0] = 0, .foo[1] = 0 } }; + struct ip_set_ext ext = IP_SET_INIT_KEXT(skb, opt, set); + +/* MAC can be src only */ + if (!(opt->flags & IPSET_DIM_TWO_SRC)) + return 0; + + if (skb_mac_header(skb) < skb->head || + (skb_mac_header(skb) + ETH_HLEN) > skb->data) + return -EINVAL; + + memcpy(e.ether, eth_hdr(skb)->h_source, ETH_ALEN); + if (ether_addr_equal(e.ether, invalid_ether)) + return -EINVAL; + + ip4addrptr(skb, opt->flags & IPSET_DIM_ONE_SRC, ); + + return adtfn(set, , ,
[PATCH 07/22] netfilter: ipset: Regroup ip_set_put_extensions and add extern
Signed-off-by: Jozsef Kadlecsik--- include/linux/netfilter/ipset/ip_set.h | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h index b5bd0fb3..7a218eb 100644 --- a/include/linux/netfilter/ipset/ip_set.h +++ b/include/linux/netfilter/ipset/ip_set.h @@ -331,6 +331,8 @@ extern size_t ip_set_elem_len(struct ip_set *set, struct nlattr *tb[], size_t len, size_t align); extern int ip_set_get_extensions(struct ip_set *set, struct nlattr *tb[], struct ip_set_ext *ext); +extern int ip_set_put_extensions(struct sk_buff *skb, const struct ip_set *set, +const void *e, bool active); static inline int ip_set_get_hostipaddr4(struct nlattr *nla, u32 *ipaddr) @@ -449,10 +451,6 @@ static inline int nla_put_ipaddr6(struct sk_buff *skb, int type, #include #include -int -ip_set_put_extensions(struct sk_buff *skb, const struct ip_set *set, - const void *e, bool active); - #define IP_SET_INIT_KEXT(skb, opt, set)\ { .bytes = (skb)->len, .packets = 1,\ .timeout = ip_set_adt_opt_timeout(opt, set) } -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/22] netfilter: ipset: Count non-static extension memory for userspace
Non-static (i.e. comment) extension was not counted into the memory size. A new internal counter is introduced for this. In the case of the hash types the sizes of the arrays are counted there as well so that we can avoid to scan the whole set when just the header data is requested. Signed-off-by: Jozsef Kadlecsik--- include/linux/netfilter/ipset/ip_set.h | 8 ++-- include/linux/netfilter/ipset/ip_set_comment.h | 7 +-- net/netfilter/ipset/ip_set_bitmap_gen.h| 5 +++-- net/netfilter/ipset/ip_set_core.c | 2 +- net/netfilter/ipset/ip_set_hash_gen.h | 26 ++ net/netfilter/ipset/ip_set_list_set.c | 5 +++-- 6 files changed, 32 insertions(+), 21 deletions(-) diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h index 4671d74..8e42253 100644 --- a/include/linux/netfilter/ipset/ip_set.h +++ b/include/linux/netfilter/ipset/ip_set.h @@ -79,10 +79,12 @@ enum ip_set_ext_id { IPSET_EXT_ID_MAX, }; +struct ip_set; + /* Extension type */ struct ip_set_ext_type { /* Destroy extension private data (can be NULL) */ - void (*destroy)(void *ext); + void (*destroy)(struct ip_set *set, void *ext); enum ip_set_extension type; enum ipset_cadt_flags flag; /* Size and minimal alignment */ @@ -252,6 +254,8 @@ struct ip_set { u32 timeout; /* Number of elements (vs timeout) */ u32 elements; + /* Size of the dynamic extensions (vs timeout) */ + size_t ext_size; /* Element data size */ size_t dsize; /* Offsets to extensions in elements */ @@ -268,7 +272,7 @@ struct ip_set { */ if (SET_WITH_COMMENT(set)) ip_set_extensions[IPSET_EXT_ID_COMMENT].destroy( - ext_comment(data, set)); + set, ext_comment(data, set)); } static inline int diff --git a/include/linux/netfilter/ipset/ip_set_comment.h b/include/linux/netfilter/ipset/ip_set_comment.h index 5444b1b..8e2bab1 100644 --- a/include/linux/netfilter/ipset/ip_set_comment.h +++ b/include/linux/netfilter/ipset/ip_set_comment.h @@ -20,13 +20,14 @@ * The kadt functions don't use the comment extensions in any way. */ static inline void -ip_set_init_comment(struct ip_set_comment *comment, +ip_set_init_comment(struct ip_set *set, struct ip_set_comment *comment, const struct ip_set_ext *ext) { struct ip_set_comment_rcu *c = rcu_dereference_protected(comment->c, 1); size_t len = ext->comment ? strlen(ext->comment) : 0; if (unlikely(c)) { + set->ext_size -= sizeof(*c) + strlen(c->str) + 1; kfree_rcu(c, rcu); rcu_assign_pointer(comment->c, NULL); } @@ -38,6 +39,7 @@ if (unlikely(!c)) return; strlcpy(c->str, ext->comment, len + 1); + set->ext_size += sizeof(*c) + strlen(c->str) + 1; rcu_assign_pointer(comment->c, c); } @@ -58,13 +60,14 @@ * of the set data anymore. */ static inline void -ip_set_comment_free(struct ip_set_comment *comment) +ip_set_comment_free(struct ip_set *set, struct ip_set_comment *comment) { struct ip_set_comment_rcu *c; c = rcu_dereference_protected(comment->c, 1); if (unlikely(!c)) return; + set->ext_size -= sizeof(*c) + strlen(c->str) + 1; kfree_rcu(c, rcu); rcu_assign_pointer(comment->c, NULL); } diff --git a/net/netfilter/ipset/ip_set_bitmap_gen.h b/net/netfilter/ipset/ip_set_bitmap_gen.h index 13a7021..5a9fa61 100644 --- a/net/netfilter/ipset/ip_set_bitmap_gen.h +++ b/net/netfilter/ipset/ip_set_bitmap_gen.h @@ -84,6 +84,7 @@ mtype_ext_cleanup(set); memset(map->members, 0, map->memsize); set->elements = 0; + set->ext_size = 0; } /* Calculate the actual memory size of the set data */ @@ -101,7 +102,7 @@ { const struct mtype *map = set->data; struct nlattr *nested; - size_t memsize = mtype_memsize(map, set->dsize); + size_t memsize = mtype_memsize(map, set->dsize) + set->ext_size; nested = ipset_nest_start(skb, IPSET_ATTR_DATA); if (!nested) @@ -175,7 +176,7 @@ if (SET_WITH_COUNTER(set)) ip_set_init_counter(ext_counter(x, set), ext); if (SET_WITH_COMMENT(set)) - ip_set_init_comment(ext_comment(x, set), ext); + ip_set_init_comment(set, ext_comment(x, set), ext); if (SET_WITH_SKBINFO(set)) ip_set_init_skbinfo(ext_skbinfo(x, set), ext); diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index 3bca341..cd8961e 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -324,7 +324,7 @@ static inline struct ip_set_net *ip_set_pernet(struct net *net) }
[PATCH 02/22] netfilter: ipset: Headers file cleanup
Remove extra whitespace, group counter helper together. Mark some of the helpers arguments as const. Ported from a patch proposed by Sergey Popovich. Suggested-by: Sergey Popovich Signed-off-by: Jozsef Kadlecsik --- include/linux/netfilter/ipset/ip_set.h | 57 +- include/linux/netfilter/ipset/ip_set_comment.h | 2 +- include/linux/netfilter/ipset/ip_set_timeout.h | 4 +- 3 files changed, 32 insertions(+), 31 deletions(-) diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h index 83b9a2e..1ea28e3 100644 --- a/include/linux/netfilter/ipset/ip_set.h +++ b/include/linux/netfilter/ipset/ip_set.h @@ -334,18 +334,40 @@ struct ip_set { } } +static inline bool +ip_set_put_counter(struct sk_buff *skb, const struct ip_set_counter *counter) +{ + return nla_put_net64(skb, IPSET_ATTR_BYTES, +cpu_to_be64(ip_set_get_bytes(counter)), +IPSET_ATTR_PAD) || + nla_put_net64(skb, IPSET_ATTR_PACKETS, +cpu_to_be64(ip_set_get_packets(counter)), +IPSET_ATTR_PAD); +} + +static inline void +ip_set_init_counter(struct ip_set_counter *counter, + const struct ip_set_ext *ext) +{ + if (ext->bytes != ULLONG_MAX) + atomic64_set(&(counter)->bytes, (long long)(ext->bytes)); + if (ext->packets != ULLONG_MAX) + atomic64_set(&(counter)->packets, (long long)(ext->packets)); +} + static inline void ip_set_get_skbinfo(struct ip_set_skbinfo *skbinfo, - const struct ip_set_ext *ext, - struct ip_set_ext *mext, u32 flags) + const struct ip_set_ext *ext, + struct ip_set_ext *mext, u32 flags) { - mext->skbmark = skbinfo->skbmark; - mext->skbmarkmask = skbinfo->skbmarkmask; - mext->skbprio = skbinfo->skbprio; - mext->skbqueue = skbinfo->skbqueue; + mext->skbmark = skbinfo->skbmark; + mext->skbmarkmask = skbinfo->skbmarkmask; + mext->skbprio = skbinfo->skbprio; + mext->skbqueue = skbinfo->skbqueue; } + static inline bool -ip_set_put_skbinfo(struct sk_buff *skb, struct ip_set_skbinfo *skbinfo) +ip_set_put_skbinfo(struct sk_buff *skb, const struct ip_set_skbinfo *skbinfo) { /* Send nonzero parameters only */ return ((skbinfo->skbmark || skbinfo->skbmarkmask) && @@ -371,27 +393,6 @@ struct ip_set { skbinfo->skbqueue = ext->skbqueue; } -static inline bool -ip_set_put_counter(struct sk_buff *skb, struct ip_set_counter *counter) -{ - return nla_put_net64(skb, IPSET_ATTR_BYTES, -cpu_to_be64(ip_set_get_bytes(counter)), -IPSET_ATTR_PAD) || - nla_put_net64(skb, IPSET_ATTR_PACKETS, -cpu_to_be64(ip_set_get_packets(counter)), -IPSET_ATTR_PAD); -} - -static inline void -ip_set_init_counter(struct ip_set_counter *counter, - const struct ip_set_ext *ext) -{ - if (ext->bytes != ULLONG_MAX) - atomic64_set(&(counter)->bytes, (long long)(ext->bytes)); - if (ext->packets != ULLONG_MAX) - atomic64_set(&(counter)->packets, (long long)(ext->packets)); -} - /* Netlink CB args */ enum { IPSET_CB_NET = 0, /* net namespace */ diff --git a/include/linux/netfilter/ipset/ip_set_comment.h b/include/linux/netfilter/ipset/ip_set_comment.h index 8d02485..bae5c76 100644 --- a/include/linux/netfilter/ipset/ip_set_comment.h +++ b/include/linux/netfilter/ipset/ip_set_comment.h @@ -43,7 +43,7 @@ /* Used only when dumping a set, protected by rcu_read_lock_bh() */ static inline int -ip_set_put_comment(struct sk_buff *skb, struct ip_set_comment *comment) +ip_set_put_comment(struct sk_buff *skb, const struct ip_set_comment *comment) { struct ip_set_comment_rcu *c = rcu_dereference_bh(comment->c); diff --git a/include/linux/netfilter/ipset/ip_set_timeout.h b/include/linux/netfilter/ipset/ip_set_timeout.h index 1d6a935..bfb3531 100644 --- a/include/linux/netfilter/ipset/ip_set_timeout.h +++ b/include/linux/netfilter/ipset/ip_set_timeout.h @@ -40,7 +40,7 @@ } static inline bool -ip_set_timeout_expired(unsigned long *t) +ip_set_timeout_expired(const unsigned long *t) { return *t != IPSET_ELEM_PERMANENT && time_is_before_jiffies(*t); } @@ -63,7 +63,7 @@ } static inline u32 -ip_set_timeout_get(unsigned long *timeout) +ip_set_timeout_get(const unsigned long *timeout) { return *timeout == IPSET_ELEM_PERMANENT ? 0 : jiffies_to_msecs(*timeout - jiffies)/MSEC_PER_SEC; -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More
[PATCH 09/22] netfilter: ipset: Add element count to all set types header
It is better to list the set elements for all set types, thus the header information is uniform. Element counts are therefore added to the bitmap and list types. Signed-off-by: Jozsef Kadlecsik--- include/linux/netfilter/ipset/ip_set.h| 2 ++ include/linux/netfilter/ipset/ip_set_bitmap.h | 2 +- net/netfilter/ipset/ip_set_bitmap_gen.h | 10 +- net/netfilter/ipset/ip_set_hash_gen.h | 21 ++--- net/netfilter/ipset/ip_set_list_set.c | 6 +- 5 files changed, 27 insertions(+), 14 deletions(-) diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h index 7a218eb..4671d74 100644 --- a/include/linux/netfilter/ipset/ip_set.h +++ b/include/linux/netfilter/ipset/ip_set.h @@ -250,6 +250,8 @@ struct ip_set { u8 flags; /* Default timeout value, if enabled */ u32 timeout; + /* Number of elements (vs timeout) */ + u32 elements; /* Element data size */ size_t dsize; /* Offsets to extensions in elements */ diff --git a/include/linux/netfilter/ipset/ip_set_bitmap.h b/include/linux/netfilter/ipset/ip_set_bitmap.h index 5e4662a..366d6c0 100644 --- a/include/linux/netfilter/ipset/ip_set_bitmap.h +++ b/include/linux/netfilter/ipset/ip_set_bitmap.h @@ -6,8 +6,8 @@ #define IPSET_BITMAP_MAX_RANGE 0x enum { + IPSET_ADD_STORE_PLAIN_TIMEOUT = -1, IPSET_ADD_FAILED = 1, - IPSET_ADD_STORE_PLAIN_TIMEOUT, IPSET_ADD_START_STORED_TIMEOUT, }; diff --git a/net/netfilter/ipset/ip_set_bitmap_gen.h b/net/netfilter/ipset/ip_set_bitmap_gen.h index c22cdde..13a7021 100644 --- a/net/netfilter/ipset/ip_set_bitmap_gen.h +++ b/net/netfilter/ipset/ip_set_bitmap_gen.h @@ -83,6 +83,7 @@ if (set->extensions & IPSET_EXT_DESTROY) mtype_ext_cleanup(set); memset(map->members, 0, map->memsize); + set->elements = 0; } /* Calculate the actual memory size of the set data */ @@ -107,7 +108,8 @@ goto nla_put_failure; if (mtype_do_head(skb, map) || nla_put_net32(skb, IPSET_ATTR_REFERENCES, htonl(set->ref)) || - nla_put_net32(skb, IPSET_ATTR_MEMSIZE, htonl(memsize))) + nla_put_net32(skb, IPSET_ATTR_MEMSIZE, htonl(memsize)) || + nla_put_net32(skb, IPSET_ATTR_ELEMENTS, htonl(set->elements))) goto nla_put_failure; if (unlikely(ip_set_put_flags(skb, set))) goto nla_put_failure; @@ -151,6 +153,7 @@ if (ret == IPSET_ADD_FAILED) { if (SET_WITH_TIMEOUT(set) && ip_set_timeout_expired(ext_timeout(x, set))) { + set->elements--; ret = 0; } else if (!(flags & IPSET_FLAG_EXIST)) { set_bit(e->id, map->members); @@ -159,6 +162,8 @@ /* Element is re-added, cleanup extensions */ ip_set_ext_destroy(set, x); } + if (ret > 0) + set->elements--; if (SET_WITH_TIMEOUT(set)) #ifdef IP_SET_BITMAP_STORED_TIMEOUT @@ -176,6 +181,7 @@ /* Activate element */ set_bit(e->id, map->members); + set->elements++; return 0; } @@ -192,6 +198,7 @@ return -IPSET_ERR_EXIST; ip_set_ext_destroy(set, x); + set->elements--; if (SET_WITH_TIMEOUT(set) && ip_set_timeout_expired(ext_timeout(x, set))) return -IPSET_ERR_EXIST; @@ -287,6 +294,7 @@ if (ip_set_timeout_expired(ext_timeout(x, set))) { clear_bit(id, map->members); ip_set_ext_destroy(set, x); + set->elements--; } } spin_unlock_bh(>lock); diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index 66a55a5..09465d1 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -277,7 +277,6 @@ struct net_prefixes { struct htype { struct htable __rcu *table; /* the hash table */ u32 maxelem;/* max elements in the hash */ - u32 elements; /* current element (vs timeout) */ u32 initval;/* random jhash init value */ #ifdef IP_SET_HASH_WITH_MARKMASK u32 markmask; /* markmask value for mark mask to store */ @@ -402,7 +401,7 @@ struct htype { #ifdef IP_SET_HASH_WITH_NETS memset(h->nets, 0, sizeof(struct net_prefixes) * NLEN(set->family)); #endif - h->elements = 0; + set->elements = 0; } /* Destroy the hashtable part of the set */ @@ -508,7 +507,7 @@ struct htype { nets_length, k); #endif ip_set_ext_destroy(set, data); - h->elements--; +
[PATCH 00/22] ipset patches for nf-next
Hi Pablo, Please consider to apply the next bunch of patches for ipset. There is new set type in it (hash:ip,mac), elemet counts are reported to userspace in the set headers data and a couple of small cleanups, improvements * rcu_dereference_bh_nfnl() redefined to accept netfilter subsys id. * Header files cleanup: counter helper functions are grouped together, some args are changed to const. * struct ip_set_skbinfo is introduced instead of open coded fields in skbinfo get/init helper funcions. * In comment extension allocate area with kmalloc() rather than kzalloc(). * Split all extensions into separate files. * Separate memsize calculation into dedicated functions. * ip_set_put_extensions() is regrouped and extern is added. * Add element count to hash headers by Eric B Munson. * Add element count to all set types header for uniform output. * Count non-static extension memory into memsize calculation for userspace. * Simplify mtype_expire() for hash types by removing redundant parameters which can be get from other ones. * Make NLEN compile time constant for hash types. * Make sure element data size is a multiple of u32. * Optimize hash creation routine, exit as early as possible. * Make struct htype per ipset family. * Collapse same condition body into a single one. * Fix reported memory size for hash:* types. * hash:ipmac type support added to ipset by Tomasz Chilinski. * Use setup_timer() and mod_timer() instead of init_timer() by Muhammad Falak R Wani, individually for the set type families. * hash: fix boolreturn.cocci warnings avout bool should use true/false by Fengguang Wu. The following changes since commit 1b830996c1603225a96e233c3b09bf2b12607d78: Merge branch 's390-net' (2016-10-12 01:56:10 -0400) are available in the git repository at: git://blackhole.kfki.hu/nf-next master for you to fetch changes up to 214ee1d9a5e73f13a126849c69fdb29dfe2bdb3f: netfilter: ipset: hash: fix boolreturn.cocci warnings (2016-10-15 14:51:59 +0200) Eric B Munson (1): netfilter: ipset: Add element count to hash headers Jozsef Kadlecsik (16): netfilter: ipset: Correct rcu_dereference_bh_nfnl() usage netfilter: ipset: Headers file cleanup netfilter: ipset: Improve skbinfo get/init helpers netfilter: ipset: Improve comment extension helpers netfilter: ipset: Split extensions into separate files netfilter: ipset: Separate memsize calculation code into dedicated function netfilter: ipset: Regroup ip_set_put_extensions and add extern netfilter: ipset: Add element count to all set types header netfilter: ipset: Count non-static extension memory for userspace netfilter: ipset: Simplify mtype_expire() for hash types netfilter: ipset: Make NLEN compile time constant for hash types netfilter: ipset: Make sure element data size is a multiple of u32 netfilter: ipset: Optimize hash creation routine netfilter: ipset: Make struct htype per ipset family netfilter: ipset: Collapse same condition body to a single one netfilter: ipset: Fix reported memory size for hash:* types Muhammad Falak R Wani (3): netfilter: ipset: use setup_timer() and mod_timer(). netfilter: ipset: use setup_timer() and mod_timer(). netfilter: ipset: use setup_timer() and mod_timer(). Tomasz Chilinski (1): netfilter: ipset: hash:ipmac type support added to ipset kbuild test robot (1): netfilter: ipset: hash: fix boolreturn.cocci warnings include/linux/netfilter/ipset/ip_set.h | 136 ++- include/linux/netfilter/ipset/ip_set_bitmap.h | 2 +- include/linux/netfilter/ipset/ip_set_comment.h | 11 +- include/linux/netfilter/ipset/ip_set_counter.h | 75 ++ include/linux/netfilter/ipset/ip_set_skbinfo.h | 46 include/linux/netfilter/ipset/ip_set_timeout.h | 4 +- net/netfilter/ipset/Kconfig| 9 + net/netfilter/ipset/Makefile | 1 + net/netfilter/ipset/ip_set_bitmap_gen.h| 33 ++- net/netfilter/ipset/ip_set_core.c | 14 +- net/netfilter/ipset/ip_set_hash_gen.h | 264 ++--- net/netfilter/ipset/ip_set_hash_ip.c | 10 +- net/netfilter/ipset/ip_set_hash_ipmac.c| 315 + net/netfilter/ipset/ip_set_hash_ipmark.c | 10 +- net/netfilter/ipset/ip_set_hash_ipport.c | 6 +- net/netfilter/ipset/ip_set_hash_ipportip.c | 6 +- net/netfilter/ipset/ip_set_hash_ipportnet.c| 10 +- net/netfilter/ipset/ip_set_hash_net.c | 8 +- net/netfilter/ipset/ip_set_hash_netiface.c | 8 +- net/netfilter/ipset/ip_set_hash_netnet.c | 8 +- net/netfilter/ipset/ip_set_hash_netport.c | 10 +- net/netfilter/ipset/ip_set_hash_netportnet.c | 10 +- net/netfilter/ipset/ip_set_list_set.c | 37 ++- net/netfilter/xt_set.c | 12 +-
[PATCH 17/22] netfilter: ipset: Fix reported memory size for hash:* types
The calculation of the full allocated memory did not take into account the size of the base hash bucket structure at some places. Signed-off-by: Jozsef Kadlecsik--- net/netfilter/ipset/ip_set_hash_gen.h | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index f4b30b6..295ad84 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -87,6 +87,8 @@ struct htable { }; #define hbucket(h, i) ((h)->bucket[i]) +#define ext_size(n, dsize) \ + (sizeof(struct hbucket) + (n) * (dsize)) #ifndef IPSET_NET_COUNT #define IPSET_NET_COUNT1 @@ -521,7 +523,7 @@ struct htype { d++; } tmp->pos = d; - set->ext_size -= AHASH_INIT_SIZE * dsize; + set->ext_size -= ext_size(AHASH_INIT_SIZE, dsize); rcu_assign_pointer(hbucket(t, i), tmp); kfree_rcu(n, rcu); } @@ -627,7 +629,7 @@ struct htype { goto cleanup; } m->size = AHASH_INIT_SIZE; - extsize = sizeof(*m) + AHASH_INIT_SIZE * dsize; + extsize = ext_size(AHASH_INIT_SIZE, dsize); RCU_INIT_POINTER(hbucket(t, key), m); } else if (m->pos >= m->size) { struct hbucket *ht; @@ -647,7 +649,7 @@ struct htype { memcpy(ht, m, sizeof(struct hbucket) + m->size * dsize); ht->size = m->size + AHASH_INIT_SIZE; - extsize += AHASH_INIT_SIZE * dsize; + extsize += ext_size(AHASH_INIT_SIZE, dsize); kfree(m); m = ht; RCU_INIT_POINTER(hbucket(t, key), ht); @@ -729,7 +731,7 @@ struct htype { if (!n) return -ENOMEM; n->size = AHASH_INIT_SIZE; - set->ext_size += sizeof(*n) + AHASH_INIT_SIZE * set->dsize; + set->ext_size += ext_size(AHASH_INIT_SIZE, set->dsize); goto copy_elem; } for (i = 0; i < n->pos; i++) { @@ -793,7 +795,7 @@ struct htype { memcpy(n, old, sizeof(struct hbucket) + old->size * set->dsize); n->size = old->size + AHASH_INIT_SIZE; - set->ext_size += AHASH_INIT_SIZE * set->dsize; + set->ext_size += ext_size(AHASH_INIT_SIZE, set->dsize); } copy_elem: @@ -885,7 +887,7 @@ struct htype { k++; } if (n->pos == 0 && k == 0) { - set->ext_size -= sizeof(*n) + n->size * dsize; + set->ext_size -= ext_size(n->size, dsize); rcu_assign_pointer(hbucket(t, key), NULL); kfree_rcu(n, rcu); } else if (k >= AHASH_INIT_SIZE) { @@ -904,7 +906,7 @@ struct htype { k++; } tmp->pos = k; - set->ext_size -= AHASH_INIT_SIZE * dsize; + set->ext_size -= ext_size(AHASH_INIT_SIZE, dsize); rcu_assign_pointer(hbucket(t, key), tmp); kfree_rcu(n, rcu); } -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/22] netfilter: ipset: Improve skbinfo get/init helpers
Use struct ip_set_skbinfo in struct ip_set_ext instead of open coded fields and assign structure members in get/init helpers instead of copying members one by one. Ported from a patch proposed by Sergey Popovich. Suggested-by: Sergey Popovich Signed-off-by: Jozsef Kadlecsik --- include/linux/netfilter/ipset/ip_set.h | 30 +++--- net/netfilter/ipset/ip_set_core.c | 12 ++-- net/netfilter/xt_set.c | 12 +++- 3 files changed, 24 insertions(+), 30 deletions(-) diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h index 1ea28e3..7802621 100644 --- a/include/linux/netfilter/ipset/ip_set.h +++ b/include/linux/netfilter/ipset/ip_set.h @@ -92,17 +92,6 @@ struct ip_set_ext_type { extern const struct ip_set_ext_type ip_set_extensions[]; -struct ip_set_ext { - u64 packets; - u64 bytes; - u32 timeout; - u32 skbmark; - u32 skbmarkmask; - u32 skbprio; - u16 skbqueue; - char *comment; -}; - struct ip_set_counter { atomic64_t bytes; atomic64_t packets; @@ -122,6 +111,15 @@ struct ip_set_skbinfo { u32 skbmarkmask; u32 skbprio; u16 skbqueue; + u16 __pad; +}; + +struct ip_set_ext { + struct ip_set_skbinfo skbinfo; + u64 packets; + u64 bytes; + char *comment; + u32 timeout; }; struct ip_set; @@ -360,10 +358,7 @@ struct ip_set { const struct ip_set_ext *ext, struct ip_set_ext *mext, u32 flags) { - mext->skbmark = skbinfo->skbmark; - mext->skbmarkmask = skbinfo->skbmarkmask; - mext->skbprio = skbinfo->skbprio; - mext->skbqueue = skbinfo->skbqueue; + mext->skbinfo = *skbinfo; } static inline bool @@ -387,10 +382,7 @@ struct ip_set { ip_set_init_skbinfo(struct ip_set_skbinfo *skbinfo, const struct ip_set_ext *ext) { - skbinfo->skbmark = ext->skbmark; - skbinfo->skbmarkmask = ext->skbmarkmask; - skbinfo->skbprio = ext->skbprio; - skbinfo->skbqueue = ext->skbqueue; + *skbinfo = ext->skbinfo; } /* Netlink CB args */ diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index a748b0c..3bca341 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -426,20 +426,20 @@ static inline struct ip_set_net *ip_set_pernet(struct net *net) if (!SET_WITH_SKBINFO(set)) return -IPSET_ERR_SKBINFO; fullmark = be64_to_cpu(nla_get_be64(tb[IPSET_ATTR_SKBMARK])); - ext->skbmark = fullmark >> 32; - ext->skbmarkmask = fullmark & 0x; + ext->skbinfo.skbmark = fullmark >> 32; + ext->skbinfo.skbmarkmask = fullmark & 0x; } if (tb[IPSET_ATTR_SKBPRIO]) { if (!SET_WITH_SKBINFO(set)) return -IPSET_ERR_SKBINFO; - ext->skbprio = be32_to_cpu(nla_get_be32( - tb[IPSET_ATTR_SKBPRIO])); + ext->skbinfo.skbprio = + be32_to_cpu(nla_get_be32(tb[IPSET_ATTR_SKBPRIO])); } if (tb[IPSET_ATTR_SKBQUEUE]) { if (!SET_WITH_SKBINFO(set)) return -IPSET_ERR_SKBINFO; - ext->skbqueue = be16_to_cpu(nla_get_be16( - tb[IPSET_ATTR_SKBQUEUE])); + ext->skbinfo.skbqueue = + be16_to_cpu(nla_get_be16(tb[IPSET_ATTR_SKBQUEUE])); } return 0; } diff --git a/net/netfilter/xt_set.c b/net/netfilter/xt_set.c index 5669e5b..e6a8232 100644 --- a/net/netfilter/xt_set.c +++ b/net/netfilter/xt_set.c @@ -423,6 +423,8 @@ struct ip_set_adt_opt n = { \ /* Revision 3 target */ +#define MOPT(opt, member) ((opt).ext.skbinfo.member) + static unsigned int set_target_v3(struct sk_buff *skb, const struct xt_action_param *par) { @@ -453,14 +455,14 @@ struct ip_set_adt_opt n = { \ if (!ret) return XT_CONTINUE; if (map_opt.cmdflags & IPSET_FLAG_MAP_SKBMARK) - skb->mark = (skb->mark & ~(map_opt.ext.skbmarkmask)) - ^ (map_opt.ext.skbmark); + skb->mark = (skb->mark & ~MOPT(map_opt,skbmarkmask)) + ^ MOPT(map_opt, skbmark); if (map_opt.cmdflags & IPSET_FLAG_MAP_SKBPRIO) - skb->priority = map_opt.ext.skbprio; + skb->priority = MOPT(map_opt, skbprio); if ((map_opt.cmdflags & IPSET_FLAG_MAP_SKBQUEUE) && skb->dev && - skb->dev->real_num_tx_queues > map_opt.ext.skbqueue) -
[PATCH 22/22] netfilter: ipset: hash: fix boolreturn.cocci warnings
From: kbuild test robotnet/netfilter/ipset/ip_set_hash_ipmac.c:70:8-9: WARNING: return of 0/1 in function 'hash_ipmac4_data_list' with return type bool net/netfilter/ipset/ip_set_hash_ipmac.c:178:8-9: WARNING: return of 0/1 in function 'hash_ipmac6_data_list' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: scripts/coccinelle/misc/boolreturn.cocci CC: Tomasz Chilinski Signed-off-by: Fengguang Wu Signed-off-by: Jozsef Kadlecsik --- net/netfilter/ipset/ip_set_hash_ipmac.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_ipmac.c b/net/netfilter/ipset/ip_set_hash_ipmac.c index d9eb144..1ab5ed2 100644 --- a/net/netfilter/ipset/ip_set_hash_ipmac.c +++ b/net/netfilter/ipset/ip_set_hash_ipmac.c @@ -67,10 +67,10 @@ struct hash_ipmac4_elem { if (nla_put_ipaddr4(skb, IPSET_ATTR_IP, e->ip) || nla_put(skb, IPSET_ATTR_ETHER, ETH_ALEN, e->ether)) goto nla_put_failure; - return 0; + return false; nla_put_failure: - return 1; + return true; } static inline void @@ -175,10 +175,10 @@ struct hash_ipmac6_elem { if (nla_put_ipaddr6(skb, IPSET_ATTR_IP, >ip.in6) || nla_put(skb, IPSET_ATTR_ETHER, ETH_ALEN, e->ether)) goto nla_put_failure; - return 0; + return false; nla_put_failure: - return 1; + return true; } static inline void -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/22] netfilter: ipset: Correct rcu_dereference_bh_nfnl() usage
When rcu_dereference_bh_nfnl() macro would be defined on the target system it will accept pointer and subsystem id. Check if rcu_dereference_bh_nfnl() is defined and make it accepting two arguments. Ported from a patch proposed by Sergey Popovich. Suggested-by: Sergey Popovich Signed-off-by: Jozsef Kadlecsik --- net/netfilter/ipset/ip_set_hash_gen.h | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index d32fd6b..bc54be4 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -17,7 +17,9 @@ #define ipset_dereference_protected(p, set) \ __ipset_dereference_protected(p, spin_is_locked(&(set)->lock)) -#define rcu_dereference_bh_nfnl(p) rcu_dereference_bh_check(p, 1) +#ifndef rcu_dereference_bh_nfnl +#define rcu_dereference_bh_nfnl(p, ss) rcu_dereference_bh_check(p, 1) +#endif /* Hashing which uses arrays to resolve clashing. The hash table is resized * (doubled) when searching becomes too long. @@ -580,7 +582,7 @@ struct htype { return -ENOMEM; #endif rcu_read_lock_bh(); - orig = rcu_dereference_bh_nfnl(h->table); + orig = rcu_dereference_bh_nfnl(h->table, NFNL_SUBSYS_IPSET); htable_bits = orig->htable_bits; rcu_read_unlock_bh(); @@ -1061,7 +1063,7 @@ struct htype { u8 htable_bits; rcu_read_lock_bh(); - t = rcu_dereference_bh_nfnl(h->table); + t = rcu_dereference_bh_nfnl(h->table, NFNL_SUBSYS_IPSET); memsize = mtype_ahash_memsize(h, t, NLEN(set->family), set->dsize); htable_bits = t->htable_bits; rcu_read_unlock_bh(); @@ -1103,7 +1105,7 @@ struct htype { if (start) { rcu_read_lock_bh(); - t = rcu_dereference_bh_nfnl(h->table); + t = rcu_dereference_bh_nfnl(h->table, NFNL_SUBSYS_IPSET); atomic_inc(>uref); cb->args[IPSET_CB_PRIVATE] = (unsigned long)t; rcu_read_unlock_bh(); -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/22] netfilter: ipset: Simplify mtype_expire() for hash types
Remove redundant parameters nets_length and dsize: they could be get from other parameters. Remove one leve of intendation by using continue while iterating over elements in bucket. Ported from a patch proposed by Sergey Popovich. Signed-off-by: Jozsef Kadlecsik --- net/netfilter/ipset/ip_set_hash_gen.h | 34 +- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index 37afa68..79e158d 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -467,14 +467,15 @@ struct htype { /* Delete expired elements from the hashtable */ static void -mtype_expire(struct ip_set *set, struct htype *h, u8 nets_length, size_t dsize) +mtype_expire(struct ip_set *set, struct htype *h) { struct htable *t; struct hbucket *n, *tmp; struct mtype_elem *data; u32 i, j, d; + size_t dsize = set->dsize; #ifdef IP_SET_HASH_WITH_NETS - u8 k; + u8 k, nets_length = NLEN(set->family); #endif t = ipset_dereference_protected(h->table, set); @@ -488,21 +489,20 @@ struct htype { continue; } data = ahash_data(n, j, dsize); - if (ip_set_timeout_expired(ext_timeout(data, set))) { - pr_debug("expired %u/%u\n", i, j); - clear_bit(j, n->used); - smp_mb__after_atomic(); + if (!ip_set_timeout_expired(ext_timeout(data, set))) + continue; + pr_debug("expired %u/%u\n", i, j); + clear_bit(j, n->used); + smp_mb__after_atomic(); #ifdef IP_SET_HASH_WITH_NETS - for (k = 0; k < IPSET_NET_COUNT; k++) - mtype_del_cidr(h, - NCIDR_PUT(DCIDR_GET(data->cidr, - k)), - nets_length, k); + for (k = 0; k < IPSET_NET_COUNT; k++) + mtype_del_cidr(h, + NCIDR_PUT(DCIDR_GET(data->cidr, k)), + nets_length, k); #endif - ip_set_ext_destroy(set, data); - set->elements--; - d++; - } + ip_set_ext_destroy(set, data); + set->elements--; + d++; } if (d >= AHASH_INIT_SIZE) { if (d >= n->size) { @@ -541,7 +541,7 @@ struct htype { pr_debug("called\n"); spin_lock_bh(>lock); - mtype_expire(set, h, NLEN(set->family), set->dsize); + mtype_expire(set, h); spin_unlock_bh(>lock); h->gc.expires = jiffies + IPSET_GC_PERIOD(set->timeout) * HZ; @@ -717,7 +717,7 @@ struct htype { if (set->elements >= h->maxelem) { if (SET_WITH_TIMEOUT(set)) /* FIXME: when set is full, we slow down here */ - mtype_expire(set, h, NLEN(set->family), set->dsize); + mtype_expire(set, h); if (set->elements >= h->maxelem && SET_WITH_FORCEADD(set)) forceadd = true; } -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 19/22] netfilter: ipset: use setup_timer() and mod_timer().
From: Muhammad Falak R WaniUse setup_timer() and instead of init_timer(), being the preferred way of setting up a timer. Also, quoting the mod_timer() function comment: -> mod_timer() is a more efficient way to update the expire field of an active timer (if the timer is inactive it will be activated). Use setup_timer() and mod_timer() to setup and arm a timer, making the code compact and easier to read. Signed-off-by: Muhammad Falak R Wani Signed-off-by: Jozsef Kadlecsik --- net/netfilter/ipset/ip_set_bitmap_gen.h | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/net/netfilter/ipset/ip_set_bitmap_gen.h b/net/netfilter/ipset/ip_set_bitmap_gen.h index 5a9fa61..77dd415 100644 --- a/net/netfilter/ipset/ip_set_bitmap_gen.h +++ b/net/netfilter/ipset/ip_set_bitmap_gen.h @@ -41,11 +41,8 @@ { struct mtype *map = set->data; - init_timer(>gc); - map->gc.data = (unsigned long)set; - map->gc.function = gc; - map->gc.expires = jiffies + IPSET_GC_PERIOD(set->timeout) * HZ; - add_timer(>gc); + setup_timer(>gc, gc, (unsigned long)set); + mod_timer(>gc, jiffies + IPSET_GC_PERIOD(set->timeout) * HZ); } static void -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 12/22] netfilter: ipset: Make NLEN compile time constant for hash types
Hash types define HOST_MASK before inclusion of ip_set_hash_gen.h and the only place where NLEN needed to be calculated at runtime is *_create() method. Ported from a patch proposed by Sergey Popovich. Signed-off-by: Jozsef Kadlecsik --- net/netfilter/ipset/ip_set_hash_gen.h | 51 --- 1 file changed, 23 insertions(+), 28 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index 79e158d..ab5b57c 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -152,20 +152,18 @@ struct net_prefixes { #define INIT_CIDR(cidr, host_mask) \ DCIDR_PUT(((cidr) ? NCIDR_GET(cidr) : host_mask)) -#define SET_HOST_MASK(family) (family == AF_INET ? 32 : 128) - #ifdef IP_SET_HASH_WITH_NET0 -/* cidr from 0 to SET_HOST_MASK() value and c = cidr + 1 */ -#define NLEN(family) (SET_HOST_MASK(family) + 1) +/* cidr from 0 to HOST_MASK value and c = cidr + 1 */ +#define NLEN (HOST_MASK + 1) #define CIDR_POS(c)((c) - 1) #else -/* cidr from 1 to SET_HOST_MASK() value and c = cidr + 1 */ -#define NLEN(family) SET_HOST_MASK(family) +/* cidr from 1 to HOST_MASK value and c = cidr + 1 */ +#define NLEN HOST_MASK #define CIDR_POS(c)((c) - 2) #endif #else -#define NLEN(family) 0 +#define NLEN 0 #endif /* IP_SET_HASH_WITH_NETS */ #endif /* _IP_SET_HASH_GEN_H */ @@ -300,12 +298,12 @@ struct htype { * sized networks. cidr == real cidr + 1 to support /0. */ static void -mtype_add_cidr(struct htype *h, u8 cidr, u8 nets_length, u8 n) +mtype_add_cidr(struct htype *h, u8 cidr, u8 n) { int i, j; /* Add in increasing prefix order, so larger cidr first */ - for (i = 0, j = -1; i < nets_length && h->nets[i].cidr[n]; i++) { + for (i = 0, j = -1; i < NLEN && h->nets[i].cidr[n]; i++) { if (j != -1) { continue; } else if (h->nets[i].cidr[n] < cidr) { @@ -324,11 +322,11 @@ struct htype { } static void -mtype_del_cidr(struct htype *h, u8 cidr, u8 nets_length, u8 n) +mtype_del_cidr(struct htype *h, u8 cidr, u8 n) { - u8 i, j, net_end = nets_length - 1; + u8 i, j, net_end = NLEN - 1; - for (i = 0; i < nets_length; i++) { + for (i = 0; i < NLEN; i++) { if (h->nets[i].cidr[n] != cidr) continue; h->nets[CIDR_POS(cidr)].nets[n]--; @@ -344,13 +342,12 @@ struct htype { /* Calculate the actual memory size of the set data */ static size_t -mtype_ahash_memsize(const struct htype *h, const struct htable *t, - u8 nets_length) +mtype_ahash_memsize(const struct htype *h, const struct htable *t) { size_t memsize = sizeof(*h) + sizeof(*t); #ifdef IP_SET_HASH_WITH_NETS - memsize += sizeof(struct net_prefixes) * nets_length; + memsize += sizeof(struct net_prefixes) * NLEN; #endif return memsize; @@ -391,7 +388,7 @@ struct htype { kfree_rcu(n, rcu); } #ifdef IP_SET_HASH_WITH_NETS - memset(h->nets, 0, sizeof(struct net_prefixes) * NLEN(set->family)); + memset(h->nets, 0, sizeof(struct net_prefixes) * NLEN); #endif set->elements = 0; set->ext_size = 0; @@ -475,7 +472,7 @@ struct htype { u32 i, j, d; size_t dsize = set->dsize; #ifdef IP_SET_HASH_WITH_NETS - u8 k, nets_length = NLEN(set->family); + u8 k; #endif t = ipset_dereference_protected(h->table, set); @@ -498,7 +495,7 @@ struct htype { for (k = 0; k < IPSET_NET_COUNT; k++) mtype_del_cidr(h, NCIDR_PUT(DCIDR_GET(data->cidr, k)), - nets_length, k); + k); #endif ip_set_ext_destroy(set, data); set->elements--; @@ -778,7 +775,7 @@ struct htype { for (i = 0; i < IPSET_NET_COUNT; i++) mtype_del_cidr(h, NCIDR_PUT(DCIDR_GET(data->cidr, i)), - NLEN(set->family), i); + i); #endif ip_set_ext_destroy(set, data); set->elements--; @@ -814,8 +811,7 @@ struct htype { set->elements++; #ifdef IP_SET_HASH_WITH_NETS for (i = 0; i < IPSET_NET_COUNT; i++) - mtype_add_cidr(h, NCIDR_PUT(DCIDR_GET(d->cidr, i)), - NLEN(set->family), i); + mtype_add_cidr(h, NCIDR_PUT(DCIDR_GET(d->cidr, i)), i); #endif memcpy(data, d, sizeof(struct mtype_elem)); overwrite_extensions: @@ -888,7 +884,7 @@ struct htype {
[PATCH 15/22] netfilter: ipset: Make struct htype per ipset family
Before this patch struct htype created at the first source of ip_set_hash_gen.h and it is common for both IPv4 and IPv6 set variants. Make struct htype per ipset family and use NLEN to make nets array fixed size to simplify struct htype allocation. Ported from a patch proposed by Sergey Popovich. Signed-off-by: Jozsef Kadlecsik --- net/netfilter/ipset/ip_set_hash_gen.h| 51 +++- net/netfilter/ipset/ip_set_hash_ip.c | 10 +++--- net/netfilter/ipset/ip_set_hash_ipmark.c | 10 +++--- net/netfilter/ipset/ip_set_hash_ipport.c | 6 ++-- net/netfilter/ipset/ip_set_hash_ipportip.c | 6 ++-- net/netfilter/ipset/ip_set_hash_ipportnet.c | 10 +++--- net/netfilter/ipset/ip_set_hash_net.c| 8 ++--- net/netfilter/ipset/ip_set_hash_netiface.c | 8 ++--- net/netfilter/ipset/ip_set_hash_netnet.c | 8 ++--- net/netfilter/ipset/ip_set_hash_netport.c| 10 +++--- net/netfilter/ipset/ip_set_hash_netportnet.c | 10 +++--- 11 files changed, 63 insertions(+), 74 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index cc9208b..0082ccf 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -168,6 +168,18 @@ struct net_prefixes { #endif /* _IP_SET_HASH_GEN_H */ +#ifndef MTYPE +#error "MTYPE is not defined!" +#endif + +#ifndef HTYPE +#error "HTYPE is not defined!" +#endif + +#ifndef HOST_MASK +#error "HOST_MASK is not defined!" +#endif + /* Family dependent templates */ #undef ahash_data @@ -191,7 +203,6 @@ struct net_prefixes { #undef mtype_same_set #undef mtype_kadt #undef mtype_uadt -#undef mtype #undef mtype_add #undef mtype_del @@ -207,6 +218,7 @@ struct net_prefixes { #undef mtype_variant #undef mtype_data_match +#undef htype #undef HKEY #define mtype_data_equal IPSET_TOKEN(MTYPE, _data_equal) @@ -233,7 +245,6 @@ struct net_prefixes { #define mtype_same_set IPSET_TOKEN(MTYPE, _same_set) #define mtype_kadt IPSET_TOKEN(MTYPE, _kadt) #define mtype_uadt IPSET_TOKEN(MTYPE, _uadt) -#define mtype MTYPE #define mtype_add IPSET_TOKEN(MTYPE, _add) #define mtype_del IPSET_TOKEN(MTYPE, _del) @@ -249,18 +260,12 @@ struct net_prefixes { #define mtype_variant IPSET_TOKEN(MTYPE, _variant) #define mtype_data_match IPSET_TOKEN(MTYPE, _data_match) -#ifndef MTYPE -#error "MTYPE is not defined!" -#endif - -#ifndef HOST_MASK -#error "HOST_MASK is not defined!" -#endif - #ifndef HKEY_DATALEN #define HKEY_DATALEN sizeof(struct mtype_elem) #endif +#define htype MTYPE + #define HKEY(data, initval, htable_bits) \ ({ \ const u32 *__k = (const u32 *)data; \ @@ -271,33 +276,26 @@ struct net_prefixes { jhash2(__k, __l, initval) & jhash_mask(htable_bits);\ }) -#ifndef htype -#ifndef HTYPE -#error "HTYPE is not defined!" -#endif /* HTYPE */ -#define htype HTYPE - /* The generic hash structure */ struct htype { struct htable __rcu *table; /* the hash table */ + struct timer_list gc; /* garbage collection when timeout enabled */ u32 maxelem;/* max elements in the hash */ u32 initval;/* random jhash init value */ #ifdef IP_SET_HASH_WITH_MARKMASK u32 markmask; /* markmask value for mark mask to store */ #endif - struct timer_list gc; /* garbage collection when timeout enabled */ - struct mtype_elem next; /* temporary storage for uadd */ #ifdef IP_SET_HASH_WITH_MULTI u8 ahash_max; /* max elements in an array block */ #endif #ifdef IP_SET_HASH_WITH_NETMASK u8 netmask; /* netmask value for subnets to store */ #endif + struct mtype_elem next; /* temporary storage for uadd */ #ifdef IP_SET_HASH_WITH_NETS - struct net_prefixes nets[0]; /* book-keeping of prefixes */ + struct net_prefixes nets[NLEN]; /* book-keeping of prefixes */ #endif }; -#endif /* htype */ #ifdef IP_SET_HASH_WITH_NETS /* Network cidr size book keeping when the hash stores different @@ -350,13 +348,7 @@ struct htype { static size_t mtype_ahash_memsize(const struct htype *h, const struct htable *t) { - size_t memsize = sizeof(*h) + sizeof(*t); - -#ifdef IP_SET_HASH_WITH_NETS - memsize += sizeof(struct net_prefixes) * NLEN; -#endif - - return memsize; + return sizeof(*h) + sizeof(*t); } /* Get the ith element from the array block n */ @@ -394,7 +386,7 @@ struct htype { kfree_rcu(n, rcu); } #ifdef IP_SET_HASH_WITH_NETS - memset(h->nets, 0, sizeof(struct net_prefixes) * NLEN); + memset(h->nets, 0, sizeof(h->nets)); #endif set->elements = 0;
[PATCH 04/22] netfilter: ipset: Improve comment extension helpers
Allocate memory with kmalloc() rather than kzalloc(). Ported from a patch proposed by Sergey Popovich. Suggested-by: Sergey Popovich Signed-off-by: Jozsef Kadlecsik --- include/linux/netfilter/ipset/ip_set_comment.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/netfilter/ipset/ip_set_comment.h b/include/linux/netfilter/ipset/ip_set_comment.h index bae5c76..5444b1b 100644 --- a/include/linux/netfilter/ipset/ip_set_comment.h +++ b/include/linux/netfilter/ipset/ip_set_comment.h @@ -34,7 +34,7 @@ return; if (unlikely(len > IPSET_MAX_COMMENT_SIZE)) len = IPSET_MAX_COMMENT_SIZE; - c = kzalloc(sizeof(*c) + len + 1, GFP_ATOMIC); + c = kmalloc(sizeof(*c) + len + 1, GFP_ATOMIC); if (unlikely(!c)) return; strlcpy(c->str, ext->comment, len + 1); -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/22] netfilter: ipset: Split extensions into separate files
Ported from a patch proposed by Sergey Popovich. Suggested-by: Sergey Popovich Signed-off-by: Jozsef Kadlecsik --- include/linux/netfilter/ipset/ip_set.h | 95 +- include/linux/netfilter/ipset/ip_set_counter.h | 75 include/linux/netfilter/ipset/ip_set_skbinfo.h | 46 + 3 files changed, 123 insertions(+), 93 deletions(-) create mode 100644 include/linux/netfilter/ipset/ip_set_counter.h create mode 100644 include/linux/netfilter/ipset/ip_set_skbinfo.h diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h index 7802621..b5bd0fb3 100644 --- a/include/linux/netfilter/ipset/ip_set.h +++ b/include/linux/netfilter/ipset/ip_set.h @@ -292,99 +292,6 @@ struct ip_set { return nla_put_net32(skb, IPSET_ATTR_CADT_FLAGS, htonl(cadt_flags)); } -static inline void -ip_set_add_bytes(u64 bytes, struct ip_set_counter *counter) -{ - atomic64_add((long long)bytes, &(counter)->bytes); -} - -static inline void -ip_set_add_packets(u64 packets, struct ip_set_counter *counter) -{ - atomic64_add((long long)packets, &(counter)->packets); -} - -static inline u64 -ip_set_get_bytes(const struct ip_set_counter *counter) -{ - return (u64)atomic64_read(&(counter)->bytes); -} - -static inline u64 -ip_set_get_packets(const struct ip_set_counter *counter) -{ - return (u64)atomic64_read(&(counter)->packets); -} - -static inline void -ip_set_update_counter(struct ip_set_counter *counter, - const struct ip_set_ext *ext, - struct ip_set_ext *mext, u32 flags) -{ - if (ext->packets != ULLONG_MAX && - !(flags & IPSET_FLAG_SKIP_COUNTER_UPDATE)) { - ip_set_add_bytes(ext->bytes, counter); - ip_set_add_packets(ext->packets, counter); - } - if (flags & IPSET_FLAG_MATCH_COUNTERS) { - mext->packets = ip_set_get_packets(counter); - mext->bytes = ip_set_get_bytes(counter); - } -} - -static inline bool -ip_set_put_counter(struct sk_buff *skb, const struct ip_set_counter *counter) -{ - return nla_put_net64(skb, IPSET_ATTR_BYTES, -cpu_to_be64(ip_set_get_bytes(counter)), -IPSET_ATTR_PAD) || - nla_put_net64(skb, IPSET_ATTR_PACKETS, -cpu_to_be64(ip_set_get_packets(counter)), -IPSET_ATTR_PAD); -} - -static inline void -ip_set_init_counter(struct ip_set_counter *counter, - const struct ip_set_ext *ext) -{ - if (ext->bytes != ULLONG_MAX) - atomic64_set(&(counter)->bytes, (long long)(ext->bytes)); - if (ext->packets != ULLONG_MAX) - atomic64_set(&(counter)->packets, (long long)(ext->packets)); -} - -static inline void -ip_set_get_skbinfo(struct ip_set_skbinfo *skbinfo, - const struct ip_set_ext *ext, - struct ip_set_ext *mext, u32 flags) -{ - mext->skbinfo = *skbinfo; -} - -static inline bool -ip_set_put_skbinfo(struct sk_buff *skb, const struct ip_set_skbinfo *skbinfo) -{ - /* Send nonzero parameters only */ - return ((skbinfo->skbmark || skbinfo->skbmarkmask) && - nla_put_net64(skb, IPSET_ATTR_SKBMARK, - cpu_to_be64((u64)skbinfo->skbmark << 32 | - skbinfo->skbmarkmask), - IPSET_ATTR_PAD)) || - (skbinfo->skbprio && - nla_put_net32(skb, IPSET_ATTR_SKBPRIO, - cpu_to_be32(skbinfo->skbprio))) || - (skbinfo->skbqueue && - nla_put_net16(skb, IPSET_ATTR_SKBQUEUE, -cpu_to_be16(skbinfo->skbqueue))); -} - -static inline void -ip_set_init_skbinfo(struct ip_set_skbinfo *skbinfo, - const struct ip_set_ext *ext) -{ - *skbinfo = ext->skbinfo; -} - /* Netlink CB args */ enum { IPSET_CB_NET = 0, /* net namespace */ @@ -539,6 +446,8 @@ static inline int nla_put_ipaddr6(struct sk_buff *skb, int type, #include #include +#include +#include int ip_set_put_extensions(struct sk_buff *skb, const struct ip_set *set, diff --git a/include/linux/netfilter/ipset/ip_set_counter.h b/include/linux/netfilter/ipset/ip_set_counter.h new file mode 100644 index 000..2b5e784 --- /dev/null +++ b/include/linux/netfilter/ipset/ip_set_counter.h @@ -0,0 +1,75 @@ +#ifndef _IP_SET_COUNTER_H +#define _IP_SET_COUNTER_H + +/* Copyright (C) 2015 Sergey Popovich + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifdef __KERNEL__ + +static inline void +ip_set_add_bytes(u64 bytes, struct
[PATCH 13/22] netfilter: ipset: Make sure element data size is a multiple of u32
Data for hashing required to be array of u32. Make sure that element data always multiple of u32. Ported from a patch proposed by Sergey Popovich. Signed-off-by: Jozsef Kadlecsik --- net/netfilter/ipset/ip_set_hash_gen.h | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index ab5b57c..e2f4925 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -262,8 +262,14 @@ struct net_prefixes { #endif #define HKEY(data, initval, htable_bits) \ -(jhash2((u32 *)(data), HKEY_DATALEN / sizeof(u32), initval)\ - & jhash_mask(htable_bits)) +({ \ + const u32 *__k = (const u32 *)data; \ + u32 __l = HKEY_DATALEN / sizeof(u32); \ + \ + BUILD_BUG_ON(HKEY_DATALEN % sizeof(u32) != 0); \ + \ + jhash2(__k, __l, initval) & jhash_mask(htable_bits);\ +}) #ifndef htype #ifndef HTYPE -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 21/22] netfilter: ipset: use setup_timer() and mod_timer().
From: Muhammad Falak R WaniUse setup_timer() and instead of init_timer(), being the preferred way of setting up a timer. Also, quoting the mod_timer() function comment: -> mod_timer() is a more efficient way to update the expire field of an active timer (if the timer is inactive it will be activated). Use setup_timer() and mod_timer() to setup and arm a timer, making the code compact and easier to read. Signed-off-by: Muhammad Falak R Wani Signed-off-by: Jozsef Kadlecsik --- net/netfilter/ipset/ip_set_list_set.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/net/netfilter/ipset/ip_set_list_set.c b/net/netfilter/ipset/ip_set_list_set.c index dede343..51077c5 100644 --- a/net/netfilter/ipset/ip_set_list_set.c +++ b/net/netfilter/ipset/ip_set_list_set.c @@ -586,11 +586,8 @@ struct list_set { { struct list_set *map = set->data; - init_timer(>gc); - map->gc.data = (unsigned long)set; - map->gc.function = gc; - map->gc.expires = jiffies + IPSET_GC_PERIOD(set->timeout) * HZ; - add_timer(>gc); + setup_timer(>gc, gc, (unsigned long)set); + mod_timer(>gc, jiffies + IPSET_GC_PERIOD(set->timeout) * HZ); } /* Create list:set type of sets */ -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 20/22] netfilter: ipset: use setup_timer() and mod_timer().
From: Muhammad Falak R WaniUse setup_timer() and instead of init_timer(), being the preferred way of setting up a timer. Also, quoting the mod_timer() function comment: -> mod_timer() is a more efficient way to update the expire field of an active timer (if the timer is inactive it will be activated). Use setup_timer() and mod_timer() to setup and arm a timer, making the code compact and easier to read. Signed-off-by: Muhammad Falak R Wani Signed-off-by: Jozsef Kadlecsik --- net/netfilter/ipset/ip_set_hash_gen.h | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index 295ad84..0d5f83e 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -435,11 +435,8 @@ struct htype { { struct htype *h = set->data; - init_timer(>gc); - h->gc.data = (unsigned long)set; - h->gc.function = gc; - h->gc.expires = jiffies + IPSET_GC_PERIOD(set->timeout) * HZ; - add_timer(>gc); + setup_timer(>gc, gc, (unsigned long)set); + mod_timer(>gc, jiffies + IPSET_GC_PERIOD(set->timeout) * HZ); pr_debug("gc initialized, run in every %u\n", IPSET_GC_PERIOD(set->timeout)); } -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/22] netfilter: ipset: Separate memsize calculation code into dedicated function
Hash types already has it's memsize calculation code in separate functions. Do the same for *bitmap* and *list* sets. Ported from a patch proposed by Sergey Popovich. Suggested-by: Sergey Popovich Signed-off-by: Jozsef Kadlecsik --- net/netfilter/ipset/ip_set_bitmap_gen.h | 13 - net/netfilter/ipset/ip_set_list_set.c | 23 +-- 2 files changed, 29 insertions(+), 7 deletions(-) diff --git a/net/netfilter/ipset/ip_set_bitmap_gen.h b/net/netfilter/ipset/ip_set_bitmap_gen.h index 2e8e7e5..c22cdde 100644 --- a/net/netfilter/ipset/ip_set_bitmap_gen.h +++ b/net/netfilter/ipset/ip_set_bitmap_gen.h @@ -22,6 +22,7 @@ #define mtype_kadt IPSET_TOKEN(MTYPE, _kadt) #define mtype_uadt IPSET_TOKEN(MTYPE, _uadt) #define mtype_destroy IPSET_TOKEN(MTYPE, _destroy) +#define mtype_memsize IPSET_TOKEN(MTYPE, _memsize) #define mtype_flushIPSET_TOKEN(MTYPE, _flush) #define mtype_head IPSET_TOKEN(MTYPE, _head) #define mtype_same_set IPSET_TOKEN(MTYPE, _same_set) @@ -84,12 +85,22 @@ memset(map->members, 0, map->memsize); } +/* Calculate the actual memory size of the set data */ +static size_t +mtype_memsize(const struct mtype *map, size_t dsize) +{ + size_t memsize = sizeof(*map) + +map->memsize + +map->elements * dsize; + return memsize; +} + static int mtype_head(struct ip_set *set, struct sk_buff *skb) { const struct mtype *map = set->data; struct nlattr *nested; - size_t memsize = sizeof(*map) + map->memsize; + size_t memsize = mtype_memsize(map, set->dsize); nested = ipset_nest_start(skb, IPSET_ATTR_DATA); if (!nested) diff --git a/net/netfilter/ipset/ip_set_list_set.c b/net/netfilter/ipset/ip_set_list_set.c index a2a89e4..462b0b1 100644 --- a/net/netfilter/ipset/ip_set_list_set.c +++ b/net/netfilter/ipset/ip_set_list_set.c @@ -441,12 +441,12 @@ struct list_set { set->data = NULL; } -static int -list_set_head(struct ip_set *set, struct sk_buff *skb) +/* Calculate the actual memory size of the set data */ +static size_t +list_set_memsize(const struct list_set *map, size_t dsize) { - const struct list_set *map = set->data; - struct nlattr *nested; struct set_elem *e; + size_t memsize; u32 n = 0; rcu_read_lock(); @@ -454,13 +454,24 @@ struct list_set { n++; rcu_read_unlock(); + memsize = sizeof(*map) + n * dsize; + + return memsize; +} + +static int +list_set_head(struct ip_set *set, struct sk_buff *skb) +{ + const struct list_set *map = set->data; + struct nlattr *nested; + size_t memsize = list_set_memsize(map, set->dsize); + nested = ipset_nest_start(skb, IPSET_ATTR_DATA); if (!nested) goto nla_put_failure; if (nla_put_net32(skb, IPSET_ATTR_SIZE, htonl(map->size)) || nla_put_net32(skb, IPSET_ATTR_REFERENCES, htonl(set->ref)) || - nla_put_net32(skb, IPSET_ATTR_MEMSIZE, - htonl(sizeof(*map) + n * set->dsize))) + nla_put_net32(skb, IPSET_ATTR_MEMSIZE, htonl(memsize))) goto nla_put_failure; if (unlikely(ip_set_put_flags(skb, set))) goto nla_put_failure; -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/10, nf-next] Netfilter core updates
Pablo Neira Ayusowrote: > Let me know if you have any comment, otherwise I'll place this in the > nf-next tree so we can follow up working on top of these. Please do, thanks! -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/10, nf-next] Netfilter core updates
This is second round of patches to improve Netfilter hooks performance, following several of the ideas that we discussed during NetDev 1.2. This patchset implements the following: 1) Deprecate NF_STOP, as this is only used by br_netfilter. 2) Remove threshold handling, this is also only used by br_netfilter too. 3) Place nf_state_hook pointer into xt_action_param structure, so this structure fits into one single cacheline according to pahole. This also implicit affects nftables since it also relies on the xt_action_param structure. 4) Move state->hook_entries into nf_queue entry. The hook_entries pointer is only required by nf_queue(), so we can store this in the queue entry instead. 5) Handle queue bypass flag from nf_queue(), to keep this little nf_queue specific handling away from the core path. 6) Merge nf_iterate() into nf_hook_slow() that results in a much more simple and readable function. I have kept back the patches that move NF_QUEUE handling away from the core and nf_hook_slow() inlining, I would like to explore other options before following this path. Using this simple drop-all packets ruleset from ingress: nft add table netdev x nft add chain netdev x y { type filter hook ingress device eth0 priority 0\; } nft add rule netdev x y drop I generated traffic through Jesper Brouer's samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh script using -i option. perf report shows nf_tables calls in its top 10: 17.30% kpktgend_0 [nf_tables][k] nft_do_chain 15.75% kpktgend_0 [kernel.vmlinux] [k] __netif_receive_skb_core 10.39% kpktgend_0 [nf_tables_netdev] [k] nft_do_chain_netdev I'm measuring here an improvement of ~15% in performance with this patchset, so we got +2.5Mpps more. I have used my old laptop Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz 4-cores. Let me know if you have any comment, otherwise I'll place this in the nf-next tree so we can follow up working on top of these. Thanks! Pablo Neira Ayuso (10): netfilter: get rid of useless debugging from core netfilter: remove comments that predate rcu days netfilter: kill NF_HOOK_THRESH() and state->tresh netfilter: deprecate NF_STOP netfilter: x_tables: move hook state into xt_action_param structure netfilter: nf_tables: use hook state from xt_action_param structure netfilter: use switch() to handle verdict cases from nf_hook_slow() netfilter: remove hook_entries field from nf_hook_state netfilter: handle queue bypass flag from nf_queue netfilter: merge nf_iterate() into nf_hook_slow() include/linux/netfilter.h | 58 ++- include/linux/netfilter/x_tables.h | 48 include/linux/netfilter_ingress.h | 4 +- include/net/netfilter/nf_queue.h | 1 + include/net/netfilter/nf_tables.h | 36 include/uapi/linux/netfilter.h | 2 +- net/bridge/br_netfilter_hooks.c| 16 +++--- net/bridge/netfilter/ebt_arpreply.c| 3 +- net/bridge/netfilter/ebt_log.c | 11 ++-- net/bridge/netfilter/ebt_nflog.c | 6 +- net/bridge/netfilter/ebt_redirect.c| 6 +- net/bridge/netfilter/ebtable_broute.c | 2 +- net/bridge/netfilter/ebtables.c| 6 +- net/bridge/netfilter/nft_meta_bridge.c | 2 +- net/bridge/netfilter/nft_reject_bridge.c | 30 ++ net/ipv4/netfilter/arp_tables.c| 6 +- net/ipv4/netfilter/ip_tables.c | 6 +- net/ipv4/netfilter/ipt_MASQUERADE.c| 3 +- net/ipv4/netfilter/ipt_REJECT.c| 4 +- net/ipv4/netfilter/ipt_SYNPROXY.c | 4 +- net/ipv4/netfilter/ipt_rpfilter.c | 2 +- net/ipv4/netfilter/nft_dup_ipv4.c | 2 +- net/ipv4/netfilter/nft_masq_ipv4.c | 4 +- net/ipv4/netfilter/nft_redir_ipv4.c| 3 +- net/ipv4/netfilter/nft_reject_ipv4.c | 4 +- net/ipv6/netfilter/ip6_tables.c| 6 +- net/ipv6/netfilter/ip6t_MASQUERADE.c | 2 +- net/ipv6/netfilter/ip6t_REJECT.c | 23 +--- net/ipv6/netfilter/ip6t_SYNPROXY.c | 4 +- net/ipv6/netfilter/ip6t_rpfilter.c | 3 +- net/ipv6/netfilter/nft_dup_ipv6.c | 2 +- net/ipv6/netfilter/nft_masq_ipv6.c | 3 +- net/ipv6/netfilter/nft_redir_ipv6.c| 3 +- net/ipv6/netfilter/nft_reject_ipv6.c | 6 +- net/netfilter/core.c | 92 ++ net/netfilter/ipset/ip_set_core.c | 6 +- net/netfilter/ipset/ip_set_hash_netiface.c | 2 +- net/netfilter/nf_dup_netdev.c | 2 +- net/netfilter/nf_internals.h | 9 +-- net/netfilter/nf_queue.c | 70 +++ net/netfilter/nf_tables_core.c | 10 ++-- net/netfilter/nf_tables_trace.c| 8 +-- net/netfilter/nfnetlink_queue.c| 2 +- net/netfilter/nft_log.c
[PATCH 10/10] netfilter: merge nf_iterate() into nf_hook_slow()
nf_iterate() has become rather simple, we can integrate this code into nf_hook_slow() to reduce the amount of LOC in the core path. However, we still need nf_iterate() around for nf_queue packet handling, so move this function there where we only need it. I think it should be possible to refactor nf_queue code to get rid of it definitely, but given this is slow path anyway, let's have a look this later. Signed-off-by: Pablo Neira Ayuso--- net/netfilter/core.c | 72 +--- net/netfilter/nf_internals.h | 5 --- net/netfilter/nf_queue.c | 20 3 files changed, 48 insertions(+), 49 deletions(-) diff --git a/net/netfilter/core.c b/net/netfilter/core.c index f299fbde150d..5f015b1948f7 100644 --- a/net/netfilter/core.c +++ b/net/netfilter/core.c @@ -302,26 +302,6 @@ void _nf_unregister_hooks(struct nf_hook_ops *reg, unsigned int n) } EXPORT_SYMBOL(_nf_unregister_hooks); -unsigned int nf_iterate(struct sk_buff *skb, - struct nf_hook_state *state, - struct nf_hook_entry **entryp) -{ - unsigned int verdict; - - do { -repeat: - verdict = (*entryp)->ops.hook((*entryp)->ops.priv, skb, state); - if (verdict != NF_ACCEPT) { - if (verdict != NF_REPEAT) - return verdict; - goto repeat; - } - *entryp = rcu_dereference((*entryp)->next); - } while (*entryp); - return NF_ACCEPT; -} - - /* Returns 1 if okfn() needs to be executed by the caller, * -EPERM for NF_DROP, 0 otherwise. Caller must hold rcu_read_lock. */ int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state, @@ -330,31 +310,35 @@ int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state, unsigned int verdict; int ret; + do { + verdict = entry->ops.hook(entry->ops.priv, skb, state); + switch (verdict & NF_VERDICT_MASK) { + case NF_ACCEPT: next_hook: - verdict = nf_iterate(skb, state, ); - switch (verdict & NF_VERDICT_MASK) { - case NF_ACCEPT: - ret = 1; - break; - case NF_DROP: - kfree_skb(skb); - ret = NF_DROP_GETERR(verdict); - if (ret == 0) - ret = -EPERM; - break; - case NF_QUEUE: - ret = nf_queue(skb, state, entry, verdict); - if (ret == 1) - goto next_hook; - break; - default: - /* Implicit handling for NF_STOLEN, as well as any other non -* conventional verdicts. -*/ - ret = 0; - break; - } - return ret; + entry = rcu_dereference(entry->next); + break; + case NF_DROP: + kfree_skb(skb); + ret = NF_DROP_GETERR(verdict); + if (ret == 0) + ret = -EPERM; + return ret; + case NF_REPEAT: + continue; + case NF_QUEUE: + ret = nf_queue(skb, state, entry, verdict); + if (ret == 1) + goto next_hook; + return ret; + default: + /* Implicit handling for NF_STOLEN, as well as any other +* non conventional verdicts. +*/ + return 0; + } + } while (entry); + + return 1; } EXPORT_SYMBOL(nf_hook_slow); diff --git a/net/netfilter/nf_internals.h b/net/netfilter/nf_internals.h index a46f2635b71f..78a59a23421f 100644 --- a/net/netfilter/nf_internals.h +++ b/net/netfilter/nf_internals.h @@ -11,11 +11,6 @@ #define NFDEBUG(format, args...) #endif - -/* core.c */ -unsigned int nf_iterate(struct sk_buff *skb, struct nf_hook_state *state, - struct nf_hook_entry **entryp); - /* nf_queue.c */ int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state, struct nf_hook_entry *entry, unsigned int verdict); diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c index c5e0d534d352..25ad36f519f7 100644 --- a/net/netfilter/nf_queue.c +++ b/net/netfilter/nf_queue.c @@ -177,6 +177,26 @@ int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state, return 0; } +static unsigned int nf_iterate(struct sk_buff *skb, + struct nf_hook_state *state, + struct nf_hook_entry **entryp) +{ + unsigned int verdict; + + do { +repeat: + verdict = (*entryp)->ops.hook((*entryp)->ops.priv, skb, state); + if (verdict != NF_ACCEPT) { +
[PATCH 02/10] netfilter: remove comments that predate rcu days
We cannot block/sleep on nf_iterate because netfilter runs under rcu read lock these days, where blocking is well-known to be illegal. So let's remove these old comments. Signed-off-by: Pablo Neira Ayuso--- net/netfilter/core.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/net/netfilter/core.c b/net/netfilter/core.c index 7b723bcd2522..b193bd46ac30 100644 --- a/net/netfilter/core.c +++ b/net/netfilter/core.c @@ -308,18 +308,11 @@ unsigned int nf_iterate(struct sk_buff *skb, { unsigned int verdict; - /* -* The caller must not block between calls to this -* function because of risk of continuing from deleted element. -*/ while (*entryp) { if (state->thresh > (*entryp)->ops.priority) { *entryp = rcu_dereference((*entryp)->next); continue; } - - /* Optimization: we don't need to hold module - reference here, since function can't sleep. --RR */ repeat: verdict = (*entryp)->ops.hook((*entryp)->ops.priv, skb, state); if (verdict != NF_ACCEPT) { -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/10] netfilter: use switch() to handle verdict cases from nf_hook_slow()
Use switch() for verdict handling and add explicit handling for NF_STOLEN and other non-conventional verdicts. Signed-off-by: Pablo Neira Ayuso--- net/netfilter/core.c | 28 ++-- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/net/netfilter/core.c b/net/netfilter/core.c index 2a6ed7d29c6c..2b3b2f8e39c4 100644 --- a/net/netfilter/core.c +++ b/net/netfilter/core.c @@ -328,29 +328,37 @@ int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state) { struct nf_hook_entry *entry; unsigned int verdict; - int ret = 0; + int ret; entry = rcu_dereference(state->hook_entries); next_hook: verdict = nf_iterate(skb, state, ); - if (verdict == NF_ACCEPT) { + switch (verdict & NF_VERDICT_MASK) { + case NF_ACCEPT: ret = 1; - } else if ((verdict & NF_VERDICT_MASK) == NF_DROP) { + break; + case NF_DROP: kfree_skb(skb); ret = NF_DROP_GETERR(verdict); if (ret == 0) ret = -EPERM; - } else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) { - int err; - + break; + case NF_QUEUE: RCU_INIT_POINTER(state->hook_entries, entry); - err = nf_queue(skb, state, verdict >> NF_VERDICT_QBITS); - if (err < 0) { - if (err == -ESRCH && - (verdict & NF_VERDICT_FLAG_QUEUE_BYPASS)) + ret = nf_queue(skb, state, verdict >> NF_VERDICT_QBITS); + if (ret < 0) { + if (ret == -ESRCH && + (verdict & NF_VERDICT_FLAG_QUEUE_BYPASS)) goto next_hook; kfree_skb(skb); } + /* Fall through. */ + default: + /* Implicit handling for NF_STOLEN, as well as any other non +* conventional verdicts. +*/ + ret = 0; + break; } return ret; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/10] netfilter: remove hook_entries field from nf_hook_state
This field is only useful for nf_queue, so store it in the nf_queue_entry structure instead, away from the core path. Pass hook_head to nf_hook_slow(). Since we always have a valid entry on the first iteration in nf_iterate(), we can use 'do { ... } while (entry)' loop instead. Signed-off-by: Pablo Neira Ayuso--- include/linux/netfilter.h | 10 -- include/linux/netfilter_ingress.h | 4 ++-- include/net/netfilter/nf_queue.h | 1 + net/bridge/br_netfilter_hooks.c | 4 ++-- net/bridge/netfilter/ebtable_broute.c | 2 +- net/netfilter/core.c | 13 ++--- net/netfilter/nf_internals.h | 2 +- net/netfilter/nf_queue.c | 16 ++-- net/netfilter/nfnetlink_queue.c | 2 +- 9 files changed, 24 insertions(+), 30 deletions(-) diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h index e0d000f6c9bf..69230140215b 100644 --- a/include/linux/netfilter.h +++ b/include/linux/netfilter.h @@ -54,7 +54,6 @@ struct nf_hook_state { struct net_device *out; struct sock *sk; struct net *net; - struct nf_hook_entry __rcu *hook_entries; int (*okfn)(struct net *, struct sock *, struct sk_buff *); }; @@ -81,7 +80,6 @@ struct nf_hook_entry { }; static inline void nf_hook_state_init(struct nf_hook_state *p, - struct nf_hook_entry *hook_entry, unsigned int hook, u_int8_t pf, struct net_device *indev, @@ -96,7 +94,6 @@ static inline void nf_hook_state_init(struct nf_hook_state *p, p->out = outdev; p->sk = sk; p->net = net; - RCU_INIT_POINTER(p->hook_entries, hook_entry); p->okfn = okfn; } @@ -150,7 +147,8 @@ void nf_unregister_sockopt(struct nf_sockopt_ops *reg); extern struct static_key nf_hooks_needed[NFPROTO_NUMPROTO][NF_MAX_HOOKS]; #endif -int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state); +int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state, +struct nf_hook_entry *entry); /** * nf_hook - call a netfilter hook @@ -179,10 +177,10 @@ static inline int nf_hook(u_int8_t pf, unsigned int hook, struct net *net, if (hook_head) { struct nf_hook_state state; - nf_hook_state_init(, hook_head, hook, pf, indev, outdev, + nf_hook_state_init(, hook, pf, indev, outdev, sk, net, okfn); - ret = nf_hook_slow(skb, ); + ret = nf_hook_slow(skb, , hook_head); } rcu_read_unlock(); diff --git a/include/linux/netfilter_ingress.h b/include/linux/netfilter_ingress.h index fd44e4131710..2dc3b49b804a 100644 --- a/include/linux/netfilter_ingress.h +++ b/include/linux/netfilter_ingress.h @@ -26,10 +26,10 @@ static inline int nf_hook_ingress(struct sk_buff *skb) if (unlikely(!e)) return 0; - nf_hook_state_init(, e, NF_NETDEV_INGRESS, + nf_hook_state_init(, NF_NETDEV_INGRESS, NFPROTO_NETDEV, skb->dev, NULL, NULL, dev_net(skb->dev), NULL); - return nf_hook_slow(skb, ); + return nf_hook_slow(skb, , e); } static inline void nf_hook_ingress_init(struct net_device *dev) diff --git a/include/net/netfilter/nf_queue.h b/include/net/netfilter/nf_queue.h index 2280cfe86c56..09948d10e38e 100644 --- a/include/net/netfilter/nf_queue.h +++ b/include/net/netfilter/nf_queue.h @@ -12,6 +12,7 @@ struct nf_queue_entry { unsigned intid; struct nf_hook_statestate; + struct nf_hook_entry*hook; u16 size; /* sizeof(entry) + saved route keys */ /* extra space to store route keys */ diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c index 7e3645fa6339..8155bd2a5138 100644 --- a/net/bridge/br_netfilter_hooks.c +++ b/net/bridge/br_netfilter_hooks.c @@ -1018,10 +1018,10 @@ int br_nf_hook_thresh(unsigned int hook, struct net *net, /* We may already have this, but read-locks nest anyway */ rcu_read_lock(); - nf_hook_state_init(, elem, hook, NFPROTO_BRIDGE, indev, outdev, + nf_hook_state_init(, hook, NFPROTO_BRIDGE, indev, outdev, sk, net, okfn); - ret = nf_hook_slow(skb, ); + ret = nf_hook_slow(skb, , elem); rcu_read_unlock(); if (ret == 1) ret = okfn(net, sk, skb); diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c index 599679e3498d..8fe36dc3aab2 100644 --- a/net/bridge/netfilter/ebtable_broute.c +++ b/net/bridge/netfilter/ebtable_broute.c @@ -53,7 +53,7 @@ static int ebt_broute(struct sk_buff *skb) struct nf_hook_state state; int ret; -
[PATCH 06/10] netfilter: nf_tables: use hook state from xt_action_param structure
Don't copy relevant fields from hook state structure, instead use the one that is already available in struct xt_action_param. This patch also adds a set of new wrapper functions to fetch relevant hook state structure fields. Signed-off-by: Pablo Neira Ayuso--- include/net/netfilter/nf_tables.h| 35 +++- net/bridge/netfilter/nft_meta_bridge.c | 2 +- net/bridge/netfilter/nft_reject_bridge.c | 30 --- net/ipv4/netfilter/nft_dup_ipv4.c| 2 +- net/ipv4/netfilter/nft_masq_ipv4.c | 4 ++-- net/ipv4/netfilter/nft_redir_ipv4.c | 3 +-- net/ipv4/netfilter/nft_reject_ipv4.c | 4 ++-- net/ipv6/netfilter/nft_dup_ipv6.c| 2 +- net/ipv6/netfilter/nft_masq_ipv6.c | 3 ++- net/ipv6/netfilter/nft_redir_ipv6.c | 3 ++- net/ipv6/netfilter/nft_reject_ipv6.c | 6 +++--- net/netfilter/nf_dup_netdev.c| 2 +- net/netfilter/nf_tables_core.c | 10 - net/netfilter/nf_tables_trace.c | 8 net/netfilter/nft_log.c | 5 +++-- net/netfilter/nft_lookup.c | 5 ++--- net/netfilter/nft_meta.c | 6 +++--- net/netfilter/nft_queue.c| 2 +- net/netfilter/nft_reject_inet.c | 18 19 files changed, 86 insertions(+), 64 deletions(-) diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index 44060344f958..3295fb85bff6 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -14,27 +14,42 @@ struct nft_pktinfo { struct sk_buff *skb; - struct net *net; - const struct net_device *in; - const struct net_device *out; - u8 pf; - u8 hook; booltprot_set; u8 tprot; /* for x_tables compatibility */ struct xt_action_param xt; }; +static inline struct net *nft_net(const struct nft_pktinfo *pkt) +{ + return pkt->xt.state->net; +} + +static inline unsigned int nft_hook(const struct nft_pktinfo *pkt) +{ + return pkt->xt.state->hook; +} + +static inline u8 nft_pf(const struct nft_pktinfo *pkt) +{ + return pkt->xt.state->pf; +} + +static inline const struct net_device *nft_in(const struct nft_pktinfo *pkt) +{ + return pkt->xt.state->in; +} + +static inline const struct net_device *nft_out(const struct nft_pktinfo *pkt) +{ + return pkt->xt.state->out; +} + static inline void nft_set_pktinfo(struct nft_pktinfo *pkt, struct sk_buff *skb, const struct nf_hook_state *state) { pkt->skb = skb; - pkt->net = state->net; - pkt->in = state->in; - pkt->out = state->out; - pkt->hook = state->hook; - pkt->pf = state->pf; pkt->xt.state = state; } diff --git a/net/bridge/netfilter/nft_meta_bridge.c b/net/bridge/netfilter/nft_meta_bridge.c index ad47a921b701..5974dbc1ea24 100644 --- a/net/bridge/netfilter/nft_meta_bridge.c +++ b/net/bridge/netfilter/nft_meta_bridge.c @@ -23,7 +23,7 @@ static void nft_meta_bridge_get_eval(const struct nft_expr *expr, const struct nft_pktinfo *pkt) { const struct nft_meta *priv = nft_expr_priv(expr); - const struct net_device *in = pkt->in, *out = pkt->out; + const struct net_device *in = nft_in(pkt), *out = nft_out(pkt); u32 *dest = >data[priv->dreg]; const struct net_bridge_port *p; diff --git a/net/bridge/netfilter/nft_reject_bridge.c b/net/bridge/netfilter/nft_reject_bridge.c index 4b3df6b0e3b9..206dc266ecd2 100644 --- a/net/bridge/netfilter/nft_reject_bridge.c +++ b/net/bridge/netfilter/nft_reject_bridge.c @@ -315,17 +315,20 @@ static void nft_reject_bridge_eval(const struct nft_expr *expr, case htons(ETH_P_IP): switch (priv->type) { case NFT_REJECT_ICMP_UNREACH: - nft_reject_br_send_v4_unreach(pkt->net, pkt->skb, - pkt->in, pkt->hook, + nft_reject_br_send_v4_unreach(nft_net(pkt), pkt->skb, + nft_in(pkt), + nft_hook(pkt), priv->icmp_code); break; case NFT_REJECT_TCP_RST: - nft_reject_br_send_v4_tcp_reset(pkt->net, pkt->skb, - pkt->in, pkt->hook); + nft_reject_br_send_v4_tcp_reset(nft_net(pkt), pkt->skb, + nft_in(pkt), +
[PATCH 09/10] netfilter: handle queue bypass flag from nf_queue
Move queue bypass logic from nf_hook_slow() into nf_queue() that resides in net/netfilter/nf_queue.c, away from the core path. Signed-off-by: Pablo Neira Ayuso--- net/netfilter/core.c | 13 - net/netfilter/nf_internals.h | 4 ++-- net/netfilter/nf_queue.c | 39 --- 3 files changed, 30 insertions(+), 26 deletions(-) diff --git a/net/netfilter/core.c b/net/netfilter/core.c index fa5a3694c4b6..f299fbde150d 100644 --- a/net/netfilter/core.c +++ b/net/netfilter/core.c @@ -343,15 +343,10 @@ int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state, ret = -EPERM; break; case NF_QUEUE: - ret = nf_queue(skb, state, entry, - verdict >> NF_VERDICT_QBITS); - if (ret < 0) { - if (ret == -ESRCH && - (verdict & NF_VERDICT_FLAG_QUEUE_BYPASS)) - goto next_hook; - kfree_skb(skb); - } - /* Fall through. */ + ret = nf_queue(skb, state, entry, verdict); + if (ret == 1) + goto next_hook; + break; default: /* Implicit handling for NF_STOLEN, as well as any other non * conventional verdicts. diff --git a/net/netfilter/nf_internals.h b/net/netfilter/nf_internals.h index 301cc02257ad..a46f2635b71f 100644 --- a/net/netfilter/nf_internals.h +++ b/net/netfilter/nf_internals.h @@ -17,8 +17,8 @@ unsigned int nf_iterate(struct sk_buff *skb, struct nf_hook_state *state, struct nf_hook_entry **entryp); /* nf_queue.c */ -int nf_queue(struct sk_buff *skb, struct nf_hook_state *state, -struct nf_hook_entry *entry, unsigned int queuenum); +int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state, +struct nf_hook_entry *entry, unsigned int verdict); void nf_queue_nf_hook_drop(struct net *net, const struct nf_hook_entry *entry); int __init netfilter_queue_init(void); diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c index 091130bc890a..c5e0d534d352 100644 --- a/net/netfilter/nf_queue.c +++ b/net/netfilter/nf_queue.c @@ -107,12 +107,8 @@ void nf_queue_nf_hook_drop(struct net *net, const struct nf_hook_entry *entry) rcu_read_unlock(); } -/* - * Any packet that leaves via this function must come back - * through nf_reinject(). - */ -int nf_queue(struct sk_buff *skb, struct nf_hook_state *state, -struct nf_hook_entry *hook_entry, unsigned int queuenum) +static int __nf_queue(struct sk_buff *skb, const struct nf_hook_state *state, + struct nf_hook_entry *hook_entry, unsigned int queuenum) { int status = -ENOENT; struct nf_queue_entry *entry = NULL; @@ -161,13 +157,32 @@ int nf_queue(struct sk_buff *skb, struct nf_hook_state *state, return status; } +/* Any packet that leaves via this function must come back through + * nf_reinject(). + */ +int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state, +struct nf_hook_entry *entry, unsigned int verdict) +{ + int ret; + + ret = __nf_queue(skb, state, entry, verdict >> NF_VERDICT_QBITS); + if (ret < 0) { + if (ret == -ESRCH && + (verdict & NF_VERDICT_FLAG_QUEUE_BYPASS)) + return 1; + + kfree_skb(skb); + } + + return 0; +} + void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict) { struct nf_hook_entry *hook_entry = entry->hook; struct nf_hook_ops *elem = _entry->ops; struct sk_buff *skb = entry->skb; const struct nf_afinfo *afinfo; - int err; nf_queue_entry_release_refs(entry); @@ -196,14 +211,8 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict) local_bh_enable(); break; case NF_QUEUE: - err = nf_queue(skb, >state, hook_entry, - verdict >> NF_VERDICT_QBITS); - if (err < 0) { - if (err == -ESRCH && - (verdict & NF_VERDICT_FLAG_QUEUE_BYPASS)) - goto next_hook; - kfree_skb(skb); - } + if (nf_queue(skb, >state, hook_entry, verdict) == 1) + goto next_hook; break; case NF_STOLEN: break; -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/10] netfilter: kill NF_HOOK_THRESH() and state->tresh
Patch c5136b15ea36 ("netfilter: bridge: add and use br_nf_hook_thresh") introduced br_nf_hook_thresh(). Replace NF_HOOK_THRESH() by br_nf_hook_thresh from br_nf_forward_finish(), so we have no more callers for this macro. As a result, state->thresh and explicit thresh parameter in the hook state structure is not required anymore. And we can get rid of skip-hook-under-thresh loop in nf_iterate() in the core path that is only used by br_netfilter to search for the filter hook. Suggested-by: Florian WestphalSigned-off-by: Pablo Neira Ayuso --- include/linux/netfilter.h | 50 +-- include/linux/netfilter_ingress.h | 2 +- net/bridge/br_netfilter_hooks.c | 8 +++--- net/bridge/netfilter/ebtable_broute.c | 2 +- net/netfilter/core.c | 4 --- net/netfilter/nf_queue.c | 1 - 6 files changed, 19 insertions(+), 48 deletions(-) diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h index abc7fdcb9eb1..e0d000f6c9bf 100644 --- a/include/linux/netfilter.h +++ b/include/linux/netfilter.h @@ -49,7 +49,6 @@ struct sock; struct nf_hook_state { unsigned int hook; - int thresh; u_int8_t pf; struct net_device *in; struct net_device *out; @@ -84,7 +83,7 @@ struct nf_hook_entry { static inline void nf_hook_state_init(struct nf_hook_state *p, struct nf_hook_entry *hook_entry, unsigned int hook, - int thresh, u_int8_t pf, + u_int8_t pf, struct net_device *indev, struct net_device *outdev, struct sock *sk, @@ -92,7 +91,6 @@ static inline void nf_hook_state_init(struct nf_hook_state *p, int (*okfn)(struct net *, struct sock *, struct sk_buff *)) { p->hook = hook; - p->thresh = thresh; p->pf = pf; p->in = indev; p->out = outdev; @@ -155,20 +153,16 @@ extern struct static_key nf_hooks_needed[NFPROTO_NUMPROTO][NF_MAX_HOOKS]; int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state); /** - * nf_hook_thresh - call a netfilter hook + * nf_hook - call a netfilter hook * * Returns 1 if the hook has allowed the packet to pass. The function * okfn must be invoked by the caller in this case. Any other return * value indicates the packet has been consumed by the hook. */ -static inline int nf_hook_thresh(u_int8_t pf, unsigned int hook, -struct net *net, -struct sock *sk, -struct sk_buff *skb, -struct net_device *indev, -struct net_device *outdev, -int (*okfn)(struct net *, struct sock *, struct sk_buff *), -int thresh) +static inline int nf_hook(u_int8_t pf, unsigned int hook, struct net *net, + struct sock *sk, struct sk_buff *skb, + struct net_device *indev, struct net_device *outdev, + int (*okfn)(struct net *, struct sock *, struct sk_buff *)) { struct nf_hook_entry *hook_head; int ret = 1; @@ -185,8 +179,8 @@ static inline int nf_hook_thresh(u_int8_t pf, unsigned int hook, if (hook_head) { struct nf_hook_state state; - nf_hook_state_init(, hook_head, hook, thresh, - pf, indev, outdev, sk, net, okfn); + nf_hook_state_init(, hook_head, hook, pf, indev, outdev, + sk, net, okfn); ret = nf_hook_slow(skb, ); } @@ -195,14 +189,6 @@ static inline int nf_hook_thresh(u_int8_t pf, unsigned int hook, return ret; } -static inline int nf_hook(u_int8_t pf, unsigned int hook, struct net *net, - struct sock *sk, struct sk_buff *skb, - struct net_device *indev, struct net_device *outdev, - int (*okfn)(struct net *, struct sock *, struct sk_buff *)) -{ - return nf_hook_thresh(pf, hook, net, sk, skb, indev, outdev, okfn, INT_MIN); -} - /* Activate hook; either okfn or kfree_skb called, unless a hook returns NF_STOLEN (in which case, it's up to the hook to deal with the consequences). @@ -221,19 +207,6 @@ static inline int nf_hook(u_int8_t pf, unsigned int hook, struct net *net, */ static inline int -NF_HOOK_THRESH(uint8_t pf, unsigned int hook, struct net *net, struct sock *sk, - struct sk_buff *skb, struct net_device *in, - struct net_device *out, - int (*okfn)(struct net *, struct
Re: [PATCH nf-next 2/5] netfilter: nft: basic routing expression
On 16 October 2016 at 15:42, Anders K. Pedersen | Cohaesiowrote: > From: Anders K. Pedersen > > Introduce basic infrastructure for nftables rt expression for routing > related data. Initially "rt classid" is implemented identical to "meta > rtclassid", since it is more logical to have this match in the routing > expression going forward. > > Signed-off-by: Anders K. Pedersen > --- > include/net/netfilter/nft_rt.h | 23 + > net/netfilter/Kconfig| 6 ++ > net/netfilter/Makefile | 1 + > net/netfilter/nft_rt.c | 145 ++ > 4 files changed, 175 insertions(+) > > diff --git a/include/net/netfilter/nft_rt.h b/include/net/netfilter/nft_rt.h > --- /dev/null > +++ b/include/net/netfilter/nft_rt.h > @@ -0,0 +1,23 @@ > +#ifndef _NFT_RT_H_ > +#define _NFT_RT_H_ > + > +struct nft_rt { > + enum nft_rt_keyskey:8; > + enum nft_registers dreg:8; > + u8 family; > +}; > + > +extern const struct nla_policy nft_rt_policy[]; > + > +int nft_rt_get_init(const struct nft_ctx *ctx, > + const struct nft_expr *expr, > + const struct nlattr * const tb[]); > + > +int nft_rt_get_dump(struct sk_buff *skb, > + const struct nft_expr *expr); > + > +void nft_rt_get_eval(const struct nft_expr *expr, > +struct nft_regs *regs, > +const struct nft_pktinfo *pkt); > + > +#endif > diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig > --- a/net/netfilter/Kconfig > +++ b/net/netfilter/Kconfig > @@ -474,6 +474,12 @@ config NFT_META > This option adds the "meta" expression that you can use to match and > to set packet metainformation such as the packet mark. > > +config NFT_RT > + tristate "Netfilter nf_tables routing module" > + help > + This option adds the "rt" expression that you can use to match > + packet routing information such as the packet nexthop. > + > config NFT_NUMGEN > tristate "Netfilter nf_tables number generator module" > help > diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile > --- a/net/netfilter/Makefile > +++ b/net/netfilter/Makefile > @@ -81,6 +81,7 @@ obj-$(CONFIG_NF_TABLES_NETDEV)+= nf_tables_netdev.o > obj-$(CONFIG_NFT_COMPAT) += nft_compat.o > obj-$(CONFIG_NFT_EXTHDR) += nft_exthdr.o > obj-$(CONFIG_NFT_META) += nft_meta.o > +obj-$(CONFIG_NFT_RT) += nft_rt.o > obj-$(CONFIG_NFT_NUMGEN) += nft_numgen.o > obj-$(CONFIG_NFT_CT) += nft_ct.o > obj-$(CONFIG_NFT_LIMIT)+= nft_limit.o > diff --git a/net/netfilter/nft_rt.c b/net/netfilter/nft_rt.c > --- /dev/null > +++ b/net/netfilter/nft_rt.c > @@ -0,0 +1,145 @@ > +/* > + * Copyright (c) 2016 Anders K. Pedersen > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License version 2 as > + * published by the Free Software Foundation. > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +void nft_rt_get_eval(const struct nft_expr *expr, > +struct nft_regs *regs, > +const struct nft_pktinfo *pkt) > +{ > + const struct nft_rt *priv = nft_expr_priv(expr); > + const struct sk_buff *skb = pkt->skb; > + u32 *dest = >data[priv->dreg]; > + > + switch (priv->key) { > +#ifdef CONFIG_IP_ROUTE_CLASSID > + case NFT_RT_CLASSID: { > + const struct dst_entry *dst = skb_dst(skb); > + > + if (dst == NULL) > + goto err; > + *dest = dst->tclassid; > + break; > + } > +#endif > + default: > + WARN_ON(1); > + goto err; > + } > + return; > + > +err: > + regs->verdict.code = NFT_BREAK; > +} > +EXPORT_SYMBOL_GPL(nft_rt_get_eval); > + > +const struct nla_policy nft_rt_policy[NFTA_RT_MAX + 1] = { > + [NFTA_RT_DREG] = { .type = NLA_U32 }, > + [NFTA_RT_KEY] = { .type = NLA_U32 }, > + [NFTA_RT_FAMILY]= { .type = NLA_U32 }, > +}; > +EXPORT_SYMBOL_GPL(nft_rt_policy); > + > +int nft_rt_get_init(const struct nft_ctx *ctx, > + const struct nft_expr *expr, > + const struct nlattr * const tb[]) > +{ > + struct nft_rt *priv = nft_expr_priv(expr); > + unsigned int len; > + > + priv->key = ntohl(nla_get_be32(tb[NFTA_RT_KEY])); > + switch (priv->key) { > +#ifdef CONFIG_IP_ROUTE_CLASSID > + case NFT_RT_CLASSID: > + len = sizeof(u32); > + break; > +#endif > + default: > + return -EOPNOTSUPP; > + } > + > +
Re: [PATCH nf-next 1/5] netfilter: nft: UAPI headers for routing expression
On 16 October 2016 at 15:41, Anders K. Pedersen | Cohaesiowrote: > diff --git a/include/uapi/linux/netfilter/nf_tables.h > b/include/uapi/linux/netfilter/nf_tables.h > --- a/include/uapi/linux/netfilter/nf_tables.h > +++ b/include/uapi/linux/netfilter/nf_tables.h > @@ -759,6 +759,16 @@ enum nft_meta_keys { > }; > > /** > + * enum nft_rt_keys - nf_tables routing expression keys > + * > + * @NFT_META_NEXTHOP: routing nexthop > + */ > +enum nft_rt_keys { > + NFT_RT_CLASSID, > + NFT_RT_NEXTHOP, > +}; > + The comment section looks like it requires a fix. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html