Re: [PATCH nf] netfilter: ipv6: nf_defrag: drop skb dst before queueing

2018-07-09 Thread Eric Dumazet
On 07/09/2018 04:43 AM, Florian Westphal wrote: > Eric Dumazet reports: > Here is a reproducer of an annoying bug detected by syzkaller on our > production kernel > [..] > ./b78305423 enable_conntrack > Then : > sleep 60 > dmesg | tail -10 > [ 171.599093] un

Re: [PATCH v2] netfilter: properly initialize xt_table_info structure

2018-05-17 Thread Eric Dumazet
On 05/17/2018 02:34 AM, Greg Kroah-Hartman wrote: > When allocating a xt_table_info structure, we should be clearing out the > full amount of memory that was allocated, not just the "header" of the > structure. Otherwise odd values could be passed to userspace, which is > not a good thing. > >

Re: WARNING in __proc_create

2018-03-09 Thread Eric Dumazet
On 03/09/2018 03:32 PM, Cong Wang wrote: On Fri, Mar 9, 2018 at 3:21 PM, Eric Dumazet <eric.duma...@gmail.com> wrote: On 03/09/2018 03:05 PM, Cong Wang wrote: BTW, the warning itself is all about empty names, so perhaps it's better to fix them separately. Huh ? You want more

Re: WARNING in __proc_create

2018-03-09 Thread Eric Dumazet
On 03/09/2018 03:05 PM, Cong Wang wrote: BTW, the warning itself is all about empty names, so perhaps it's better to fix them separately. Huh ? You want more syzbot reports ? I do not. I unblocked this report today [1], you can be sure that as soon as syzbot gets the correct tag

Re: WARNING in __proc_create

2018-03-09 Thread Eric Dumazet
On 03/09/2018 02:56 PM, Eric Dumazet wrote: I sent a patch a while back, but Pablo/Florian wanted more than that simple fix. We also need to filter special characters like '/' Or maybe I am mixing with something else. Yes, Florian mentioned that we also had to reject

Re: WARNING in __proc_create

2018-03-09 Thread Eric Dumazet
On 03/09/2018 02:48 PM, Cong Wang wrote: On Fri, Mar 9, 2018 at 1:59 PM, syzbot wrote: Hello, syzbot hit the following crash on net-next commit 617aebe6a97efa539cc4b8a52adccd89596e6be0 (Sun Feb 4 00:25:42 2018 +) Merge tag

Re: [PATCH nf v5] netfilter: bridge: ebt_among: add more missing match size checks

2018-03-09 Thread Eric Dumazet
)sizeof() cast rather than use of temporary 'int minsize'. objdump shows identical output for v3/v4/v5. SGTM, thanks Florian ;) Reviewed-by: Eric Dumazet <eduma...@google.com> -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a mes

Re: [PATCH nf v4] netfilter: bridge: ebt_among: add more missing match size checks

2018-03-09 Thread Eric Dumazet
On 03/09/2018 02:03 AM, Florian Westphal wrote: ebt_among is special, it has a dynamic match size and is exempt from the central size checks. commit c4585a2823edf ("bridge: ebt_among: add missing match size checks") added validation for pool size, but missed fact that the macros

Re: [PATCH nf v3] netfilter: bridge: ebt_among: add more missing match size checks

2018-03-08 Thread Eric Dumazet
On 03/08/2018 04:24 PM, Florian Westphal wrote: Eric Dumazet <eric.duma...@gmail.com> wrote: Fixes: c4585a2823edf ("bridge: ebt_among: add missing match size checks") Reported-by: <syzbot+bdabab6f1983a03fc...@syzkaller.appspotmail.com> Signed-off-by: Florian We

Re: [PATCH nf-next 1/2] netfilter: SYNPROXY: set transport header properly

2018-03-08 Thread Eric Dumazet
On 03/08/2018 07:01 AM, Serhey Popovych wrote: Eric Dumazet wrote: On 03/08/2018 02:08 AM, Serhey Popovych wrote: We can't use skb_reset_transport_header() together with skb_put() to set skb->transport_header field because skb_put() does not touch skb->data. Do this same way as

Re: [PATCH nf-next 1/2] netfilter: SYNPROXY: set transport header properly

2018-03-08 Thread Eric Dumazet
On 03/08/2018 02:08 AM, Serhey Popovych wrote: We can't use skb_reset_transport_header() together with skb_put() to set skb->transport_header field because skb_put() does not touch skb->data. Do this same way as we did for csum_data in code: substract skb->head from tcph. Signed-off-by:

Re: [Patch nf-next] netfilter: make xt_rateest hash table per net

2018-03-01 Thread Eric Dumazet
On Thu, 2018-03-01 at 18:58 -0800, Cong Wang wrote: > As suggested by Eric, we need to make the xt_rateest > hash table and its lock per netns to reduce lock > contentions. > > Cc: Florian Westphal <f...@strlen.de> > Cc: Eric Dumazet <eduma...@google.com&g

[PATCH nf] netfilter: IDLETIMER: be syzkaller friendly

2018-02-16 Thread Eric Dumazet
From: Eric Dumazet <eduma...@google.com> We had one report from syzkaller [1] First issue is that INIT_WORK() should be done before mod_timer() or we risk timer being fired too soon, even with a 1 second timer. Second issue is that we need to reject too big info->timeout to avoid

Re: [PATCH net v2] netfilter: nat: cope with negative port range

2018-02-14 Thread Eric Dumazet
On Wed, 2018-02-14 at 13:30 +0100, Florian Westphal wrote: > Eric Dumazet <eric.duma...@gmail.com> wrote: > > On Wed, 2018-02-14 at 12:13 +0100, Paolo Abeni wrote: > > > syzbot reported a division by 0 bug in the netfilter nat code: > > > Adding the relevan

Re: [Patch net v2] xt_RATEEST: acquire xt_rateest_mutex for hash insert

2018-02-05 Thread Eric Dumazet
or internal use and keep the > locking one for external. > > Reported-by: <syzbot+5cb189720978275e4...@syzkaller.appspotmail.com> > Fixes: 5859034d7eb8 ("[NETFILTER]: x_tables: add RATEEST target") > Cc: Pablo Neira Ayuso <pa...@netfilter.org> > Cc: Eric Duma

Re: [Patch net] xt_RATEEST: acquire xt_rateest_mutex for hash insert

2018-01-31 Thread Eric Dumazet
On Wed, 2018-01-31 at 16:26 -0800, Cong Wang wrote: > rateest_hash is supposed to be protected by xt_rateest_mutex. > > Reported-by: > Fixes: 5859034d7eb8 ("[NETFILTER]: x_tables: add RATEEST target") > Cc: Pablo Neira Ayuso

Re: [patch 1/1] net/netfilter/x_tables.c: make allocation less aggressive

2018-01-30 Thread Eric Dumazet
On Tue, 2018-01-30 at 11:30 -0800, a...@linux-foundation.org wrote: > From: Michal Hocko > Subject: net/netfilter/x_tables.c: make allocation less aggressive > > syzbot has noticed that xt_alloc_table_info can allocate a lot of memory. > This is an admin only interface but an

[PATCH net] netfilter: xt_recent: do not accept / in table name

2018-01-28 Thread Eric Dumazet
From: Eric Dumazet <eduma...@google.com> proc_create_data() will issue a WARN() otherwise, lets avoid that. name 'syz/\xF5' WARNING: CPU: 1 PID: 3688 at fs/proc/generic.c:163 __xlate_proc_name+0xe6/0x110 fs/proc/generic.c:163 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID

[PATCH net] netfilter: xt_hashlimit: do not allow empty names

2018-01-28 Thread Eric Dumazet
From: Eric Dumazet <eduma...@google.com> Syzbot reported a WARN() in proc_create_data() [1] Issue here is that xt_hashlimit does not check that user space provided an empty table name. [1] name len 0 WARNING: CPU: 0 PID: 3680 at fs/proc/generic.c:354 __proc_create+0x696/0x880 fs/proc/gen

[PATCH net] netfilter: x_tables: avoid out-of-bounds reads in xt_request_find_match()

2018-01-24 Thread Eric Dumazet
From: Eric Dumazet <eduma...@google.com> It looks like syzbot found its way into netfilter territory. Issue here is that @name comes from user space and might not be null terminated. Out-of-bound reads happen, KASAN is not happy. Signed-off-by: Eric Dumazet <eduma...@google.com&

Re: [PATCH net-next v2] net: move decnet to staging

2017-11-13 Thread Eric Dumazet
On Mon, 2017-11-13 at 11:32 -0800, Joe Perches wrote: > On Mon, 2017-11-13 at 09:11 -0800, Stephen Hemminger wrote: > > Support for Decnet has been orphaned for some time. > > In the interest of reducing the potential bug surface and pre-holiday > > cleaning, move the decnet protocol into staging

Re: [PATCH v3 nf-next 1/2] netfilter: x_tables: wait until old table isn't used anymore

2017-10-11 Thread Eric Dumazet
On Wed, Oct 11, 2017 at 11:18 AM, Florian Westphal <f...@strlen.de> wrote: > Eric Dumazet <eduma...@google.com> wrote: >> On Wed, Oct 11, 2017 at 10:48 AM, Florian Westphal <f...@strlen.de> wrote: >> > Eric Dumazet <eduma...@google.com> wrote: >

Re: [PATCH nf] netfilter: x_tables: ensure readers see new ->private value

2017-10-11 Thread Eric Dumazet
On Wed, Oct 11, 2017 at 11:03 AM, Florian Westphal <f...@strlen.de> wrote: > Eric Dumazet wrote: > But it seems we need an extra smp_wmb() after > smp_wmb(); > table->private = newinfo; > > Otherwise we have no guarantee other cpus actually see the new >

Re: [PATCH v3 nf-next 1/2] netfilter: x_tables: wait until old table isn't used anymore

2017-10-11 Thread Eric Dumazet
On Wed, Oct 11, 2017 at 10:48 AM, Florian Westphal <f...@strlen.de> wrote: > Eric Dumazet <eduma...@google.com> wrote: >> On Wed, Oct 11, 2017 at 7:26 AM, Florian Westphal <f...@strlen.de> wrote: >> > xt_replace_table relies on table replacement counter

Re: [PATCH v3 nf-next 1/2] netfilter: x_tables: wait until old table isn't used anymore

2017-10-11 Thread Eric Dumazet
eset > without any synchonization after xt_replace_table has completed. > > Cc: Dan Williams <d...@redhat.com> > Cc: Eric Dumazet <eduma...@google.com> > Signed-off-by: Florian Westphal <f...@strlen.de> > --- > v3: check for 'seq is uneven' OR 'has changed' since &

[PATCH net] netfilter: x_tables: avoid stack-out-of-bounds read in xt_copy_counters_from_user

2017-10-05 Thread Eric Dumazet
From: Eric Dumazet <eduma...@google.com> syzkaller reports an out of bound read in strlcpy(), triggered by xt_copy_counters_from_user() Fix this by using memcpy(), then forcing a zero byte at the last position of the destination, as Florian did for the non COMPAT code. Fixes: d7591f

Re: [PATCH v2] netfilter: xt_socket: Restore mark from full sockets only

2017-09-21 Thread Eric Dumazet
On Thu, 2017-09-21 at 16:08 -0600, Subash Abhinov Kasiviswanathan wrote: > An out of bounds error was detected on an ARM64 target with > Android based kernel 4.9. This occurs while trying to > restore mark on a skb from an inet request socket. > > BUG: KASAN: slab-out-of-bounds in

Re: [PATCH] netfilter: xt_socket: Restore mark from full sockets only

2017-09-21 Thread Eric Dumazet
On Thu, 2017-09-21 at 15:20 -0600, Subash Abhinov Kasiviswanathan wrote: > An out of bounds error was detected on an ARM64 target with > Android based kernel 4.9. This occurs while trying to > restore mark on a skb from an inet request socket. > > BUG: KASAN: slab-out-of-bounds in

Re: [PATCH nf] netfilter: xtables: add scheduling opportunity in get_counters

2017-09-01 Thread Eric Dumazet
: Florian Westphal <f...@strlen.de> > --- > net/ipv4/netfilter/arp_tables.c | 1 + > net/ipv4/netfilter/ip_tables.c | 1 + > net/ipv6/netfilter/ip6_tables.c | 1 + > 3 files changed, 3 insertions(+) SGTM, thanks Florian ! Acked-by: Eric Dumazet <eduma...@google.com>

Re: [PATCH nf-next 1/3] netfilter: convert hook list to an array

2017-08-23 Thread Eric Dumazet
On Wed, 2017-08-23 at 17:26 +0200, Florian Westphal wrote: > From: Aaron Conole ... > -static struct nf_hook_entry __rcu **nf_hook_entry_head(struct net *net, > const struct nf_hook_ops *reg) > +static struct nf_hook_entries *allocate_hook_entries_size(u16 num) > +{ > +

Re: [PATCH nf-next 4/4] netfilter: rt: add support to fetch path mss

2017-08-08 Thread Eric Dumazet
On Tue, 2017-08-08 at 15:15 +0200, Florian Westphal wrote: > to be used in combination with tcp option set support to mimic > iptables TCPMSS --clamp-mss-to-pmtu. > > Signed-off-by: Florian Westphal > --- > include/uapi/linux/netfilter/nf_tables.h | 2 + >

Re: [PATCH nf-next] netfilter: nft_set_rbtree: use seqcount to avoid lock in most cases

2017-07-26 Thread Eric Dumazet
On Wed, 2017-07-26 at 02:09 +0200, Florian Westphal wrote: > switch to lockless lockup. write side now also increments sequence > counter. On lookup, sample counter value and only take the lock > if we did not find a match and the counter has changed. > > This avoids need to write to private

Re: [PATCH] netfilter: nfnetlink: Improve input length sanitization in nfnetlink_rcv

2017-06-07 Thread Eric Dumazet
On Wed, 2017-06-07 at 14:35 +0200, Mateusz Jurczyk wrote: > Verify that the length of the socket buffer is sufficient to cover the > entire nlh->nlmsg_len field before accessing that field for further > input sanitization. If the client only supplies 1-3 bytes of data in > sk_buff, then

Re: [PATCH nf-next] netfilter: tcp: Use TCP_MAX_WSCALE instead of literal 14

2017-04-20 Thread Eric Dumazet
On Thu, 2017-04-20 at 08:44 +0800, Gao Feng wrote: > > On Wed, Apr 19, 2017 at 09:57:55PM +0200, Pablo Neira Ayuso wrote: > > > On Wed, Apr 19, 2017 at 09:22:08AM -0700, Eric Dumazet wrote: > > > > On Wed, 2017-04-19 at 17:58 +0200, Pablo Neira Ayuso wrote: > > &g

Re: [PATCH nf-next] netfilter: tcp: Use TCP_MAX_WSCALE instead of literal 14

2017-04-19 Thread Eric Dumazet
On Wed, 2017-04-19 at 17:58 +0200, Pablo Neira Ayuso wrote: > On Wed, Apr 19, 2017 at 09:23:42AM +0800, gfree.w...@foxmail.com wrote: > > From: Gao Feng > > > > The window scale may be enlarged from 14 to 15 according to the itef > > draft

[PATCH net] netfilter: xt_TCPMSS: add more sanity tests on tcph->doff

2017-04-03 Thread Eric Dumazet
From: Eric Dumazet <eduma...@google.com> Denys provided an awesome KASAN report pointing to an use after free in xt_TCPMSS I have provided three patches to fix this issue, either in xt_TCPMSS or in xt_tcpudp.c. It seems xt_TCPMSS patch has the smallest possible impact. Signed-off-by

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-03 Thread Eric Dumazet
On Mon, 2017-04-03 at 15:14 +0300, Denys Fedoryshchenko wrote: > On 2017-04-03 15:09, Eric Dumazet wrote: > > On Mon, 2017-04-03 at 11:10 +0300, Denys Fedoryshchenko wrote: > > > >> I modified patch a little as: > >> if (th->doff * 4 < size

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-03 Thread Eric Dumazet
On Mon, 2017-04-03 at 11:10 +0300, Denys Fedoryshchenko wrote: > I modified patch a little as: > if (th->doff * 4 < sizeof(_tcph)) { > par->hotdrop = true; > WARN_ON_ONCE(!tcpinfo->option); > return false; > } > > And it did triggered WARN once at morning, and didn't hit KASAN. I will >

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-02 Thread Eric Dumazet
On Sun, 2017-04-02 at 10:14 -0700, Eric Dumazet wrote: > Could that be that netfilter does not abort earlier if TCP header is > completely wrong ? > Yes, I wonder if this patch would be better, unless we replicate the th->doff sanity check in all netfilter modules dissecting TCP f

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-02 Thread Eric Dumazet
On Sun, 2017-04-02 at 19:52 +0300, Denys Fedoryshchenko wrote: > On 2017-04-02 15:32, Eric Dumazet wrote: > > On Sun, 2017-04-02 at 15:25 +0300, Denys Fedoryshchenko wrote: > >> > */ > >> I will add also WARN_ON_ONCE(tcp_hdrlen >= 15 * 4) before, for

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-02 Thread Eric Dumazet
On Sun, 2017-04-02 at 15:25 +0300, Denys Fedoryshchenko wrote: > > */ > I will add also WARN_ON_ONCE(tcp_hdrlen >= 15 * 4) before, for > curiosity, if this condition are triggered. Is it fine like that? Sure. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-02 Thread Eric Dumazet
On Sun, 2017-04-02 at 04:54 -0700, Eric Dumazet wrote: > On Sun, 2017-04-02 at 13:45 +0200, Florian Westphal wrote: > > Eric Dumazet <eric.duma...@gmail.com> wrote: > > > - for (i = sizeof(struct tcphdr); i <= tcp_hdrlen - TCPOLEN_MSS; i += > > > optlen(opt,

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-02 Thread Eric Dumazet
On Sun, 2017-04-02 at 13:45 +0200, Florian Westphal wrote: > Eric Dumazet <eric.duma...@gmail.com> wrote: > > - for (i = sizeof(struct tcphdr); i <= tcp_hdrlen - TCPOLEN_MSS; i += > > optlen(opt, i)) { > > + for (i = sizeof(struct tcphdr); i < tcp_hdrlen - TCP

Re: KASAN, xt_TCPMSS finally found nasty use-after-free bug? 4.10.8

2017-04-02 Thread Eric Dumazet
On Sun, 2017-04-02 at 10:43 +0300, Denys Fedoryshchenko wrote: > Repost, due being sleepy missed few important points. > > I am searching reasons of crashes for multiple conntrack enabled > servers, usually they point to conntrack, but i suspect use after free > might be somewhere else, > so i

Re: [PATCH nf-next 0/2] netfilter: untracked object removal

2017-03-08 Thread Eric Dumazet
On Wed, 2017-03-08 at 13:49 +0100, Florian Westphal wrote: > These patches remove the percpu untracked objects, they get replaced > with a new (kernel internal) ctinfo state. > > This avoids reference counter operations for untracked packets and > removes the need to check a conntrack for the

Re: net: suspicious RCU usage in nf_hook

2017-02-01 Thread Eric Dumazet
On Wed, 2017-02-01 at 15:48 -0800, Eric Dumazet wrote: > On Wed, Feb 1, 2017 at 3:29 PM, Cong Wang <xiyou.wangc...@gmail.com> wrote: > > > Not sure if it is better. The difference is caught up in > > net_enable_timestamp(), > > which is called setsockopt() path an

Re: net: suspicious RCU usage in nf_hook

2017-02-01 Thread Eric Dumazet
On Wed, Feb 1, 2017 at 3:29 PM, Cong Wang wrote: > Not sure if it is better. The difference is caught up in > net_enable_timestamp(), > which is called setsockopt() path and sk_clone() path, so we could be > in netstamp_needed state for a long time too until user-space

Re: net: suspicious RCU usage in nf_hook

2017-02-01 Thread Eric Dumazet
On Wed, 2017-02-01 at 13:16 -0800, Eric Dumazet wrote: > This would permanently leave the kernel in the netstamp_needed state. > > I would prefer the patch using a process context to perform the > cleanup ? Note there is a race window, but probably not a big deal. > > net

Re: net: suspicious RCU usage in nf_hook

2017-02-01 Thread Eric Dumazet
On Wed, 2017-02-01 at 12:51 -0800, Cong Wang wrote: > On Tue, Jan 31, 2017 at 7:44 AM, Eric Dumazet <eric.duma...@gmail.com> wrote: > > On Mon, 2017-01-30 at 22:19 -0800, Cong Wang wrote: > > > >> > >> The context is process context (TX path before hitting

Re: net: suspicious RCU usage in nf_hook

2017-01-31 Thread Eric Dumazet
On Mon, 2017-01-30 at 22:19 -0800, Cong Wang wrote: > > The context is process context (TX path before hitting qdisc), and > BH is not disabled, so in_interrupt() doesn't catch it. Hmm, this > makes me thinking maybe we really need to disable BH in this > case for nf_hook()? But it is called in

Re: net: suspicious RCU usage in nf_hook

2017-01-27 Thread Eric Dumazet
On Fri, 2017-01-27 at 17:00 -0800, Cong Wang wrote: > On Fri, Jan 27, 2017 at 3:35 PM, Eric Dumazet <eric.duma...@gmail.com> wrote: > > Oh well, I forgot to submit the official patch I think, Jan 9th. > > > > https://groups.google.com/forum/#!topic/syzkaller/BhyN5OFd7sQ

Re: [PATCH 02/14] tcp: fix mark propagation with fwmark_reflect enabled

2017-01-26 Thread Eric Dumazet
On Thu, 2017-01-26 at 20:19 +0100, Pablo Neira Ayuso wrote: > Right. This is not percpu as in IPv4. > > I can send a follow up patch to get this in sync with the way we do it > in IPv4, ie. add percpu socket. > > Fine with this approach? Thanks! Not really. percpu sockets are going to slow

Re: [PATCH 02/14] tcp: fix mark propagation with fwmark_reflect enabled

2017-01-26 Thread Eric Dumazet
On Thu, 2017-01-26 at 17:37 +0100, Pablo Neira Ayuso wrote: > From: Pau Espin Pedrol > > Otherwise, RST packets generated by the TCP stack for non-existing > sockets always have mark 0. > The mark from the original packet is assigned to the netns_ipv4/6 > socket used to

Re: ip_rcv_finish() NULL pointer kernel panic

2017-01-26 Thread Eric Dumazet
On Thu, 2017-01-26 at 10:00 -0800, Eric Dumazet wrote: > On Thu, 2017-01-26 at 17:24 +0100, Florian Westphal wrote: > > > I think it makes sense to set dst->incoming > > to a stub in br_netfilter_rtable_init() to just kfree_skb()+ > > WARN_ON_ONCE(), no need to ad

Re: ip_rcv_finish() NULL pointer kernel panic

2017-01-26 Thread Eric Dumazet
On Thu, 2017-01-26 at 17:24 +0100, Florian Westphal wrote: > I think it makes sense to set dst->incoming > to a stub in br_netfilter_rtable_init() to just kfree_skb()+ > WARN_ON_ONCE(), no need to add code to ip stack or crash kernel > due to brnf bug. Just kfree_skb() would hide bugs. Dropping

Re: ip_rcv_finish() NULL pointer kernel panic

2017-01-26 Thread Eric Dumazet
On Thu, 2017-01-26 at 09:32 -0600, Roy Keene wrote: > This bug appears to have existed for a long time: > > https://www.spinics.net/lists/netdev/msg222459.html > > http://www.kernelhub.org/?p=2=823752 > > Though possibly with different things not setting the "input" function >

Re: [PATCH net-next] netfilter: nft_counter: rework atomic dump and reset

2016-12-10 Thread Eric Dumazet
On Sat, 2016-12-10 at 15:25 +0100, Pablo Neira Ayuso wrote: > On Sat, Dec 10, 2016 at 03:16:55PM +0100, Pablo Neira Ayuso wrote: = > > - nft_counter_fetch(priv, , reset); > + nft_counter_fetch(priv, ); > + if (reset) > + nft_counter_reset(priv, ); > > if

Re: [PATCH 37/50] netfilter: nf_tables: atomic dump and reset for stateful objects

2016-12-09 Thread Eric Dumazet
On Fri, 2016-12-09 at 06:24 -0800, Eric Dumazet wrote: > It looks that you want a seqcount, even on 64bit arches, > so that CPU 2 can restart its loop, and more importantly you need > to not accumulate the values you read, because they might be old/invalid. Untested patch to give genera

Re: [PATCH 37/50] netfilter: nf_tables: atomic dump and reset for stateful objects

2016-12-09 Thread Eric Dumazet
On Fri, 2016-12-09 at 11:24 +0100, Pablo Neira Ayuso wrote: > Hi Paul, Hi Pablo Given that bytes/packets counters are modified without cmpxchg64() : static inline void nft_counter_do_eval(struct nft_counter_percpu_priv *priv, struct nft_regs *regs,

Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf

2016-12-05 Thread Eric Dumazet
o pass the mode and path to the kernel to be > able to return it later for iptables dump and save. > > Signed-off-by: Willem de Bruijn <will...@google.com> > --- Assuming there is no simple way to get variable matchsize in iptables, this looks good to me, thanks. Revi

Re: [PATCN net-next] net_sched: gen_estimator: complete rewrite of rate estimators

2016-12-03 Thread Eric Dumazet
On Sat, 2016-12-03 at 23:07 -0800, Eric Dumazet wrote: > From: Eric Dumazet <eduma...@google.com> > > 1) Old code was hard to maintain, due to complex lock chains. >(We probably will be able to remove some kfree_rcu() in callers) > > 2) Using a single timer to up

[PATCN net-next] net_sched: gen_estimator: complete rewrite of rate estimators

2016-12-03 Thread Eric Dumazet
From: Eric Dumazet <eduma...@google.com> 1) Old code was hard to maintain, due to complex lock chains. (We probably will be able to remove some kfree_rcu() in callers) 2) Using a single timer to update all estimators does not scale. 3) Code was buggy on 32bit kernel (WRITE_ONCE() on

[PATCN v2 net-next] net_sched: gen_estimator: complete rewrite of rate estimators

2016-12-03 Thread Eric Dumazet
From: Eric Dumazet <eduma...@google.com> 1) Old code was hard to maintain, due to complex lock chains. (We probably will be able to remove some kfree_rcu() in callers) 2) Using a single timer to update all estimators does not scale. 3) Code was buggy on 32bit kernel (WRITE_ONCE() on

Re: net/sctp: vmalloc allocation failure in sctp_setsockopt/xt_alloc_table_info

2016-11-28 Thread Eric Dumazet
On Mon, 2016-11-28 at 19:09 +0100, Florian Westphal wrote: > We should prevent OOM killer from running in first place (GFP_NORETRY should > work). Make sure that a vmalloc(8) will succeed, even if few pages need to be swapped out. Otherwise, some scripts using iptables will die while they

Re: [PATCH net-next 1/1] netfilter: xt_multiport: Fix wrong unmatch result with multiple ports

2016-11-24 Thread Eric Dumazet
On Fri, 2016-11-25 at 11:58 +0800, f...@ikuai8.com wrote: > From: Gao Feng > > I lost one test case in the commit for xt_multiport. > For example, the rule is "-m multiport --dports 22,80,443". > When first port is unmatched and the second is matched, the curent codes > could

Re: [PATCH v2 nf-next 3/3] netfilter: x_tables: pack percpu counter allocations

2016-11-21 Thread Eric Dumazet
loading because we reduce calls to the percpu > allocator. > > As Eric points out we can't use PAGE_SIZE, page_allocator would fail on > arches with 64k page size. > > Suggested-by: Eric Dumazet <eduma...@google.com> > Signed-off-by: Florian Westphal <f...@st

Re: [PATCH nf-next 3/3] netfilter: x_tables: pack percpu counter allocations

2016-11-21 Thread Eric Dumazet
On Mon, 2016-11-21 at 14:57 +0100, Florian Westphal wrote: ... > #define SMP_ALIGN(x) (((x) + SMP_CACHE_BYTES-1) & ~(SMP_CACHE_BYTES-1)) > +#define XT_PCPU_BLOCK_SIZE 4096 > > struct compat_delta { > unsigned int offset; /* offset in kernel */ > @@ -1618,6 +1619,7 @@

Re: [PATCH nf-next 2/3] netfilter: x_tables: pass xt_counters struct to counter allocator

2016-11-21 Thread Eric Dumazet
On Mon, 2016-11-21 at 14:57 +0100, Florian Westphal wrote: > Keeps some noise away from a followup patch. > > Signed-off-by: Florian Westphal <f...@strlen.de> > --- Acked-by: Eric Dumazet <eduma...@google.com> -- To unsubscribe from this list: send the line &quo

Re: [PATCH nf-next 1/3] netfilter: x_tables: pass xt_counters struct instead of packet counter

2016-11-21 Thread Eric Dumazet
batch > chunks. > > Signed-off-by: Florian Westphal <f...@strlen.de> > --- Acked-by: Eric Dumazet <eduma...@google.com> -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More ma

Re: netfilter question

2016-11-20 Thread Eric Dumazet
On Sun, 2016-11-20 at 09:31 -0800, Eric Dumazet wrote: > Thanks Eric, I will test the patch myself, because I believe we need it > asap ;) Current net-next without Florian patch : lpaa24:~# time for f in `seq 1 2000` ; do iptables -A FORWARD ; done real0m12.856s user0m0.59

Re: netfilter question

2016-11-20 Thread Eric Dumazet
On Sun, 2016-11-20 at 12:22 -0500, Eric D wrote: > I'm currently abroad for work and will come back home soon. I will > test the solution and provide feedback to Florian by end of week. > > Thanks for jumping on this quickly. > > Eric > > > On Nov 20, 2016 7:33 AM

Re: netfilter question

2016-11-19 Thread Eric Dumazet
On Thu, 2016-11-17 at 01:07 +0100, Florian Westphal wrote: > + if (state->mem == NULL) { > + state->mem = __alloc_percpu(PAGE_SIZE, PAGE_SIZE); > + if (!state->mem) > + return false; > + } This will fail on arches where PAGE_SIZE=65536 percpu

Re: netfilter question

2016-11-16 Thread Eric Dumazet
On Thu, 2016-11-17 at 01:07 +0100, Florian Westphal wrote: Seems very nice ! > + > +void xt_percpu_counter_free(struct xt_counters *counters) > +{ > + unsigned long pcnt = counters->pcnt; > + > + if (nr_cpu_ids > 1 && (pcnt & (PAGE_SIZE - 1)) == 0) > + free_percpu((void

Re: netfilter question

2016-11-16 Thread Eric Dumazet
On Wed, 2016-11-16 at 16:02 +0100, Florian Westphal wrote: > Eric Dumazet <eduma...@google.com> wrote: > > On Wed, Nov 16, 2016 at 2:22 AM, Eric Desrochers <er...@miromedia.ca> wrote: > > > Hi Eric, > > > > > > My name is Eric and I'm reaching y

Re: [PATCH nf] netfilter: conntrack: refine gc worker heuristics

2016-11-01 Thread Eric Dumazet
On Tue, 2016-11-01 at 21:01 +0100, Florian Westphal wrote: > schedule_delayed_work(_work->dwork, next_run); > @@ -993,6 +1029,7 @@ static void gc_worker(struct work_struct *work) > static void conntrack_gc_work_init(struct conntrack_gc_work *gc_work) > { >

Re: error: 'struct net_device' has no member named 'nf_hooks_ingress'

2016-10-05 Thread Eric Dumazet
On Wed, 2016-10-05 at 22:56 +0200, Michal Sojka wrote: > this commit is now in mainline as > e3b37f11e6e4e6b6f02cc762f182ce233d2c1c9d and it breaks my build: > > net/netfilter/core.c: In function 'nf_set_hooks_head': > net/netfilter/core.c:96:3: error: 'struct net_device' has no member

Re: [PATCH 3/3] netfilter: xt_hashlimit: uses div_u64 for division

2016-09-30 Thread Eric Dumazet
On Fri, 2016-09-30 at 18:05 +0200, Arnd Bergmann wrote: > The newly added support for high-resolution pps rates introduced multiple > 64-bit > division operations in one function, which fails on all 32-bit architectures: > > net/netfilter/xt_hashlimit.o: In function `user2credits': >

Re: [PATCH nf-next v3 1/2] netfilter: Fix potential null pointer dereference

2016-09-28 Thread Eric Dumazet
On Wed, 2016-09-28 at 10:56 -0400, Aaron Conole wrote: > Eric Dumazet <eric.duma...@gmail.com> writes: > > > On Wed, 2016-09-28 at 09:12 -0400, Aaron Conole wrote: > >> It's possible for nf_hook_entry_head to return NULL. If two > >> nf_unregister_

Re: [PATCH nf-next v3 1/2] netfilter: Fix potential null pointer dereference

2016-09-28 Thread Eric Dumazet
On Wed, 2016-09-28 at 09:12 -0400, Aaron Conole wrote: > It's possible for nf_hook_entry_head to return NULL. If two > nf_unregister_net_hook calls happen simultaneously with a single hook > entry in the list, both will enter the nf_hook_mutex critical section. > The first will successfully

Re: [PATCH] Fix link error in 32bit arch because of 64bit division

2016-09-27 Thread Eric Dumazet
On Tue, 2016-09-27 at 03:42 -0400, Vishwanath Pai wrote: > Fix link error in 32bit arch because of 64bit division > > Division of 64bit integers will cause linker error undefined reference > to `__udivdi3'. Fix this by replacing divisions with div64_64 > > Signed-off-by: Vishwanath Pai

Re: [PATCH nf] netfilter: nf_tables: Ensure u8 attributes are loaded from u32 within the bounds

2016-09-22 Thread Eric Dumazet
On Thu, 2016-09-22 at 16:58 +0200, Pablo Neira Ayuso wrote: > attributes") > > Always use 12 bytes commit-ids. 4da449a is too short, given the number > of changes we're getting in the kernel tree, this may become ambiguous > at some point so it won't be unique. > > You can achieve this via: git

Re: [PATCH nf v3] netfilter: seqadj: Fix the wrong ack adjust for the RST packet without ack

2016-09-21 Thread Eric Dumazet
On Thu, 2016-09-22 at 10:22 +0800, f...@ikuai8.com wrote: > From: Gao Feng > > It is valid that the TCP RST packet which does not set ack flag, and bytes > of ack number are zero. But current seqadj codes would adjust the "0" ack > to invalid ack number. Actually seqadj need to

Re: [PATCH] netfilter: xt_socket: fix transparent match for IPv6 request sockets

2016-09-20 Thread Eric Dumazet
On Tue, 2016-09-20 at 08:01 -0700, Eric Dumazet wrote: > > Something like : > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 3ebf45b38bc3..0fccfd6cc258 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -6264,6 +6264,7 @@ int

Re: [PATCH] netfilter: xt_socket: fix transparent match for IPv6 request sockets

2016-09-20 Thread Eric Dumazet
On Tue, 2016-09-20 at 15:26 +0200, KOVACS Krisztian wrote: > The introduction of TCP_NEW_SYN_RECV state, and the addition of request > sockets to the ehash table seems to have broken the --transparent option > of the socket match for IPv6 (around commit a9407000). > > Now that the socket lookup

Re: [PATCH 5/5] net/netfilter/nf_conntrack_core: update memory barriers.

2016-08-31 Thread Eric Dumazet
On Wed, 2016-08-31 at 15:42 +0200, Manfred Spraul wrote: > As explained in commit 51d7d5205d33 > ("powerpc: Add smp_mb() to arch_spin_is_locked()", for some architectures > the ACQUIRE during spin_lock only applies to loading the lock, not to > storing the lock state. > > nf_conntrack_lock() does

Re: [PATCH v3 nf-next 5/7] netfilter: conntrack: add gc worker to remove timed-out entries

2016-08-25 Thread Eric Dumazet
nge to speed up GC for the extreme > case where most entries are timed out on an otherwise idle system. > > v2: Use cond_resched_rcu_qs & add comment wrt. missing restart on > nulls value change in gc worker, suggested by Eric Dumazet. > > v3: don't call cancel_delayed_w

Re: [PATCH v2 nf-next 4/7] netfilter: conntrack: add gc worker to remove timed-out entries

2016-08-24 Thread Eric Dumazet
On Wed, 2016-08-24 at 22:11 +0200, Florian Westphal wrote: > Eric Dumazet <eric.duma...@gmail.com> wrote: > > On Wed, 2016-08-24 at 13:55 +0200, Florian Westphal wrote: > > > Conntrack gc worker to evict stale entries. > > > > > > > static struct nf_

Re: [PATCH v2 nf-next 4/7] netfilter: conntrack: add gc worker to remove timed-out entries

2016-08-24 Thread Eric Dumazet
On Wed, 2016-08-24 at 13:55 +0200, Florian Westphal wrote: > Conntrack gc worker to evict stale entries. > static struct nf_conn * > __nf_conntrack_alloc(struct net *net, >const struct nf_conntrack_zone *zone, > @@ -1527,6 +1597,7 @@ static int untrack_refs(void) > >

Re: [PATCH v2 nf-next 2/7] netfilter: conntrack: get rid of conntrack timer

2016-08-24 Thread Eric Dumazet
On Wed, 2016-08-24 at 13:55 +0200, Florian Westphal wrote: > With stats enabled this eats 80 bytes on x86_64 per nf_conn entry, as > Eric Dumazet pointed out during netfilter workshop 2016. Another reason was the fact that Thomas was about to change max timer range : https://git.kernel.or

Re: [PATCH v2 nf-next 1/7] netfilter: don't rely on DYING bit to detect when destroy event was sent

2016-08-24 Thread Eric Dumazet
do not have the DYING bit set. > > Once timer is gone, we can no longer use if (del_timer()) to detect > when we 'stole' the reference count owned by the timer/hash entry, so > we need some other way to avoid racing with other cpu. > > Pablo suggested to add a marker in the ecache ext

Re: [PATCH nf-next 7/7] netfilter: restart search if moved to other chain

2016-08-24 Thread Eric Dumazet
On Wed, 2016-08-24 at 13:55 +0200, Florian Westphal wrote: > In case nf_conntrack_tuple_taken did not find a conflicting entry > check that all entries in this hash slot were tested and restart > in case an entry was moved to another chain. > > Reported-by: Eric Dumazet <e

Re: [PATCH nf-next 2/6] netfilter: conntrack: get rid of conntrack timer

2016-08-21 Thread Eric Dumazet
On Fri, 2016-08-19 at 18:04 +0200, Florian Westphal wrote: > Eric Dumazet <eric.duma...@gmail.com> wrote: > > On Fri, 2016-08-19 at 17:16 +0200, Florian Westphal wrote: > > > > > Hmm, nf_conntrack_find caller needs to hold rcu_read_lock, > > > in ca

Re: [PATCH nf-next 2/6] netfilter: conntrack: get rid of conntrack timer

2016-08-19 Thread Eric Dumazet
On Fri, 2016-08-19 at 17:16 +0200, Florian Westphal wrote: > Hmm, nf_conntrack_find caller needs to hold rcu_read_lock, > in case object is free'd SLAB_DESTROY_BY_RCU should delay actual release > of the page. Well, point is that SLAB_DESTROY_BY_RCU means that we have no grace period, and

Re: [PATCH nf-next 4/6] netfilter: conntrack: add gc worker to remove timed-out entries

2016-08-19 Thread Eric Dumazet
On Fri, 2016-08-19 at 13:36 +0200, Florian Westphal wrote: > Conntrack gc worker to evict stale entries. ... > + > + hlist_nulls_for_each_entry_rcu(h, n, _hash[i], hnnode) { > + tmp = nf_ct_tuplehash_to_ctrack(h); > + > + if

Re: [PATCH nf-next 2/6] netfilter: conntrack: get rid of conntrack timer

2016-08-19 Thread Eric Dumazet
On Fri, 2016-08-19 at 13:36 +0200, Florian Westphal wrote: > With stats enabled this eats 80 bytes on x86_64 per nf_conn entry. > > Remove it and use a 32bit jiffies value containing timestamp until > entry is valid. Great work ! ... > +/* caller must hold rcu readlock and none of the

[PATCH net] netfilter: tproxy: properly refcount tcp listeners

2016-08-17 Thread Eric Dumazet
From: Eric Dumazet <eduma...@google.com> inet_lookup_listener() and inet6_lookup_listener() no longer take a reference on the found listener. This minimal patch adds back the refcounting, but we might do this differently in net-next later. Fixes: 3b24d854cb35 ("tcp/dccp: do not tou

Re: kernel panic TPROXY , vanilla 4.7.1

2016-08-17 Thread Eric Dumazet
On Wed, 2016-08-17 at 19:44 +0300, Denys Fedoryshchenko wrote: > Yes, everything fine after patch! > Thanks a lot Perfect, thanks for testing, I am sending the official patch. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to

Re: kernel panic TPROXY , vanilla 4.7.1

2016-08-17 Thread Eric Dumazet
On Wed, 2016-08-17 at 08:42 -0700, Eric Dumazet wrote: > On Wed, 2016-08-17 at 17:31 +0300, Denys Fedoryshchenko wrote: > > Hi! > > > > Tried to run squid on latest kernel, and hit a panic > > Sometimes it just shows warning in dmesg (but doesnt work properly) >

Re: PROBLEM: TPROXY and DNAT broken (bisected to 079096f103fa)

2016-07-27 Thread Eric Dumazet
I performed a git bisect using a qemu image to test my example below, and the > bisect ended at this commit: > > > commit 079096f103faca2dd87342cca6f23d4b34da8871 > > Author: Eric Dumazet <eduma...@google.com> > > Date: Fri Oct 2 11:43:32 2015 -0700 > > > &

Re: [RFC 5/7] net: Add allocation flag to rtnl_unicast()

2016-07-07 Thread Eric Dumazet
On Fri, 2016-07-08 at 12:15 +0900, Masashi Honma wrote: = > Thanks for comment. > > I have selected GFP flags based on existing code. > > I have selected GFP_ATOMIC in inet6_netconf_get_devconf() because > skb was allocated with GFP_ATOMIC. Point is : we should remove GFP_ATOMIC uses as much as

  1   2   >