On Mon, Jan 28, 2019 at 4:00 PM Pablo Neira Ayuso wrote:
>
> From: Phil Sutter
>
> To allow for a batch to contain rules in arbitrary ordering, introduce
> NFTA_RULE_POSITION_ID attribute which works just like NFTA_RULE_POSITION
> but contains the ID of another rule within the same batch. This he
;netfilter: nf_conntrack: provide modparam to always
register conntrack hooks")
Fixes: b884fa461776 ("netfilter: conntrack: unify sysctl handling")
Reported-and-tested-by: syzbot+fcee88b2d87f0539d...@syzkaller.appspotmail.com
Cc: Pablo Neira Ayuso
Cc: Jozsef Kadlecsik
Cc: Florian West
Cc: Pablo Neira Ayuso
Signed-off-by: Cong Wang
---
net/netfilter/xt_hashlimit.c | 18 +-
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 9b16402f29af..3e7d259e5d8d 100644
--- a/net/netfilter/xt_hashl
On Tue, Jul 24, 2018 at 8:14 AM David Ahern wrote:
>
> On 7/19/18 11:12 AM, Cong Wang wrote:
> > On Thu, Jul 19, 2018 at 9:16 AM David Ahern wrote:
> >>
> >> Chatting with Nikolay about this and he brought up a good corollary - ip
> >> fragmentation. It rea
On Thu, Jul 19, 2018 at 9:16 AM David Ahern wrote:
>
> Chatting with Nikolay about this and he brought up a good corollary - ip
> fragmentation. It really is a similar problem in that memory is consumed
> as a result of packets received from an external entity. The ipfrag
> sysctls are per namespa
On Tue, Jul 17, 2018 at 12:02 PM David Ahern wrote:
> As for the per-namespace tables, it is 4 years later and over that time
> Linux supports a number of features: EVPN which is very mac heavy, VRR
> which doubles mac entries (one against the VRR device and one against
> the lower device) and NOS
On Tue, Jul 17, 2018 at 10:43 AM David Ahern wrote:
>
> On 7/17/18 11:40 AM, Cong Wang wrote:
> > On Tue, Jul 17, 2018 at 5:11 AM wrote:
> >>
> >> From: David Ahern
> >>
> >> Nikita Leshenko reported that neighbor entries in one namespace can
>
On Tue, Jul 17, 2018 at 5:11 AM wrote:
>
> From: David Ahern
>
> Nikita Leshenko reported that neighbor entries in one namespace can
> evict neighbor entries in another. The problem is that the neighbor
> tables have entries across all namespaces without separate accounting
> and with global limi
On Thu, Jun 28, 2018 at 7:22 PM David Miller wrote:
>
> From: Cong Wang
> Date: Thu, 28 Jun 2018 14:53:09 -0700
>
> > I will send a revert with quote of the above.
>
> And it will go to /dev/null as far as I am concerned. I read it the
> first time, so posting
On Mon, Jun 25, 2018 at 8:59 AM Flavio Leitner wrote:
> XPS breaks because the queue mapping stored in the socket is not
> available, so another random queue might be selected when the stack
> needs to transmit something like a TCP ACK, or TCP Retransmissions.
> That causes packet re-ordering and/
On Wed, Jun 27, 2018 at 12:39 PM Cong Wang wrote:
>
> Let me rephrase why I don't like this patchset:
>
> 1. Let's forget about TSQ for a moment, skb_orphan() before leaving
> the stack is not just reasonable but also aligning to network isolation
> design. You can
On Wed, Jun 27, 2018 at 1:19 PM Flavio Leitner wrote:
>
> On Wed, Jun 27, 2018 at 12:06:16PM -0700, Cong Wang wrote:
> > On Wed, Jun 27, 2018 at 5:32 AM Flavio Leitner wrote:
> > >
> > > On Tue, Jun 26, 2018 at 06:28:27PM -0700, Cong Wang wrote:
> > > &
On Thu, Jun 28, 2018 at 6:20 AM David Miller wrote:
>
> From: Cong Wang
> Date: Wed, 27 Jun 2018 12:39:01 -0700
>
> > Let me rephrase why I don't like this patchset:
>
> Cong, I don't think you are seeing the situation clearly and
> I am certainly going t
On Wed, Jun 27, 2018 at 12:33 PM Eric Dumazet wrote:
>
>
>
> On 06/27/2018 11:59 AM, Cong Wang wrote:
>
> >
> > IIRC, this skb_orphan() was introduced much earlier than TSQ, probably
> > from the beginning of veth.
>
> Sigh
>
> SO_SNDBUF was invente
Let me rephrase why I don't like this patchset:
1. Let's forget about TSQ for a moment, skb_orphan() before leaving
the stack is not just reasonable but also aligning to network isolation
design. You can't claim skb_orphan() is broken from beginning, it is
designed in this way and it is intentiona
On Wed, Jun 27, 2018 at 5:32 AM Flavio Leitner wrote:
>
> On Tue, Jun 26, 2018 at 06:28:27PM -0700, Cong Wang wrote:
> > On Tue, Jun 26, 2018 at 5:39 PM Flavio Leitner wrote:
> > >
> > > On Tue, Jun 26, 2018 at 05:29:51PM -0700, Cong Wang wrote:
> > > &
On Tue, Jun 26, 2018 at 7:35 PM Eric Dumazet wrote:
>
>
>
> On 06/26/2018 05:44 PM, Cong Wang wrote:
>
> > With this, a netns could totally throttle a TCP socket in a different
> > netns by holding the packets infinitely (e.g. putting them in a loop).
> > T
On Tue, Jun 26, 2018 at 5:39 PM Flavio Leitner wrote:
>
> On Tue, Jun 26, 2018 at 05:29:51PM -0700, Cong Wang wrote:
> > On Tue, Jun 26, 2018 at 4:33 PM Flavio Leitner wrote:
> > >
> > > It is still isolated, the sk carries the netns info and it is
> > &g
On Tue, Jun 26, 2018 at 4:53 PM Eric Dumazet wrote:
>
>
>
> On 06/26/2018 03:47 PM, Cong Wang wrote:
> >
> > You need to justify why you want to break the TSQ's scope here,
> > which is obviously not compatible with netns design.
>
> You have to explain
On Tue, Jun 26, 2018 at 4:33 PM Flavio Leitner wrote:
>
> It is still isolated, the sk carries the netns info and it is
> orphaned when it re-enters the stack.
Then what difference does your patch make?
Before your patch:
veth orphans skb in its xmit
After your patch:
RX orphans it when re-ente
On Tue, Jun 26, 2018 at 3:03 PM Flavio Leitner wrote:
>
> On Tue, Jun 26, 2018 at 02:48:47PM -0700, Cong Wang wrote:
> > On Mon, Jun 25, 2018 at 11:41 PM Eric Dumazet
> > wrote:
> > > When a packet is attached to a socket, we should keep the association as
> &g
On Mon, Jun 25, 2018 at 11:41 PM Eric Dumazet wrote:
>
>
>
> On 06/25/2018 09:15 PM, Cong Wang wrote:
> > On Mon, Jun 25, 2018 at 8:59 AM Flavio Leitner wrote:
> >>
> >> The sock reference is lost when scrubbing the packet and that breaks
> >> TS
On Mon, Jun 25, 2018 at 8:59 AM Flavio Leitner wrote:
>
> The sock reference is lost when scrubbing the packet and that breaks
> TSQ (TCP Small Queues) and XPS (Transmit Packet Steering) causing
> performance impacts of about 50% in a single TCP stream when crossing
> network namespaces.
>
> XPS b
On Fri, May 25, 2018 at 1:39 PM, Vlad Buslov wrote:
>
> On Thu 24 May 2018 at 23:34, Cong Wang wrote:
>> On Mon, May 14, 2018 at 7:27 AM, Vlad Buslov wrote:
>>> Currently, all netlink protocol handlers for updating rules, actions and
>>> qdiscs are protected with
On Mon, May 14, 2018 at 7:27 AM, Vlad Buslov wrote:
> Currently, all netlink protocol handlers for updating rules, actions and
> qdiscs are protected with single global rtnl lock which removes any
> possibility for parallelism. This patch set is a first step to remove
> rtnl lock dependency from T
Similarly, tbl->entries is not initialized after kmalloc(),
therefore causes an uninit-value warning in ip_vs_lblc_check_expire(),
as reported by syzbot.
Reported-by:
Cc: Simon Horman
Cc: Julian Anastasov
Cc: Pablo Neira Ayuso
Signed-off-by: Cong Wang
---
net/netfilter/ipvs/ip_vs_lblc.c
tbl->entries is not initialized after kmalloc(), therefore
causes an uninit-value warning in ip_vs_lblc_check_expire()
as reported by syzbot.
Reported-by:
Cc: Simon Horman
Cc: Julian Anastasov
Cc: Pablo Neira Ayuso
Signed-off-by: Cong Wang
---
net/netfilter/ipvs/ip_vs_lblcr.c | 1 +
1 f
On Mon, Apr 16, 2018 at 4:28 PM, Stephen Rothwell wrote:
> Hi all,
>
> After merging the netfilter tree, today's linux-next build (powerpc
> ppc64_defconfig) failed like this:
>
> net/netfilter/nf_conntrack_extend.c: In function 'nf_ct_ext_
> add':
> net/netfilter/nf_conntrack_extend.c:74:2: error
irq_exit+0x53/0xa2
[<29ddee8f>] smp_apic_timer_interrupt+0x22a/0x235
because __krealloc() is not supposed to release the old
memory and it is released later via kfree_rcu(). Since this is
the only external user of __krealloc(), just mark it as not leak
here.
Cc: P
On Fri, Mar 9, 2018 at 3:21 PM, Eric Dumazet wrote:
>
>
> On 03/09/2018 03:05 PM, Cong Wang wrote:
>>
>>
>> BTW, the warning itself is all about empty names, so perhaps
>> it's better to fix them separately.
>
>
> Huh ? You want more syzbot report
On Fri, Mar 9, 2018 at 2:58 PM, Eric Dumazet wrote:
>
>
> On 03/09/2018 02:56 PM, Eric Dumazet wrote:
>
>>
>> I sent a patch a while back, but Pablo/Florian wanted more than that
>> simple fix.
>>
>> We also need to filter special characters like '/'
proc_create_data() itself accepts '/', so it m
On Fri, Mar 9, 2018 at 1:59 PM, syzbot
wrote:
> Hello,
>
> syzbot hit the following crash on net-next commit
> 617aebe6a97efa539cc4b8a52adccd89596e6be0 (Sun Feb 4 00:25:42 2018 +)
> Merge tag 'usercopy-v4.16-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
>
> So far this cra
As suggested by Eric, we need to make the xt_rateest
hash table and its lock per netns to reduce lock
contentions.
Cc: Florian Westphal
Cc: Eric Dumazet
Cc: Pablo Neira Ayuso
Signed-off-by: Cong Wang
---
include/net/netfilter/xt_rateest.h | 4 +-
net/netfilter/xt_RATEEST.c | 91
xes: d73f33b16883 ("netfilter: CLUSTERIP: RCU conversion")
Cc: Eric Dumazet
Cc: Pablo Neira Ayuso
Cc: Florian Westphal
Signed-off-by: Cong Wang
---
net/ipv4/netfilter/ipt_CLUSTERIP.c | 8 ++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.
On Thu, Feb 8, 2018 at 12:01 AM, Florian Westphal wrote:
> Cong Wang wrote:
>> In clusterip_config_find_get() we hold RCU read lock so it could
>> run concurrently with clusterip_config_entry_put(), as a result,
>> the refcnt could go back to 1 from 0, which leads to
xes: d73f33b16883 ("netfilter: CLUSTERIP: RCU conversion")
Cc: Eric Dumazet
Cc: Pablo Neira Ayuso
Signed-off-by: Cong Wang
---
net/ipv4/netfilter/ipt_CLUSTERIP.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c
b/net/i
roc_remove() inside the spinlock.
Reported-by:
Fixes: 6c5d5cfbe3c5 ("netfilter: ipt_CLUSTERIP: check duplicate config when
initializing")
Tested-by: Paolo Abeni
Cc: Xin Long
Cc: Pablo Neira Ayuso
Signed-off-by: Cong Wang
---
net/ipv4/netfilter/ipt_CLUSTERIP.c | 12 ++-
On Tue, Feb 6, 2018 at 6:27 AM, syzbot
wrote:
> Hello,
>
> syzbot hit the following crash on net-next commit
> 617aebe6a97efa539cc4b8a52adccd89596e6be0 (Sun Feb 4 00:25:42 2018 +)
> Merge tag 'usercopy-v4.16-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
>
> So far this cra
: 5859034d7eb8 ("[NETFILTER]: x_tables: add RATEEST target")
Cc: Pablo Neira Ayuso
Cc: Eric Dumazet
Signed-off-by: Cong Wang
---
net/netfilter/xt_RATEEST.c | 22 +-
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/net/netfilter/xt_RATEEST.c b/net
On Wed, Jan 31, 2018 at 5:44 PM, Eric Dumazet wrote:
> On Wed, 2018-01-31 at 16:26 -0800, Cong Wang wrote:
>> rateest_hash is supposed to be protected by xt_rateest_mutex.
>>
>> Reported-by:
>> Fixes: 5859034d7eb8 ("[NETFILTER]: x_tables: add RATEEST target"
rateest_hash is supposed to be protected by xt_rateest_mutex.
Reported-by:
Fixes: 5859034d7eb8 ("[NETFILTER]: x_tables: add RATEEST target")
Cc: Pablo Neira Ayuso
Signed-off-by: Cong Wang
---
net/netfilter/xt_RATEEST.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net
xt_cgroup_info_v1->priv is an internal pointer only used for kernel,
we should not trust what user-space provides.
Reported-by:
Fixes: c38c4597e4bf ("netfilter: implement xt_cgroup cgroup2 path match")
Cc: Pablo Neira Ayuso
Signed-off-by: Cong Wang
---
net/netfilter/xt_cgroup.c
On Wed, Sep 13, 2017 at 9:45 AM, Cong Wang wrote:
> On Wed, Sep 13, 2017 at 1:05 AM, Florian Westphal wrote:
>> Cong Wang wrote:
>>> While testing my TC filter patches (so not related to conntrack), the
>>> following memory leaks are shown up:
>>>
>&g
On Wed, Sep 13, 2017 at 1:05 AM, Florian Westphal wrote:
> Cong Wang wrote:
>> While testing my TC filter patches (so not related to conntrack), the
>> following memory leaks are shown up:
>>
>> unreferenced object 0x9b19ba551228 (size 128):
>> comm "
Hello,
While testing my TC filter patches (so not related to conntrack), the
following memory leaks are shown up:
unreferenced object 0x9b19ba551228 (size 128):
comm "chronyd", pid 338, jiffies 4294910829 (age 53.188s)
hex dump (first 32 bytes):
6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
On Wed, Aug 16, 2017 at 1:39 AM, Xin Long wrote:
> On Wed, Aug 9, 2017 at 7:33 AM, Cong Wang wrote:
>> On Mon, Aug 7, 2017 at 7:33 PM, Xin Long wrote:
>>> On Tue, Aug 8, 2017 at 9:15 AM, Cong Wang wrote:
>>>> This looks like a completely API burden?
>>&
On Mon, Aug 7, 2017 at 7:33 PM, Xin Long wrote:
> On Tue, Aug 8, 2017 at 9:15 AM, Cong Wang wrote:
>> This looks like a completely API burden?
> netfilter xt targets are not really compatible with netsched action.
> I've got to say, the patch is just a way to make checkentr
(Cc'ing netfilter and Jamal)
On Sat, Aug 5, 2017 at 4:35 AM, Xin Long wrote:
> As we know in some target's checkentry it may dereference par.entryinfo
> to check entry stuff inside. But when sched action calls xt_check_target,
> par.entryinfo is set with NULL. It would cause kernel panic when cal
On Tue, Jul 11, 2017 at 7:24 AM, David Miller wrote:
>
> It has gotten to the point that even casually walking around
> Faro, Portugal last week, random German tourists would stop
> me in the street and ask if net-next was open or not.
>
> Therefore, in order to avoid any and all confusion I have
On Tue, Jun 13, 2017 at 11:07 AM, Florian Westphal wrote:
> Historically it wasn't needed because we just clear out the helper area
> in the affected conntracks (i.e, future packets are not inspected by
> the helper anymore).
>
> When conntracks were made per-netns this problem was added as we're
On Mon, Jun 12, 2017 at 11:16 PM, Florian Westphal wrote:
> Cong Wang wrote:
>> On Thu, Jun 1, 2017 at 1:52 AM, Florian Westphal wrote:
>> > Joe described it nicely, problem is that after unload we may have
>> > conntracks that still have a nf_conn_help extensi
On Thu, Jun 1, 2017 at 1:52 AM, Florian Westphal wrote:
> Joe described it nicely, problem is that after unload we may have
> conntracks that still have a nf_conn_help extension attached that
> has a pointer to a structure that resided in the (unloaded) module.
Why not hold a refcnt for its modul
On Mon, Feb 20, 2017 at 5:29 AM, Andrey Konovalov wrote:
> other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
>CPU0CPU1
>
> lock(&(&pcpu->lock)->rlock);
>lock(&(&list->
On Wed, Feb 1, 2017 at 3:59 PM, Eric Dumazet wrote:
> On Wed, 2017-02-01 at 15:48 -0800, Eric Dumazet wrote:
>> On Wed, Feb 1, 2017 at 3:29 PM, Cong Wang wrote:
>>
>> > Not sure if it is better. The difference is caught up in
>> > net_enable_timestamp(),
>&
On Wed, Feb 1, 2017 at 1:16 PM, Eric Dumazet wrote:
> On Wed, 2017-02-01 at 12:51 -0800, Cong Wang wrote:
>> On Tue, Jan 31, 2017 at 7:44 AM, Eric Dumazet wrote:
>> > On Mon, 2017-01-30 at 22:19 -0800, Cong Wang wrote:
>> >
>> >>
>> >> The conte
On Tue, Jan 31, 2017 at 7:44 AM, Eric Dumazet wrote:
> On Mon, 2017-01-30 at 22:19 -0800, Cong Wang wrote:
>
>>
>> The context is process context (TX path before hitting qdisc), and
>> BH is not disabled, so in_interrupt() doesn't catch it. Hmm, this
>> makes
On Fri, Jan 27, 2017 at 5:31 PM, Eric Dumazet wrote:
> On Fri, 2017-01-27 at 17:00 -0800, Cong Wang wrote:
>> On Fri, Jan 27, 2017 at 3:35 PM, Eric Dumazet wrote:
>> > Oh well, I forgot to submit the official patch I think, Jan 9th.
>> >
>> > https://group
On Fri, Jan 27, 2017 at 3:35 PM, Eric Dumazet wrote:
> Oh well, I forgot to submit the official patch I think, Jan 9th.
>
> https://groups.google.com/forum/#!topic/syzkaller/BhyN5OFd7sQ
>
Hmm, but why only fragments need skb_orphan()? It seems like
any kfree_skb() inside a nf hook needs to have a
On Fri, Jan 27, 2017 at 3:22 PM, Cong Wang wrote:
> On Fri, Jan 27, 2017 at 1:15 PM, Dmitry Vyukov wrote:
>> stack backtrace:
>> CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/
On Fri, Jan 27, 2017 at 1:15 PM, Dmitry Vyukov wrote:
> stack backtrace:
> CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:15 [inline]
> dump_stack+0x2
family.maxattr is the max index for policy[], the size of
ops[] is determined with ARRAY_SIZE().
Reported-by: Andrey Konovalov
Tested-by: Andrey Konovalov
Cc: Pablo Neira Ayuso
Signed-off-by: Cong Wang
---
net/netfilter/ipvs/ip_vs_ctl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion
On Thu, Mar 10, 2016 at 11:55 AM, David Miller wrote:
> Indeed, good catch. Therefore:
>
> 1) Keep the masq netdev notifier. That will flush the conntrack table
>for the inetdev_destroy event.
>
> 2) Make the inetdev notifier only do something if inetdev->dead is
>false. (ie. we are flu
On Thu, Mar 10, 2016 at 10:01 AM, David Miller wrote:
> I'm tempted to say that we should provide these notifier handlers with
> the information they need, explicitly, to handle this case.
>
> Most intdev notifiers actually want to know the individual addresses
> that get removed, one by one. Tha
63 matches
Mail list logo