Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id
Le 31/01/2017 à 07:47, Elluru, Krishna Mohan a écrit : > HI Cong, > Have you posted any patch for the same? I am looking for single netlink > socket to handle multiple network namespace events using > NETLINK_LISTEN_ALL_NSID. I don't see detection of new namespaces by this > socket yet and causing updates to be missed. New namespaces are not automatically detected. You have to assign them a nsid (e.g. ip netns set). NETLINK_LISTEN_ALL_NSID enables notifications only for peer netns that have a nsid assign in the netns where the socket has been opened. Regards, Nicolas
RE: [Patch net-next 2/2] netns: avoid disabling irq for netns id
I am of 4.4.18. I see the commit is not present. I will recompile my kernel with below patch. Thanks for the quick sha reference. Thanks Krishna Mohan. -Original Message- From: Cong Wang [mailto:xiyou.wangc...@gmail.com] Sent: Tuesday, January 31, 2017 12:23 PM To: Elluru, Krishna Mohan <elluru.kri.mo...@hpe.com> Cc: Elad Raz <e...@eladraz.com>; Nicolas Dichtel <nicolas.dich...@6wind.com>; David Miller <da...@davemloft.net>; Linux Netdev List <netdev@vger.kernel.org>; Jiri Pirko <j...@resnulli.us>; Ido Schimmel <ido...@mellanox.com>; Yotam Gigi <yot...@mellanox.com> Subject: Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id On Mon, Jan 30, 2017 at 10:47 PM, Elluru, Krishna Mohan <elluru.kri.mo...@hpe.com> wrote: > HI Cong, > Have you posted any patch for the same? I am looking for single > netlink socket to handle multiple network namespace events using > NETLINK_LISTEN_ALL_NSID. I don't see detection of new namespaces by this > socket yet and causing updates to be missed. The patch in $subject was reposted by Paul after fixing the audit part: commit fba143c66abb81307a450679f38ab953fe96a413 Author: Paul Moore <p...@paul-moore.com> Date: Tue Nov 29 16:57:48 2016 -0500 netns: avoid disabling irq for netns id Which version of kernel are you testing? Does it have this commit? Thanks.
Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id
On Mon, Jan 30, 2017 at 10:47 PM, Elluru, Krishna Mohanwrote: > HI Cong, > Have you posted any patch for the same? I am looking for single > netlink socket to handle multiple network namespace events using > NETLINK_LISTEN_ALL_NSID. I don't see detection of new namespaces by this > socket yet and causing updates to be missed. The patch in $subject was reposted by Paul after fixing the audit part: commit fba143c66abb81307a450679f38ab953fe96a413 Author: Paul Moore Date: Tue Nov 29 16:57:48 2016 -0500 netns: avoid disabling irq for netns id Which version of kernel are you testing? Does it have this commit? Thanks.
RE: [Patch net-next 2/2] netns: avoid disabling irq for netns id
HI Cong, Have you posted any patch for the same? I am looking for single netlink socket to handle multiple network namespace events using NETLINK_LISTEN_ALL_NSID. I don't see detection of new namespaces by this socket yet and causing updates to be missed. Thanks Krishna Mohan. -Original Message- From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] On Behalf Of Cong Wang Sent: Thursday, October 20, 2016 1:37 AM To: Elad Raz <e...@eladraz.com> Cc: Nicolas Dichtel <nicolas.dich...@6wind.com>; David Miller <da...@davemloft.net>; Linux Netdev List <netdev@vger.kernel.org>; Jiri Pirko <j...@resnulli.us>; Ido Schimmel <ido...@mellanox.com>; Yotam Gigi <yot...@mellanox.com> Subject: Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id On Wed, Oct 19, 2016 at 8:21 AM, Elad Raz <e...@eladraz.com> wrote: > On Fri, Sep 2, 2016 at 11:12 AM, Nicolas Dichtel > <nicolas.dich...@6wind.com> wrote: >> Le 02/09/2016 à 06:53, Cong Wang a écrit : >>> We never read or change netns id in hardirq context, the only place >>> we read netns id in softirq context is in vxlan_xmit(). So, it >>> should be enough to just disable BH. >> >> Are you sure? Did you audit all part of the code? >> peernet2id() is called from netlink core system (do_one_broadcast()). >> Are you sure that no driver call this function from an hard irq context? >> >> I think that NETLINK_LISTEN_ALL_NSID is largely untested, so it will >> be hard to detect a bug introduced in this feature. > > I'm seeing strange things on our systems on boot time when trying to > mount autofs. > I bisected and got this patch as the bad one. > I can see that only when I'm using "debug" config file. Yeah, I saw the same report from SELinux developers, I am working on a fix. Thanks.
Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id
On Wed, Oct 19, 2016 at 8:21 AM, Elad Razwrote: > On Fri, Sep 2, 2016 at 11:12 AM, Nicolas Dichtel > wrote: >> Le 02/09/2016 à 06:53, Cong Wang a écrit : >>> We never read or change netns id in hardirq context, >>> the only place we read netns id in softirq context >>> is in vxlan_xmit(). So, it should be enough to just >>> disable BH. >> >> Are you sure? Did you audit all part of the code? >> peernet2id() is called from netlink core system (do_one_broadcast()). Are you >> sure that no driver call this function from an hard irq context? >> >> I think that NETLINK_LISTEN_ALL_NSID is largely untested, so it will be hard >> to >> detect a bug introduced in this feature. > > I'm seeing strange things on our systems on boot time when trying to > mount autofs. > I bisected and got this patch as the bad one. > I can see that only when I'm using "debug" config file. Yeah, I saw the same report from SELinux developers, I am working on a fix. Thanks.
Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id
On Fri, Sep 2, 2016 at 11:12 AM, Nicolas Dichtelwrote: > Le 02/09/2016 à 06:53, Cong Wang a écrit : >> We never read or change netns id in hardirq context, >> the only place we read netns id in softirq context >> is in vxlan_xmit(). So, it should be enough to just >> disable BH. > > Are you sure? Did you audit all part of the code? > peernet2id() is called from netlink core system (do_one_broadcast()). Are you > sure that no driver call this function from an hard irq context? > > I think that NETLINK_LISTEN_ALL_NSID is largely untested, so it will be hard > to > detect a bug introduced in this feature. I'm seeing strange things on our systems on boot time when trying to mount autofs. I bisected and got this patch as the bad one. I can see that only when I'm using "debug" config file. CONFIG_LOCKDEP =y I can see on boot: [ OK ] Started OpenSSH server daemon. Starting OpenSSH server daemon... [ 23.498577] [ cut here ] [ 23.503247] WARNING: CPU: 0 PID: 994 at kernel/softirq.c:150 __local_bh_enable_ip+0x9d/0xc0 [ 23.511665] Modules linked in: xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw intel_rapl x86_pkg_temp_thermal iTCO _wdt coretemp iTCO_vendor_support kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr i2c_i801 i2c_smbus lpc_ich mfd_core mei_me mei video tpm_tis tpm_tis_core tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c crc32c_intel e1000e ptp pps_core [ 23.561125] CPU: 0 PID: 994 Comm: setsebool Not tainted 4.8.0-rc4eladr+ #51 [ 23.568152] Hardware name: Mellanox Technologies Ltd. Mellanox switch/Mellanox switch, BIOS 4.6.5 05/21/2015 [ 23.578066] 8801161d3bb0 81411b73 [ 23.585594] 8801161d3bf0 810aa69b 0096816dcda7 [ 23.593118] 0200 816dcdc5 81f0fb00 8801147a [ 23.600639] Call Trace: [ 23.603114] [] dump_stack+0x85/0xc2 [ 23.608302] [] __warn+0xcb/0xf0 [ 23.613138] [] ? peernet2id+0x45/0x60 [ 23.618502] [] warn_slowpath_null+0x1d/0x20 [ 23.624394] [] __local_bh_enable_ip+0x9d/0xc0 [ 23.630453] [] _raw_spin_unlock_bh+0x35/0x40 [ 23.636432] [] peernet2id+0x45/0x60 [ 23.641626] [] netlink_broadcast_filtered+0x265/0x3d0 [ 23.648385] [] netlink_broadcast+0x1d/0x20 [ 23.654188] [] audit_log_end+0x2b1/0x2c0 [ 23.659820] [] ? audit_log_end+0x30/0x2c0 [ 23.665532] [] audit_log+0x5a/0x70 [ 23.670631] [] security_set_bools+0xee/0x200 [ 23.676602] [] sel_commit_bools_write+0xb8/0x100 [ 23.682928] [] __vfs_write+0x28/0x120 [ 23.688291] [] ? update_fast_ctr+0x17/0x30 [ 23.694087] [] ? percpu_down_read+0x49/0x90 [ 23.699980] [] ? __sb_start_write+0xb7/0xf0 [ 23.705872] [] ? __sb_start_write+0xb7/0xf0 [ 23.711758] [] vfs_write+0xb8/0x1b0 [ 23.716949] [] ? trace_hardirqs_on_caller+0x129/0x1b0 [ 23.723717] [] SyS_write+0x49/0xa0 [ 23.728828] [] entry_SYSCALL_64_fastpath+0x1f/0xbd [ 23.735337] ---[ end trace 576457efb89c3ea5 ]--- and Starting NIS/YP (Network Informatio... Clients to NIS Domain Binder... [ 24.182895] FS-Cache: Netfs 'nfs' registered for caching [ OK ] Started rolekit - role server. [ 24.210512] Key type dns_resolver registered [ 24.243061] NFS: Registering the id_resolver key type [ 24.253951] Key type id_resolver registered [ 24.258215] Key type id_legacy registered [ 24.272025] [ 24.273545] = [ 24.277919] [ INFO: inconsistent lock state ] [ 24.282348] 4.8.0-rc4eladr+ #51 Tainted: GW [ 24.287806] - [ 24.292234] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-R} usage. [ 24.298301] dbus-daemon/673 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 24.303898] (policy_rwlock){+++?..}, at: [] security_netlbl_sid_to_secattr+0x32/0xc0 [ 24.313456] {SOFTIRQ-ON-W} state was registered at: [ 24.318388] [] mark_held_locks+0x6f/0xa0 [ 24.324134] [] trace_hardirqs_on_caller+0x129/0x1b0 [ 24.330900] [] trace_hardirqs_on+0xd/0x10 [ 24.336708] [] __local_bh_enable_ip+0x70/0xc0 [ 24.342912] [] _raw_spin_unlock_bh+0x35/0x40 [ 24.349005] [] peernet2id+0x45/0x60 [ 24.354361] [] netlink_broadcast_filtered+0x265/0x3d0 [ 24.361285] [] netlink_broadcast+0x1d/0x20 [ 24.367222] [] audit_log_end+0x2b1/0x2c0 [ 24.372957] [] audit_log+0x5a/0x70 [ 24.378174] [] security_set_bools+0xee/0x200 [ 24.384283] [] sel_commit_bools_write+0xb8/0x100 [ 24.390733] [] __vfs_write+0x28/0x120 [ 24.396208] [] vfs_write+0xb8/0x1b0 [ 24.401537] [] SyS_write+0x49/0xa0 [ 24.406746] [] entry_SYSCALL_64_fastpath+0x1f/0xbd [ 24.413375] irq event stamp: 73908 [ 24.416878] hardirqs last enabled at (73908): [] __slab_alloc+0x5e/0x90 [
Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id
Le 02/09/2016 à 19:24, Cong Wang a écrit : > On Fri, Sep 2, 2016 at 9:39 AM, Cong Wangwrote: >> On Fri, Sep 2, 2016 at 1:12 AM, Nicolas Dichtel >> wrote: >>> Le 02/09/2016 à 06:53, Cong Wang a écrit : We never read or change netns id in hardirq context, the only place we read netns id in softirq context is in vxlan_xmit(). So, it should be enough to just disable BH. >>> >>> Are you sure? Did you audit all part of the code? >> >> I did audit all the callers, and I didn't find any of them in IRQ context. >> >>> peernet2id() is called from netlink core system (do_one_broadcast()). Are >>> you >>> sure that no driver call this function from an hard irq context? >> >> I audit all callers of netlink_broadcast(), and I don't see any of >> them in hardirq context. > > Note, you can rule out most of them by checking GFP_KERNEL, > which indicates a process context. ;) For GFP_ATOMIC cases, > I don't see any of them in hardirq context either, but I am definitely > not familiar with drivers like infiniband. > Yes, I was thinking to that one. But I'm also not familiar with it ;-)
Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id
On Fri, Sep 2, 2016 at 9:39 AM, Cong Wangwrote: > On Fri, Sep 2, 2016 at 1:12 AM, Nicolas Dichtel > wrote: >> Le 02/09/2016 à 06:53, Cong Wang a écrit : >>> We never read or change netns id in hardirq context, >>> the only place we read netns id in softirq context >>> is in vxlan_xmit(). So, it should be enough to just >>> disable BH. >> >> Are you sure? Did you audit all part of the code? > > I did audit all the callers, and I didn't find any of them in IRQ context. > >> peernet2id() is called from netlink core system (do_one_broadcast()). Are you >> sure that no driver call this function from an hard irq context? > > I audit all callers of netlink_broadcast(), and I don't see any of > them in hardirq context. Note, you can rule out most of them by checking GFP_KERNEL, which indicates a process context. ;) For GFP_ATOMIC cases, I don't see any of them in hardirq context either, but I am definitely not familiar with drivers like infiniband.
Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id
On Fri, Sep 2, 2016 at 1:12 AM, Nicolas Dichtelwrote: > Le 02/09/2016 à 06:53, Cong Wang a écrit : >> We never read or change netns id in hardirq context, >> the only place we read netns id in softirq context >> is in vxlan_xmit(). So, it should be enough to just >> disable BH. > > Are you sure? Did you audit all part of the code? I did audit all the callers, and I didn't find any of them in IRQ context. > peernet2id() is called from netlink core system (do_one_broadcast()). Are you > sure that no driver call this function from an hard irq context? I audit all callers of netlink_broadcast(), and I don't see any of them in hardirq context. > > I think that NETLINK_LISTEN_ALL_NSID is largely untested, so it will be hard > to > detect a bug introduced in this feature. This patch passed my netns stress tests, I have LOCKDEP enabled, and I don't get any warning or crash etc.
Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id
Le 02/09/2016 à 06:53, Cong Wang a écrit : > We never read or change netns id in hardirq context, > the only place we read netns id in softirq context > is in vxlan_xmit(). So, it should be enough to just > disable BH. Are you sure? Did you audit all part of the code? peernet2id() is called from netlink core system (do_one_broadcast()). Are you sure that no driver call this function from an hard irq context? I think that NETLINK_LISTEN_ALL_NSID is largely untested, so it will be hard to detect a bug introduced in this feature.
[Patch net-next 2/2] netns: avoid disabling irq for netns id
We never read or change netns id in hardirq context, the only place we read netns id in softirq context is in vxlan_xmit(). So, it should be enough to just disable BH. Cc: Nicolas DichtelSigned-off-by: Cong Wang --- net/core/net_namespace.c | 35 +++ 1 file changed, 15 insertions(+), 20 deletions(-) diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index f3fa435..42bdda0 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -215,14 +215,13 @@ static void rtnl_net_notifyid(struct net *net, int cmd, int id); */ int peernet2id_alloc(struct net *net, struct net *peer) { - unsigned long flags; bool alloc; int id; - spin_lock_irqsave(>nsid_lock, flags); + spin_lock_bh(>nsid_lock); alloc = atomic_read(>count) == 0 ? false : true; id = __peernet2id_alloc(net, peer, ); - spin_unlock_irqrestore(>nsid_lock, flags); + spin_unlock_bh(>nsid_lock); if (alloc && id >= 0) rtnl_net_notifyid(net, RTM_NEWNSID, id); return id; @@ -231,12 +230,11 @@ int peernet2id_alloc(struct net *net, struct net *peer) /* This function returns, if assigned, the id of a peer netns. */ int peernet2id(struct net *net, struct net *peer) { - unsigned long flags; int id; - spin_lock_irqsave(>nsid_lock, flags); + spin_lock_bh(>nsid_lock); id = __peernet2id(net, peer); - spin_unlock_irqrestore(>nsid_lock, flags); + spin_unlock_bh(>nsid_lock); return id; } EXPORT_SYMBOL(peernet2id); @@ -251,18 +249,17 @@ bool peernet_has_id(struct net *net, struct net *peer) struct net *get_net_ns_by_id(struct net *net, int id) { - unsigned long flags; struct net *peer; if (id < 0) return NULL; rcu_read_lock(); - spin_lock_irqsave(>nsid_lock, flags); + spin_lock_bh(>nsid_lock); peer = idr_find(>netns_ids, id); if (peer) get_net(peer); - spin_unlock_irqrestore(>nsid_lock, flags); + spin_unlock_bh(>nsid_lock); rcu_read_unlock(); return peer; @@ -406,17 +403,17 @@ static void cleanup_net(struct work_struct *work) for_each_net(tmp) { int id; - spin_lock_irq(>nsid_lock); + spin_lock_bh(>nsid_lock); id = __peernet2id(tmp, net); if (id >= 0) idr_remove(>netns_ids, id); - spin_unlock_irq(>nsid_lock); + spin_unlock_bh(>nsid_lock); if (id >= 0) rtnl_net_notifyid(tmp, RTM_DELNSID, id); } - spin_lock_irq(>nsid_lock); + spin_lock_bh(>nsid_lock); idr_destroy(>netns_ids); - spin_unlock_irq(>nsid_lock); + spin_unlock_bh(>nsid_lock); } rtnl_unlock(); @@ -544,7 +541,6 @@ static int rtnl_net_newid(struct sk_buff *skb, struct nlmsghdr *nlh) { struct net *net = sock_net(skb->sk); struct nlattr *tb[NETNSA_MAX + 1]; - unsigned long flags; struct net *peer; int nsid, err; @@ -565,15 +561,15 @@ static int rtnl_net_newid(struct sk_buff *skb, struct nlmsghdr *nlh) if (IS_ERR(peer)) return PTR_ERR(peer); - spin_lock_irqsave(>nsid_lock, flags); + spin_lock_bh(>nsid_lock); if (__peernet2id(net, peer) >= 0) { - spin_unlock_irqrestore(>nsid_lock, flags); + spin_unlock_bh(>nsid_lock); err = -EEXIST; goto out; } err = alloc_netid(net, peer, nsid); - spin_unlock_irqrestore(>nsid_lock, flags); + spin_unlock_bh(>nsid_lock); if (err >= 0) { rtnl_net_notifyid(net, RTM_NEWNSID, err); err = 0; @@ -695,11 +691,10 @@ static int rtnl_net_dumpid(struct sk_buff *skb, struct netlink_callback *cb) .idx = 0, .s_idx = cb->args[0], }; - unsigned long flags; - spin_lock_irqsave(>nsid_lock, flags); + spin_lock_bh(>nsid_lock); idr_for_each(>netns_ids, rtnl_net_dumpid_one, _cb); - spin_unlock_irqrestore(>nsid_lock, flags); + spin_unlock_bh(>nsid_lock); cb->args[0] = net_cb.idx; return skb->len; -- 2.1.0