Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id

2017-01-31 Thread Nicolas Dichtel
Le 31/01/2017 à 07:47, Elluru, Krishna Mohan a écrit :
> HI Cong,
>   Have you posted any patch for the same? I am looking for single netlink 
> socket to handle multiple network namespace events using 
> NETLINK_LISTEN_ALL_NSID. I don't see detection of new namespaces by this 
> socket yet and causing updates to be missed.
New namespaces are not automatically detected. You have to assign them a nsid
(e.g. ip netns set).
NETLINK_LISTEN_ALL_NSID enables notifications only for peer netns that have a
nsid assign in the netns where the socket has been opened.


Regards,
Nicolas


RE: [Patch net-next 2/2] netns: avoid disabling irq for netns id

2017-01-30 Thread Elluru, Krishna Mohan
I am of 4.4.18. I see the commit is not present. I will recompile my kernel 
with below patch.

Thanks for the quick sha reference.

Thanks
Krishna Mohan.

-Original Message-
From: Cong Wang [mailto:xiyou.wangc...@gmail.com] 
Sent: Tuesday, January 31, 2017 12:23 PM
To: Elluru, Krishna Mohan <elluru.kri.mo...@hpe.com>
Cc: Elad Raz <e...@eladraz.com>; Nicolas Dichtel <nicolas.dich...@6wind.com>; 
David Miller <da...@davemloft.net>; Linux Netdev List <netdev@vger.kernel.org>; 
Jiri Pirko <j...@resnulli.us>; Ido Schimmel <ido...@mellanox.com>; Yotam Gigi 
<yot...@mellanox.com>
Subject: Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id

On Mon, Jan 30, 2017 at 10:47 PM, Elluru, Krishna Mohan 
<elluru.kri.mo...@hpe.com> wrote:
> HI Cong,
> Have you posted any patch for the same? I am looking for single 
> netlink socket to handle multiple network namespace events using 
> NETLINK_LISTEN_ALL_NSID. I don't see detection of new namespaces by this 
> socket yet and causing updates to be missed.

The patch in $subject was reposted by Paul after fixing the audit part:

commit fba143c66abb81307a450679f38ab953fe96a413
Author: Paul Moore <p...@paul-moore.com>
Date:   Tue Nov 29 16:57:48 2016 -0500

netns: avoid disabling irq for netns id

Which version of kernel are you testing? Does it have this commit?

Thanks.


Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id

2017-01-30 Thread Cong Wang
On Mon, Jan 30, 2017 at 10:47 PM, Elluru, Krishna Mohan
 wrote:
> HI Cong,
> Have you posted any patch for the same? I am looking for single 
> netlink socket to handle multiple network namespace events using 
> NETLINK_LISTEN_ALL_NSID. I don't see detection of new namespaces by this 
> socket yet and causing updates to be missed.

The patch in $subject was reposted by Paul after fixing the audit part:

commit fba143c66abb81307a450679f38ab953fe96a413
Author: Paul Moore 
Date:   Tue Nov 29 16:57:48 2016 -0500

netns: avoid disabling irq for netns id

Which version of kernel are you testing? Does it have this commit?

Thanks.


RE: [Patch net-next 2/2] netns: avoid disabling irq for netns id

2017-01-30 Thread Elluru, Krishna Mohan
HI Cong,
Have you posted any patch for the same? I am looking for single netlink 
socket to handle multiple network namespace events using 
NETLINK_LISTEN_ALL_NSID. I don't see detection of new namespaces by this socket 
yet and causing updates to be missed.


Thanks
Krishna Mohan.

-Original Message-
From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] On 
Behalf Of Cong Wang
Sent: Thursday, October 20, 2016 1:37 AM
To: Elad Raz <e...@eladraz.com>
Cc: Nicolas Dichtel <nicolas.dich...@6wind.com>; David Miller 
<da...@davemloft.net>; Linux Netdev List <netdev@vger.kernel.org>; Jiri Pirko 
<j...@resnulli.us>; Ido Schimmel <ido...@mellanox.com>; Yotam Gigi 
<yot...@mellanox.com>
Subject: Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id

On Wed, Oct 19, 2016 at 8:21 AM, Elad Raz <e...@eladraz.com> wrote:
> On Fri, Sep 2, 2016 at 11:12 AM, Nicolas Dichtel 
> <nicolas.dich...@6wind.com> wrote:
>> Le 02/09/2016 à 06:53, Cong Wang a écrit :
>>> We never read or change netns id in hardirq context, the only place 
>>> we read netns id in softirq context is in vxlan_xmit(). So, it 
>>> should be enough to just disable BH.
>>
>> Are you sure? Did you audit all part of the code?
>> peernet2id() is called from netlink core system (do_one_broadcast()). 
>> Are you sure that no driver call this function from an hard irq context?
>>
>> I think that NETLINK_LISTEN_ALL_NSID is largely untested, so it will 
>> be hard to detect a bug introduced in this feature.
>
> I'm seeing strange things on our systems on boot time when trying to 
> mount autofs.
> I bisected and got this patch as the bad one.
> I can see that only when I'm using "debug" config file.

Yeah, I saw the same report from SELinux developers, I am working on a fix.

Thanks.


Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id

2016-10-19 Thread Cong Wang
On Wed, Oct 19, 2016 at 8:21 AM, Elad Raz  wrote:
> On Fri, Sep 2, 2016 at 11:12 AM, Nicolas Dichtel
>  wrote:
>> Le 02/09/2016 à 06:53, Cong Wang a écrit :
>>> We never read or change netns id in hardirq context,
>>> the only place we read netns id in softirq context
>>> is in vxlan_xmit(). So, it should be enough to just
>>> disable BH.
>>
>> Are you sure? Did you audit all part of the code?
>> peernet2id() is called from netlink core system (do_one_broadcast()). Are you
>> sure that no driver call this function from an hard irq context?
>>
>> I think that NETLINK_LISTEN_ALL_NSID is largely untested, so it will be hard 
>> to
>> detect a bug introduced in this feature.
>
> I'm seeing strange things on our systems on boot time when trying to
> mount autofs.
> I bisected and got this patch as the bad one.
> I can see that only when I'm using "debug" config file.

Yeah, I saw the same report from SELinux developers, I am working
on a fix.

Thanks.


Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id

2016-10-19 Thread Elad Raz
On Fri, Sep 2, 2016 at 11:12 AM, Nicolas Dichtel
 wrote:
> Le 02/09/2016 à 06:53, Cong Wang a écrit :
>> We never read or change netns id in hardirq context,
>> the only place we read netns id in softirq context
>> is in vxlan_xmit(). So, it should be enough to just
>> disable BH.
>
> Are you sure? Did you audit all part of the code?
> peernet2id() is called from netlink core system (do_one_broadcast()). Are you
> sure that no driver call this function from an hard irq context?
>
> I think that NETLINK_LISTEN_ALL_NSID is largely untested, so it will be hard 
> to
> detect a bug introduced in this feature.

I'm seeing strange things on our systems on boot time when trying to
mount autofs.
I bisected and got this patch as the bad one.
I can see that only when I'm using "debug" config file.

CONFIG_LOCKDEP =y I can see on boot:

[  OK  ] Started OpenSSH server daemon.
Starting OpenSSH server daemon...
[   23.498577] [ cut here ]
[   23.503247] WARNING: CPU: 0 PID: 994 at kernel/softirq.c:150
__local_bh_enable_ip+0x9d/0xc0
[   23.511665] Modules linked in: xt_conntrack ebtable_nat
ebtable_broute bridge stp llc ebtable_filter ebtables iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
iptable_mangle iptable_security iptable_raw intel_rapl
x86_pkg_temp_thermal iTCO
_wdt coretemp iTCO_vendor_support kvm_intel kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr i2c_i801
i2c_smbus lpc_ich mfd_core mei_me mei video tpm_tis tpm_tis_core tpm
nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c crc32c_intel
e1000e
ptp pps_core
[   23.561125] CPU: 0 PID: 994 Comm: setsebool Not tainted 4.8.0-rc4eladr+ #51
[   23.568152] Hardware name: Mellanox Technologies Ltd. Mellanox
switch/Mellanox switch, BIOS 4.6.5 05/21/2015
[   23.578066]   8801161d3bb0 81411b73

[   23.585594]   8801161d3bf0 810aa69b
0096816dcda7
[   23.593118]  0200 816dcdc5 81f0fb00
8801147a
[   23.600639] Call Trace:
[   23.603114]  [] dump_stack+0x85/0xc2
[   23.608302]  [] __warn+0xcb/0xf0
[   23.613138]  [] ? peernet2id+0x45/0x60
[   23.618502]  [] warn_slowpath_null+0x1d/0x20
[   23.624394]  [] __local_bh_enable_ip+0x9d/0xc0
[   23.630453]  [] _raw_spin_unlock_bh+0x35/0x40
[   23.636432]  [] peernet2id+0x45/0x60
[   23.641626]  [] netlink_broadcast_filtered+0x265/0x3d0
[   23.648385]  [] netlink_broadcast+0x1d/0x20
[   23.654188]  [] audit_log_end+0x2b1/0x2c0
[   23.659820]  [] ? audit_log_end+0x30/0x2c0
[   23.665532]  [] audit_log+0x5a/0x70
[   23.670631]  [] security_set_bools+0xee/0x200
[   23.676602]  [] sel_commit_bools_write+0xb8/0x100
[   23.682928]  [] __vfs_write+0x28/0x120
[   23.688291]  [] ? update_fast_ctr+0x17/0x30
[   23.694087]  [] ? percpu_down_read+0x49/0x90
[   23.699980]  [] ? __sb_start_write+0xb7/0xf0
[   23.705872]  [] ? __sb_start_write+0xb7/0xf0
[   23.711758]  [] vfs_write+0xb8/0x1b0
[   23.716949]  [] ? trace_hardirqs_on_caller+0x129/0x1b0
[   23.723717]  [] SyS_write+0x49/0xa0
[   23.728828]  [] entry_SYSCALL_64_fastpath+0x1f/0xbd
[   23.735337] ---[ end trace 576457efb89c3ea5 ]---

and

Starting NIS/YP (Network Informatio... Clients to NIS Domain Binder...
[   24.182895] FS-Cache: Netfs 'nfs' registered for caching
[  OK  ] Started rolekit - role server.
[   24.210512] Key type dns_resolver registered
[   24.243061] NFS: Registering the id_resolver key type
[   24.253951] Key type id_resolver registered
[   24.258215] Key type id_legacy registered
[   24.272025]
[   24.273545] =
[   24.277919] [ INFO: inconsistent lock state ]
[   24.282348] 4.8.0-rc4eladr+ #51 Tainted: GW
[   24.287806] -
[   24.292234] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-R} usage.
[   24.298301] dbus-daemon/673 [HC0[0]:SC1[1]:HE1:SE0] takes:
[   24.303898]  (policy_rwlock){+++?..}, at: []
security_netlbl_sid_to_secattr+0x32/0xc0
[   24.313456] {SOFTIRQ-ON-W} state was registered at:
[   24.318388]   [] mark_held_locks+0x6f/0xa0
[   24.324134]   [] trace_hardirqs_on_caller+0x129/0x1b0
[   24.330900]   [] trace_hardirqs_on+0xd/0x10
[   24.336708]   [] __local_bh_enable_ip+0x70/0xc0
[   24.342912]   [] _raw_spin_unlock_bh+0x35/0x40
[   24.349005]   [] peernet2id+0x45/0x60
[   24.354361]   [] netlink_broadcast_filtered+0x265/0x3d0
[   24.361285]   [] netlink_broadcast+0x1d/0x20
[   24.367222]   [] audit_log_end+0x2b1/0x2c0
[   24.372957]   [] audit_log+0x5a/0x70
[   24.378174]   [] security_set_bools+0xee/0x200
[   24.384283]   [] sel_commit_bools_write+0xb8/0x100
[   24.390733]   [] __vfs_write+0x28/0x120
[   24.396208]   [] vfs_write+0xb8/0x1b0
[   24.401537]   [] SyS_write+0x49/0xa0
[   24.406746]   [] entry_SYSCALL_64_fastpath+0x1f/0xbd
[   24.413375] irq event stamp: 73908
[   24.416878] hardirqs last  enabled at (73908): []
__slab_alloc+0x5e/0x90
[   

Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id

2016-09-04 Thread Nicolas Dichtel
Le 02/09/2016 à 19:24, Cong Wang a écrit :
> On Fri, Sep 2, 2016 at 9:39 AM, Cong Wang  wrote:
>> On Fri, Sep 2, 2016 at 1:12 AM, Nicolas Dichtel
>>  wrote:
>>> Le 02/09/2016 à 06:53, Cong Wang a écrit :
 We never read or change netns id in hardirq context,
 the only place we read netns id in softirq context
 is in vxlan_xmit(). So, it should be enough to just
 disable BH.
>>>
>>> Are you sure? Did you audit all part of the code?
>>
>> I did audit all the callers, and I didn't find any of them in IRQ context.
>>
>>> peernet2id() is called from netlink core system (do_one_broadcast()). Are 
>>> you
>>> sure that no driver call this function from an hard irq context?
>>
>> I audit all callers of netlink_broadcast(), and I don't see any of
>> them in hardirq context.
> 
> Note, you can rule out most of them by checking GFP_KERNEL,
> which indicates a process context. ;) For GFP_ATOMIC cases,
> I don't see any of them in hardirq context either, but I am definitely
> not familiar with drivers like infiniband.
> 
Yes, I was thinking to that one. But I'm also not familiar with it ;-)


Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id

2016-09-02 Thread Cong Wang
On Fri, Sep 2, 2016 at 9:39 AM, Cong Wang  wrote:
> On Fri, Sep 2, 2016 at 1:12 AM, Nicolas Dichtel
>  wrote:
>> Le 02/09/2016 à 06:53, Cong Wang a écrit :
>>> We never read or change netns id in hardirq context,
>>> the only place we read netns id in softirq context
>>> is in vxlan_xmit(). So, it should be enough to just
>>> disable BH.
>>
>> Are you sure? Did you audit all part of the code?
>
> I did audit all the callers, and I didn't find any of them in IRQ context.
>
>> peernet2id() is called from netlink core system (do_one_broadcast()). Are you
>> sure that no driver call this function from an hard irq context?
>
> I audit all callers of netlink_broadcast(), and I don't see any of
> them in hardirq context.

Note, you can rule out most of them by checking GFP_KERNEL,
which indicates a process context. ;) For GFP_ATOMIC cases,
I don't see any of them in hardirq context either, but I am definitely
not familiar with drivers like infiniband.


Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id

2016-09-02 Thread Cong Wang
On Fri, Sep 2, 2016 at 1:12 AM, Nicolas Dichtel
 wrote:
> Le 02/09/2016 à 06:53, Cong Wang a écrit :
>> We never read or change netns id in hardirq context,
>> the only place we read netns id in softirq context
>> is in vxlan_xmit(). So, it should be enough to just
>> disable BH.
>
> Are you sure? Did you audit all part of the code?

I did audit all the callers, and I didn't find any of them in IRQ context.

> peernet2id() is called from netlink core system (do_one_broadcast()). Are you
> sure that no driver call this function from an hard irq context?

I audit all callers of netlink_broadcast(), and I don't see any of
them in hardirq context.

>
> I think that NETLINK_LISTEN_ALL_NSID is largely untested, so it will be hard 
> to
> detect a bug introduced in this feature.

This patch passed my netns stress tests, I have LOCKDEP enabled,
and I don't get any warning or crash etc.


Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id

2016-09-02 Thread Nicolas Dichtel
Le 02/09/2016 à 06:53, Cong Wang a écrit :
> We never read or change netns id in hardirq context,
> the only place we read netns id in softirq context
> is in vxlan_xmit(). So, it should be enough to just
> disable BH.

Are you sure? Did you audit all part of the code?
peernet2id() is called from netlink core system (do_one_broadcast()). Are you
sure that no driver call this function from an hard irq context?

I think that NETLINK_LISTEN_ALL_NSID is largely untested, so it will be hard to
detect a bug introduced in this feature.


[Patch net-next 2/2] netns: avoid disabling irq for netns id

2016-09-01 Thread Cong Wang
We never read or change netns id in hardirq context,
the only place we read netns id in softirq context
is in vxlan_xmit(). So, it should be enough to just
disable BH.

Cc: Nicolas Dichtel 
Signed-off-by: Cong Wang 
---
 net/core/net_namespace.c | 35 +++
 1 file changed, 15 insertions(+), 20 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index f3fa435..42bdda0 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -215,14 +215,13 @@ static void rtnl_net_notifyid(struct net *net, int cmd, 
int id);
  */
 int peernet2id_alloc(struct net *net, struct net *peer)
 {
-   unsigned long flags;
bool alloc;
int id;
 
-   spin_lock_irqsave(>nsid_lock, flags);
+   spin_lock_bh(>nsid_lock);
alloc = atomic_read(>count) == 0 ? false : true;
id = __peernet2id_alloc(net, peer, );
-   spin_unlock_irqrestore(>nsid_lock, flags);
+   spin_unlock_bh(>nsid_lock);
if (alloc && id >= 0)
rtnl_net_notifyid(net, RTM_NEWNSID, id);
return id;
@@ -231,12 +230,11 @@ int peernet2id_alloc(struct net *net, struct net *peer)
 /* This function returns, if assigned, the id of a peer netns. */
 int peernet2id(struct net *net, struct net *peer)
 {
-   unsigned long flags;
int id;
 
-   spin_lock_irqsave(>nsid_lock, flags);
+   spin_lock_bh(>nsid_lock);
id = __peernet2id(net, peer);
-   spin_unlock_irqrestore(>nsid_lock, flags);
+   spin_unlock_bh(>nsid_lock);
return id;
 }
 EXPORT_SYMBOL(peernet2id);
@@ -251,18 +249,17 @@ bool peernet_has_id(struct net *net, struct net *peer)
 
 struct net *get_net_ns_by_id(struct net *net, int id)
 {
-   unsigned long flags;
struct net *peer;
 
if (id < 0)
return NULL;
 
rcu_read_lock();
-   spin_lock_irqsave(>nsid_lock, flags);
+   spin_lock_bh(>nsid_lock);
peer = idr_find(>netns_ids, id);
if (peer)
get_net(peer);
-   spin_unlock_irqrestore(>nsid_lock, flags);
+   spin_unlock_bh(>nsid_lock);
rcu_read_unlock();
 
return peer;
@@ -406,17 +403,17 @@ static void cleanup_net(struct work_struct *work)
for_each_net(tmp) {
int id;
 
-   spin_lock_irq(>nsid_lock);
+   spin_lock_bh(>nsid_lock);
id = __peernet2id(tmp, net);
if (id >= 0)
idr_remove(>netns_ids, id);
-   spin_unlock_irq(>nsid_lock);
+   spin_unlock_bh(>nsid_lock);
if (id >= 0)
rtnl_net_notifyid(tmp, RTM_DELNSID, id);
}
-   spin_lock_irq(>nsid_lock);
+   spin_lock_bh(>nsid_lock);
idr_destroy(>netns_ids);
-   spin_unlock_irq(>nsid_lock);
+   spin_unlock_bh(>nsid_lock);
 
}
rtnl_unlock();
@@ -544,7 +541,6 @@ static int rtnl_net_newid(struct sk_buff *skb, struct 
nlmsghdr *nlh)
 {
struct net *net = sock_net(skb->sk);
struct nlattr *tb[NETNSA_MAX + 1];
-   unsigned long flags;
struct net *peer;
int nsid, err;
 
@@ -565,15 +561,15 @@ static int rtnl_net_newid(struct sk_buff *skb, struct 
nlmsghdr *nlh)
if (IS_ERR(peer))
return PTR_ERR(peer);
 
-   spin_lock_irqsave(>nsid_lock, flags);
+   spin_lock_bh(>nsid_lock);
if (__peernet2id(net, peer) >= 0) {
-   spin_unlock_irqrestore(>nsid_lock, flags);
+   spin_unlock_bh(>nsid_lock);
err = -EEXIST;
goto out;
}
 
err = alloc_netid(net, peer, nsid);
-   spin_unlock_irqrestore(>nsid_lock, flags);
+   spin_unlock_bh(>nsid_lock);
if (err >= 0) {
rtnl_net_notifyid(net, RTM_NEWNSID, err);
err = 0;
@@ -695,11 +691,10 @@ static int rtnl_net_dumpid(struct sk_buff *skb, struct 
netlink_callback *cb)
.idx = 0,
.s_idx = cb->args[0],
};
-   unsigned long flags;
 
-   spin_lock_irqsave(>nsid_lock, flags);
+   spin_lock_bh(>nsid_lock);
idr_for_each(>netns_ids, rtnl_net_dumpid_one, _cb);
-   spin_unlock_irqrestore(>nsid_lock, flags);
+   spin_unlock_bh(>nsid_lock);
 
cb->args[0] = net_cb.idx;
return skb->len;
-- 
2.1.0