On 20.06.2018 20:15, David Ahern wrote:
> On 6/20/18 2:57 AM, Kirill Tkhai wrote:
>> From: Kirill Tkhai <ktk...@virtuozzo.com>
>>
>> The following script makes kernel to crash since it can't obtain
>> a name for a device, when the name is occupied by another device:
>>
>> #!/bin/bash
>> ifconfig eth0 down
>> ifconfig eth1 down
>> index=`cat /sys/class/net/eth1/ifindex`
>> ip link set eth1 name dev$index
>> unshare -n sleep 1h &
>> pid=$!
>> while [[ "`readlink /proc/self/ns/net`" == "`readlink /proc/$pid/ns/net`" 
>> ]]; do continue; done
>> ip link set dev$index netns $pid
>> ip link set eth0 name dev$index
>> kill -9 $pid
>>
>> Kernel messages:
>>
>> virtio_net virtio1 dev3: renamed from eth1
>> virtio_net virtio0 dev3: renamed from eth0
>> default_device_exit: failed to move dev3 to init_net: -17
>> ------------[ cut here ]------------
>> kernel BUG at net/core/dev.c:8978!
>> invalid opcode: 0000 [#1] PREEMPT SMP
>> CPU: 1 PID: 276 Comm: kworker/u8:3 Not tainted 4.17.0+ #292
>> Workqueue: netns cleanup_net
>> RIP: 0010:default_device_exit+0x9c/0xb0
>> [stack trace snipped]
>>
>> This patch gives more variability during choosing new name
>> of device and fixes the problem.
>>
>> Signed-off-by: Kirill Tkhai <ktk...@virtuozzo.com>
>> ---
>>
>> Since there is no suggestions how to fix this in another way, I'm resending 
>> the patch.
> 
> This patch does not remove the BUG, so does not really solve the
> problem. ie., it is fairly trivial to write a script (32k dev%d named
> devices in init_net) that triggers it again, so your commit subject and
> commit log are not correct with the references to 'fixing the problem'.

1)I'm not agree with you and I don't think removing the BUG() is a good idea.
This function is called from the place, where it must not fail. But it can
fail, and the problem with name is not the only reason of this happens.
We can't continue further pernet_operations in case of a problem happened
in default_device_exit(), and we can't remove the BUG() before this function
becomes of void type. But we are not going to make it of void type. So
we can't remove the BUG().

2)In case of the script is trivial, can't you just post it here to show
what type of devices you mean? Is there real problem or this is
a theoretical thinking?

All virtual devices I see have rtnl_link_ops, so that they just destroyed
in default_device_exit_batch(). According to physical devices, it's difficult
to imagine a node with 32k physical devices, and if someone tried to deploy
them it may meet problems not only in this place.

> The change does provide more variability in naming and reduces the
> likelihood of not being able to push a device back to init_net.

No, it provides. With the patch one may move real device to a container,
and allow to do with the device anything including changing of device
index. Then, the destruction of the container does not resilt a kernel
panic just because of two devices have the same index.

Kirill

Reply via email to