The result of the reverse bisect between v3.17..v3.18 with the reproducer was: # first bad commit: [34666d467cbf1e2e3c7bb15a63eccfb582cdd71f] netfilter: bridge: move br_netfilter out of the core If I backport this patch plus 7276ca3f on top of v3.17 I no longer get the hang with the simple reproducer (although I suspect a more elaborate reproducer would still trigger the issue).
This isn't a fix because we obviously have issues in later kernels. The 'unregister_netdevice' message could occur for different code paths since there could potentially be many paths that modify the refcnt for the net_device. I'm going to track who is calling 'dev_put' and 'dev_hold' and figure out which code patch is not freeing before we start unregistering the device. I'll focus in on the reproducer I have for now, since this seems similar to the 'shutdown containers with network connections' case. In addition I'm doing a bisect between earlier versions to see where the 'regression' occured. 3.11 seems to pass without regression for 5 threads / 50 iterations. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1403152 Title: unregister_netdevice: waiting for lo to become free. Usage count Status in The Linux Kernel: Unknown Status in linux package in Ubuntu: In Progress Status in linux source package in Trusty: Confirmed Status in linux source package in Utopic: Confirmed Bug description: I currently running trusty latest patches and i get on these hardware and software: Ubuntu 3.13.0-43.72-generic 3.13.11.11 processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 77 model name : Intel(R) Atom(TM) CPU C2758 @ 2.40GHz stepping : 8 microcode : 0x11d cpu MHz : 2400.000 cache size : 1024 KB physical id : 0 siblings : 8 core id : 7 cpu cores : 8 apicid : 14 initial apicid : 14 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch arat epb dtherm tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms bogomips : 4799.48 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: somehow reproducable the subjected error, and lxc is working still but not more managable until a reboot. managable means every command hangs. I saw there are alot of bugs but they seams to relate to older version and are closed, so i decided to file a new one? I run alot of machine with trusty an lxc containers but only these kind of machines produces these errors, all other don't show these odd behavior. thx in advance meno To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1403152/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : [email protected] Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp

