On Thu, 2017-04-27 at 14:47 -0700, William Tu wrote: > On Thu, Apr 27, 2017 at 11:17 AM, Ben Pfaff <b...@ovn.org> wrote: > > On Thu, Apr 27, 2017 at 10:17:20AM -0700, William Tu wrote: > >> On Thu, Apr 27, 2017 at 9:57 AM, Ben Pfaff <b...@ovn.org> wrote: > >> > On Thu, Apr 27, 2017 at 04:02:10AM -0700, William Tu wrote: > >> >> Before the patch, when users create bridge named "default", although > >> >> ovs-vsctl fails but vswitchd in the background will keep retrying it, > >> >> causing the systemd-udev to reach 100% cpu utilization. The reason is > >> >> due to frequent calls into kernel's register_netdevice function, > >> >> which will invoke several kernel elements who has registered on the > >> >> netdevice notifier chain. One of the notifier, the inetdev_event > >> >> rejects > >> >> this devname and register_netdevice fails. The patch prohibits creating > >> >> "default" bridge name. > >> >> > >> >> VMWare-BZ: #1842388 > >> >> Signed-off-by: William Tu <u9012...@gmail.com> > >> > > >> > It all seems very arbitrary. Do we understand why this is an invalid > >> > device name? > >> > >> Yes, when kernel is configured with CONFIG_SYSCTL, creating a new > >> netdev creates a dir in /proc/sys/net/ipv4/conf/<device name> > >> > >> The <device name> "default" and "all" is pre-existed when SYSCTL > >> starts (which means we should also prohibit "all") for default > >> property of the system's netdev. So sysctl prevents creating dev->name > >> is "default" or "all". A call stack is below if interested: > >> sysctl_dev_name_is_allowed > >> devinet_sysctl_register > >> inetdev_event > >> notifier_call_chain > >> raw_notifier_call_chain > >> call_netdevice_notifiers_info > >> register_netdevice > > > > Do we get the same behavior (100% CPU) if creating a bridge fails for > > other reasons? For example, it might fail because a network device > > already exists with the given name, or because the name is too long for > > a network device name. If we do get 100% CPU from such a failure, then > > we should fix the root of the problem instead of blacklisting particular > > names. > > There are two scenarios: > 1) if the bridge name exists, ex: eth0 > then "ovs-vsctl add-br eth0" fails due to EEXIST > 2) if the bridge name does not exists, but is "default" or "all" > then "ovs-vsctl add-br default" fails due to EINVAL > > From OVS's point of view it's the same, ovs-vsctl fails creating the > bridge, and keeps retry. From the whole system's point of view, (2) > has impact on other Linux subsystems, due to kernel netdev notifier > chain mechanism informing other subsystems when trying to add a new > device, while (1) fails pretty early in register_netdevice() and has > no impact. > > Or instead of blacklisting, maybe add a max retry count?
Can you see if using a retry count still ensures this bug is fixed? VMWare-BZ: #1842388 If so that's probably a better approach. Like Ben I'm a little queasy about saying we can't have a 'default' bridge under any circumstance. Thanks, - Greg > > Regards, > William > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev