On Thu, 2017-04-27 at 14:47 -0700, William Tu wrote:
> On Thu, Apr 27, 2017 at 11:17 AM, Ben Pfaff <b...@ovn.org> wrote:
> > On Thu, Apr 27, 2017 at 10:17:20AM -0700, William Tu wrote:
> >> On Thu, Apr 27, 2017 at 9:57 AM, Ben Pfaff <b...@ovn.org> wrote:
> >> > On Thu, Apr 27, 2017 at 04:02:10AM -0700, William Tu wrote:
> >> >> Before the patch, when users create bridge named "default", although
> >> >> ovs-vsctl fails but vswitchd in the background will keep retrying it,
> >> >> causing the systemd-udev to reach 100% cpu utilization.  The reason is
> >> >> due to frequent calls into kernel's register_netdevice function,
> >> >> which will invoke several kernel elements who has registered on the
> >> >> netdevice notifier chain.  One of the notifier, the inetdev_event 
> >> >> rejects
> >> >> this devname and register_netdevice fails.  The patch prohibits creating
> >> >> "default" bridge name.
> >> >>
> >> >> VMWare-BZ: #1842388
> >> >> Signed-off-by: William Tu <u9012...@gmail.com>
> >> >
> >> > It all seems very arbitrary. Do we understand why this is an invalid
> >> > device name?
> >>
> >> Yes, when kernel is configured with CONFIG_SYSCTL, creating a new
> >> netdev creates a dir in /proc/sys/net/ipv4/conf/<device name>
> >>
> >> The <device name> "default" and "all" is pre-existed when SYSCTL
> >> starts (which means we should also prohibit "all") for default
> >> property of the system's netdev. So sysctl prevents creating dev->name
> >> is "default" or "all". A call stack is below if interested:
> >> sysctl_dev_name_is_allowed
> >>    devinet_sysctl_register
> >>      inetdev_event
> >>        notifier_call_chain
> >>          raw_notifier_call_chain
> >>            call_netdevice_notifiers_info
> >>              register_netdevice
> >
> > Do we get the same behavior (100% CPU) if creating a bridge fails for
> > other reasons?  For example, it might fail because a network device
> > already exists with the given name, or because the name is too long for
> > a network device name.  If we do get 100% CPU from such a failure, then
> > we should fix the root of the problem instead of blacklisting particular
> > names.
> 
> There are two scenarios:
> 1) if the bridge name exists, ex: eth0
> then "ovs-vsctl add-br eth0" fails due to EEXIST
> 2) if the bridge name does not exists, but is "default" or "all"
> then "ovs-vsctl add-br default" fails due to EINVAL
> 
> From OVS's point of view it's the same, ovs-vsctl fails creating the
> bridge, and keeps retry. From the whole system's point of view, (2)
> has impact on other Linux subsystems, due to kernel netdev notifier
> chain mechanism informing other subsystems when trying to add a new
> device, while (1) fails pretty early in register_netdevice() and has
> no impact.
> 
> Or instead of blacklisting, maybe add a max retry count?

Can you see if using a retry count still ensures this bug is fixed?

VMWare-BZ: #1842388

If so that's probably a better approach. Like Ben I'm a little queasy
about saying we can't have a 'default' bridge under any circumstance.

Thanks,

- Greg

> 
> Regards,
> William
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev



_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to