Flavio Leitner <[email protected]> writes:

> On Thu, Apr 18, 2019 at 01:46:22PM -0600, Alex Williamson wrote:
>> On Thu, 18 Apr 2019 15:50:43 -0300
>> Flavio Leitner <[email protected]> wrote:
>> 
>> > On Thu, Apr 18, 2019 at 12:06:57PM -0600, Alex Williamson wrote:
>> > > On Thu, 18 Apr 2019 13:56:23 -0300
>> > > Flavio Leitner <[email protected]> wrote:
>> > >   
>> > > > On Thu, Apr 18, 2019 at 10:43:11AM -0600, Alex Williamson wrote:  
>> > > > > On Thu, 18 Apr 2019 13:23:54 -0300
>> > > > > Flavio Leitner <[email protected]> wrote:
>> > > > Another thing is that when the module is ready and the event is sent
>> > > > out, what holds OVS for not trying to open and get EACCESS before
>> > > > udev is triggered to fix the device permission?  
>> > > 
>> > > If there were a race, could ovs ever run before udev on system
>> > > startup?  Probably not.  
>> > 
>> > It does wait, but only for the udev to settle, which means if the
>> > module has not triggered an event until that time, OVS will not wait
>> > and we still have a race.
>> 
>> But udev isn't waiting on the module to trigger an event, the module
>> contains a MODULE_ALIAS, so I believe it's just the static processing
>> of the modules.alias that triggers the event.
>
> What I am saying is that driverctl will trigger load the module and
> bind the device, later on systemd will trigger OVS service which
> waits udev to settle, but none of that guarantees that the permissions
> are updated when OVS is initializing, see below.
>
>> > >  Ideally perhaps a cleaner solution might be an
>> > > explicit dependency on the vfio module specific to ovs startup rather
>> > > than changing a system policy, but it really depends on the context and
>> > > use cases.  Thanks,  
>> > 
>> > It does have. The driverctl will bind the devices to vfio-pci but
>> > the problem is that which signal we should rely on to know when
>> > the vfio module is still initializing, or failed or finished.
>> 
>> What signal/mechanism is being used currently?  If driverctl is asked
>> to set a driver override it does:
>> 
>>  1) if module is not loaded, modprobe
>>  2) unbinds device from existing driver, if any
>>  3) sets driver_override
>>  4) triggers drivers_probe
>>  5) tests if device is bound to a driver, any driver
>> 
>> There are certainly some deficiencies here, unbinding the device before
>> setting the driver_override leaves the device open to getting bound by
>> the wrong driver, and the verification in the last step could be more
>> specific in testing for binding to the correct driver, but step #1 is
>> the modprobe of the driver, which should be a synchronous operation.
>> We shouldn't be able to complete a 'driverctl set-override $DEV
>> vfio-pci' without vfio being initialized, afaict.  Thanks,
>
> Right, sounds like systemd is starting openvswitch service before
> the driverctl is done with the devices.

I'm not sure.  The ordering could be a problem.

Perhaps we could try adding:

  After=basic.target

for the ovs-vswitchd.service if we have a machine that exhibits this
behavior, but I don't know if it will resolve the race.  There is some
kind of strange ordering looking at:

https://www.freedesktop.org/software/systemd/man/systemd.special.html
and
https://www.freedesktop.org/software/systemd/man/bootup.html#

I can't find how network.target dependency really works w.r.t. ordering
and the driverctl+basic.target services.

> fbl
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to