Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume

Alex Williamson Wed, 22 Jun 2016 08:52:24 -0700

On Wed, 22 Jun 2016 15:49:41 +0800
Zhou Jie <zhoujie2...@cn.fujitsu.com> wrote:

> Hi Alex,
> 
> On 2016/6/22 13:45, Zhou Jie wrote:
> > Hi Alex,
> >  
> >>>
> >>> In vfio I have some questions.
> >>> 1. How can I disable the access by mmap?
> >>>     We can disable all access to vfio fd by returning a EAGAIN error
> >>>     if user try to access it during the reset period until the host
> >>>     reset finished.
> >>>     But about the bar region which is maped by vfio_pci_mmap.
> >>>     How can I disable it in vfio driver?
> >>>     Even there is a way to do it,
> >>>     how about the complexity to recovery the mmap?  
> >>
> >> That's exactly the "sticky point" I refer to above, you'd need to
> >> solve that problem.  MST would probably still argue that we don't need
> >> to disable all those interfaces, a userspace driver can already do
> >> things like disable mmio space and then attempt to read from the mmio
> >> space of the device.  
> > You said we should not depend on user to protect the device
> >  be accessed during the reset period.
> >  
> >> So maybe the problem can be simplified to
> >> non-device specific interfaces, like config space access plus ioctls.  
> > I don't understand what's your mean.  
> 
> When a fatal aer error occurs the process is following.
> For host
>     aer driver detect aer error
> -> vfio driver send aer error
> -> aer driver reset bus
> -> qemu report aer error  
> For guest
> -> aer driver detect aer error
> -> aer driver reset bus
> -> device driver maybe disable device  
> 
> I am not sure if all the device driver disable device
> when a fatal aer error occurs.
> Should we depend on the guest device driver to protect the device
> be accessed during the reset period?

We should never depend on the guest driver to behave in a certain way,
but we need to prioritize what that actually means.  vfio in the kernel
has a responsibility first and foremost to the host kernel.  User owned
devices cannot be allowed to exploit or interfere with the host
regardless of user behavior.  The next priority is correct operation
for the user.  When the host kernel is handling the AER event between
the error and resume notifies, it doesn't have device specific drivers,
it's manipulating the device as a generic PCI device.  That makes me
think that vfio should not allow the user to interact (interfere) with
the device during that process and that such interference can be
limited to standard PCI level interactions.  That means config space,
and things that operate on config space (like interrupt ioctls and
resets).  On the QEMU side, we've sent a notification that an error
occurred, how the user and the guest respond to that is beyond the
concern of vfio in the kernel.  If the user/guest driver continues to
interact with resources on the device, that's fine, but I think vfio in
the kernel does need to prevent the user from interfering with the PCI
state of the device for that brief window when we know the host kernel
is operating on the device.  Otherwise the results are unpredictable
and therefore unsupportable.  Does that make sense?  Thanks,

Alex

Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume

Reply via email to