RE: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-02-05 Thread Pandarathil, Vijaymohan R


> -Original Message-
> From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci-
> ow...@vger.kernel.org] On Behalf Of Gleb Natapov
> Sent: Tuesday, February 05, 2013 3:37 AM
> To: Pandarathil, Vijaymohan R
> Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E;
> k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org;
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-
> PCI devices
> 
> On Tue, Feb 05, 2013 at 10:59:41AM +, Pandarathil, Vijaymohan R wrote:
> >
> >
> > > -Original Message-
> > > From: Gleb Natapov [mailto:g...@redhat.com]
> > > Sent: Tuesday, February 05, 2013 1:21 AM
> > > To: Pandarathil, Vijaymohan R
> > > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E;
> > > k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org;
> > > linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for
> VFIO-
> > > PCI devices
> > >
> > > On Tue, Feb 05, 2013 at 09:05:19AM +, Pandarathil, Vijaymohan R
> wrote:
> > > >
> > > >
> > > > > -----Original Message-
> > > > > From: Gleb Natapov [mailto:g...@redhat.com]
> > > > > Sent: Tuesday, February 05, 2013 12:05 AM
> > > > > To: Blue Swirl
> > > > > Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas;
> Ortiz,
> > > Lance
> > > > > E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux-
> > > p...@vger.kernel.org;
> > > > > linux-kernel@vger.kernel.org
> > > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER
> for
> > > VFIO-
> > > > > PCI devices
> > > > >
> > > > > On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote:
> > > > > > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R
> > > > > >  wrote:
> > > > > > > - Create eventfd per vfio device assigned to a guest
> and
> > > > > register an
> > > > > > >   event handler
> > > > > > >
> > > > > > > - This fd is passed to the vfio_pci driver through the
> > > SET_IRQ
> > > > > ioctl
> > > > > > >
> > > > > > > - When the device encounters an error, the eventfd is
> > > signalled
> > > > > > >   and the qemu eventfd handler gets invoked.
> > > > > > >
> > > > > > > - In the handler decide what action to take. Current
> action
> > > > > taken
> > > > > > >   is to terminate the guest.
> > > > > >
> > > > > > Usually this is not OK, but I guess this is not guest
> triggerable.
> > > > > >
> > > > > Still not OK. Why not stop a guest with appropriate stop reason?
> > > >
> > > > The thinking was that since this is a hardware error, we would want
> to
> > > stop the guest at the earliest. The hw_error() routine which aborts the
> > > qemu process was suggested by Alex and that seemed appropriate. Earlier
> I
> > > was using qemu_system_shutdown_request().  Any suggestions ?
> > > >
> > > I am thinking vm_stop(). Stopping SMP guest (and UP too in fact)
> > > involves sending IPIs to other cpus running guest's vcpus. Both exit()
> > > and vm_stop() will do it, but former is implicitly in the kernel and
> > > later is explicitly in QEMU.
> >
> > I had used vm_stop(RUN_STATE_SHUTDOWN) earlier in my code. But while
> testing, guest ended up in a hang rather than exiting. There seems to some
> cleanup work which is being done as part of vm_stop. In our case, we wanted
> the guest to exit immediately. So use of hw_error() seemed appropriate.
> >
> What makes you think it hang? It stopped, precisely what it should do if
> you call vm_stop(). Now it is possible for vm user to investigate what
> happened and even salvage some data from guest memory.

That was ignorance on my part on the expected behavior of vm_stop(). 
So what you are suggesting is to stop the guest displaying an appropriate 
error/next-steps message and have the users do any 
data-collection/investigation 
and then manually kill the guest, if they so desire. Right ?

Sounds reasonable. As long as the guest is not touching the device, it should 
be okay.
Alex, Any comments ?

Thanks

Vijay

 
> 
> --
>   Gleb.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-02-05 Thread Pandarathil, Vijaymohan R


> -Original Message-
> From: Gleb Natapov [mailto:g...@redhat.com]
> Sent: Tuesday, February 05, 2013 1:21 AM
> To: Pandarathil, Vijaymohan R
> Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E;
> k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org;
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-
> PCI devices
> 
> On Tue, Feb 05, 2013 at 09:05:19AM +, Pandarathil, Vijaymohan R wrote:
> >
> >
> > > -Original Message-
> > > From: Gleb Natapov [mailto:g...@redhat.com]
> > > Sent: Tuesday, February 05, 2013 12:05 AM
> > > To: Blue Swirl
> > > Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; Ortiz,
> Lance
> > > E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux-
> p...@vger.kernel.org;
> > > linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for
> VFIO-
> > > PCI devices
> > >
> > > On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote:
> > > > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R
> > > >  wrote:
> > > > > - Create eventfd per vfio device assigned to a guest and
> > > register an
> > > > >   event handler
> > > > >
> > > > > - This fd is passed to the vfio_pci driver through the
> SET_IRQ
> > > ioctl
> > > > >
> > > > > - When the device encounters an error, the eventfd is
> signalled
> > > > >   and the qemu eventfd handler gets invoked.
> > > > >
> > > > > - In the handler decide what action to take. Current action
> > > taken
> > > > >   is to terminate the guest.
> > > >
> > > > Usually this is not OK, but I guess this is not guest triggerable.
> > > >
> > > Still not OK. Why not stop a guest with appropriate stop reason?
> >
> > The thinking was that since this is a hardware error, we would want to
> stop the guest at the earliest. The hw_error() routine which aborts the
> qemu process was suggested by Alex and that seemed appropriate. Earlier I
> was using qemu_system_shutdown_request().  Any suggestions ?
> >
> I am thinking vm_stop(). Stopping SMP guest (and UP too in fact)
> involves sending IPIs to other cpus running guest's vcpus. Both exit()
> and vm_stop() will do it, but former is implicitly in the kernel and
> later is explicitly in QEMU.

I had used vm_stop(RUN_STATE_SHUTDOWN) earlier in my code. But while testing, 
guest ended up in a hang rather than exiting. There seems to some cleanup work 
which is being done as part of vm_stop. In our case, we wanted the guest to 
exit immediately. So use of hw_error() seemed appropriate.

Thoughts ?

Vijay

> 
> --
>   Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-02-05 Thread Pandarathil, Vijaymohan R


> -Original Message-
> From: Gleb Natapov [mailto:g...@redhat.com]
> Sent: Tuesday, February 05, 2013 12:05 AM
> To: Blue Swirl
> Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; Ortiz, Lance
> E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org;
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-
> PCI devices
> 
> On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote:
> > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R
> >  wrote:
> > > - Create eventfd per vfio device assigned to a guest and
> register an
> > >   event handler
> > >
> > > - This fd is passed to the vfio_pci driver through the SET_IRQ
> ioctl
> > >
> > > - When the device encounters an error, the eventfd is signalled
> > >   and the qemu eventfd handler gets invoked.
> > >
> > > - In the handler decide what action to take. Current action
> taken
> > >   is to terminate the guest.
> >
> > Usually this is not OK, but I guess this is not guest triggerable.
> >
> Still not OK. Why not stop a guest with appropriate stop reason?

The thinking was that since this is a hardware error, we would want to stop the 
guest at the earliest. The hw_error() routine which aborts the qemu process was 
suggested by Alex and that seemed appropriate. Earlier I was using 
qemu_system_shutdown_request().  Any suggestions ?

Thanks

Vijay
> 
> --
>   Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-02-05 Thread Pandarathil, Vijaymohan R


 -Original Message-
 From: Gleb Natapov [mailto:g...@redhat.com]
 Sent: Tuesday, February 05, 2013 12:05 AM
 To: Blue Swirl
 Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; Ortiz, Lance
 E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org;
 linux-kernel@vger.kernel.org
 Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-
 PCI devices
 
 On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote:
  On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R
  vijaymohan.pandarat...@hp.com wrote:
   - Create eventfd per vfio device assigned to a guest and
 register an
 event handler
  
   - This fd is passed to the vfio_pci driver through the SET_IRQ
 ioctl
  
   - When the device encounters an error, the eventfd is signalled
 and the qemu eventfd handler gets invoked.
  
   - In the handler decide what action to take. Current action
 taken
 is to terminate the guest.
 
  Usually this is not OK, but I guess this is not guest triggerable.
 
 Still not OK. Why not stop a guest with appropriate stop reason?

The thinking was that since this is a hardware error, we would want to stop the 
guest at the earliest. The hw_error() routine which aborts the qemu process was 
suggested by Alex and that seemed appropriate. Earlier I was using 
qemu_system_shutdown_request().  Any suggestions ?

Thanks

Vijay
 
 --
   Gleb.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-02-05 Thread Pandarathil, Vijaymohan R


 -Original Message-
 From: Gleb Natapov [mailto:g...@redhat.com]
 Sent: Tuesday, February 05, 2013 1:21 AM
 To: Pandarathil, Vijaymohan R
 Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E;
 k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org;
 linux-kernel@vger.kernel.org
 Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-
 PCI devices
 
 On Tue, Feb 05, 2013 at 09:05:19AM +, Pandarathil, Vijaymohan R wrote:
 
 
   -Original Message-
   From: Gleb Natapov [mailto:g...@redhat.com]
   Sent: Tuesday, February 05, 2013 12:05 AM
   To: Blue Swirl
   Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; Ortiz,
 Lance
   E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux-
 p...@vger.kernel.org;
   linux-kernel@vger.kernel.org
   Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for
 VFIO-
   PCI devices
  
   On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote:
On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R
vijaymohan.pandarat...@hp.com wrote:
 - Create eventfd per vfio device assigned to a guest and
   register an
   event handler

 - This fd is passed to the vfio_pci driver through the
 SET_IRQ
   ioctl

 - When the device encounters an error, the eventfd is
 signalled
   and the qemu eventfd handler gets invoked.

 - In the handler decide what action to take. Current action
   taken
   is to terminate the guest.
   
Usually this is not OK, but I guess this is not guest triggerable.
   
   Still not OK. Why not stop a guest with appropriate stop reason?
 
  The thinking was that since this is a hardware error, we would want to
 stop the guest at the earliest. The hw_error() routine which aborts the
 qemu process was suggested by Alex and that seemed appropriate. Earlier I
 was using qemu_system_shutdown_request().  Any suggestions ?
 
 I am thinking vm_stop(). Stopping SMP guest (and UP too in fact)
 involves sending IPIs to other cpus running guest's vcpus. Both exit()
 and vm_stop() will do it, but former is implicitly in the kernel and
 later is explicitly in QEMU.

I had used vm_stop(RUN_STATE_SHUTDOWN) earlier in my code. But while testing, 
guest ended up in a hang rather than exiting. There seems to some cleanup work 
which is being done as part of vm_stop. In our case, we wanted the guest to 
exit immediately. So use of hw_error() seemed appropriate.

Thoughts ?

Vijay

 
 --
   Gleb.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-02-05 Thread Pandarathil, Vijaymohan R


 -Original Message-
 From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci-
 ow...@vger.kernel.org] On Behalf Of Gleb Natapov
 Sent: Tuesday, February 05, 2013 3:37 AM
 To: Pandarathil, Vijaymohan R
 Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E;
 k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org;
 linux-kernel@vger.kernel.org
 Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-
 PCI devices
 
 On Tue, Feb 05, 2013 at 10:59:41AM +, Pandarathil, Vijaymohan R wrote:
 
 
   -Original Message-
   From: Gleb Natapov [mailto:g...@redhat.com]
   Sent: Tuesday, February 05, 2013 1:21 AM
   To: Pandarathil, Vijaymohan R
   Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E;
   k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org;
   linux-kernel@vger.kernel.org
   Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for
 VFIO-
   PCI devices
  
   On Tue, Feb 05, 2013 at 09:05:19AM +, Pandarathil, Vijaymohan R
 wrote:
   
   
 -Original Message-
 From: Gleb Natapov [mailto:g...@redhat.com]
 Sent: Tuesday, February 05, 2013 12:05 AM
 To: Blue Swirl
 Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas;
 Ortiz,
   Lance
 E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux-
   p...@vger.kernel.org;
 linux-kernel@vger.kernel.org
 Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER
 for
   VFIO-
 PCI devices

 On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote:
  On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R
  vijaymohan.pandarat...@hp.com wrote:
   - Create eventfd per vfio device assigned to a guest
 and
 register an
 event handler
  
   - This fd is passed to the vfio_pci driver through the
   SET_IRQ
 ioctl
  
   - When the device encounters an error, the eventfd is
   signalled
 and the qemu eventfd handler gets invoked.
  
   - In the handler decide what action to take. Current
 action
 taken
 is to terminate the guest.
 
  Usually this is not OK, but I guess this is not guest
 triggerable.
 
 Still not OK. Why not stop a guest with appropriate stop reason?
   
The thinking was that since this is a hardware error, we would want
 to
   stop the guest at the earliest. The hw_error() routine which aborts the
   qemu process was suggested by Alex and that seemed appropriate. Earlier
 I
   was using qemu_system_shutdown_request().  Any suggestions ?
   
   I am thinking vm_stop(). Stopping SMP guest (and UP too in fact)
   involves sending IPIs to other cpus running guest's vcpus. Both exit()
   and vm_stop() will do it, but former is implicitly in the kernel and
   later is explicitly in QEMU.
 
  I had used vm_stop(RUN_STATE_SHUTDOWN) earlier in my code. But while
 testing, guest ended up in a hang rather than exiting. There seems to some
 cleanup work which is being done as part of vm_stop. In our case, we wanted
 the guest to exit immediately. So use of hw_error() seemed appropriate.
 
 What makes you think it hang? It stopped, precisely what it should do if
 you call vm_stop(). Now it is possible for vm user to investigate what
 happened and even salvage some data from guest memory.

That was ignorance on my part on the expected behavior of vm_stop(). 
So what you are suggesting is to stop the guest displaying an appropriate 
error/next-steps message and have the users do any 
data-collection/investigation 
and then manually kill the guest, if they so desire. Right ?

Sounds reasonable. As long as the guest is not touching the device, it should 
be okay.
Alex, Any comments ?

Thanks

Vijay

 
 
 --
   Gleb.
 --
 To unsubscribe from this list: send the line unsubscribe linux-pci in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 0/3] AER-KVM: Error containment of VFIO devices assigned to KVM guests

2013-02-03 Thread Pandarathil, Vijaymohan R

Add support for error containment when a VFIO device assigned to a KVM
guest encounters an error. This is for PCIe devices/drivers that support AER
functionality. When the host OS is notified of an error in a device either
through the firmware first approach or through an interrupt handled by the AER
root port driver, the error handler registered by the vfio-pci driver gets
invoked. The qemu process is signaled through an eventfd registered per
VFIO device by the qemu process. In the eventfd handler, qemu decides on
what action to take. In this implementation, guest is brought down to
contain the error.


v3:
 - Removed PCI_AER* flags from device info ioctl.
 - Incorporated feedback
v2:
 - Rebased to latest upstream stable bits
 - Changed the new ioctl to be part of VFIO_SET_IRQs ioctl
 - Added a new patch to get/put reference to a vfio device from struct device
 - Incorporated all other feedback.

---

Vijay Mohan Pandarathil(3):

[PATCH 1/3] VFIO: Wrapper for getting reference to vfio_device from device 
[PATCH 2/3] VFIO-AER: Vfio-pci driver changes for supporting AER
[PATCH 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

Kernel files changed

 drivers/vfio/vfio.c  | 41 -
 include/linux/vfio.h |  3 +++
 2 files changed, 35 insertions(+), 9 deletions(-)

 drivers/vfio/pci/vfio_pci.c | 43 -
 drivers/vfio/pci/vfio_pci_intrs.c   | 30 ++
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 include/uapi/linux/vfio.h   |  1 +
 4 files changed, 74 insertions(+), 1 deletion(-)

Qemu files changed

 hw/vfio_pci.c  | 105 +
 linux-headers/linux/vfio.h |   1 +
 2 files changed, 106 insertions(+)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-02-03 Thread Pandarathil, Vijaymohan R
- Create eventfd per vfio device assigned to a guest and register an
  event handler

- This fd is passed to the vfio_pci driver through the SET_IRQ ioctl

- When the device encounters an error, the eventfd is signalled
  and the qemu eventfd handler gets invoked.

- In the handler decide what action to take. Current action taken
  is to terminate the guest.

Signed-off-by: Vijay Mohan Pandarathil 
---
 hw/vfio_pci.c  | 105 +
 linux-headers/linux/vfio.h |   1 +
 2 files changed, 106 insertions(+)

diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
index c51ae67..4e2f768 100644
--- a/hw/vfio_pci.c
+++ b/hw/vfio_pci.c
@@ -130,6 +130,8 @@ typedef struct VFIODevice {
 QLIST_ENTRY(VFIODevice) next;
 struct VFIOGroup *group;
 bool reset_works;
+EventNotifier err_notifier;
+bool pci_aer;
 } VFIODevice;
 
 typedef struct VFIOGroup {
@@ -1922,6 +1924,106 @@ static void vfio_put_device(VFIODevice *vdev)
 }
 }
 
+static void vfio_err_notifier_handler(void *opaque)
+{
+VFIODevice *vdev = opaque;
+
+if (!event_notifier_test_and_clear(>err_notifier)) {
+return;
+}
+
+/*
+ * TBD. Retrieve the error details and decide what action
+ * needs to be taken. One of the actions could be to pass
+ * the error to the guest and have the guest driver recover
+ * from the error. This requires that PCIe capabilities be
+ * exposed to the guest. At present, we just terminate the
+ * guest to contain the error.
+ */
+
+error_report("%s (%04x:%02x:%02x.%x)"
+"Unrecoverable error detected... Terminating guest\n",
+__func__, vdev->host.domain, vdev->host.bus,
+vdev->host.slot, vdev->host.function);
+
+hw_error("(%04x:%02x:%02x.%x) Unrecoverable device error\n",
+vdev->host.domain, vdev->host.bus,
+vdev->host.slot, vdev->host.function);
+
+return;
+}
+
+static void vfio_register_err_notifier(VFIODevice *vdev)
+{
+int ret;
+int argsz;
+struct vfio_irq_set *irq_set;
+int32_t *pfd;
+
+if (event_notifier_init(>err_notifier, 0)) {
+error_report("vfio: Warning: Unable to init event notifier for error 
detection\n");
+return;
+}
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+
+irq_set = g_malloc0(argsz);
+irq_set->argsz = argsz;
+irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
+ VFIO_IRQ_SET_ACTION_TRIGGER;
+irq_set->index = VFIO_PCI_ERR_IRQ_INDEX;
+irq_set->start = 0;
+irq_set->count = 1;
+pfd = (int32_t *)_set->data;
+
+*pfd = event_notifier_get_fd(>err_notifier);
+qemu_set_fd_handler(*pfd, vfio_err_notifier_handler, NULL, vdev);
+
+ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+if (ret) {
+DPRINTF("vfio: Error notification not supported for the device\n");
+qemu_set_fd_handler(*pfd, NULL, NULL, vdev);
+event_notifier_cleanup(>err_notifier);
+g_free(irq_set);
+return;
+}
+g_free(irq_set);
+vdev->pci_aer = 1;
+return;
+}
+static void vfio_unregister_err_notifier(VFIODevice *vdev)
+{
+int argsz;
+struct vfio_irq_set *irq_set;
+int32_t *pfd;
+int ret;
+
+if (!vdev->pci_aer) {
+return;
+}
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+
+irq_set = g_malloc0(argsz);
+irq_set->argsz = argsz;
+irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
+ VFIO_IRQ_SET_ACTION_TRIGGER;
+irq_set->index = VFIO_PCI_ERR_IRQ_INDEX;
+irq_set->start = 0;
+irq_set->count = 1;
+pfd = (int32_t *)_set->data;
+*pfd = -1;
+
+ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+if (ret) {
+DPRINTF("vfio: Failed to de-assign error fd: %d\n", ret);
+}
+g_free(irq_set);
+qemu_set_fd_handler(event_notifier_get_fd(>err_notifier),
+NULL, NULL, vdev);
+event_notifier_cleanup(>err_notifier);
+return;
+}
 static int vfio_initfn(PCIDevice *pdev)
 {
 VFIODevice *pvdev, *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
@@ -2032,6 +2134,8 @@ static int vfio_initfn(PCIDevice *pdev)
 }
 }
 
+vfio_register_err_notifier(vdev);
+
 return 0;
 
 out_teardown:
@@ -2049,6 +2153,7 @@ static void vfio_exitfn(PCIDevice *pdev)
 VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
 VFIOGroup *group = vdev->group;
 
+vfio_unregister_err_notifier(vdev);
 pci_device_set_intx_routing_notifier(>pdev, NULL);
 vfio_disable_interrupts(vdev);
 if (vdev->intx.mmap_timer) {
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index f787b72..6b20849 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -310,6 +310,7 @@ enum {
VFIO_PCI_INTX_IRQ_INDEX,
VFIO_PCI_MSI_IRQ_INDEX,
VFIO_PCI_MSIX_IRQ_INDEX,
+   VFIO_PCI_ERR_IRQ_INDEX,
VFIO_PCI_NUM_IRQS
 };
 
-- 
1.7.11.3

--
To 

[PATCH v3 2/3] VFIO-AER: Vfio-pci driver changes for supporting AER

2013-02-03 Thread Pandarathil, Vijaymohan R
- New VFIO_SET_IRQ ioctl option to pass the eventfd that is signaled 
when
  an error occurs in the vfio_pci_device

- Register pci_error_handler for the vfio_pci driver

- When the device encounters an error, the error handler registered by
  the vfio_pci driver gets invoked by the AER infrastructure

- In the error handler, signal the eventfd registered for the device.

- This results in the qemu eventfd handler getting invoked and
  appropriate action taken for the guest.

Signed-off-by: Vijay Mohan Pandarathil 
---
 drivers/vfio/pci/vfio_pci.c | 43 -
 drivers/vfio/pci/vfio_pci_intrs.c   | 30 ++
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 include/uapi/linux/vfio.h   |  1 +
 4 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index b28e66c..818b1ed 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -196,7 +196,9 @@ static int vfio_pci_get_irq_count(struct vfio_pci_device 
*vdev, int irq_type)
 
return (flags & PCI_MSIX_FLAGS_QSIZE) + 1;
}
-   }
+   } else if (irq_type == VFIO_PCI_ERR_IRQ_INDEX)
+   if (pci_is_pcie(vdev->pdev))
+   return 1;
 
return 0;
 }
@@ -302,6 +304,16 @@ static long vfio_pci_ioctl(void *device_data,
if (info.argsz < minsz || info.index >= VFIO_PCI_NUM_IRQS)
return -EINVAL;
 
+   switch (info.index) {
+   case VFIO_PCI_INTX_IRQ_INDEX ... VFIO_PCI_MSIX_IRQ_INDEX:
+   break;
+   case VFIO_PCI_ERR_IRQ_INDEX:
+   if (pci_is_pcie(vdev->pdev))
+   break;
+   default:
+   return -EINVAL;
+   }
+
info.flags = VFIO_IRQ_INFO_EVENTFD;
 
info.count = vfio_pci_get_irq_count(vdev, info.index);
@@ -538,11 +550,40 @@ static void vfio_pci_remove(struct pci_dev *pdev)
kfree(vdev);
 }
 
+static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+   struct vfio_pci_device *vpdev;
+   void *vdev;
+
+   vdev = vfio_device_get_from_dev(>dev);
+   if (vdev == NULL)
+   return PCI_ERS_RESULT_DISCONNECT;
+
+   vpdev = vfio_device_data(vdev);
+   if (vpdev == NULL) {
+   vfio_device_put(vdev);
+   return PCI_ERS_RESULT_DISCONNECT;
+   }
+
+   if (vpdev->err_trigger)
+   eventfd_signal(vpdev->err_trigger, 1);
+
+   vfio_device_put(vdev);
+
+   return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+static struct pci_error_handlers vfio_err_handlers = {
+   .error_detected = vfio_pci_aer_err_detected,
+};
+
 static struct pci_driver vfio_pci_driver = {
.name   = "vfio-pci",
.id_table   = NULL, /* only dynamic ids */
.probe  = vfio_pci_probe,
.remove = vfio_pci_remove,
+   .err_handler= _err_handlers,
 };
 
 static void __exit vfio_pci_cleanup(void)
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c 
b/drivers/vfio/pci/vfio_pci_intrs.c
index 3639371..83035b1 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -745,6 +745,29 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_device 
*vdev,
return 0;
 }
 
+static int vfio_pci_set_err_trigger(struct vfio_pci_device *vdev,
+   unsigned index, unsigned start,
+   unsigned count, uint32_t flags, void *data)
+{
+   int32_t fd = *(int32_t *)data;
+
+   if ((index != VFIO_PCI_ERR_IRQ_INDEX) ||
+   !(flags & VFIO_IRQ_SET_DATA_EVENTFD))
+   return -EINVAL;
+
+   if (fd == -1) {
+   if (vdev->err_trigger)
+   eventfd_ctx_put(vdev->err_trigger);
+   vdev->err_trigger = NULL;
+   return 0;
+   } else if (fd >= 0) {
+   vdev->err_trigger = eventfd_ctx_fdget(fd);
+   if (IS_ERR(vdev->err_trigger))
+   return PTR_ERR(vdev->err_trigger);
+   return 0;
+   } else
+   return -EINVAL;
+}
 int vfio_pci_set_irqs_ioctl(struct vfio_pci_device *vdev, uint32_t flags,
unsigned index, unsigned start, unsigned count,
void *data)
@@ -779,6 +802,13 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_device *vdev, 
uint32_t flags,
break;
}
break;
+   case VFIO_PCI_ERR_IRQ_INDEX:
+   switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
+   case VFIO_IRQ_SET_ACTION_TRIGGER:
+   if 

[PATCH v3 1/3] VFIO: Wrapper for getting reference to vfio_device from device

2013-02-03 Thread Pandarathil, Vijaymohan R
- Added vfio_device_get_from_dev() as wrapper to get
  reference to vfio_device from struct device.

- Added vfio_device_data() as a wrapper to get device_data from
  vfio_device.

Signed-off-by: Vijay Mohan Pandarathil 
---
 drivers/vfio/vfio.c  | 41 -
 include/linux/vfio.h |  3 +++
 2 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 12c264d..f0a78a2 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -407,12 +407,13 @@ static void vfio_device_release(struct kref *kref)
 }
 
 /* Device reference always implies a group reference */
-static void vfio_device_put(struct vfio_device *device)
+void vfio_device_put(struct vfio_device *device)
 {
struct vfio_group *group = device->group;
kref_put_mutex(>kref, vfio_device_release, >device_lock);
vfio_group_put(group);
 }
+EXPORT_SYMBOL_GPL(vfio_device_put);
 
 static void vfio_device_get(struct vfio_device *device)
 {
@@ -642,8 +643,12 @@ int vfio_add_group_dev(struct device *dev,
 }
 EXPORT_SYMBOL_GPL(vfio_add_group_dev);
 
-/* Test whether a struct device is present in our tracking */
-static bool vfio_dev_present(struct device *dev)
+/**
+ * This does a get on the vfio_device from device.
+ * Callers of this function will have to call vfio_put_device() to
+ * remove the reference.
+ */
+struct vfio_device *vfio_device_get_from_dev(struct device *dev)
 {
struct iommu_group *iommu_group;
struct vfio_group *group;
@@ -651,25 +656,43 @@ static bool vfio_dev_present(struct device *dev)
 
iommu_group = iommu_group_get(dev);
if (!iommu_group)
-   return false;
+   return NULL;
 
group = vfio_group_get_from_iommu(iommu_group);
if (!group) {
iommu_group_put(iommu_group);
-   return false;
+   return NULL;
}
 
device = vfio_group_get_device(group, dev);
if (!device) {
vfio_group_put(group);
iommu_group_put(iommu_group);
-   return false;
+   return NULL;
}
-
-   vfio_device_put(device);
vfio_group_put(group);
iommu_group_put(iommu_group);
-   return true;
+   return device;
+}
+EXPORT_SYMBOL_GPL(vfio_device_get_from_dev);
+
+void *vfio_device_data(struct vfio_device *device)
+{
+   return device->device_data;
+}
+EXPORT_SYMBOL_GPL(vfio_device_data);
+
+/* Test whether a struct device is present in our tracking */
+static bool vfio_dev_present(struct device *dev)
+{
+   struct vfio_device *device;
+
+   device = vfio_device_get_from_dev(dev);
+   if (device) {
+   vfio_device_put(device);
+   return true;
+   } else
+   return false;
 }
 
 /*
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index ab9e862..ac8d488 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -45,6 +45,9 @@ extern int vfio_add_group_dev(struct device *dev,
  void *device_data);
 
 extern void *vfio_del_group_dev(struct device *dev);
+extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
+extern void vfio_device_put(struct vfio_device *device);
+extern void *vfio_device_data(struct vfio_device *device);
 
 /**
  * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 1/3] VFIO: Wrapper for getting reference to vfio_device from device

2013-02-03 Thread Pandarathil, Vijaymohan R
- Added vfio_device_get_from_dev() as wrapper to get
  reference to vfio_device from struct device.

- Added vfio_device_data() as a wrapper to get device_data from
  vfio_device.

Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
---
 drivers/vfio/vfio.c  | 41 -
 include/linux/vfio.h |  3 +++
 2 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 12c264d..f0a78a2 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -407,12 +407,13 @@ static void vfio_device_release(struct kref *kref)
 }
 
 /* Device reference always implies a group reference */
-static void vfio_device_put(struct vfio_device *device)
+void vfio_device_put(struct vfio_device *device)
 {
struct vfio_group *group = device-group;
kref_put_mutex(device-kref, vfio_device_release, group-device_lock);
vfio_group_put(group);
 }
+EXPORT_SYMBOL_GPL(vfio_device_put);
 
 static void vfio_device_get(struct vfio_device *device)
 {
@@ -642,8 +643,12 @@ int vfio_add_group_dev(struct device *dev,
 }
 EXPORT_SYMBOL_GPL(vfio_add_group_dev);
 
-/* Test whether a struct device is present in our tracking */
-static bool vfio_dev_present(struct device *dev)
+/**
+ * This does a get on the vfio_device from device.
+ * Callers of this function will have to call vfio_put_device() to
+ * remove the reference.
+ */
+struct vfio_device *vfio_device_get_from_dev(struct device *dev)
 {
struct iommu_group *iommu_group;
struct vfio_group *group;
@@ -651,25 +656,43 @@ static bool vfio_dev_present(struct device *dev)
 
iommu_group = iommu_group_get(dev);
if (!iommu_group)
-   return false;
+   return NULL;
 
group = vfio_group_get_from_iommu(iommu_group);
if (!group) {
iommu_group_put(iommu_group);
-   return false;
+   return NULL;
}
 
device = vfio_group_get_device(group, dev);
if (!device) {
vfio_group_put(group);
iommu_group_put(iommu_group);
-   return false;
+   return NULL;
}
-
-   vfio_device_put(device);
vfio_group_put(group);
iommu_group_put(iommu_group);
-   return true;
+   return device;
+}
+EXPORT_SYMBOL_GPL(vfio_device_get_from_dev);
+
+void *vfio_device_data(struct vfio_device *device)
+{
+   return device-device_data;
+}
+EXPORT_SYMBOL_GPL(vfio_device_data);
+
+/* Test whether a struct device is present in our tracking */
+static bool vfio_dev_present(struct device *dev)
+{
+   struct vfio_device *device;
+
+   device = vfio_device_get_from_dev(dev);
+   if (device) {
+   vfio_device_put(device);
+   return true;
+   } else
+   return false;
 }
 
 /*
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index ab9e862..ac8d488 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -45,6 +45,9 @@ extern int vfio_add_group_dev(struct device *dev,
  void *device_data);
 
 extern void *vfio_del_group_dev(struct device *dev);
+extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
+extern void vfio_device_put(struct vfio_device *device);
+extern void *vfio_device_data(struct vfio_device *device);
 
 /**
  * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 2/3] VFIO-AER: Vfio-pci driver changes for supporting AER

2013-02-03 Thread Pandarathil, Vijaymohan R
- New VFIO_SET_IRQ ioctl option to pass the eventfd that is signaled 
when
  an error occurs in the vfio_pci_device

- Register pci_error_handler for the vfio_pci driver

- When the device encounters an error, the error handler registered by
  the vfio_pci driver gets invoked by the AER infrastructure

- In the error handler, signal the eventfd registered for the device.

- This results in the qemu eventfd handler getting invoked and
  appropriate action taken for the guest.

Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
---
 drivers/vfio/pci/vfio_pci.c | 43 -
 drivers/vfio/pci/vfio_pci_intrs.c   | 30 ++
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 include/uapi/linux/vfio.h   |  1 +
 4 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index b28e66c..818b1ed 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -196,7 +196,9 @@ static int vfio_pci_get_irq_count(struct vfio_pci_device 
*vdev, int irq_type)
 
return (flags  PCI_MSIX_FLAGS_QSIZE) + 1;
}
-   }
+   } else if (irq_type == VFIO_PCI_ERR_IRQ_INDEX)
+   if (pci_is_pcie(vdev-pdev))
+   return 1;
 
return 0;
 }
@@ -302,6 +304,16 @@ static long vfio_pci_ioctl(void *device_data,
if (info.argsz  minsz || info.index = VFIO_PCI_NUM_IRQS)
return -EINVAL;
 
+   switch (info.index) {
+   case VFIO_PCI_INTX_IRQ_INDEX ... VFIO_PCI_MSIX_IRQ_INDEX:
+   break;
+   case VFIO_PCI_ERR_IRQ_INDEX:
+   if (pci_is_pcie(vdev-pdev))
+   break;
+   default:
+   return -EINVAL;
+   }
+
info.flags = VFIO_IRQ_INFO_EVENTFD;
 
info.count = vfio_pci_get_irq_count(vdev, info.index);
@@ -538,11 +550,40 @@ static void vfio_pci_remove(struct pci_dev *pdev)
kfree(vdev);
 }
 
+static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+   struct vfio_pci_device *vpdev;
+   void *vdev;
+
+   vdev = vfio_device_get_from_dev(pdev-dev);
+   if (vdev == NULL)
+   return PCI_ERS_RESULT_DISCONNECT;
+
+   vpdev = vfio_device_data(vdev);
+   if (vpdev == NULL) {
+   vfio_device_put(vdev);
+   return PCI_ERS_RESULT_DISCONNECT;
+   }
+
+   if (vpdev-err_trigger)
+   eventfd_signal(vpdev-err_trigger, 1);
+
+   vfio_device_put(vdev);
+
+   return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+static struct pci_error_handlers vfio_err_handlers = {
+   .error_detected = vfio_pci_aer_err_detected,
+};
+
 static struct pci_driver vfio_pci_driver = {
.name   = vfio-pci,
.id_table   = NULL, /* only dynamic ids */
.probe  = vfio_pci_probe,
.remove = vfio_pci_remove,
+   .err_handler= vfio_err_handlers,
 };
 
 static void __exit vfio_pci_cleanup(void)
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c 
b/drivers/vfio/pci/vfio_pci_intrs.c
index 3639371..83035b1 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -745,6 +745,29 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_device 
*vdev,
return 0;
 }
 
+static int vfio_pci_set_err_trigger(struct vfio_pci_device *vdev,
+   unsigned index, unsigned start,
+   unsigned count, uint32_t flags, void *data)
+{
+   int32_t fd = *(int32_t *)data;
+
+   if ((index != VFIO_PCI_ERR_IRQ_INDEX) ||
+   !(flags  VFIO_IRQ_SET_DATA_EVENTFD))
+   return -EINVAL;
+
+   if (fd == -1) {
+   if (vdev-err_trigger)
+   eventfd_ctx_put(vdev-err_trigger);
+   vdev-err_trigger = NULL;
+   return 0;
+   } else if (fd = 0) {
+   vdev-err_trigger = eventfd_ctx_fdget(fd);
+   if (IS_ERR(vdev-err_trigger))
+   return PTR_ERR(vdev-err_trigger);
+   return 0;
+   } else
+   return -EINVAL;
+}
 int vfio_pci_set_irqs_ioctl(struct vfio_pci_device *vdev, uint32_t flags,
unsigned index, unsigned start, unsigned count,
void *data)
@@ -779,6 +802,13 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_device *vdev, 
uint32_t flags,
break;
}
break;
+   case VFIO_PCI_ERR_IRQ_INDEX:
+   switch (flags  VFIO_IRQ_SET_ACTION_TYPE_MASK) {
+   case VFIO_IRQ_SET_ACTION_TRIGGER:
+   if 

[PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-02-03 Thread Pandarathil, Vijaymohan R
- Create eventfd per vfio device assigned to a guest and register an
  event handler

- This fd is passed to the vfio_pci driver through the SET_IRQ ioctl

- When the device encounters an error, the eventfd is signalled
  and the qemu eventfd handler gets invoked.

- In the handler decide what action to take. Current action taken
  is to terminate the guest.

Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
---
 hw/vfio_pci.c  | 105 +
 linux-headers/linux/vfio.h |   1 +
 2 files changed, 106 insertions(+)

diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
index c51ae67..4e2f768 100644
--- a/hw/vfio_pci.c
+++ b/hw/vfio_pci.c
@@ -130,6 +130,8 @@ typedef struct VFIODevice {
 QLIST_ENTRY(VFIODevice) next;
 struct VFIOGroup *group;
 bool reset_works;
+EventNotifier err_notifier;
+bool pci_aer;
 } VFIODevice;
 
 typedef struct VFIOGroup {
@@ -1922,6 +1924,106 @@ static void vfio_put_device(VFIODevice *vdev)
 }
 }
 
+static void vfio_err_notifier_handler(void *opaque)
+{
+VFIODevice *vdev = opaque;
+
+if (!event_notifier_test_and_clear(vdev-err_notifier)) {
+return;
+}
+
+/*
+ * TBD. Retrieve the error details and decide what action
+ * needs to be taken. One of the actions could be to pass
+ * the error to the guest and have the guest driver recover
+ * from the error. This requires that PCIe capabilities be
+ * exposed to the guest. At present, we just terminate the
+ * guest to contain the error.
+ */
+
+error_report(%s (%04x:%02x:%02x.%x)
+Unrecoverable error detected... Terminating guest\n,
+__func__, vdev-host.domain, vdev-host.bus,
+vdev-host.slot, vdev-host.function);
+
+hw_error((%04x:%02x:%02x.%x) Unrecoverable device error\n,
+vdev-host.domain, vdev-host.bus,
+vdev-host.slot, vdev-host.function);
+
+return;
+}
+
+static void vfio_register_err_notifier(VFIODevice *vdev)
+{
+int ret;
+int argsz;
+struct vfio_irq_set *irq_set;
+int32_t *pfd;
+
+if (event_notifier_init(vdev-err_notifier, 0)) {
+error_report(vfio: Warning: Unable to init event notifier for error 
detection\n);
+return;
+}
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+
+irq_set = g_malloc0(argsz);
+irq_set-argsz = argsz;
+irq_set-flags = VFIO_IRQ_SET_DATA_EVENTFD |
+ VFIO_IRQ_SET_ACTION_TRIGGER;
+irq_set-index = VFIO_PCI_ERR_IRQ_INDEX;
+irq_set-start = 0;
+irq_set-count = 1;
+pfd = (int32_t *)irq_set-data;
+
+*pfd = event_notifier_get_fd(vdev-err_notifier);
+qemu_set_fd_handler(*pfd, vfio_err_notifier_handler, NULL, vdev);
+
+ret = ioctl(vdev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
+if (ret) {
+DPRINTF(vfio: Error notification not supported for the device\n);
+qemu_set_fd_handler(*pfd, NULL, NULL, vdev);
+event_notifier_cleanup(vdev-err_notifier);
+g_free(irq_set);
+return;
+}
+g_free(irq_set);
+vdev-pci_aer = 1;
+return;
+}
+static void vfio_unregister_err_notifier(VFIODevice *vdev)
+{
+int argsz;
+struct vfio_irq_set *irq_set;
+int32_t *pfd;
+int ret;
+
+if (!vdev-pci_aer) {
+return;
+}
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+
+irq_set = g_malloc0(argsz);
+irq_set-argsz = argsz;
+irq_set-flags = VFIO_IRQ_SET_DATA_EVENTFD |
+ VFIO_IRQ_SET_ACTION_TRIGGER;
+irq_set-index = VFIO_PCI_ERR_IRQ_INDEX;
+irq_set-start = 0;
+irq_set-count = 1;
+pfd = (int32_t *)irq_set-data;
+*pfd = -1;
+
+ret = ioctl(vdev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
+if (ret) {
+DPRINTF(vfio: Failed to de-assign error fd: %d\n, ret);
+}
+g_free(irq_set);
+qemu_set_fd_handler(event_notifier_get_fd(vdev-err_notifier),
+NULL, NULL, vdev);
+event_notifier_cleanup(vdev-err_notifier);
+return;
+}
 static int vfio_initfn(PCIDevice *pdev)
 {
 VFIODevice *pvdev, *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
@@ -2032,6 +2134,8 @@ static int vfio_initfn(PCIDevice *pdev)
 }
 }
 
+vfio_register_err_notifier(vdev);
+
 return 0;
 
 out_teardown:
@@ -2049,6 +2153,7 @@ static void vfio_exitfn(PCIDevice *pdev)
 VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
 VFIOGroup *group = vdev-group;
 
+vfio_unregister_err_notifier(vdev);
 pci_device_set_intx_routing_notifier(vdev-pdev, NULL);
 vfio_disable_interrupts(vdev);
 if (vdev-intx.mmap_timer) {
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index f787b72..6b20849 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -310,6 +310,7 @@ enum {
VFIO_PCI_INTX_IRQ_INDEX,
VFIO_PCI_MSI_IRQ_INDEX,
VFIO_PCI_MSIX_IRQ_INDEX,
+   VFIO_PCI_ERR_IRQ_INDEX,
VFIO_PCI_NUM_IRQS
 };

[PATCH v3 0/3] AER-KVM: Error containment of VFIO devices assigned to KVM guests

2013-02-03 Thread Pandarathil, Vijaymohan R

Add support for error containment when a VFIO device assigned to a KVM
guest encounters an error. This is for PCIe devices/drivers that support AER
functionality. When the host OS is notified of an error in a device either
through the firmware first approach or through an interrupt handled by the AER
root port driver, the error handler registered by the vfio-pci driver gets
invoked. The qemu process is signaled through an eventfd registered per
VFIO device by the qemu process. In the eventfd handler, qemu decides on
what action to take. In this implementation, guest is brought down to
contain the error.


v3:
 - Removed PCI_AER* flags from device info ioctl.
 - Incorporated feedback
v2:
 - Rebased to latest upstream stable bits
 - Changed the new ioctl to be part of VFIO_SET_IRQs ioctl
 - Added a new patch to get/put reference to a vfio device from struct device
 - Incorporated all other feedback.

---

Vijay Mohan Pandarathil(3):

[PATCH 1/3] VFIO: Wrapper for getting reference to vfio_device from device 
[PATCH 2/3] VFIO-AER: Vfio-pci driver changes for supporting AER
[PATCH 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

Kernel files changed

 drivers/vfio/vfio.c  | 41 -
 include/linux/vfio.h |  3 +++
 2 files changed, 35 insertions(+), 9 deletions(-)

 drivers/vfio/pci/vfio_pci.c | 43 -
 drivers/vfio/pci/vfio_pci_intrs.c   | 30 ++
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 include/uapi/linux/vfio.h   |  1 +
 4 files changed, 74 insertions(+), 1 deletion(-)

Qemu files changed

 hw/vfio_pci.c  | 105 +
 linux-headers/linux/vfio.h |   1 +
 2 files changed, 106 insertions(+)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v2 2/3] VFIO-AER: Vfio-pci driver changes for supporting AER

2013-02-01 Thread Pandarathil, Vijaymohan R


> -Original Message-
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Tuesday, January 29, 2013 5:25 AM
> To: Pandarathil, Vijaymohan R
> Cc: Gleb Natapov; Bjorn Helgaas; Blue Swirl; Ortiz, Lance E;
> k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org;
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2 2/3] VFIO-AER: Vfio-pci driver changes for
> supporting AER
> 
> On Mon, 2013-01-28 at 12:31 -0700, Alex Williamson wrote:
> > On Mon, 2013-01-28 at 09:54 +, Pandarathil, Vijaymohan R wrote:
> > >   - New VFIO_SET_IRQ ioctl option to pass the eventfd that is signalled
> when
> > >   an error occurs in the vfio_pci_device
> > >
> > >   - Register pci_error_handler for the vfio_pci driver
> > >
> > >   - When the device encounters an error, the error handler registered
> by
> > >   the vfio_pci driver gets invoked by the AER infrastructure
> > >
> > >   - In the error handler, signal the eventfd registered for the device.
> > >
> > >   - This results in the qemu eventfd handler getting invoked and
> > >   appropriate action taken for the guest.
> > >
> > > Signed-off-by: Vijay Mohan Pandarathil 
> > > ---
> > >  drivers/vfio/pci/vfio_pci.c | 44
> -
> > >  drivers/vfio/pci/vfio_pci_intrs.c   | 32 +++
> > >  drivers/vfio/pci/vfio_pci_private.h |  1 +
> > >  include/uapi/linux/vfio.h   |  3 +++
> > >  4 files changed, 79 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> > > index b28e66c..ff2a078 100644
> > > --- a/drivers/vfio/pci/vfio_pci.c
> > > +++ b/drivers/vfio/pci/vfio_pci.c
> > > @@ -196,7 +196,9 @@ static int vfio_pci_get_irq_count(struct
> vfio_pci_device *vdev, int irq_type)
> > >
> > >   return (flags & PCI_MSIX_FLAGS_QSIZE) + 1;
> > >   }
> > > - }
> > > + } else if (irq_type == VFIO_PCI_ERR_IRQ_INDEX)
> > > + if (pci_is_pcie(vdev->pdev))
> > > + return 1;
> > >
> > >   return 0;
> > >  }
> > > @@ -223,9 +225,18 @@ static long vfio_pci_ioctl(void *device_data,
> > >   if (vdev->reset_works)
> > >   info.flags |= VFIO_DEVICE_FLAGS_RESET;
> > >
> > > + if (pci_is_pcie(vdev->pdev)) {
> > > + info.flags |= VFIO_DEVICE_FLAGS_PCI_AER;
> > > + info.flags |= VFIO_DEVICE_FLAGS_PCI_AER_NOTIFY;
> >
> > Not sure this second flag should be AER specific or if it's even needed,
> > see below for more comments on this.
> >
> > > + }
> > > +
> > >   info.num_regions = VFIO_PCI_NUM_REGIONS;
> > >   info.num_irqs = VFIO_PCI_NUM_IRQS;
> > >
> > > + /* Expose only implemented IRQs */
> > > + if (!(info.flags & VFIO_DEVICE_FLAGS_PCI_AER_NOTIFY))
> > > + info.num_irqs--;
> >
> > I'm having second thoughts on this, see further below.
> >
> > > +
> > >   return copy_to_user((void __user *)arg, , minsz);
> > >
> > >   } else if (cmd == VFIO_DEVICE_GET_REGION_INFO) {
> > > @@ -302,6 +313,10 @@ static long vfio_pci_ioctl(void *device_data,
> > >   if (info.argsz < minsz || info.index >= VFIO_PCI_NUM_IRQS)
> > >   return -EINVAL;
> > >
> > > + if ((info.index == VFIO_PCI_ERR_IRQ_INDEX) &&
> > > +  !pci_is_pcie(vdev->pdev))
> > > + return -EINVAL;
> > > +
> >
> > Perhaps we could incorporate the index test above this too?
> >
> > switch (info.index) {
> > case VFIO_PCI_INTX_IRQ_INDEX: ... VFIO_PCI_MSIX_IRQ_INDEX:
> > break;
> > case VFIO_PCI_ERR_IRQ_INDEX:
> > if (pci_is_pcie(vdev->pdev))
> > break;
> > default:
> > return -EINVAL;
> > }
> >
> > This is more similar to how I've re-written the same for the proposed
> > VGA/legacy I/O support.
> >
> > >   info.flags = VFIO_IRQ_INFO_EVENTFD;
> > >
> > >   info.count = vfio_pci_get_irq_count(vdev, info.index);
> > > @@ -538,11 +553,38 @@ static void vfio_pci_remove(struct pci_dev *pdev)
> > >   kfree(vdev);
> > >  }
> > >
> > > +static pci_ers_

RE: [PATCH v2 2/3] VFIO-AER: Vfio-pci driver changes for supporting AER

2013-02-01 Thread Pandarathil, Vijaymohan R


 -Original Message-
 From: Alex Williamson [mailto:alex.william...@redhat.com]
 Sent: Tuesday, January 29, 2013 5:25 AM
 To: Pandarathil, Vijaymohan R
 Cc: Gleb Natapov; Bjorn Helgaas; Blue Swirl; Ortiz, Lance E;
 k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org;
 linux-kernel@vger.kernel.org
 Subject: Re: [PATCH v2 2/3] VFIO-AER: Vfio-pci driver changes for
 supporting AER
 
 On Mon, 2013-01-28 at 12:31 -0700, Alex Williamson wrote:
  On Mon, 2013-01-28 at 09:54 +, Pandarathil, Vijaymohan R wrote:
 - New VFIO_SET_IRQ ioctl option to pass the eventfd that is signalled
 when
 an error occurs in the vfio_pci_device
  
 - Register pci_error_handler for the vfio_pci driver
  
 - When the device encounters an error, the error handler registered
 by
 the vfio_pci driver gets invoked by the AER infrastructure
  
 - In the error handler, signal the eventfd registered for the device.
  
 - This results in the qemu eventfd handler getting invoked and
 appropriate action taken for the guest.
  
   Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
   ---
drivers/vfio/pci/vfio_pci.c | 44
 -
drivers/vfio/pci/vfio_pci_intrs.c   | 32 +++
drivers/vfio/pci/vfio_pci_private.h |  1 +
include/uapi/linux/vfio.h   |  3 +++
4 files changed, 79 insertions(+), 1 deletion(-)
  
   diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
   index b28e66c..ff2a078 100644
   --- a/drivers/vfio/pci/vfio_pci.c
   +++ b/drivers/vfio/pci/vfio_pci.c
   @@ -196,7 +196,9 @@ static int vfio_pci_get_irq_count(struct
 vfio_pci_device *vdev, int irq_type)
  
 return (flags  PCI_MSIX_FLAGS_QSIZE) + 1;
 }
   - }
   + } else if (irq_type == VFIO_PCI_ERR_IRQ_INDEX)
   + if (pci_is_pcie(vdev-pdev))
   + return 1;
  
 return 0;
}
   @@ -223,9 +225,18 @@ static long vfio_pci_ioctl(void *device_data,
 if (vdev-reset_works)
 info.flags |= VFIO_DEVICE_FLAGS_RESET;
  
   + if (pci_is_pcie(vdev-pdev)) {
   + info.flags |= VFIO_DEVICE_FLAGS_PCI_AER;
   + info.flags |= VFIO_DEVICE_FLAGS_PCI_AER_NOTIFY;
 
  Not sure this second flag should be AER specific or if it's even needed,
  see below for more comments on this.
 
   + }
   +
 info.num_regions = VFIO_PCI_NUM_REGIONS;
 info.num_irqs = VFIO_PCI_NUM_IRQS;
  
   + /* Expose only implemented IRQs */
   + if (!(info.flags  VFIO_DEVICE_FLAGS_PCI_AER_NOTIFY))
   + info.num_irqs--;
 
  I'm having second thoughts on this, see further below.
 
   +
 return copy_to_user((void __user *)arg, info, minsz);
  
 } else if (cmd == VFIO_DEVICE_GET_REGION_INFO) {
   @@ -302,6 +313,10 @@ static long vfio_pci_ioctl(void *device_data,
 if (info.argsz  minsz || info.index = VFIO_PCI_NUM_IRQS)
 return -EINVAL;
  
   + if ((info.index == VFIO_PCI_ERR_IRQ_INDEX) 
   +  !pci_is_pcie(vdev-pdev))
   + return -EINVAL;
   +
 
  Perhaps we could incorporate the index test above this too?
 
  switch (info.index) {
  case VFIO_PCI_INTX_IRQ_INDEX: ... VFIO_PCI_MSIX_IRQ_INDEX:
  break;
  case VFIO_PCI_ERR_IRQ_INDEX:
  if (pci_is_pcie(vdev-pdev))
  break;
  default:
  return -EINVAL;
  }
 
  This is more similar to how I've re-written the same for the proposed
  VGA/legacy I/O support.
 
 info.flags = VFIO_IRQ_INFO_EVENTFD;
  
 info.count = vfio_pci_get_irq_count(vdev, info.index);
   @@ -538,11 +553,38 @@ static void vfio_pci_remove(struct pci_dev *pdev)
 kfree(vdev);
}
  
   +static pci_ers_result_t vfio_err_detected(struct pci_dev *pdev,
   + pci_channel_state_t state)
 
  This is actually AER specific, right?  So perhaps it should be
  vfio_pci_aer_err_detected?
 
  Also, please follow existing whitespace usage throughout, tabs followed
  by spaces to align function parameter wrap.
 
   +{
   + struct vfio_pci_device *vpdev;
   + void *vdev;
 
  struct vfio_device *vdev;
 
   +
   + vdev = vfio_device_get_from_dev(pdev-dev);
   + if (vdev == NULL)
   + return PCI_ERS_RESULT_DISCONNECT;
   +
   + vpdev = vfio_device_data(vdev);
   + if (vpdev == NULL)
   + return PCI_ERS_RESULT_DISCONNECT;
   +
   + if (vpdev-err_trigger)
   + eventfd_signal(vpdev-err_trigger, 1);
   +
   + vfio_device_put_vdev(vdev);
   +
   + return PCI_ERS_RESULT_CAN_RECOVER;
   +}
   +
   +static const struct pci_error_handlers vfio_err_handlers = {
   + .error_detected = vfio_err_detected,
   +};
   +
static struct pci_driver vfio_pci_driver = {
 .name   = vfio-pci,
 .id_table   = NULL, /* only dynamic ids */
 .probe  = vfio_pci_probe

[PATCH v2 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-01-28 Thread Pandarathil, Vijaymohan R

- Create eventfd per vfio device assigned to a guest and register an
  event handler

- This fd is passed to the vfio_pci driver through the SET_IRQ ioctl

- When the device encounters an error, the eventfd is signalled
  and the qemu eventfd handler gets invoked.

- In the handler decide what action to take. Current action taken
  is to terminate the guest.

Signed-off-by: Vijay Mohan Pandarathil 
---
 hw/vfio_pci.c  | 106 +
 linux-headers/linux/vfio.h |   3 ++
 2 files changed, 109 insertions(+)

diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
index c51ae67..c4a6aec 100644
--- a/hw/vfio_pci.c
+++ b/hw/vfio_pci.c
@@ -130,6 +130,8 @@ typedef struct VFIODevice {
 QLIST_ENTRY(VFIODevice) next;
 struct VFIOGroup *group;
 bool reset_works;
+bool pci_aer_notify_cap;
+EventNotifier err_notifier;
 } VFIODevice;
 
 typedef struct VFIOGroup {
@@ -1837,6 +1839,10 @@ static int vfio_get_device(VFIOGroup *group, const char 
*name, VFIODevice *vdev)
 error_report("Warning, device %s does not support reset\n", name);
 }
 
+vdev->pci_aer_notify_cap = !!(dev_info.flags &
+(VFIO_DEVICE_FLAGS_PCI_AER |
+VFIO_DEVICE_FLAGS_PCI_AER_NOTIFY));
+
 if (dev_info.num_regions < VFIO_PCI_CONFIG_REGION_INDEX + 1) {
 error_report("vfio: unexpected number of io regions %u\n",
  dev_info.num_regions);
@@ -1922,6 +1928,103 @@ static void vfio_put_device(VFIODevice *vdev)
 }
 }
 
+static void vfio_err_notifier_handler(void *opaque)
+{
+VFIODevice *vdev = opaque;
+
+if (!event_notifier_test_and_clear(>err_notifier)) {
+return;
+}
+
+/*
+ * TBD. Retrieve the error details and decide what action
+ * needs to be taken. One of the actions could be to pass
+ * the error to the guest and have the guest driver recover
+ * from the error. This requires that PCIe capabilities be
+ * exposed to the guest. At present, we just terminate the
+ * guest to contain the error.
+ */
+
+error_report("%s (%04x:%02x:%02x.%x)"
+"Unrecoverable error detected... Terminating guest\n",
+__func__, vdev->host.domain, vdev->host.bus,
+vdev->host.slot, vdev->host.function);
+
+hw_error("(%04x:%02x:%02x.%x) Unrecoverable device error\n",
+vdev->host.domain, vdev->host.bus,
+vdev->host.slot, vdev->host.function);
+
+return;
+}
+
+static void vfio_register_err_notifier(VFIODevice *vdev)
+{
+int ret;
+int argsz;
+struct vfio_irq_set *irq_set;
+int32_t *pfd;
+
+if (!(vdev->pci_aer_notify_cap)) {
+DPRINTF("vfio: Error notification not supported for the device\n");
+return;
+}
+if (event_notifier_init(>err_notifier, 0)) {
+error_report("vfio: Warning: Unable to init event notifier for error 
detection\n");
+return;
+}
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+
+irq_set = g_malloc0(argsz);
+irq_set->argsz = argsz;
+irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
+ VFIO_IRQ_SET_ACTION_TRIGGER;
+irq_set->index = VFIO_PCI_ERR_IRQ_INDEX;
+irq_set->start = 0;
+irq_set->count = 1;
+pfd = (int32_t *)_set->data;
+
+*pfd = event_notifier_get_fd(>err_notifier);
+qemu_set_fd_handler(*pfd, vfio_err_notifier_handler, NULL, vdev);
+
+ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+if (ret) {
+error_report("vfio: Warning: Failed to setup error fd: %d\n", ret);
+qemu_set_fd_handler(*pfd, NULL, NULL, vdev);
+event_notifier_cleanup(>err_notifier);
+}
+g_free(irq_set);
+return;
+}
+static void vfio_unregister_err_notifier(VFIODevice *vdev)
+{
+int argsz;
+struct vfio_irq_set *irq_set;
+int32_t *pfd;
+int ret;
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+
+irq_set = g_malloc0(argsz);
+irq_set->argsz = argsz;
+irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
+ VFIO_IRQ_SET_ACTION_TRIGGER;
+irq_set->index = VFIO_PCI_ERR_IRQ_INDEX;
+irq_set->start = 0;
+irq_set->count = 1;
+pfd = (int32_t *)_set->data;
+*pfd = -1;
+
+ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+if (ret) {
+DPRINTF("vfio: Failed to de-assign error fd: %d\n", ret);
+}
+g_free(irq_set);
+qemu_set_fd_handler(event_notifier_get_fd(>err_notifier),
+NULL, NULL, vdev);
+event_notifier_cleanup(>err_notifier);
+return;
+}
 static int vfio_initfn(PCIDevice *pdev)
 {
 VFIODevice *pvdev, *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
@@ -2032,6 +2135,8 @@ static int vfio_initfn(PCIDevice *pdev)
 }
 }
 
+vfio_register_err_notifier(vdev);
+
 return 0;
 
 out_teardown:
@@ -2049,6 +2154,7 @@ static void vfio_exitfn(PCIDevice *pdev)
 VFIODevice *vdev = 

[PATCH v2 2/3] VFIO-AER: Vfio-pci driver changes for supporting AER

2013-01-28 Thread Pandarathil, Vijaymohan R
 
- New VFIO_SET_IRQ ioctl option to pass the eventfd that is signalled 
when
  an error occurs in the vfio_pci_device

- Register pci_error_handler for the vfio_pci driver

- When the device encounters an error, the error handler registered by
  the vfio_pci driver gets invoked by the AER infrastructure

- In the error handler, signal the eventfd registered for the device.

- This results in the qemu eventfd handler getting invoked and
  appropriate action taken for the guest.

Signed-off-by: Vijay Mohan Pandarathil 
---
 drivers/vfio/pci/vfio_pci.c | 44 -
 drivers/vfio/pci/vfio_pci_intrs.c   | 32 +++
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 include/uapi/linux/vfio.h   |  3 +++
 4 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index b28e66c..ff2a078 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -196,7 +196,9 @@ static int vfio_pci_get_irq_count(struct vfio_pci_device 
*vdev, int irq_type)
 
return (flags & PCI_MSIX_FLAGS_QSIZE) + 1;
}
-   }
+   } else if (irq_type == VFIO_PCI_ERR_IRQ_INDEX)
+   if (pci_is_pcie(vdev->pdev))
+   return 1;
 
return 0;
 }
@@ -223,9 +225,18 @@ static long vfio_pci_ioctl(void *device_data,
if (vdev->reset_works)
info.flags |= VFIO_DEVICE_FLAGS_RESET;
 
+   if (pci_is_pcie(vdev->pdev)) {
+   info.flags |= VFIO_DEVICE_FLAGS_PCI_AER;
+   info.flags |= VFIO_DEVICE_FLAGS_PCI_AER_NOTIFY;
+   }
+
info.num_regions = VFIO_PCI_NUM_REGIONS;
info.num_irqs = VFIO_PCI_NUM_IRQS;
 
+   /* Expose only implemented IRQs */
+   if (!(info.flags & VFIO_DEVICE_FLAGS_PCI_AER_NOTIFY))
+   info.num_irqs--;
+
return copy_to_user((void __user *)arg, , minsz);
 
} else if (cmd == VFIO_DEVICE_GET_REGION_INFO) {
@@ -302,6 +313,10 @@ static long vfio_pci_ioctl(void *device_data,
if (info.argsz < minsz || info.index >= VFIO_PCI_NUM_IRQS)
return -EINVAL;
 
+   if ((info.index == VFIO_PCI_ERR_IRQ_INDEX) &&
+!pci_is_pcie(vdev->pdev))
+   return -EINVAL;
+
info.flags = VFIO_IRQ_INFO_EVENTFD;
 
info.count = vfio_pci_get_irq_count(vdev, info.index);
@@ -538,11 +553,38 @@ static void vfio_pci_remove(struct pci_dev *pdev)
kfree(vdev);
 }
 
+static pci_ers_result_t vfio_err_detected(struct pci_dev *pdev,
+   pci_channel_state_t state)
+{
+   struct vfio_pci_device *vpdev;
+   void *vdev;
+
+   vdev = vfio_device_get_from_dev(>dev);
+   if (vdev == NULL)
+   return PCI_ERS_RESULT_DISCONNECT;
+
+   vpdev = vfio_device_data(vdev);
+   if (vpdev == NULL)
+   return PCI_ERS_RESULT_DISCONNECT;
+
+   if (vpdev->err_trigger)
+   eventfd_signal(vpdev->err_trigger, 1);
+
+   vfio_device_put_vdev(vdev);
+
+   return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+static const struct pci_error_handlers vfio_err_handlers = {
+   .error_detected = vfio_err_detected,
+};
+
 static struct pci_driver vfio_pci_driver = {
.name   = "vfio-pci",
.id_table   = NULL, /* only dynamic ids */
.probe  = vfio_pci_probe,
.remove = vfio_pci_remove,
+   .err_handler= _err_handlers,
 };
 
 static void __exit vfio_pci_cleanup(void)
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c 
b/drivers/vfio/pci/vfio_pci_intrs.c
index 3639371..f003e08 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -745,6 +745,31 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_device 
*vdev,
return 0;
 }
 
+static int vfio_pci_set_err_eventfd(struct vfio_pci_device *vdev,
+   unsigned index, unsigned start,
+   unsigned count, uint32_t flags, void *data)
+{
+   if ((index == VFIO_PCI_ERR_IRQ_INDEX) &&
+   (flags & VFIO_IRQ_SET_DATA_EVENTFD)   &&
+   pci_is_pcie(vdev->pdev)) {
+
+   int32_t fd = *(int32_t *)data;
+
+   if (fd == -1) {
+   if (vdev->err_trigger)
+   eventfd_ctx_put(vdev->err_trigger);
+   vdev->err_trigger = NULL;
+   return 0;
+   } else if (fd >= 0) {
+   vdev->err_trigger = eventfd_ctx_fdget(fd);
+   if (IS_ERR(vdev->err_trigger))
+   return PTR_ERR(vdev->err_trigger);
+   return 0;
+   } 

[PATCH v2 1/3] VFIO: Wrappers for getting/putting reference to vfio_device

2013-01-28 Thread Pandarathil, Vijaymohan R

- Added vfio_device_get_from_vdev(), vfio_device_put_vdev()
  as wrappers to get/put reference to vfio_device from struct device.

- Added vfio_device_data() as a wrapper to get device_data from
  vfio_device.

Signed-off-by: Vijay Mohan Pandarathil 
---
 drivers/vfio/vfio.c  | 47 +--
 include/linux/vfio.h |  3 +++
 2 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 12c264d..c2ff1b2 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -642,8 +642,13 @@ int vfio_add_group_dev(struct device *dev,
 }
 EXPORT_SYMBOL_GPL(vfio_add_group_dev);
 
-/* Test whether a struct device is present in our tracking */
-static bool vfio_dev_present(struct device *dev)
+/**
+ * This does a get on the corresponding iommu_group,
+ * vfio_group and the vfio_device. Callers of this
+ * function will hae to call vfio_put_vdev() to
+ * remove the reference to all objects.
+ */
+void *vfio_device_get_from_dev(struct device *dev)
 {
struct iommu_group *iommu_group;
struct vfio_group *group;
@@ -651,25 +656,55 @@ static bool vfio_dev_present(struct device *dev)
 
iommu_group = iommu_group_get(dev);
if (!iommu_group)
-   return false;
+   return NULL;
 
group = vfio_group_get_from_iommu(iommu_group);
if (!group) {
iommu_group_put(iommu_group);
-   return false;
+   return NULL;
}
 
device = vfio_group_get_device(group, dev);
if (!device) {
vfio_group_put(group);
iommu_group_put(iommu_group);
-   return false;
+   return NULL;
}
+   return device;
+}
+EXPORT_SYMBOL_GPL(vfio_device_get_from_dev);
+
+void *vfio_device_data(void *data)
+{
+   struct vfio_device *device = data;
+   return device->device_data;
+}
+EXPORT_SYMBOL_GPL(vfio_device_data);
+
+void vfio_device_put_vdev(void *data)
+{
+   struct vfio_device *device = data;
+   struct vfio_group *group = device->group;
+   struct iommu_group *iommu_group = group->iommu_group;
 
vfio_device_put(device);
vfio_group_put(group);
iommu_group_put(iommu_group);
-   return true;
+   return;
+}
+EXPORT_SYMBOL_GPL(vfio_device_put_vdev);
+
+/* Test whether a struct device is present in our tracking */
+static bool vfio_dev_present(struct device *dev)
+{
+   struct vfio_device *device;
+
+   device = vfio_device_get_from_dev(dev);
+   if (device) {
+   vfio_device_put_vdev(device);
+   return true;
+   } else
+   return false;
 }
 
 /*
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index ab9e862..e550c09 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -45,6 +45,9 @@ extern int vfio_add_group_dev(struct device *dev,
  void *device_data);
 
 extern void *vfio_del_group_dev(struct device *dev);
+extern void *vfio_device_get_from_dev(struct device *dev);
+extern void vfio_device_put_vdev(void *device);
+extern void *vfio_device_data(void *device);
 
 /**
  * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/3] AER-KVM: Error containment of VFIO devices assigned to KVM guests

2013-01-28 Thread Pandarathil, Vijaymohan R
Add support for error containment when a VFIO device assigned to a KVM
guest encounters an error. This is for PCIe devices/drivers that support AER
functionality. When the host OS is notified of an error in a device either
through the firmware first approach or through an interrupt handled by the AER
root port driver, the error handler registered by the vfio-pci driver gets
invoked. The qemu process is signaled through an eventfd registered per
VFIO device by the qemu process. In the eventfd handler, qemu decides on
what action to take. In this implementation, guest is brought down to
contain the error.

v2:
 - Rebased to latest upstream stable bits
 - Changed the new ioctl to be part of VFIO_SET_IRQs ioctl
 - Added a new patch to get/put reference to a vfio device from struct device
 - Incorporated all other feedback.

---
Vijay Mohan Pandarathil(3):

[PATCH 1/3] VFIO: Wrappers for getting/putting reference to vfio_device
[PATCH 2/3] VFIO-AER: Vfio-pci driver changes for supporting AER
[PATCH 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

Kernel files changed

 drivers/vfio/vfio.c  | 47 +--
 include/linux/vfio.h |  3 +++
 2 files changed, 44 insertions(+), 6 deletions(-)

 drivers/vfio/pci/vfio_pci.c | 44 -
 drivers/vfio/pci/vfio_pci_intrs.c   | 32 +++
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 include/uapi/linux/vfio.h   |  3 +++
 4 files changed, 79 insertions(+), 1 deletion(-)


Qemu files changed

 hw/vfio_pci.c  | 106 +
 linux-headers/linux/vfio.h |   3 ++
 2 files changed, 109 insertions(+)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/3] AER-KVM: Error containment of VFIO devices assigned to KVM guests

2013-01-28 Thread Pandarathil, Vijaymohan R
Add support for error containment when a VFIO device assigned to a KVM
guest encounters an error. This is for PCIe devices/drivers that support AER
functionality. When the host OS is notified of an error in a device either
through the firmware first approach or through an interrupt handled by the AER
root port driver, the error handler registered by the vfio-pci driver gets
invoked. The qemu process is signaled through an eventfd registered per
VFIO device by the qemu process. In the eventfd handler, qemu decides on
what action to take. In this implementation, guest is brought down to
contain the error.

v2:
 - Rebased to latest upstream stable bits
 - Changed the new ioctl to be part of VFIO_SET_IRQs ioctl
 - Added a new patch to get/put reference to a vfio device from struct device
 - Incorporated all other feedback.

---
Vijay Mohan Pandarathil(3):

[PATCH 1/3] VFIO: Wrappers for getting/putting reference to vfio_device
[PATCH 2/3] VFIO-AER: Vfio-pci driver changes for supporting AER
[PATCH 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

Kernel files changed

 drivers/vfio/vfio.c  | 47 +--
 include/linux/vfio.h |  3 +++
 2 files changed, 44 insertions(+), 6 deletions(-)

 drivers/vfio/pci/vfio_pci.c | 44 -
 drivers/vfio/pci/vfio_pci_intrs.c   | 32 +++
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 include/uapi/linux/vfio.h   |  3 +++
 4 files changed, 79 insertions(+), 1 deletion(-)


Qemu files changed

 hw/vfio_pci.c  | 106 +
 linux-headers/linux/vfio.h |   3 ++
 2 files changed, 109 insertions(+)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/3] VFIO: Wrappers for getting/putting reference to vfio_device

2013-01-28 Thread Pandarathil, Vijaymohan R

- Added vfio_device_get_from_vdev(), vfio_device_put_vdev()
  as wrappers to get/put reference to vfio_device from struct device.

- Added vfio_device_data() as a wrapper to get device_data from
  vfio_device.

Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
---
 drivers/vfio/vfio.c  | 47 +--
 include/linux/vfio.h |  3 +++
 2 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 12c264d..c2ff1b2 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -642,8 +642,13 @@ int vfio_add_group_dev(struct device *dev,
 }
 EXPORT_SYMBOL_GPL(vfio_add_group_dev);
 
-/* Test whether a struct device is present in our tracking */
-static bool vfio_dev_present(struct device *dev)
+/**
+ * This does a get on the corresponding iommu_group,
+ * vfio_group and the vfio_device. Callers of this
+ * function will hae to call vfio_put_vdev() to
+ * remove the reference to all objects.
+ */
+void *vfio_device_get_from_dev(struct device *dev)
 {
struct iommu_group *iommu_group;
struct vfio_group *group;
@@ -651,25 +656,55 @@ static bool vfio_dev_present(struct device *dev)
 
iommu_group = iommu_group_get(dev);
if (!iommu_group)
-   return false;
+   return NULL;
 
group = vfio_group_get_from_iommu(iommu_group);
if (!group) {
iommu_group_put(iommu_group);
-   return false;
+   return NULL;
}
 
device = vfio_group_get_device(group, dev);
if (!device) {
vfio_group_put(group);
iommu_group_put(iommu_group);
-   return false;
+   return NULL;
}
+   return device;
+}
+EXPORT_SYMBOL_GPL(vfio_device_get_from_dev);
+
+void *vfio_device_data(void *data)
+{
+   struct vfio_device *device = data;
+   return device-device_data;
+}
+EXPORT_SYMBOL_GPL(vfio_device_data);
+
+void vfio_device_put_vdev(void *data)
+{
+   struct vfio_device *device = data;
+   struct vfio_group *group = device-group;
+   struct iommu_group *iommu_group = group-iommu_group;
 
vfio_device_put(device);
vfio_group_put(group);
iommu_group_put(iommu_group);
-   return true;
+   return;
+}
+EXPORT_SYMBOL_GPL(vfio_device_put_vdev);
+
+/* Test whether a struct device is present in our tracking */
+static bool vfio_dev_present(struct device *dev)
+{
+   struct vfio_device *device;
+
+   device = vfio_device_get_from_dev(dev);
+   if (device) {
+   vfio_device_put_vdev(device);
+   return true;
+   } else
+   return false;
 }
 
 /*
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index ab9e862..e550c09 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -45,6 +45,9 @@ extern int vfio_add_group_dev(struct device *dev,
  void *device_data);
 
 extern void *vfio_del_group_dev(struct device *dev);
+extern void *vfio_device_get_from_dev(struct device *dev);
+extern void vfio_device_put_vdev(void *device);
+extern void *vfio_device_data(void *device);
 
 /**
  * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/3] VFIO-AER: Vfio-pci driver changes for supporting AER

2013-01-28 Thread Pandarathil, Vijaymohan R
 
- New VFIO_SET_IRQ ioctl option to pass the eventfd that is signalled 
when
  an error occurs in the vfio_pci_device

- Register pci_error_handler for the vfio_pci driver

- When the device encounters an error, the error handler registered by
  the vfio_pci driver gets invoked by the AER infrastructure

- In the error handler, signal the eventfd registered for the device.

- This results in the qemu eventfd handler getting invoked and
  appropriate action taken for the guest.

Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
---
 drivers/vfio/pci/vfio_pci.c | 44 -
 drivers/vfio/pci/vfio_pci_intrs.c   | 32 +++
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 include/uapi/linux/vfio.h   |  3 +++
 4 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index b28e66c..ff2a078 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -196,7 +196,9 @@ static int vfio_pci_get_irq_count(struct vfio_pci_device 
*vdev, int irq_type)
 
return (flags  PCI_MSIX_FLAGS_QSIZE) + 1;
}
-   }
+   } else if (irq_type == VFIO_PCI_ERR_IRQ_INDEX)
+   if (pci_is_pcie(vdev-pdev))
+   return 1;
 
return 0;
 }
@@ -223,9 +225,18 @@ static long vfio_pci_ioctl(void *device_data,
if (vdev-reset_works)
info.flags |= VFIO_DEVICE_FLAGS_RESET;
 
+   if (pci_is_pcie(vdev-pdev)) {
+   info.flags |= VFIO_DEVICE_FLAGS_PCI_AER;
+   info.flags |= VFIO_DEVICE_FLAGS_PCI_AER_NOTIFY;
+   }
+
info.num_regions = VFIO_PCI_NUM_REGIONS;
info.num_irqs = VFIO_PCI_NUM_IRQS;
 
+   /* Expose only implemented IRQs */
+   if (!(info.flags  VFIO_DEVICE_FLAGS_PCI_AER_NOTIFY))
+   info.num_irqs--;
+
return copy_to_user((void __user *)arg, info, minsz);
 
} else if (cmd == VFIO_DEVICE_GET_REGION_INFO) {
@@ -302,6 +313,10 @@ static long vfio_pci_ioctl(void *device_data,
if (info.argsz  minsz || info.index = VFIO_PCI_NUM_IRQS)
return -EINVAL;
 
+   if ((info.index == VFIO_PCI_ERR_IRQ_INDEX) 
+!pci_is_pcie(vdev-pdev))
+   return -EINVAL;
+
info.flags = VFIO_IRQ_INFO_EVENTFD;
 
info.count = vfio_pci_get_irq_count(vdev, info.index);
@@ -538,11 +553,38 @@ static void vfio_pci_remove(struct pci_dev *pdev)
kfree(vdev);
 }
 
+static pci_ers_result_t vfio_err_detected(struct pci_dev *pdev,
+   pci_channel_state_t state)
+{
+   struct vfio_pci_device *vpdev;
+   void *vdev;
+
+   vdev = vfio_device_get_from_dev(pdev-dev);
+   if (vdev == NULL)
+   return PCI_ERS_RESULT_DISCONNECT;
+
+   vpdev = vfio_device_data(vdev);
+   if (vpdev == NULL)
+   return PCI_ERS_RESULT_DISCONNECT;
+
+   if (vpdev-err_trigger)
+   eventfd_signal(vpdev-err_trigger, 1);
+
+   vfio_device_put_vdev(vdev);
+
+   return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+static const struct pci_error_handlers vfio_err_handlers = {
+   .error_detected = vfio_err_detected,
+};
+
 static struct pci_driver vfio_pci_driver = {
.name   = vfio-pci,
.id_table   = NULL, /* only dynamic ids */
.probe  = vfio_pci_probe,
.remove = vfio_pci_remove,
+   .err_handler= vfio_err_handlers,
 };
 
 static void __exit vfio_pci_cleanup(void)
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c 
b/drivers/vfio/pci/vfio_pci_intrs.c
index 3639371..f003e08 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -745,6 +745,31 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_device 
*vdev,
return 0;
 }
 
+static int vfio_pci_set_err_eventfd(struct vfio_pci_device *vdev,
+   unsigned index, unsigned start,
+   unsigned count, uint32_t flags, void *data)
+{
+   if ((index == VFIO_PCI_ERR_IRQ_INDEX) 
+   (flags  VFIO_IRQ_SET_DATA_EVENTFD)   
+   pci_is_pcie(vdev-pdev)) {
+
+   int32_t fd = *(int32_t *)data;
+
+   if (fd == -1) {
+   if (vdev-err_trigger)
+   eventfd_ctx_put(vdev-err_trigger);
+   vdev-err_trigger = NULL;
+   return 0;
+   } else if (fd = 0) {
+   vdev-err_trigger = eventfd_ctx_fdget(fd);
+   if (IS_ERR(vdev-err_trigger))
+   return PTR_ERR(vdev-err_trigger);
+   return 0;
+  

[PATCH v2 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-01-28 Thread Pandarathil, Vijaymohan R

- Create eventfd per vfio device assigned to a guest and register an
  event handler

- This fd is passed to the vfio_pci driver through the SET_IRQ ioctl

- When the device encounters an error, the eventfd is signalled
  and the qemu eventfd handler gets invoked.

- In the handler decide what action to take. Current action taken
  is to terminate the guest.

Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
---
 hw/vfio_pci.c  | 106 +
 linux-headers/linux/vfio.h |   3 ++
 2 files changed, 109 insertions(+)

diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
index c51ae67..c4a6aec 100644
--- a/hw/vfio_pci.c
+++ b/hw/vfio_pci.c
@@ -130,6 +130,8 @@ typedef struct VFIODevice {
 QLIST_ENTRY(VFIODevice) next;
 struct VFIOGroup *group;
 bool reset_works;
+bool pci_aer_notify_cap;
+EventNotifier err_notifier;
 } VFIODevice;
 
 typedef struct VFIOGroup {
@@ -1837,6 +1839,10 @@ static int vfio_get_device(VFIOGroup *group, const char 
*name, VFIODevice *vdev)
 error_report(Warning, device %s does not support reset\n, name);
 }
 
+vdev-pci_aer_notify_cap = !!(dev_info.flags 
+(VFIO_DEVICE_FLAGS_PCI_AER |
+VFIO_DEVICE_FLAGS_PCI_AER_NOTIFY));
+
 if (dev_info.num_regions  VFIO_PCI_CONFIG_REGION_INDEX + 1) {
 error_report(vfio: unexpected number of io regions %u\n,
  dev_info.num_regions);
@@ -1922,6 +1928,103 @@ static void vfio_put_device(VFIODevice *vdev)
 }
 }
 
+static void vfio_err_notifier_handler(void *opaque)
+{
+VFIODevice *vdev = opaque;
+
+if (!event_notifier_test_and_clear(vdev-err_notifier)) {
+return;
+}
+
+/*
+ * TBD. Retrieve the error details and decide what action
+ * needs to be taken. One of the actions could be to pass
+ * the error to the guest and have the guest driver recover
+ * from the error. This requires that PCIe capabilities be
+ * exposed to the guest. At present, we just terminate the
+ * guest to contain the error.
+ */
+
+error_report(%s (%04x:%02x:%02x.%x)
+Unrecoverable error detected... Terminating guest\n,
+__func__, vdev-host.domain, vdev-host.bus,
+vdev-host.slot, vdev-host.function);
+
+hw_error((%04x:%02x:%02x.%x) Unrecoverable device error\n,
+vdev-host.domain, vdev-host.bus,
+vdev-host.slot, vdev-host.function);
+
+return;
+}
+
+static void vfio_register_err_notifier(VFIODevice *vdev)
+{
+int ret;
+int argsz;
+struct vfio_irq_set *irq_set;
+int32_t *pfd;
+
+if (!(vdev-pci_aer_notify_cap)) {
+DPRINTF(vfio: Error notification not supported for the device\n);
+return;
+}
+if (event_notifier_init(vdev-err_notifier, 0)) {
+error_report(vfio: Warning: Unable to init event notifier for error 
detection\n);
+return;
+}
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+
+irq_set = g_malloc0(argsz);
+irq_set-argsz = argsz;
+irq_set-flags = VFIO_IRQ_SET_DATA_EVENTFD |
+ VFIO_IRQ_SET_ACTION_TRIGGER;
+irq_set-index = VFIO_PCI_ERR_IRQ_INDEX;
+irq_set-start = 0;
+irq_set-count = 1;
+pfd = (int32_t *)irq_set-data;
+
+*pfd = event_notifier_get_fd(vdev-err_notifier);
+qemu_set_fd_handler(*pfd, vfio_err_notifier_handler, NULL, vdev);
+
+ret = ioctl(vdev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
+if (ret) {
+error_report(vfio: Warning: Failed to setup error fd: %d\n, ret);
+qemu_set_fd_handler(*pfd, NULL, NULL, vdev);
+event_notifier_cleanup(vdev-err_notifier);
+}
+g_free(irq_set);
+return;
+}
+static void vfio_unregister_err_notifier(VFIODevice *vdev)
+{
+int argsz;
+struct vfio_irq_set *irq_set;
+int32_t *pfd;
+int ret;
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+
+irq_set = g_malloc0(argsz);
+irq_set-argsz = argsz;
+irq_set-flags = VFIO_IRQ_SET_DATA_EVENTFD |
+ VFIO_IRQ_SET_ACTION_TRIGGER;
+irq_set-index = VFIO_PCI_ERR_IRQ_INDEX;
+irq_set-start = 0;
+irq_set-count = 1;
+pfd = (int32_t *)irq_set-data;
+*pfd = -1;
+
+ret = ioctl(vdev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
+if (ret) {
+DPRINTF(vfio: Failed to de-assign error fd: %d\n, ret);
+}
+g_free(irq_set);
+qemu_set_fd_handler(event_notifier_get_fd(vdev-err_notifier),
+NULL, NULL, vdev);
+event_notifier_cleanup(vdev-err_notifier);
+return;
+}
 static int vfio_initfn(PCIDevice *pdev)
 {
 VFIODevice *pvdev, *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
@@ -2032,6 +2135,8 @@ static int vfio_initfn(PCIDevice *pdev)
 }
 }
 
+vfio_register_err_notifier(vdev);
+
 return 0;
 
 out_teardown:
@@ -2049,6 +2154,7 @@ static void vfio_exitfn(PCIDevice *pdev)
 VFIODevice *vdev = 

RE: [Qemu-devel] [PATCH 2/2] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-01-11 Thread Pandarathil, Vijaymohan R


> -Original Message-
> From: Blue Swirl [mailto:blauwir...@gmail.com]
> Sent: Wednesday, January 09, 2013 1:23 PM
> To: Pandarathil, Vijaymohan R
> Cc: Alex Williamson; Bjorn Helgaas; Gleb Natapov; linux-
> p...@vger.kernel.org; qemu-de...@nongnu.org; k...@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Subject: Re: [Qemu-devel] [PATCH 2/2] QEMU-AER: Qemu changes to support AER
> for VFIO-PCI devices
> 
> On Wed, Jan 9, 2013 at 6:26 AM, Pandarathil, Vijaymohan R
>  wrote:
> > - Create eventfd per vfio device assigned to a guest and register
> an
> >   event handler
> >
> > - This fd is passed to the vfio_pci driver through a new ioctl
> >
> > - When the device encounters an error, the eventfd is signaled
> >   and the qemu eventfd handler gets invoked.
> >
> > - In the handler decide what action to take. Current action taken
> >   is to terminate the guest.
> >
> > Signed-off-by: Vijay Mohan Pandarathil 
> > ---
> >  hw/vfio_pci.c  | 56
> ++
> >  linux-headers/linux/vfio.h |  9 
> >  2 files changed, 65 insertions(+)
> >
> > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
> > index 28c8303..9c3c28b 100644
> > --- a/hw/vfio_pci.c
> > +++ b/hw/vfio_pci.c
> > @@ -38,6 +38,7 @@
> >  #include "qemu/error-report.h"
> >  #include "qemu/queue.h"
> >  #include "qemu/range.h"
> > +#include "sysemu/sysemu.h"
> >
> >  /* #define DEBUG_VFIO */
> >  #ifdef DEBUG_VFIO
> > @@ -130,6 +131,8 @@ typedef struct VFIODevice {
> >  QLIST_ENTRY(VFIODevice) next;
> >  struct VFIOGroup *group;
> >  bool reset_works;
> > +EventNotifier errfd;
> > +__u32 dev_info_flags;
> 
> QEMU is not kernel code, please use uint32_t.

As you saw later in the code, I was using the same type in the structure 
copied from the kernel. Will change this to uint32_t.

Thanks

Vijay

> 
> >  } VFIODevice;
> >
> >  typedef struct VFIOGroup {
> > @@ -1805,6 +1808,8 @@ static int vfio_get_device(VFIOGroup *group, const
> char *name, VFIODevice *vdev)
> >  DPRINTF("Device %s flags: %u, regions: %u, irgs: %u\n", name,
> >  dev_info.flags, dev_info.num_regions, dev_info.num_irqs);
> >
> > +vdev->dev_info_flags = dev_info.flags;
> > +
> >  if (!(dev_info.flags & VFIO_DEVICE_FLAGS_PCI)) {
> >  error_report("vfio: Um, this isn't a PCI device\n");
> >  goto error;
> > @@ -1900,6 +1905,55 @@ static void vfio_put_device(VFIODevice *vdev)
> >  }
> >  }
> >
> > +static void vfio_errfd_handler(void *opaque)
> > +{
> > +VFIODevice *vdev = opaque;
> > +
> > +if (!event_notifier_test_and_clear(>errfd)) {
> > +return;
> > +}
> > +
> > +/*
> > + * TBD. Retrieve the error details and decide what action
> > + * needs to be taken. One of the actions could be to pass
> > + * the error to the guest and have the guest driver recover
> > + * the error. This requires that PCIe capabilities be
> > + * exposed to the guest. At present, we just terminate the
> > + * guest to contain the error.
> > + */
> > +error_report("%s(%04x:%02x:%02x.%x) "
> > +"Unrecoverable error detected... Terminating guest\n",
> > +__func__, vdev->host.domain, vdev->host.bus, vdev->host.slot,
> > +vdev->host.function);
> > +
> > +qemu_system_shutdown_request();
> > +return;
> > +}
> > +
> > +static void vfio_register_errfd(VFIODevice *vdev)
> > +{
> > +int32_t pfd;
> > +int ret;
> > +
> > +if (!(vdev->dev_info_flags & VFIO_DEVICE_FLAGS_AER_NOTIFY)) {
> > +error_report("vfio: Warning: Error notification not supported
> for the device\n");
> > +return;
> > +}
> > +if (event_notifier_init(>errfd, 0)) {
> > +error_report("vfio: Warning: Unable to init event notifier for
> error detection\n");
> > +return;
> > +}
> > +pfd = event_notifier_get_fd(>errfd);
> > +qemu_set_fd_handler(pfd, vfio_errfd_handler, NULL, vdev);
> > +
> > +ret = ioctl(vdev->fd, VFIO_DEVICE_SET_ERRFD, pfd);
> > +if (ret) {
> > +error_report("vfio: Warning: Failed to setup error fd: %d\n",
> ret);
> > +

RE: [PATCH 2/2] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-01-11 Thread Pandarathil, Vijaymohan R


> -Original Message-
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Wednesday, January 09, 2013 10:08 AM
> To: Pandarathil, Vijaymohan R
> Cc: Bjorn Helgaas; Gleb Natapov; k...@vger.kernel.org; qemu-
> de...@nongnu.org; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 2/2] QEMU-AER: Qemu changes to support AER for VFIO-PCI
> devices
> 
> On Wed, 2013-01-09 at 06:26 +0000, Pandarathil, Vijaymohan R wrote:
> > - Create eventfd per vfio device assigned to a guest and register an
> >   event handler
> >
> > - This fd is passed to the vfio_pci driver through a new ioctl
> >
> > - When the device encounters an error, the eventfd is signaled
> >   and the qemu eventfd handler gets invoked.
> >
> > - In the handler decide what action to take. Current action taken
> >   is to terminate the guest.
> >
> > Signed-off-by: Vijay Mohan Pandarathil 
> > ---
> >  hw/vfio_pci.c  | 56
> ++
> >  linux-headers/linux/vfio.h |  9 
> >  2 files changed, 65 insertions(+)
> >
> > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
> > index 28c8303..9c3c28b 100644
> > --- a/hw/vfio_pci.c
> > +++ b/hw/vfio_pci.c
> > @@ -38,6 +38,7 @@
> >  #include "qemu/error-report.h"
> >  #include "qemu/queue.h"
> >  #include "qemu/range.h"
> > +#include "sysemu/sysemu.h"
> >
> >  /* #define DEBUG_VFIO */
> >  #ifdef DEBUG_VFIO
> > @@ -130,6 +131,8 @@ typedef struct VFIODevice {
> >  QLIST_ENTRY(VFIODevice) next;
> >  struct VFIOGroup *group;
> >  bool reset_works;
> > +EventNotifier errfd;
> 
> Misleading name, it's an EventNotifier not an fd.

Will make the change.

> 
> > +__u32 dev_info_flags;
> >  } VFIODevice;
> >
> >  typedef struct VFIOGroup {
> > @@ -1805,6 +1808,8 @@ static int vfio_get_device(VFIOGroup *group, const
> char *name, VFIODevice *vdev)
> >  DPRINTF("Device %s flags: %u, regions: %u, irgs: %u\n", name,
> >  dev_info.flags, dev_info.num_regions, dev_info.num_irqs);
> >
> > +vdev->dev_info_flags = dev_info.flags;
> > +
> 
> We test the VFIO_DEVICE_FLAGS_RESET elsewhere by caching the bit into a
> variable, why not do that here too?

I was wondering if there was any specific reason to cache the bit into a 
separate variable. Wouldn't a flags field which can contain the specific bit be 
more apt ?

> 
> >  if (!(dev_info.flags & VFIO_DEVICE_FLAGS_PCI)) {
> >  error_report("vfio: Um, this isn't a PCI device\n");
> >  goto error;
> > @@ -1900,6 +1905,55 @@ static void vfio_put_device(VFIODevice *vdev)
> >  }
> >  }
> >
> > +static void vfio_errfd_handler(void *opaque)
> > +{
> > +VFIODevice *vdev = opaque;
> > +
> > +if (!event_notifier_test_and_clear(>errfd)) {
> > +return;
> > +}
> > +
> > +/*
> > + * TBD. Retrieve the error details and decide what action
> > + * needs to be taken. One of the actions could be to pass
> > + * the error to the guest and have the guest driver recover
> > + * the error. This requires that PCIe capabilities be
> > + * exposed to the guest. At present, we just terminate the
> > + * guest to contain the error.
> > + */
> > +error_report("%s(%04x:%02x:%02x.%x) "
> > +"Unrecoverable error detected... Terminating guest\n",
> > +__func__, vdev->host.domain, vdev->host.bus, vdev->host.slot,
> > +vdev->host.function);
> > +
> > +qemu_system_shutdown_request();
> 
> I would have figured hw_error

Hw_error() is probably more appropriate. Will make the change.

> 
> > +return;
> > +}
> > +
> > +static void vfio_register_errfd(VFIODevice *vdev)
> > +{
> > +int32_t pfd;
> 
> "pfd" is used elsewhere in vfio as "Pointer to FD", this is just a fd.

Will change.

> 
> > +int ret;
> > +
> > +if (!(vdev->dev_info_flags & VFIO_DEVICE_FLAGS_AER_NOTIFY)) {
> > +error_report("vfio: Warning: Error notification not supported
> for the device\n");
> 
> This should probably just be a debug print.

Okay.

> 
> > +return;
> > +}
> > +if (event_notifier_init(>errfd, 0)) {
> > +error_report("vfio: Warning: Unable to init event notifier for
>

RE: [PATCH 1/2] VFIO-AER: Vfio-pci driver changes for supporting AER

2013-01-11 Thread Pandarathil, Vijaymohan R


> -Original Message-
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Wednesday, January 09, 2013 11:05 AM
> To: Pandarathil, Vijaymohan R
> Cc: Bjorn Helgaas; Gleb Natapov; k...@vger.kernel.org; qemu-
> de...@nongnu.org; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 1/2] VFIO-AER: Vfio-pci driver changes for supporting
> AER
> 
> On Wed, 2013-01-09 at 10:52 -0700, Alex Williamson wrote:
> > On Wed, 2013-01-09 at 06:26 +, Pandarathil, Vijaymohan R wrote:
> > >   - New ioctl which is used to pass the eventfd that is signaled when
> > >   an error occurs in the vfio_pci_device
> > >
> > >   - Register pci_error_handler for the vfio_pci driver
> > >
> > >   - When the device encounters an error, the error handler registered
> by
> > >   the vfio_pci driver gets invoked by the AER infrastructure
> > >
> > >   - In the error handler, signal the eventfd registered for the device.
> > >
> > >   - This results in the qemu eventfd handler getting invoked and
> > >   appropriate action taken for the guest.
> > >
> > > Signed-off-by: Vijay Mohan Pandarathil 
> > > ---
> > >  drivers/vfio/pci/vfio_pci.c | 29 +
> > >  drivers/vfio/pci/vfio_pci_private.h |  1 +
> > >  drivers/vfio/vfio.c |  8 
> > >  include/linux/vfio.h|  1 +
> > >  include/uapi/linux/vfio.h   |  9 +
> > >  5 files changed, 48 insertions(+)
> > >
> > > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> > > index 6c11994..4ae9526 100644
> > > --- a/drivers/vfio/pci/vfio_pci.c
> > > +++ b/drivers/vfio/pci/vfio_pci.c
> > > @@ -207,6 +207,8 @@ static long vfio_pci_ioctl(void *device_data,
> > >   if (vdev->reset_works)
> > >   info.flags |= VFIO_DEVICE_FLAGS_RESET;
> > >
> > > + info.flags |= VFIO_DEVICE_FLAGS_AER_NOTIFY;
> > > +
> >
> > This appears to be a PCI specific flag, so the name should include
> > _PCI_.  We also support non-PCIe devices and it seems like it would be
> > possible to not have AER support available, so shouldn't this be
> > conditional?

Will do that.

> >
> > >   info.num_regions = VFIO_PCI_NUM_REGIONS;
> > >   info.num_irqs = VFIO_PCI_NUM_IRQS;
> > >
> > > @@ -348,6 +350,19 @@ static long vfio_pci_ioctl(void *device_data,
> > >
> > >   return ret;
> > >
> > > + } else if (cmd == VFIO_DEVICE_SET_ERRFD) {
> > > + int32_t fd = (int32_t)arg;
> > > +
> > > + if (fd < 0)
> > > + return -EINVAL;
> > > +
> > > + vdev->err_trigger = eventfd_ctx_fdget(fd);
> > > +
> > > + if (IS_ERR(vdev->err_trigger))
> > > + return PTR_ERR(vdev->err_trigger);
> > > +
> > > + return 0;
> > > +
> >
> > I'm not sure why we wouldn't describe this as just another interrupt
> > from the device and configure it via SET_IRQ.  This ioctl has very
> > limited use and doesn't follow any of the conventions of all the other
> > vfio ioctls.

I thought this was a fairly simple ioctl to implement this way. Moreover, 
there is no device interrupt involved. Let me know if you really want to 
model it as a SET_IRQ ioctl.

> >
> > >   } else if (cmd == VFIO_DEVICE_RESET)
> > >   return vdev->reset_works ?
> > >   pci_reset_function(vdev->pdev) : -EINVAL;
> > > @@ -527,11 +542,25 @@ static void vfio_pci_remove(struct pci_dev *pdev)
> > >   kfree(vdev);
> > >  }
> > >
> > > +static pci_ers_result_t vfio_err_detected(struct pci_dev *pdev,
> > > + pci_channel_state_t state)
> > > +{
> > > + struct vfio_pci_device *vdev = vfio_get_vdev(>dev);
> > > +
> > > + eventfd_signal(vdev->err_trigger, 1);
> > > + return PCI_ERS_RESULT_CAN_RECOVER;
> > > +}
> >
> > What if err_trigger hasn't been set?

Will fix that.

> >
> > > +
> > > +static const struct pci_error_handlers vfio_err_handlers = {
> > > + .error_detected = vfio_err_detected,
> > > +};
> > > +
> > >  static struct pci_driver vfio_pci_driver = {
> > >   .name   = "vfio-pci",
> > >   .id_table   = NULL, /* only dyna

RE: [PATCH 1/2] VFIO-AER: Vfio-pci driver changes for supporting AER

2013-01-11 Thread Pandarathil, Vijaymohan R


 -Original Message-
 From: Alex Williamson [mailto:alex.william...@redhat.com]
 Sent: Wednesday, January 09, 2013 11:05 AM
 To: Pandarathil, Vijaymohan R
 Cc: Bjorn Helgaas; Gleb Natapov; k...@vger.kernel.org; qemu-
 de...@nongnu.org; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
 Subject: Re: [PATCH 1/2] VFIO-AER: Vfio-pci driver changes for supporting
 AER
 
 On Wed, 2013-01-09 at 10:52 -0700, Alex Williamson wrote:
  On Wed, 2013-01-09 at 06:26 +, Pandarathil, Vijaymohan R wrote:
 - New ioctl which is used to pass the eventfd that is signaled when
 an error occurs in the vfio_pci_device
  
 - Register pci_error_handler for the vfio_pci driver
  
 - When the device encounters an error, the error handler registered
 by
 the vfio_pci driver gets invoked by the AER infrastructure
  
 - In the error handler, signal the eventfd registered for the device.
  
 - This results in the qemu eventfd handler getting invoked and
 appropriate action taken for the guest.
  
   Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
   ---
drivers/vfio/pci/vfio_pci.c | 29 +
drivers/vfio/pci/vfio_pci_private.h |  1 +
drivers/vfio/vfio.c |  8 
include/linux/vfio.h|  1 +
include/uapi/linux/vfio.h   |  9 +
5 files changed, 48 insertions(+)
  
   diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
   index 6c11994..4ae9526 100644
   --- a/drivers/vfio/pci/vfio_pci.c
   +++ b/drivers/vfio/pci/vfio_pci.c
   @@ -207,6 +207,8 @@ static long vfio_pci_ioctl(void *device_data,
 if (vdev-reset_works)
 info.flags |= VFIO_DEVICE_FLAGS_RESET;
  
   + info.flags |= VFIO_DEVICE_FLAGS_AER_NOTIFY;
   +
 
  This appears to be a PCI specific flag, so the name should include
  _PCI_.  We also support non-PCIe devices and it seems like it would be
  possible to not have AER support available, so shouldn't this be
  conditional?

Will do that.

 
 info.num_regions = VFIO_PCI_NUM_REGIONS;
 info.num_irqs = VFIO_PCI_NUM_IRQS;
  
   @@ -348,6 +350,19 @@ static long vfio_pci_ioctl(void *device_data,
  
 return ret;
  
   + } else if (cmd == VFIO_DEVICE_SET_ERRFD) {
   + int32_t fd = (int32_t)arg;
   +
   + if (fd  0)
   + return -EINVAL;
   +
   + vdev-err_trigger = eventfd_ctx_fdget(fd);
   +
   + if (IS_ERR(vdev-err_trigger))
   + return PTR_ERR(vdev-err_trigger);
   +
   + return 0;
   +
 
  I'm not sure why we wouldn't describe this as just another interrupt
  from the device and configure it via SET_IRQ.  This ioctl has very
  limited use and doesn't follow any of the conventions of all the other
  vfio ioctls.

I thought this was a fairly simple ioctl to implement this way. Moreover, 
there is no device interrupt involved. Let me know if you really want to 
model it as a SET_IRQ ioctl.

 
 } else if (cmd == VFIO_DEVICE_RESET)
 return vdev-reset_works ?
 pci_reset_function(vdev-pdev) : -EINVAL;
   @@ -527,11 +542,25 @@ static void vfio_pci_remove(struct pci_dev *pdev)
 kfree(vdev);
}
  
   +static pci_ers_result_t vfio_err_detected(struct pci_dev *pdev,
   + pci_channel_state_t state)
   +{
   + struct vfio_pci_device *vdev = vfio_get_vdev(pdev-dev);
   +
   + eventfd_signal(vdev-err_trigger, 1);
   + return PCI_ERS_RESULT_CAN_RECOVER;
   +}
 
  What if err_trigger hasn't been set?

Will fix that.

 
   +
   +static const struct pci_error_handlers vfio_err_handlers = {
   + .error_detected = vfio_err_detected,
   +};
   +
static struct pci_driver vfio_pci_driver = {
 .name   = vfio-pci,
 .id_table   = NULL, /* only dynamic ids */
 .probe  = vfio_pci_probe,
 .remove = vfio_pci_remove,
   + .err_handler= vfio_err_handlers,
};
  
static void __exit vfio_pci_cleanup(void)
   diff --git a/drivers/vfio/pci/vfio_pci_private.h
 b/drivers/vfio/pci/vfio_pci_private.h
   index 611827c..daee62f 100644
   --- a/drivers/vfio/pci/vfio_pci_private.h
   +++ b/drivers/vfio/pci/vfio_pci_private.h
   @@ -55,6 +55,7 @@ struct vfio_pci_device {
 boolbardirty;
 struct pci_saved_state  *pci_saved_state;
 atomic_trefcnt;
   + struct eventfd_ctx  *err_trigger;
};
  
#define is_intx(vdev) (vdev-irq_type == VFIO_PCI_INTX_IRQ_INDEX)
   diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
   index 56097c6..5ed5a54 100644
   --- a/drivers/vfio/vfio.c
   +++ b/drivers/vfio/vfio.c
   @@ -693,6 +693,14 @@ void *vfio_del_group_dev(struct device *dev)
}
EXPORT_SYMBOL_GPL(vfio_del_group_dev);
  
   +void *vfio_get_vdev(struct device *dev)
   +{
   + struct vfio_device *device = dev_get_drvdata(dev);
   +
   + return

RE: [PATCH 2/2] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-01-11 Thread Pandarathil, Vijaymohan R


 -Original Message-
 From: Alex Williamson [mailto:alex.william...@redhat.com]
 Sent: Wednesday, January 09, 2013 10:08 AM
 To: Pandarathil, Vijaymohan R
 Cc: Bjorn Helgaas; Gleb Natapov; k...@vger.kernel.org; qemu-
 de...@nongnu.org; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
 Subject: Re: [PATCH 2/2] QEMU-AER: Qemu changes to support AER for VFIO-PCI
 devices
 
 On Wed, 2013-01-09 at 06:26 +, Pandarathil, Vijaymohan R wrote:
  - Create eventfd per vfio device assigned to a guest and register an
event handler
 
  - This fd is passed to the vfio_pci driver through a new ioctl
 
  - When the device encounters an error, the eventfd is signaled
and the qemu eventfd handler gets invoked.
 
  - In the handler decide what action to take. Current action taken
is to terminate the guest.
 
  Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
  ---
   hw/vfio_pci.c  | 56
 ++
   linux-headers/linux/vfio.h |  9 
   2 files changed, 65 insertions(+)
 
  diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
  index 28c8303..9c3c28b 100644
  --- a/hw/vfio_pci.c
  +++ b/hw/vfio_pci.c
  @@ -38,6 +38,7 @@
   #include qemu/error-report.h
   #include qemu/queue.h
   #include qemu/range.h
  +#include sysemu/sysemu.h
 
   /* #define DEBUG_VFIO */
   #ifdef DEBUG_VFIO
  @@ -130,6 +131,8 @@ typedef struct VFIODevice {
   QLIST_ENTRY(VFIODevice) next;
   struct VFIOGroup *group;
   bool reset_works;
  +EventNotifier errfd;
 
 Misleading name, it's an EventNotifier not an fd.

Will make the change.

 
  +__u32 dev_info_flags;
   } VFIODevice;
 
   typedef struct VFIOGroup {
  @@ -1805,6 +1808,8 @@ static int vfio_get_device(VFIOGroup *group, const
 char *name, VFIODevice *vdev)
   DPRINTF(Device %s flags: %u, regions: %u, irgs: %u\n, name,
   dev_info.flags, dev_info.num_regions, dev_info.num_irqs);
 
  +vdev-dev_info_flags = dev_info.flags;
  +
 
 We test the VFIO_DEVICE_FLAGS_RESET elsewhere by caching the bit into a
 variable, why not do that here too?

I was wondering if there was any specific reason to cache the bit into a 
separate variable. Wouldn't a flags field which can contain the specific bit be 
more apt ?

 
   if (!(dev_info.flags  VFIO_DEVICE_FLAGS_PCI)) {
   error_report(vfio: Um, this isn't a PCI device\n);
   goto error;
  @@ -1900,6 +1905,55 @@ static void vfio_put_device(VFIODevice *vdev)
   }
   }
 
  +static void vfio_errfd_handler(void *opaque)
  +{
  +VFIODevice *vdev = opaque;
  +
  +if (!event_notifier_test_and_clear(vdev-errfd)) {
  +return;
  +}
  +
  +/*
  + * TBD. Retrieve the error details and decide what action
  + * needs to be taken. One of the actions could be to pass
  + * the error to the guest and have the guest driver recover
  + * the error. This requires that PCIe capabilities be
  + * exposed to the guest. At present, we just terminate the
  + * guest to contain the error.
  + */
  +error_report(%s(%04x:%02x:%02x.%x) 
  +Unrecoverable error detected... Terminating guest\n,
  +__func__, vdev-host.domain, vdev-host.bus, vdev-host.slot,
  +vdev-host.function);
  +
  +qemu_system_shutdown_request();
 
 I would have figured hw_error

Hw_error() is probably more appropriate. Will make the change.

 
  +return;
  +}
  +
  +static void vfio_register_errfd(VFIODevice *vdev)
  +{
  +int32_t pfd;
 
 pfd is used elsewhere in vfio as Pointer to FD, this is just a fd.

Will change.

 
  +int ret;
  +
  +if (!(vdev-dev_info_flags  VFIO_DEVICE_FLAGS_AER_NOTIFY)) {
  +error_report(vfio: Warning: Error notification not supported
 for the device\n);
 
 This should probably just be a debug print.

Okay.

 
  +return;
  +}
  +if (event_notifier_init(vdev-errfd, 0)) {
  +error_report(vfio: Warning: Unable to init event notifier for
 error detection\n);
  +return;
  +}
  +pfd = event_notifier_get_fd(vdev-errfd);
  +qemu_set_fd_handler(pfd, vfio_errfd_handler, NULL, vdev);
  +
  +ret = ioctl(vdev-fd, VFIO_DEVICE_SET_ERRFD, pfd);
  +if (ret) {
  +error_report(vfio: Warning: Failed to setup error fd: %d\n,
 ret);
  +qemu_set_fd_handler(pfd, NULL, NULL, vdev);
  +event_notifier_cleanup(vdev-errfd);
  +}
  +return;
  +}
   static int vfio_initfn(PCIDevice *pdev)
   {
   VFIODevice *pvdev, *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
  @@ -2010,6 +2064,8 @@ static int vfio_initfn(PCIDevice *pdev)
   }
   }
 
  +vfio_register_errfd(vdev);
  +
 
 No cleanup in exitfn?!  Thanks,

Missed that. Will fix.

Vijay

 
 Alex
 
   return 0;
 
   out_teardown:
  diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
  index 4758d1b..0ca4eeb 100644
  --- a/linux-headers/linux/vfio.h
  +++ b/linux

RE: [Qemu-devel] [PATCH 2/2] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-01-11 Thread Pandarathil, Vijaymohan R


 -Original Message-
 From: Blue Swirl [mailto:blauwir...@gmail.com]
 Sent: Wednesday, January 09, 2013 1:23 PM
 To: Pandarathil, Vijaymohan R
 Cc: Alex Williamson; Bjorn Helgaas; Gleb Natapov; linux-
 p...@vger.kernel.org; qemu-de...@nongnu.org; k...@vger.kernel.org; linux-
 ker...@vger.kernel.org
 Subject: Re: [Qemu-devel] [PATCH 2/2] QEMU-AER: Qemu changes to support AER
 for VFIO-PCI devices
 
 On Wed, Jan 9, 2013 at 6:26 AM, Pandarathil, Vijaymohan R
 vijaymohan.pandarat...@hp.com wrote:
  - Create eventfd per vfio device assigned to a guest and register
 an
event handler
 
  - This fd is passed to the vfio_pci driver through a new ioctl
 
  - When the device encounters an error, the eventfd is signaled
and the qemu eventfd handler gets invoked.
 
  - In the handler decide what action to take. Current action taken
is to terminate the guest.
 
  Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
  ---
   hw/vfio_pci.c  | 56
 ++
   linux-headers/linux/vfio.h |  9 
   2 files changed, 65 insertions(+)
 
  diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
  index 28c8303..9c3c28b 100644
  --- a/hw/vfio_pci.c
  +++ b/hw/vfio_pci.c
  @@ -38,6 +38,7 @@
   #include qemu/error-report.h
   #include qemu/queue.h
   #include qemu/range.h
  +#include sysemu/sysemu.h
 
   /* #define DEBUG_VFIO */
   #ifdef DEBUG_VFIO
  @@ -130,6 +131,8 @@ typedef struct VFIODevice {
   QLIST_ENTRY(VFIODevice) next;
   struct VFIOGroup *group;
   bool reset_works;
  +EventNotifier errfd;
  +__u32 dev_info_flags;
 
 QEMU is not kernel code, please use uint32_t.

As you saw later in the code, I was using the same type in the structure 
copied from the kernel. Will change this to uint32_t.

Thanks

Vijay

 
   } VFIODevice;
 
   typedef struct VFIOGroup {
  @@ -1805,6 +1808,8 @@ static int vfio_get_device(VFIOGroup *group, const
 char *name, VFIODevice *vdev)
   DPRINTF(Device %s flags: %u, regions: %u, irgs: %u\n, name,
   dev_info.flags, dev_info.num_regions, dev_info.num_irqs);
 
  +vdev-dev_info_flags = dev_info.flags;
  +
   if (!(dev_info.flags  VFIO_DEVICE_FLAGS_PCI)) {
   error_report(vfio: Um, this isn't a PCI device\n);
   goto error;
  @@ -1900,6 +1905,55 @@ static void vfio_put_device(VFIODevice *vdev)
   }
   }
 
  +static void vfio_errfd_handler(void *opaque)
  +{
  +VFIODevice *vdev = opaque;
  +
  +if (!event_notifier_test_and_clear(vdev-errfd)) {
  +return;
  +}
  +
  +/*
  + * TBD. Retrieve the error details and decide what action
  + * needs to be taken. One of the actions could be to pass
  + * the error to the guest and have the guest driver recover
  + * the error. This requires that PCIe capabilities be
  + * exposed to the guest. At present, we just terminate the
  + * guest to contain the error.
  + */
  +error_report(%s(%04x:%02x:%02x.%x) 
  +Unrecoverable error detected... Terminating guest\n,
  +__func__, vdev-host.domain, vdev-host.bus, vdev-host.slot,
  +vdev-host.function);
  +
  +qemu_system_shutdown_request();
  +return;
  +}
  +
  +static void vfio_register_errfd(VFIODevice *vdev)
  +{
  +int32_t pfd;
  +int ret;
  +
  +if (!(vdev-dev_info_flags  VFIO_DEVICE_FLAGS_AER_NOTIFY)) {
  +error_report(vfio: Warning: Error notification not supported
 for the device\n);
  +return;
  +}
  +if (event_notifier_init(vdev-errfd, 0)) {
  +error_report(vfio: Warning: Unable to init event notifier for
 error detection\n);
  +return;
  +}
  +pfd = event_notifier_get_fd(vdev-errfd);
  +qemu_set_fd_handler(pfd, vfio_errfd_handler, NULL, vdev);
  +
  +ret = ioctl(vdev-fd, VFIO_DEVICE_SET_ERRFD, pfd);
  +if (ret) {
  +error_report(vfio: Warning: Failed to setup error fd: %d\n,
 ret);
  +qemu_set_fd_handler(pfd, NULL, NULL, vdev);
  +event_notifier_cleanup(vdev-errfd);
  +}
  +return;
  +}
   static int vfio_initfn(PCIDevice *pdev)
   {
   VFIODevice *pvdev, *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
  @@ -2010,6 +2064,8 @@ static int vfio_initfn(PCIDevice *pdev)
   }
   }
 
  +vfio_register_errfd(vdev);
  +
   return 0;
 
   out_teardown:
  diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
  index 4758d1b..0ca4eeb 100644
  --- a/linux-headers/linux/vfio.h
  +++ b/linux-headers/linux/vfio.h
  @@ -147,6 +147,7 @@ struct vfio_device_info {
  __u32   flags;
   #define VFIO_DEVICE_FLAGS_RESET(1  0)/* Device
 supports reset */
   #define VFIO_DEVICE_FLAGS_PCI  (1  1)/* vfio-pci device */
  +#define VFIO_DEVICE_FLAGS_AER_NOTIFY (1  2)   /* Supports aer
 notification */
  __u32   num_regions;/* Max region index + 1 */
  __u32   num_irqs

[PATCH 2/2] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-01-08 Thread Pandarathil, Vijaymohan R
- Create eventfd per vfio device assigned to a guest and register an
  event handler

- This fd is passed to the vfio_pci driver through a new ioctl

- When the device encounters an error, the eventfd is signaled
  and the qemu eventfd handler gets invoked.

- In the handler decide what action to take. Current action taken
  is to terminate the guest.

Signed-off-by: Vijay Mohan Pandarathil 
---
 hw/vfio_pci.c  | 56 ++
 linux-headers/linux/vfio.h |  9 
 2 files changed, 65 insertions(+)

diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
index 28c8303..9c3c28b 100644
--- a/hw/vfio_pci.c
+++ b/hw/vfio_pci.c
@@ -38,6 +38,7 @@
 #include "qemu/error-report.h"
 #include "qemu/queue.h"
 #include "qemu/range.h"
+#include "sysemu/sysemu.h"
 
 /* #define DEBUG_VFIO */
 #ifdef DEBUG_VFIO
@@ -130,6 +131,8 @@ typedef struct VFIODevice {
 QLIST_ENTRY(VFIODevice) next;
 struct VFIOGroup *group;
 bool reset_works;
+EventNotifier errfd;
+__u32 dev_info_flags;
 } VFIODevice;
 
 typedef struct VFIOGroup {
@@ -1805,6 +1808,8 @@ static int vfio_get_device(VFIOGroup *group, const char 
*name, VFIODevice *vdev)
 DPRINTF("Device %s flags: %u, regions: %u, irgs: %u\n", name,
 dev_info.flags, dev_info.num_regions, dev_info.num_irqs);
 
+vdev->dev_info_flags = dev_info.flags;
+
 if (!(dev_info.flags & VFIO_DEVICE_FLAGS_PCI)) {
 error_report("vfio: Um, this isn't a PCI device\n");
 goto error;
@@ -1900,6 +1905,55 @@ static void vfio_put_device(VFIODevice *vdev)
 }
 }
 
+static void vfio_errfd_handler(void *opaque)
+{
+VFIODevice *vdev = opaque;
+
+if (!event_notifier_test_and_clear(>errfd)) {
+return;
+}
+
+/*
+ * TBD. Retrieve the error details and decide what action
+ * needs to be taken. One of the actions could be to pass
+ * the error to the guest and have the guest driver recover
+ * the error. This requires that PCIe capabilities be
+ * exposed to the guest. At present, we just terminate the
+ * guest to contain the error.
+ */
+error_report("%s(%04x:%02x:%02x.%x) "
+"Unrecoverable error detected... Terminating guest\n",
+__func__, vdev->host.domain, vdev->host.bus, vdev->host.slot,
+vdev->host.function);
+
+qemu_system_shutdown_request();
+return;
+}
+
+static void vfio_register_errfd(VFIODevice *vdev)
+{
+int32_t pfd;
+int ret;
+
+if (!(vdev->dev_info_flags & VFIO_DEVICE_FLAGS_AER_NOTIFY)) {
+error_report("vfio: Warning: Error notification not supported for the 
device\n");
+return;
+}
+if (event_notifier_init(>errfd, 0)) {
+error_report("vfio: Warning: Unable to init event notifier for error 
detection\n");
+return;
+}
+pfd = event_notifier_get_fd(>errfd);
+qemu_set_fd_handler(pfd, vfio_errfd_handler, NULL, vdev);
+
+ret = ioctl(vdev->fd, VFIO_DEVICE_SET_ERRFD, pfd);
+if (ret) {
+error_report("vfio: Warning: Failed to setup error fd: %d\n", ret);
+qemu_set_fd_handler(pfd, NULL, NULL, vdev);
+event_notifier_cleanup(>errfd);
+}
+return;
+}
 static int vfio_initfn(PCIDevice *pdev)
 {
 VFIODevice *pvdev, *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
@@ -2010,6 +2064,8 @@ static int vfio_initfn(PCIDevice *pdev)
 }
 }
 
+vfio_register_errfd(vdev);
+
 return 0;
 
 out_teardown:
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 4758d1b..0ca4eeb 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -147,6 +147,7 @@ struct vfio_device_info {
__u32   flags;
 #define VFIO_DEVICE_FLAGS_RESET(1 << 0)/* Device supports 
reset */
 #define VFIO_DEVICE_FLAGS_PCI  (1 << 1)/* vfio-pci device */
+#define VFIO_DEVICE_FLAGS_AER_NOTIFY (1 << 2)   /* Supports aer notification */
__u32   num_regions;/* Max region index + 1 */
__u32   num_irqs;   /* Max IRQ index + 1 */
 };
@@ -288,6 +289,14 @@ struct vfio_irq_set {
  */
 #define VFIO_DEVICE_RESET  _IO(VFIO_TYPE, VFIO_BASE + 11)
 
+/**
+ * VFIO_DEVICE_SET_ERRFD - _IO(VFIO_TYPE, VFIO_BASE + 12)
+ *
+ * Pass the eventfd to the vfio-pci driver for signalling any device
+ * error notifications
+ */
+#define VFIO_DEVICE_SET_ERRFD   _IO(VFIO_TYPE, VFIO_BASE + 12)
+
 /*
  * The VFIO-PCI bus driver makes use of the following fixed region and
  * IRQ index mapping.  Unimplemented regions return a size of zero.
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] AER-KVM: Error containment of VFIO devices assigned to KVM guests

2013-01-08 Thread Pandarathil, Vijaymohan R
Add support for error containment when a VFIO device assigned to a KVM
guest encounters an error. This is for PCIe devices/drivers that support AER
functionality. When the host OS is notified of an error in a device either
through the firmware first approach or through an interrupt handled by the AER
root port driver, the error handler registered by the vfio-pci driver gets
invoked. The qemu process is signaled through an eventfd registered per
VFIO device by the qemu process. In the eventfd handler, qemu decides on
what action to take. In this implementation, guest is brought down to
contain the error.

---
Vijay Mohan Pandarathil(2):

[PATCH 1/2] VFIO-AER: Vfio-pci driver changes for supporting AER
[PATCH 2/2] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

Kernel files changed

 drivers/vfio/pci/vfio_pci.c | 29 +
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 drivers/vfio/vfio.c |  8 
 include/linux/vfio.h|  1 +
 include/uapi/linux/vfio.h   |  9 +
 5 files changed, 48 insertions(+)

Qemu files changed

 hw/vfio_pci.c  | 56 ++
 linux-headers/linux/vfio.h |  9 
 2 files changed, 65 insertions(+)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] VFIO-AER: Vfio-pci driver changes for supporting AER

2013-01-08 Thread Pandarathil, Vijaymohan R

- New ioctl which is used to pass the eventfd that is signaled when
  an error occurs in the vfio_pci_device

- Register pci_error_handler for the vfio_pci driver

- When the device encounters an error, the error handler registered by
  the vfio_pci driver gets invoked by the AER infrastructure

- In the error handler, signal the eventfd registered for the device.

- This results in the qemu eventfd handler getting invoked and
  appropriate action taken for the guest.

Signed-off-by: Vijay Mohan Pandarathil 
---
 drivers/vfio/pci/vfio_pci.c | 29 +
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 drivers/vfio/vfio.c |  8 
 include/linux/vfio.h|  1 +
 include/uapi/linux/vfio.h   |  9 +
 5 files changed, 48 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 6c11994..4ae9526 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -207,6 +207,8 @@ static long vfio_pci_ioctl(void *device_data,
if (vdev->reset_works)
info.flags |= VFIO_DEVICE_FLAGS_RESET;
 
+   info.flags |= VFIO_DEVICE_FLAGS_AER_NOTIFY;
+
info.num_regions = VFIO_PCI_NUM_REGIONS;
info.num_irqs = VFIO_PCI_NUM_IRQS;
 
@@ -348,6 +350,19 @@ static long vfio_pci_ioctl(void *device_data,
 
return ret;
 
+   } else if (cmd == VFIO_DEVICE_SET_ERRFD) {
+   int32_t fd = (int32_t)arg;
+
+   if (fd < 0)
+   return -EINVAL;
+
+   vdev->err_trigger = eventfd_ctx_fdget(fd);
+
+   if (IS_ERR(vdev->err_trigger))
+   return PTR_ERR(vdev->err_trigger);
+
+   return 0;
+
} else if (cmd == VFIO_DEVICE_RESET)
return vdev->reset_works ?
pci_reset_function(vdev->pdev) : -EINVAL;
@@ -527,11 +542,25 @@ static void vfio_pci_remove(struct pci_dev *pdev)
kfree(vdev);
 }
 
+static pci_ers_result_t vfio_err_detected(struct pci_dev *pdev,
+   pci_channel_state_t state)
+{
+   struct vfio_pci_device *vdev = vfio_get_vdev(>dev);
+
+   eventfd_signal(vdev->err_trigger, 1);
+   return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+static const struct pci_error_handlers vfio_err_handlers = {
+   .error_detected = vfio_err_detected,
+};
+
 static struct pci_driver vfio_pci_driver = {
.name   = "vfio-pci",
.id_table   = NULL, /* only dynamic ids */
.probe  = vfio_pci_probe,
.remove = vfio_pci_remove,
+   .err_handler= _err_handlers,
 };
 
 static void __exit vfio_pci_cleanup(void)
diff --git a/drivers/vfio/pci/vfio_pci_private.h 
b/drivers/vfio/pci/vfio_pci_private.h
index 611827c..daee62f 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -55,6 +55,7 @@ struct vfio_pci_device {
boolbardirty;
struct pci_saved_state  *pci_saved_state;
atomic_trefcnt;
+   struct eventfd_ctx  *err_trigger;
 };
 
 #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 56097c6..5ed5a54 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -693,6 +693,14 @@ void *vfio_del_group_dev(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(vfio_del_group_dev);
 
+void *vfio_get_vdev(struct device *dev)
+{
+   struct vfio_device *device = dev_get_drvdata(dev);
+
+   return device->device_data;
+}
+EXPORT_SYMBOL_GPL(vfio_get_vdev);
+
 /**
  * VFIO base fd, /dev/vfio/vfio
  */
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index ab9e862..3c97b03 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -45,6 +45,7 @@ extern int vfio_add_group_dev(struct device *dev,
  void *device_data);
 
 extern void *vfio_del_group_dev(struct device *dev);
+extern void *vfio_get_vdev(struct device *dev);
 
 /**
  * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 4758d1b..fa67213 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -147,6 +147,7 @@ struct vfio_device_info {
__u32   flags;
 #define VFIO_DEVICE_FLAGS_RESET(1 << 0)/* Device supports 
reset */
 #define VFIO_DEVICE_FLAGS_PCI  (1 << 1)/* vfio-pci device */
+#define VFIO_DEVICE_FLAGS_AER_NOTIFY (1 << 2)  /* Supports aer notify */
__u32   num_regions;/* Max region index + 1 */
__u32   num_irqs;   /* Max IRQ index + 1 */
 };
@@ -288,6 +289,14 @@ struct vfio_irq_set {
  */
 #define VFIO_DEVICE_RESET  _IO(VFIO_TYPE, VFIO_BASE + 11)
 
+/**
+ * VFIO_DEVICE_SET_ERRFD - _IO(VFIO_TYPE, VFIO_BASE + 12)
+ *
+ * Pass the 

[PATCH 0/2] AER-KVM: Error containment of VFIO devices assigned to KVM guests

2013-01-08 Thread Pandarathil, Vijaymohan R
Add support for error containment when a VFIO device assigned to a KVM
guest encounters an error. This is for PCIe devices/drivers that support AER
functionality. When the host OS is notified of an error in a device either
through the firmware first approach or through an interrupt handled by the AER
root port driver, the error handler registered by the vfio-pci driver gets
invoked. The qemu process is signaled through an eventfd registered per
VFIO device by the qemu process. In the eventfd handler, qemu decides on
what action to take. In this implementation, guest is brought down to
contain the error.

---
Vijay Mohan Pandarathil(2):

[PATCH 1/2] VFIO-AER: Vfio-pci driver changes for supporting AER
[PATCH 2/2] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

Kernel files changed

 drivers/vfio/pci/vfio_pci.c | 29 +
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 drivers/vfio/vfio.c |  8 
 include/linux/vfio.h|  1 +
 include/uapi/linux/vfio.h   |  9 +
 5 files changed, 48 insertions(+)

Qemu files changed

 hw/vfio_pci.c  | 56 ++
 linux-headers/linux/vfio.h |  9 
 2 files changed, 65 insertions(+)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] VFIO-AER: Vfio-pci driver changes for supporting AER

2013-01-08 Thread Pandarathil, Vijaymohan R

- New ioctl which is used to pass the eventfd that is signaled when
  an error occurs in the vfio_pci_device

- Register pci_error_handler for the vfio_pci driver

- When the device encounters an error, the error handler registered by
  the vfio_pci driver gets invoked by the AER infrastructure

- In the error handler, signal the eventfd registered for the device.

- This results in the qemu eventfd handler getting invoked and
  appropriate action taken for the guest.

Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
---
 drivers/vfio/pci/vfio_pci.c | 29 +
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 drivers/vfio/vfio.c |  8 
 include/linux/vfio.h|  1 +
 include/uapi/linux/vfio.h   |  9 +
 5 files changed, 48 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 6c11994..4ae9526 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -207,6 +207,8 @@ static long vfio_pci_ioctl(void *device_data,
if (vdev-reset_works)
info.flags |= VFIO_DEVICE_FLAGS_RESET;
 
+   info.flags |= VFIO_DEVICE_FLAGS_AER_NOTIFY;
+
info.num_regions = VFIO_PCI_NUM_REGIONS;
info.num_irqs = VFIO_PCI_NUM_IRQS;
 
@@ -348,6 +350,19 @@ static long vfio_pci_ioctl(void *device_data,
 
return ret;
 
+   } else if (cmd == VFIO_DEVICE_SET_ERRFD) {
+   int32_t fd = (int32_t)arg;
+
+   if (fd  0)
+   return -EINVAL;
+
+   vdev-err_trigger = eventfd_ctx_fdget(fd);
+
+   if (IS_ERR(vdev-err_trigger))
+   return PTR_ERR(vdev-err_trigger);
+
+   return 0;
+
} else if (cmd == VFIO_DEVICE_RESET)
return vdev-reset_works ?
pci_reset_function(vdev-pdev) : -EINVAL;
@@ -527,11 +542,25 @@ static void vfio_pci_remove(struct pci_dev *pdev)
kfree(vdev);
 }
 
+static pci_ers_result_t vfio_err_detected(struct pci_dev *pdev,
+   pci_channel_state_t state)
+{
+   struct vfio_pci_device *vdev = vfio_get_vdev(pdev-dev);
+
+   eventfd_signal(vdev-err_trigger, 1);
+   return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+static const struct pci_error_handlers vfio_err_handlers = {
+   .error_detected = vfio_err_detected,
+};
+
 static struct pci_driver vfio_pci_driver = {
.name   = vfio-pci,
.id_table   = NULL, /* only dynamic ids */
.probe  = vfio_pci_probe,
.remove = vfio_pci_remove,
+   .err_handler= vfio_err_handlers,
 };
 
 static void __exit vfio_pci_cleanup(void)
diff --git a/drivers/vfio/pci/vfio_pci_private.h 
b/drivers/vfio/pci/vfio_pci_private.h
index 611827c..daee62f 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -55,6 +55,7 @@ struct vfio_pci_device {
boolbardirty;
struct pci_saved_state  *pci_saved_state;
atomic_trefcnt;
+   struct eventfd_ctx  *err_trigger;
 };
 
 #define is_intx(vdev) (vdev-irq_type == VFIO_PCI_INTX_IRQ_INDEX)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 56097c6..5ed5a54 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -693,6 +693,14 @@ void *vfio_del_group_dev(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(vfio_del_group_dev);
 
+void *vfio_get_vdev(struct device *dev)
+{
+   struct vfio_device *device = dev_get_drvdata(dev);
+
+   return device-device_data;
+}
+EXPORT_SYMBOL_GPL(vfio_get_vdev);
+
 /**
  * VFIO base fd, /dev/vfio/vfio
  */
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index ab9e862..3c97b03 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -45,6 +45,7 @@ extern int vfio_add_group_dev(struct device *dev,
  void *device_data);
 
 extern void *vfio_del_group_dev(struct device *dev);
+extern void *vfio_get_vdev(struct device *dev);
 
 /**
  * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 4758d1b..fa67213 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -147,6 +147,7 @@ struct vfio_device_info {
__u32   flags;
 #define VFIO_DEVICE_FLAGS_RESET(1  0)/* Device supports 
reset */
 #define VFIO_DEVICE_FLAGS_PCI  (1  1)/* vfio-pci device */
+#define VFIO_DEVICE_FLAGS_AER_NOTIFY (1  2)  /* Supports aer notify */
__u32   num_regions;/* Max region index + 1 */
__u32   num_irqs;   /* Max IRQ index + 1 */
 };
@@ -288,6 +289,14 @@ struct vfio_irq_set {
  */
 #define VFIO_DEVICE_RESET  _IO(VFIO_TYPE, VFIO_BASE + 11)
 
+/**
+ * VFIO_DEVICE_SET_ERRFD - _IO(VFIO_TYPE, VFIO_BASE + 

[PATCH 2/2] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-01-08 Thread Pandarathil, Vijaymohan R
- Create eventfd per vfio device assigned to a guest and register an
  event handler

- This fd is passed to the vfio_pci driver through a new ioctl

- When the device encounters an error, the eventfd is signaled
  and the qemu eventfd handler gets invoked.

- In the handler decide what action to take. Current action taken
  is to terminate the guest.

Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
---
 hw/vfio_pci.c  | 56 ++
 linux-headers/linux/vfio.h |  9 
 2 files changed, 65 insertions(+)

diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
index 28c8303..9c3c28b 100644
--- a/hw/vfio_pci.c
+++ b/hw/vfio_pci.c
@@ -38,6 +38,7 @@
 #include qemu/error-report.h
 #include qemu/queue.h
 #include qemu/range.h
+#include sysemu/sysemu.h
 
 /* #define DEBUG_VFIO */
 #ifdef DEBUG_VFIO
@@ -130,6 +131,8 @@ typedef struct VFIODevice {
 QLIST_ENTRY(VFIODevice) next;
 struct VFIOGroup *group;
 bool reset_works;
+EventNotifier errfd;
+__u32 dev_info_flags;
 } VFIODevice;
 
 typedef struct VFIOGroup {
@@ -1805,6 +1808,8 @@ static int vfio_get_device(VFIOGroup *group, const char 
*name, VFIODevice *vdev)
 DPRINTF(Device %s flags: %u, regions: %u, irgs: %u\n, name,
 dev_info.flags, dev_info.num_regions, dev_info.num_irqs);
 
+vdev-dev_info_flags = dev_info.flags;
+
 if (!(dev_info.flags  VFIO_DEVICE_FLAGS_PCI)) {
 error_report(vfio: Um, this isn't a PCI device\n);
 goto error;
@@ -1900,6 +1905,55 @@ static void vfio_put_device(VFIODevice *vdev)
 }
 }
 
+static void vfio_errfd_handler(void *opaque)
+{
+VFIODevice *vdev = opaque;
+
+if (!event_notifier_test_and_clear(vdev-errfd)) {
+return;
+}
+
+/*
+ * TBD. Retrieve the error details and decide what action
+ * needs to be taken. One of the actions could be to pass
+ * the error to the guest and have the guest driver recover
+ * the error. This requires that PCIe capabilities be
+ * exposed to the guest. At present, we just terminate the
+ * guest to contain the error.
+ */
+error_report(%s(%04x:%02x:%02x.%x) 
+Unrecoverable error detected... Terminating guest\n,
+__func__, vdev-host.domain, vdev-host.bus, vdev-host.slot,
+vdev-host.function);
+
+qemu_system_shutdown_request();
+return;
+}
+
+static void vfio_register_errfd(VFIODevice *vdev)
+{
+int32_t pfd;
+int ret;
+
+if (!(vdev-dev_info_flags  VFIO_DEVICE_FLAGS_AER_NOTIFY)) {
+error_report(vfio: Warning: Error notification not supported for the 
device\n);
+return;
+}
+if (event_notifier_init(vdev-errfd, 0)) {
+error_report(vfio: Warning: Unable to init event notifier for error 
detection\n);
+return;
+}
+pfd = event_notifier_get_fd(vdev-errfd);
+qemu_set_fd_handler(pfd, vfio_errfd_handler, NULL, vdev);
+
+ret = ioctl(vdev-fd, VFIO_DEVICE_SET_ERRFD, pfd);
+if (ret) {
+error_report(vfio: Warning: Failed to setup error fd: %d\n, ret);
+qemu_set_fd_handler(pfd, NULL, NULL, vdev);
+event_notifier_cleanup(vdev-errfd);
+}
+return;
+}
 static int vfio_initfn(PCIDevice *pdev)
 {
 VFIODevice *pvdev, *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
@@ -2010,6 +2064,8 @@ static int vfio_initfn(PCIDevice *pdev)
 }
 }
 
+vfio_register_errfd(vdev);
+
 return 0;
 
 out_teardown:
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 4758d1b..0ca4eeb 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -147,6 +147,7 @@ struct vfio_device_info {
__u32   flags;
 #define VFIO_DEVICE_FLAGS_RESET(1  0)/* Device supports 
reset */
 #define VFIO_DEVICE_FLAGS_PCI  (1  1)/* vfio-pci device */
+#define VFIO_DEVICE_FLAGS_AER_NOTIFY (1  2)   /* Supports aer notification */
__u32   num_regions;/* Max region index + 1 */
__u32   num_irqs;   /* Max IRQ index + 1 */
 };
@@ -288,6 +289,14 @@ struct vfio_irq_set {
  */
 #define VFIO_DEVICE_RESET  _IO(VFIO_TYPE, VFIO_BASE + 11)
 
+/**
+ * VFIO_DEVICE_SET_ERRFD - _IO(VFIO_TYPE, VFIO_BASE + 12)
+ *
+ * Pass the eventfd to the vfio-pci driver for signalling any device
+ * error notifications
+ */
+#define VFIO_DEVICE_SET_ERRFD   _IO(VFIO_TYPE, VFIO_BASE + 12)
+
 /*
  * The VFIO-PCI bus driver makes use of the following fixed region and
  * IRQ index mapping.  Unimplemented regions return a size of zero.
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v3] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

2012-11-28 Thread Pandarathil, Vijaymohan R
Looks correct.

Vijay

> -Original Message-
> From: Bjorn Helgaas [mailto:bhelg...@google.com]
> Sent: Wednesday, November 28, 2012 12:15 PM
> To: Pandarathil, Vijaymohan R
> Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org;
> linasveps...@gmail.com; Myron Stowe; Ortiz, Lance E; Huang Ying; Hidetoshi
> Seto; Patterson, Andrew D (LeftHand Networks); Zhang Yanmin
> Subject: Re: [PATCH v3] PCI-AER: Do not report successful error recovery
> for devices with AER-unaware drivers
> 
> On Sat, Nov 17, 2012 at 4:47 AM, Pandarathil, Vijaymohan R
>  wrote:
> > When an error is detected on a PCIe device which does not have an
> > AER-aware driver, prevent AER infrastructure from reporting
> > successful error recovery.
> >
> > This is because the report_error_detected() function that gets
> > called in the first phase of recovery process allows forward
> > progress even when the driver for the device does not have AER
> > capabilities. It seems that all callbacks (in pci_error_handlers
> > structure) registered by drivers that gets called during error
> > recovery are not mandatory. So the intention of the infrastructure
> > design seems to be to allow forward progress even when a specific
> > callback has not been registered by a driver. However, if error
> > handler structure itself has not been registered, it doesn't make
> > sense to allow forward progress.
> >
> > As a result of the current design, in the case of a single device
> > having an AER-unaware driver or in the case of any function in a
> > multi-function card having an AER-unaware driver, a successful
> > recovery is reported.
> >
> > Typical scenario this happens is when a PCI device is detached
> > from a KVM host and the pci-stub driver on the host claims the
> > device. The pci-stub driver does not have error handling capabilities
> > but the AER infrastructure still reports that the device recovered
> > successfully.
> >
> > The changes proposed here leaves the device(s)in an unrecovered state
> > if the driver for the device or for any device in the subtree
> > does not have error handler structure registered. This reflects
> > the true state of the device and prevents any partial recovery (or no
> > recovery at all) reported as successful.
> >
> >
> > v3:
> >   - Fixed a checkpatch/build issue
> >
> > v2:
> >   - Made changes so that all devices in the subtree have the error
> > state set correctly.
> >
> >
> > Reviewed-by: Linas Vepstas  gmail.com>
> > Reviewed-by: Myron Stowe  redhat.com>
> > Reviewed-by: Bjorn Helgaas  google.com>
> > Signed-off-by: Vijay Mohan Pandarathil 
> 
> I added this to my -next branch, which will be merged for v3.8.
> 
> Please double-check that I resolved the merge conflict correctly.  It
> should show up here in the next few minutes:
> 
> http://git.kernel.org/?p=linux/kernel/git/helgaas/pci.git;a=shortlog;h=refs
> /heads/next
> 
> >  drivers/pci/pcie/aer/aerdrv.h  |  5 -
> >  drivers/pci/pcie/aer/aerdrv_core.c | 21 ++---
> >  include/linux/pci.h|  3 +++
> >  3 files changed, 25 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/pci/pcie/aer/aerdrv.h
> b/drivers/pci/pcie/aer/aerdrv.h
> > index 94a7598..22f840f 100644
> > --- a/drivers/pci/pcie/aer/aerdrv.h
> > +++ b/drivers/pci/pcie/aer/aerdrv.h
> > @@ -87,6 +87,9 @@ struct aer_broadcast_data {
> >  static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
> > enum pci_ers_result new)
> >  {
> > +   if (new == PCI_ERS_RESULT_NO_AER_DRIVER)
> > +   return PCI_ERS_RESULT_NO_AER_DRIVER;
> > +
> > if (new == PCI_ERS_RESULT_NONE)
> > return orig;
> >
> > @@ -97,7 +100,7 @@ static inline pci_ers_result_t merge_result(enum
> pci_ers_result orig,
> > break;
> > case PCI_ERS_RESULT_DISCONNECT:
> > if (new == PCI_ERS_RESULT_NEED_RESET)
> > -   orig = new;
> > +   orig = PCI_ERS_RESULT_NEED_RESET;
> > break;
> > default:
> > break;
> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
> b/drivers/pci/pcie/aer/aerdrv_core.c
> > index 9f80cb0..b67f91a 100644
> > --- a/drivers/pci/pcie/aer/aerdrv_core.c
> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> > @@ -232,11 +232,26 @@ static int report_error_detected(struct pci_dev
> *dev, void *data)
> >  

RE: [PATCH v3] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

2012-11-28 Thread Pandarathil, Vijaymohan R
Looks correct.

Vijay

 -Original Message-
 From: Bjorn Helgaas [mailto:bhelg...@google.com]
 Sent: Wednesday, November 28, 2012 12:15 PM
 To: Pandarathil, Vijaymohan R
 Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org;
 linasveps...@gmail.com; Myron Stowe; Ortiz, Lance E; Huang Ying; Hidetoshi
 Seto; Patterson, Andrew D (LeftHand Networks); Zhang Yanmin
 Subject: Re: [PATCH v3] PCI-AER: Do not report successful error recovery
 for devices with AER-unaware drivers
 
 On Sat, Nov 17, 2012 at 4:47 AM, Pandarathil, Vijaymohan R
 vijaymohan.pandarat...@hp.com wrote:
  When an error is detected on a PCIe device which does not have an
  AER-aware driver, prevent AER infrastructure from reporting
  successful error recovery.
 
  This is because the report_error_detected() function that gets
  called in the first phase of recovery process allows forward
  progress even when the driver for the device does not have AER
  capabilities. It seems that all callbacks (in pci_error_handlers
  structure) registered by drivers that gets called during error
  recovery are not mandatory. So the intention of the infrastructure
  design seems to be to allow forward progress even when a specific
  callback has not been registered by a driver. However, if error
  handler structure itself has not been registered, it doesn't make
  sense to allow forward progress.
 
  As a result of the current design, in the case of a single device
  having an AER-unaware driver or in the case of any function in a
  multi-function card having an AER-unaware driver, a successful
  recovery is reported.
 
  Typical scenario this happens is when a PCI device is detached
  from a KVM host and the pci-stub driver on the host claims the
  device. The pci-stub driver does not have error handling capabilities
  but the AER infrastructure still reports that the device recovered
  successfully.
 
  The changes proposed here leaves the device(s)in an unrecovered state
  if the driver for the device or for any device in the subtree
  does not have error handler structure registered. This reflects
  the true state of the device and prevents any partial recovery (or no
  recovery at all) reported as successful.
 
 
  v3:
- Fixed a checkpatch/build issue
 
  v2:
- Made changes so that all devices in the subtree have the error
  state set correctly.
 
 
  Reviewed-by: Linas Vepstas linasvepstas at gmail.com
  Reviewed-by: Myron Stowe mstowe at redhat.com
  Reviewed-by: Bjorn Helgaas bhelgaas at google.com
  Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
 
 I added this to my -next branch, which will be merged for v3.8.
 
 Please double-check that I resolved the merge conflict correctly.  It
 should show up here in the next few minutes:
 
 http://git.kernel.org/?p=linux/kernel/git/helgaas/pci.git;a=shortlog;h=refs
 /heads/next
 
   drivers/pci/pcie/aer/aerdrv.h  |  5 -
   drivers/pci/pcie/aer/aerdrv_core.c | 21 ++---
   include/linux/pci.h|  3 +++
   3 files changed, 25 insertions(+), 4 deletions(-)
 
  diff --git a/drivers/pci/pcie/aer/aerdrv.h
 b/drivers/pci/pcie/aer/aerdrv.h
  index 94a7598..22f840f 100644
  --- a/drivers/pci/pcie/aer/aerdrv.h
  +++ b/drivers/pci/pcie/aer/aerdrv.h
  @@ -87,6 +87,9 @@ struct aer_broadcast_data {
   static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
  enum pci_ers_result new)
   {
  +   if (new == PCI_ERS_RESULT_NO_AER_DRIVER)
  +   return PCI_ERS_RESULT_NO_AER_DRIVER;
  +
  if (new == PCI_ERS_RESULT_NONE)
  return orig;
 
  @@ -97,7 +100,7 @@ static inline pci_ers_result_t merge_result(enum
 pci_ers_result orig,
  break;
  case PCI_ERS_RESULT_DISCONNECT:
  if (new == PCI_ERS_RESULT_NEED_RESET)
  -   orig = new;
  +   orig = PCI_ERS_RESULT_NEED_RESET;
  break;
  default:
  break;
  diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
 b/drivers/pci/pcie/aer/aerdrv_core.c
  index 9f80cb0..b67f91a 100644
  --- a/drivers/pci/pcie/aer/aerdrv_core.c
  +++ b/drivers/pci/pcie/aer/aerdrv_core.c
  @@ -232,11 +232,26 @@ static int report_error_detected(struct pci_dev
 *dev, void *data)
 dev-driver ?
 no AER-aware driver : no driver);
  }
  -   return 0;
  +
  +   /*
  +* If there's any device in the subtree that does not
  +* have an error_detected callback, returning
  +* PCI_ERS_RESULT_NO_AER_DRIVER prevents calling of
  +* the subsequent mmio_enabled/slot_reset/resume
  +* callbacks of any device in the subtree. All the
  +* devices in the subtree are left in the error state
  +* without recovery

RE: [PATCH 0/4] AER-KVM: Error containment of PCI pass-thru devices assigned to KVM guests

2012-11-20 Thread Pandarathil, Vijaymohan R


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: Tuesday, November 20, 2012 5:41 AM
> To: Pandarathil, Vijaymohan R
> Cc: k...@vger.kernel.org; linux-...@vger.kernel.org; qemu-de...@nongnu.org;
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 0/4] AER-KVM: Error containment of PCI pass-thru
> devices assigned to KVM guests
> 
> On Tue, Nov 20, 2012 at 06:31:48AM +0000, Pandarathil, Vijaymohan R wrote:
> > Add support for error containment when a PCI pass-thru device assigned to
> a KVM
> > guest encounters an error. This is for PCIe devices/drivers that support
> AER
> > functionality. When the OS is notified of an error in a device either
> > through the firmware first approach or through an interrupt handled by
> the AER
> > root port driver, concerned subsystems are notified by invoking callbacks
> > registered by these subsystems. The device is also marked as tainted till
> the
> > corresponding driver recovery routines are successful.
> >
> > KVM module registers for a notification of such errors. In the KVM
> callback
> > routine, a global counter is incremented to keep track of the error
> > notification. Before each CPU enters guest mode to execute guest code,
> > appropriate checks are done to see if the impacted device belongs to the
> guest
> > or not. If the device belongs to the guest, qemu hypervisor for the guest
> is
> > informed and the guest is immediately brought down, thus preventing or
> > minimizing chances of any bad data being written out by the guest driver
> > after the device has encountered an error.
> 
> I'm surprised that the hypervisor would shut down the guest when PCIe
> AER kicks in for a pass-through device.  Shouldn't we pass the AER event
> into the guest and deal with it there?

Agreed. That would be the ideal behavior and is planned in a future patch.
Lack of control over the capabilities/type of the OS/drivers running in 
the guest is also a concern in passing along the event to the guest.

My understanding is that in the current implementation of Linux/KVM, these 
errors are not handled at all and can potentially cause a guest hang or 
crash or even data corruption depending on the implementation of the guest
driver for the device. As a first step, these patches make the behavior 
better by doing error containment with a predictable behavior when such
errors occur. 

> 
> The equivalent to this policy on physical hardware would be that the CPU
> is reset or the machine is powered down on AER.  That doesn't sound
> right.
> 
> Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/4] AER-KVM: Error containment of PCI pass-thru devices assigned to KVM guests

2012-11-20 Thread Pandarathil, Vijaymohan R


 -Original Message-
 From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
 Sent: Tuesday, November 20, 2012 5:41 AM
 To: Pandarathil, Vijaymohan R
 Cc: k...@vger.kernel.org; linux-...@vger.kernel.org; qemu-de...@nongnu.org;
 linux-kernel@vger.kernel.org
 Subject: Re: [PATCH 0/4] AER-KVM: Error containment of PCI pass-thru
 devices assigned to KVM guests
 
 On Tue, Nov 20, 2012 at 06:31:48AM +, Pandarathil, Vijaymohan R wrote:
  Add support for error containment when a PCI pass-thru device assigned to
 a KVM
  guest encounters an error. This is for PCIe devices/drivers that support
 AER
  functionality. When the OS is notified of an error in a device either
  through the firmware first approach or through an interrupt handled by
 the AER
  root port driver, concerned subsystems are notified by invoking callbacks
  registered by these subsystems. The device is also marked as tainted till
 the
  corresponding driver recovery routines are successful.
 
  KVM module registers for a notification of such errors. In the KVM
 callback
  routine, a global counter is incremented to keep track of the error
  notification. Before each CPU enters guest mode to execute guest code,
  appropriate checks are done to see if the impacted device belongs to the
 guest
  or not. If the device belongs to the guest, qemu hypervisor for the guest
 is
  informed and the guest is immediately brought down, thus preventing or
  minimizing chances of any bad data being written out by the guest driver
  after the device has encountered an error.
 
 I'm surprised that the hypervisor would shut down the guest when PCIe
 AER kicks in for a pass-through device.  Shouldn't we pass the AER event
 into the guest and deal with it there?

Agreed. That would be the ideal behavior and is planned in a future patch.
Lack of control over the capabilities/type of the OS/drivers running in 
the guest is also a concern in passing along the event to the guest.

My understanding is that in the current implementation of Linux/KVM, these 
errors are not handled at all and can potentially cause a guest hang or 
crash or even data corruption depending on the implementation of the guest
driver for the device. As a first step, these patches make the behavior 
better by doing error containment with a predictable behavior when such
errors occur. 

 
 The equivalent to this policy on physical hardware would be that the CPU
 is reset or the machine is powered down on AER.  That doesn't sound
 right.
 
 Stefan
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] AER-KVM: Integration of KVM with AER for PCI pass-thru devices

2012-11-19 Thread Pandarathil, Vijaymohan R

- Register a notifier function to be called whenever a PCIe error is
detected by the AER subsystem.

- The notifier function bumps up a global count to keep track of the
error notifications.

- Before guest entry, each vcpu checks if there has been any new
notifications since last check. If any, check if the device impacted
is assigned to the guest. If impacted, return to qemu requesting that
the guest be brought down. If no device assigned to the guest is impacted,
sync up the per guest notified count to the global value.

- At guest start time, check if any of the PCI devices assigned to the
guest is faulty and if so, fail the guest startup.

Signed-off-by: Vijay Mohan Pandarathil 
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/x86.c  | 44 +
 include/linux/kvm_host.h|  4 
 include/uapi/linux/kvm.h|  1 +
 virt/kvm/assigned-dev.c | 34 +++
 virt/kvm/kvm_main.c | 34 +++
 6 files changed, 118 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b2e11f4..481ad94 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -951,6 +951,7 @@ enum {
  */
 asmlinkage void kvm_spurious_fault(void);
 extern bool kvm_rebooting;
+extern unsigned long kvm_aer_notified_cnt;
 
 #define kvm_handle_fault_on_reboot(insn, cleanup_insn) \
"666: " insn "\n\t" \
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4f76417..87e3c3e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5235,6 +5235,32 @@ static void process_nmi(struct kvm_vcpu *vcpu)
kvm_make_request(KVM_REQ_EVENT, vcpu);
 }
 
+/*
+ * This function checks if KVM has been notified of any PCI error since last
+ * checked by this guest. If so, it checks if any PCI device assigned to this
+ * guest has got the error. If not, adjust the per guest notified_cnt to match
+ * the global kvm notified_cnt
+ */
+static inline int kvm_aer_exit(struct kvm *kvm)
+{
+   if (kvm_aer_notified_cnt == kvm->aer_notified_cnt)
+   return 0;
+
+   /*
+* These errors are expected to be very rare. In the case
+* of an error notification, multiple vcpu threads could reach
+* here and do the device check below. However, functionally
+* it shouldn't cause a problem.
+*/
+   if (kvm_find_assigned_dev_err(kvm)) {
+   return 1;
+   } else {
+   spin_lock(>aer_lock);
+   kvm->aer_notified_cnt = kvm_aer_notified_cnt;
+   spin_unlock(>aer_lock);
+   return 0;
+   }
+}
 static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 {
int r;
@@ -5334,6 +5360,24 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
goto cancel_injection;
}
 
+   /*
+* If any of the PCI devices assigned to a guest is reported to have
+* uncorrected error, do not allow guest code to execute, instead
+* bring down the guest to contain the error. Note that there is a
+* small window here where a new error notification could come in while
+* while the check is being done or right after the check before the cpu
+* enters the guest mode. Not sure if this check needs to be after
+* kvm_guest_enter() ?
+*/
+   if (kvm_aer_exit(vcpu->kvm)) {
+   vcpu->mode = OUTSIDE_GUEST_MODE;
+   smp_wmb();
+   local_irq_enable();
+   preempt_enable();
+   r = 0;
+   vcpu->run->exit_reason = KVM_EXIT_AER_SHUTDOWN;
+   goto cancel_injection;
+   }
srcu_read_unlock(>kvm->srcu, vcpu->srcu_idx);
 
if (req_immediate_exit)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ecc5543..b3c2730 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -364,6 +364,8 @@ struct kvm {
long mmu_notifier_count;
 #endif
long tlbs_dirty;
+   spinlock_t aer_lock;
+   unsigned long aer_notified_cnt;
 };
 
 #define kvm_err(fmt, ...) \
@@ -933,6 +935,8 @@ static inline long kvm_vm_ioctl_assigned_device(struct kvm 
*kvm, unsigned ioctl,
 
 #endif
 
+int kvm_find_assigned_dev_err(struct kvm *kvm);
+
 static inline void kvm_make_request(int req, struct kvm_vcpu *vcpu)
 {
set_bit(req, >requests);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 0a6d6ba..6263c21 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -167,6 +167,7 @@ struct kvm_pit_config {
 #define KVM_EXIT_OSI  18
 #define KVM_EXIT_PAPR_HCALL  19
 #define KVM_EXIT_S390_UCONTROL   20
+#define KVM_EXIT_AER_SHUTDOWN 21
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 #define KVM_INTERNAL_ERROR_EMULATION 1
diff --git a/virt/kvm/assigned-dev.c 

[PATCH 4/4] AER-QEMU: Bring down the guest when KVM detects a PCI device error

2012-11-19 Thread Pandarathil, Vijaymohan R
- When KVM_RUN ioctl returns with an exit reason requesting a shutdown
of the guest due to a PCI device error detected by AER, shutdown the
guest immediately.

Signed-off-by: Vijay Mohan Pandarathil 
---
 kvm-all.c | 6 ++
 linux-headers/linux/kvm.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/kvm-all.c b/kvm-all.c
index b6d0483..aaff44c 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1592,6 +1592,12 @@ int kvm_cpu_exec(CPUArchState *env)
 qemu_system_reset_request();
 ret = EXCP_INTERRUPT;
 break;
+case KVM_EXIT_AER_SHUTDOWN:
+fprintf(stderr, "KVM: PCI device assigned to guest encountered "
+   "an uncorrectable error. Stopping guest\n");
+qemu_system_shutdown_request();
+ret = EXCP_INTERRUPT;
+break;
 case KVM_EXIT_UNKNOWN:
 fprintf(stderr, "KVM: unknown exit, hardware reason %" PRIx64 "\n",
 (uint64_t)run->hw.hardware_exit_reason);
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 81d2feb..64906ef 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -167,6 +167,7 @@ struct kvm_pit_config {
 #define KVM_EXIT_OSI  18
 #define KVM_EXIT_PAPR_HCALL  19
 #define KVM_EXIT_S390_UCONTROL   20
+#define KVM_EXIT_AER_SHUTDOWN 21
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 #define KVM_INTERNAL_ERROR_EMULATION 1
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/4] AER-KVM: Error containment of PCI pass-thru devices assigned to KVM guests

2012-11-19 Thread Pandarathil, Vijaymohan R
Add support for error containment when a PCI pass-thru device assigned to a KVM
guest encounters an error. This is for PCIe devices/drivers that support AER
functionality. When the OS is notified of an error in a device either
through the firmware first approach or through an interrupt handled by the AER
root port driver, concerned subsystems are notified by invoking callbacks
registered by these subsystems. The device is also marked as tainted till the
corresponding driver recovery routines are successful. 

KVM module registers for a notification of such errors. In the KVM callback
routine, a global counter is incremented to keep track of the error
notification. Before each CPU enters guest mode to execute guest code,
appropriate checks are done to see if the impacted device belongs to the guest
or not. If the device belongs to the guest, qemu hypervisor for the guest is
informed and the guest is immediately brought down, thus preventing or
minimizing chances of any bad data being written out by the guest driver
after the device has encountered an error.

Note that the changes here are specific to  PCI pass-thru devices and is
confined to error containment. Error recovery is not included in these set
of changes. A future set of patches is planned to address SR-IOV devices and
VFIO devices assigned to guests as well as recovery without bringing down
the guest.

---
Vijay Mohan Pandarathil(4):

 AER-PCI: Add infrastructure for notification of errors to other subsystems
 AER-GHES: Add support for error notification in firmware first approach of AER
 AER-KVM: Integration of KVM with AER for PCI pass-thru devices
 AER-QEMU: Bring down the guest when KVM detects a PCI device error

 arch/x86/include/asm/kvm_host.h|  1 +
 arch/x86/kvm/x86.c | 44 ++
 drivers/acpi/apei/ghes.c   | 41 +++
 drivers/pci/pcie/aer/aerdrv.c  | 20 +
 drivers/pci/pcie/aer/aerdrv_core.c |  9 +++-
 include/linux/aer.h|  4 
 include/linux/kvm_host.h   |  4 
 include/linux/pci.h|  2 ++
 include/uapi/linux/kvm.h   |  1 +
 virt/kvm/assigned-dev.c| 34 +
 virt/kvm/kvm_main.c| 34 +
 11 files changed, 193 insertions(+), 1 deletion(-)

 
Qemu files changed

 kvm-all.c |6 ++
 linux-headers/linux/kvm.h |1 +
 2 files changed, 7 insertions(+)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] AER-GHES: Add support for error notification in firmware first approach of AER

2012-11-19 Thread Pandarathil, Vijaymohan R
- When an uncorrected recoverable error is reported via an NMI in the
firmware first model of AER, invoke the callbacks registered for
notifications as well as mark the device as tainted.

- Add a new function ghes_mark_dev_err() leveraged from ghes_do_proc()
to identify the device from the error record and mark it as tainted.

Signed-off-by: Vijay Mohan Pandarathil 
---
 drivers/acpi/apei/ghes.c | 41 +
 1 file changed, 41 insertions(+)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 1599566..7b077a7 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -502,6 +502,45 @@ static void ghes_do_proc(const struct 
acpi_hest_generic_status *estatus)
}
 }
 
+/*
+ * If an uncorrected recoverable PCIe error is reported, the corresponding
+ * PCI device is marked as tainted. The device remains tainted until the
+ * claiming driver does a recovery. The PCI device is identified from the
+ * error record.
+ */
+static void ghes_mark_dev_err(const struct acpi_hest_generic_status *estatus)
+{
+   int sev, sec_sev;
+   struct acpi_hest_generic_data *gdata;
+
+   sev = ghes_severity(estatus->error_severity);
+   apei_estatus_for_each_section(estatus, gdata) {
+   sec_sev = ghes_severity(gdata->error_severity);
+   if (!uuid_le_cmp(*(uuid_le *)gdata->section_type,
+ CPER_SEC_PCIE)) {
+   struct cper_sec_pcie *pcie_err;
+   pcie_err = (struct cper_sec_pcie *)(gdata+1);
+   if (sev == GHES_SEV_RECOVERABLE &&
+   sec_sev == GHES_SEV_RECOVERABLE &&
+   pcie_err->validation_bits &
+   CPER_PCIE_VALID_DEVICE_ID &&
+   pcie_err->validation_bits &
+   CPER_PCIE_VALID_AER_INFO) {
+   unsigned int devfn;
+   struct pci_dev *pdev;
+   devfn = PCI_DEVFN(pcie_err->device_id.device,
+ pcie_err->device_id.function);
+   pdev = pci_get_domain_bus_and_slot(
+   pcie_err->device_id.segment,
+   pcie_err->device_id.bus,
+   devfn);
+   if (!pdev)
+   continue;
+   pdev->dev_flags |= PCI_DEV_FLAGS_ERR_DETECTED;
+   }
+   }
+   }
+}
 static void __ghes_print_estatus(const char *pfx,
 const struct acpi_hest_generic *generic,
 const struct acpi_hest_generic_status *estatus)
@@ -868,6 +907,8 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs 
*regs)
estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
memcpy(estatus, ghes->estatus, len);
llist_add(_node->llnode, _estatus_llist);
+   ghes_mark_dev_err(estatus);
+   aer_notify(0, NULL);
}
 next:
 #endif
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] AER-PCI: Add infrastructure for notification of errors to other subsystems

2012-11-19 Thread Pandarathil, Vijaymohan R

- Adds infrastructure support for error notifications from AER subsystem
to other subsystems. The generic notifier_chain functionality is used.

- When the AER rootport driver detects an uncorrected error, invoke the
callbacks registered for notifications as well as mark the device as
tainted.

- After the recovery is successful, clear the tainted flag on the device.

Signed-off-by: Vijay Mohan Pandarathil 

---
 drivers/pci/pcie/aer/aerdrv.c  | 20 
 drivers/pci/pcie/aer/aerdrv_core.c |  9 -
 include/linux/aer.h|  4 
 include/linux/pci.h|  2 ++
 4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/aer/aerdrv.c b/drivers/pci/pcie/aer/aerdrv.c
index 030cf12..92dc54c 100644
--- a/drivers/pci/pcie/aer/aerdrv.c
+++ b/drivers/pci/pcie/aer/aerdrv.c
@@ -68,6 +68,26 @@ static struct pcie_port_service_driver aerdriver = {
 
 static int pcie_aer_disable;
 
+ATOMIC_NOTIFIER_HEAD(aer_notifier_list);
+
+void aer_notifier_register(struct notifier_block *nb)
+{
+   atomic_notifier_chain_register(_notifier_list, nb);
+}
+EXPORT_SYMBOL_GPL(aer_notifier_register);
+
+void aer_notifier_unregister(struct notifier_block *nb)
+{
+   atomic_notifier_chain_unregister(_notifier_list, nb);
+}
+EXPORT_SYMBOL_GPL(aer_notifier_unregister);
+
+void aer_notify(unsigned long val, void *v)
+{
+   atomic_notifier_call_chain(_notifier_list, val, v);
+}
+EXPORT_SYMBOL_GPL(aer_notify);
+
 void pci_no_aer(void)
 {
pcie_aer_disable = 1;   /* has priority over 'forceload' */
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c 
b/drivers/pci/pcie/aer/aerdrv_core.c
index af4e31c..be6c3ee 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -215,6 +215,8 @@ static int report_error_detected(struct pci_dev *dev, void 
*data)
 
device_lock(>dev);
dev->error_state = result_data->state;
+   dev->dev_flags |= PCI_DEV_FLAGS_ERR_DETECTED;
+   aer_notify(0, NULL);
 
if (!dev->driver ||
!dev->driver->err_handler ||
@@ -291,6 +293,7 @@ static int report_resume(struct pci_dev *dev, void *data)
 
device_lock(>dev);
dev->error_state = pci_channel_io_normal;
+   dev->dev_flags &= ~PCI_DEV_FLAGS_ERR_DETECTED;
 
if (!dev->driver ||
!dev->driver->err_handler ||
@@ -521,6 +524,7 @@ static void do_recovery(struct pci_dev *dev, int severity)
"resume",
report_resume);
 
+   dev->dev_flags &= ~PCI_DEV_FLAGS_ERR_DETECTED;
dev_info(>dev, "AER: Device recovery successful\n");
return;
 
@@ -552,8 +556,11 @@ static void handle_error_source(struct pcie_device *aerdev,
if (pos)
pci_write_config_dword(dev, pos + PCI_ERR_COR_STATUS,
info->status);
-   } else
+   } else {
+   dev->dev_flags |= PCI_DEV_FLAGS_ERR_DETECTED;
+   aer_notify(0, NULL);
do_recovery(dev, info->severity);
+   }
 }
 
 #ifdef CONFIG_ACPI_APEI_PCIEAER
diff --git a/include/linux/aer.h b/include/linux/aer.h
index 544abdb..f8df468 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -54,5 +54,9 @@ extern void cper_print_aer(const char *prefix, int 
cper_severity,
 extern int cper_severity_to_aer(int cper_severity);
 extern void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
  int severity);
+extern void aer_notifier_register(struct notifier_block *nb);
+extern void aer_notifier_unregister(struct notifier_block *nb);
+extern void aer_notify(unsigned long val, void *v);
+
 #endif //_AER_H_
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index ee21795..ab17a08 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -155,6 +155,8 @@ enum pci_dev_flags {
PCI_DEV_FLAGS_NO_D3 = (__force pci_dev_flags_t) 2,
/* Provide indication device is assigned by a Virtual Machine Manager */
PCI_DEV_FLAGS_ASSIGNED = (__force pci_dev_flags_t) 4,
+   /* Indicates that hw has reported an uncorrected error for the device */
+   PCI_DEV_FLAGS_ERR_DETECTED = (__force pci_dev_flags_t) 8,
 };
 
 enum pci_irq_reroute_variant {
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] AER-PCI: Add infrastructure for notification of errors to other subsystems

2012-11-19 Thread Pandarathil, Vijaymohan R

- Adds infrastructure support for error notifications from AER subsystem
to other subsystems. The generic notifier_chain functionality is used.

- When the AER rootport driver detects an uncorrected error, invoke the
callbacks registered for notifications as well as mark the device as
tainted.

- After the recovery is successful, clear the tainted flag on the device.

Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com

---
 drivers/pci/pcie/aer/aerdrv.c  | 20 
 drivers/pci/pcie/aer/aerdrv_core.c |  9 -
 include/linux/aer.h|  4 
 include/linux/pci.h|  2 ++
 4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/aer/aerdrv.c b/drivers/pci/pcie/aer/aerdrv.c
index 030cf12..92dc54c 100644
--- a/drivers/pci/pcie/aer/aerdrv.c
+++ b/drivers/pci/pcie/aer/aerdrv.c
@@ -68,6 +68,26 @@ static struct pcie_port_service_driver aerdriver = {
 
 static int pcie_aer_disable;
 
+ATOMIC_NOTIFIER_HEAD(aer_notifier_list);
+
+void aer_notifier_register(struct notifier_block *nb)
+{
+   atomic_notifier_chain_register(aer_notifier_list, nb);
+}
+EXPORT_SYMBOL_GPL(aer_notifier_register);
+
+void aer_notifier_unregister(struct notifier_block *nb)
+{
+   atomic_notifier_chain_unregister(aer_notifier_list, nb);
+}
+EXPORT_SYMBOL_GPL(aer_notifier_unregister);
+
+void aer_notify(unsigned long val, void *v)
+{
+   atomic_notifier_call_chain(aer_notifier_list, val, v);
+}
+EXPORT_SYMBOL_GPL(aer_notify);
+
 void pci_no_aer(void)
 {
pcie_aer_disable = 1;   /* has priority over 'forceload' */
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c 
b/drivers/pci/pcie/aer/aerdrv_core.c
index af4e31c..be6c3ee 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -215,6 +215,8 @@ static int report_error_detected(struct pci_dev *dev, void 
*data)
 
device_lock(dev-dev);
dev-error_state = result_data-state;
+   dev-dev_flags |= PCI_DEV_FLAGS_ERR_DETECTED;
+   aer_notify(0, NULL);
 
if (!dev-driver ||
!dev-driver-err_handler ||
@@ -291,6 +293,7 @@ static int report_resume(struct pci_dev *dev, void *data)
 
device_lock(dev-dev);
dev-error_state = pci_channel_io_normal;
+   dev-dev_flags = ~PCI_DEV_FLAGS_ERR_DETECTED;
 
if (!dev-driver ||
!dev-driver-err_handler ||
@@ -521,6 +524,7 @@ static void do_recovery(struct pci_dev *dev, int severity)
resume,
report_resume);
 
+   dev-dev_flags = ~PCI_DEV_FLAGS_ERR_DETECTED;
dev_info(dev-dev, AER: Device recovery successful\n);
return;
 
@@ -552,8 +556,11 @@ static void handle_error_source(struct pcie_device *aerdev,
if (pos)
pci_write_config_dword(dev, pos + PCI_ERR_COR_STATUS,
info-status);
-   } else
+   } else {
+   dev-dev_flags |= PCI_DEV_FLAGS_ERR_DETECTED;
+   aer_notify(0, NULL);
do_recovery(dev, info-severity);
+   }
 }
 
 #ifdef CONFIG_ACPI_APEI_PCIEAER
diff --git a/include/linux/aer.h b/include/linux/aer.h
index 544abdb..f8df468 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -54,5 +54,9 @@ extern void cper_print_aer(const char *prefix, int 
cper_severity,
 extern int cper_severity_to_aer(int cper_severity);
 extern void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
  int severity);
+extern void aer_notifier_register(struct notifier_block *nb);
+extern void aer_notifier_unregister(struct notifier_block *nb);
+extern void aer_notify(unsigned long val, void *v);
+
 #endif //_AER_H_
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index ee21795..ab17a08 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -155,6 +155,8 @@ enum pci_dev_flags {
PCI_DEV_FLAGS_NO_D3 = (__force pci_dev_flags_t) 2,
/* Provide indication device is assigned by a Virtual Machine Manager */
PCI_DEV_FLAGS_ASSIGNED = (__force pci_dev_flags_t) 4,
+   /* Indicates that hw has reported an uncorrected error for the device */
+   PCI_DEV_FLAGS_ERR_DETECTED = (__force pci_dev_flags_t) 8,
 };
 
 enum pci_irq_reroute_variant {
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] AER-GHES: Add support for error notification in firmware first approach of AER

2012-11-19 Thread Pandarathil, Vijaymohan R
- When an uncorrected recoverable error is reported via an NMI in the
firmware first model of AER, invoke the callbacks registered for
notifications as well as mark the device as tainted.

- Add a new function ghes_mark_dev_err() leveraged from ghes_do_proc()
to identify the device from the error record and mark it as tainted.

Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
---
 drivers/acpi/apei/ghes.c | 41 +
 1 file changed, 41 insertions(+)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 1599566..7b077a7 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -502,6 +502,45 @@ static void ghes_do_proc(const struct 
acpi_hest_generic_status *estatus)
}
 }
 
+/*
+ * If an uncorrected recoverable PCIe error is reported, the corresponding
+ * PCI device is marked as tainted. The device remains tainted until the
+ * claiming driver does a recovery. The PCI device is identified from the
+ * error record.
+ */
+static void ghes_mark_dev_err(const struct acpi_hest_generic_status *estatus)
+{
+   int sev, sec_sev;
+   struct acpi_hest_generic_data *gdata;
+
+   sev = ghes_severity(estatus-error_severity);
+   apei_estatus_for_each_section(estatus, gdata) {
+   sec_sev = ghes_severity(gdata-error_severity);
+   if (!uuid_le_cmp(*(uuid_le *)gdata-section_type,
+ CPER_SEC_PCIE)) {
+   struct cper_sec_pcie *pcie_err;
+   pcie_err = (struct cper_sec_pcie *)(gdata+1);
+   if (sev == GHES_SEV_RECOVERABLE 
+   sec_sev == GHES_SEV_RECOVERABLE 
+   pcie_err-validation_bits 
+   CPER_PCIE_VALID_DEVICE_ID 
+   pcie_err-validation_bits 
+   CPER_PCIE_VALID_AER_INFO) {
+   unsigned int devfn;
+   struct pci_dev *pdev;
+   devfn = PCI_DEVFN(pcie_err-device_id.device,
+ pcie_err-device_id.function);
+   pdev = pci_get_domain_bus_and_slot(
+   pcie_err-device_id.segment,
+   pcie_err-device_id.bus,
+   devfn);
+   if (!pdev)
+   continue;
+   pdev-dev_flags |= PCI_DEV_FLAGS_ERR_DETECTED;
+   }
+   }
+   }
+}
 static void __ghes_print_estatus(const char *pfx,
 const struct acpi_hest_generic *generic,
 const struct acpi_hest_generic_status *estatus)
@@ -868,6 +907,8 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs 
*regs)
estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
memcpy(estatus, ghes-estatus, len);
llist_add(estatus_node-llnode, ghes_estatus_llist);
+   ghes_mark_dev_err(estatus);
+   aer_notify(0, NULL);
}
 next:
 #endif
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/4] AER-KVM: Error containment of PCI pass-thru devices assigned to KVM guests

2012-11-19 Thread Pandarathil, Vijaymohan R
Add support for error containment when a PCI pass-thru device assigned to a KVM
guest encounters an error. This is for PCIe devices/drivers that support AER
functionality. When the OS is notified of an error in a device either
through the firmware first approach or through an interrupt handled by the AER
root port driver, concerned subsystems are notified by invoking callbacks
registered by these subsystems. The device is also marked as tainted till the
corresponding driver recovery routines are successful. 

KVM module registers for a notification of such errors. In the KVM callback
routine, a global counter is incremented to keep track of the error
notification. Before each CPU enters guest mode to execute guest code,
appropriate checks are done to see if the impacted device belongs to the guest
or not. If the device belongs to the guest, qemu hypervisor for the guest is
informed and the guest is immediately brought down, thus preventing or
minimizing chances of any bad data being written out by the guest driver
after the device has encountered an error.

Note that the changes here are specific to  PCI pass-thru devices and is
confined to error containment. Error recovery is not included in these set
of changes. A future set of patches is planned to address SR-IOV devices and
VFIO devices assigned to guests as well as recovery without bringing down
the guest.

---
Vijay Mohan Pandarathil(4):

 AER-PCI: Add infrastructure for notification of errors to other subsystems
 AER-GHES: Add support for error notification in firmware first approach of AER
 AER-KVM: Integration of KVM with AER for PCI pass-thru devices
 AER-QEMU: Bring down the guest when KVM detects a PCI device error

 arch/x86/include/asm/kvm_host.h|  1 +
 arch/x86/kvm/x86.c | 44 ++
 drivers/acpi/apei/ghes.c   | 41 +++
 drivers/pci/pcie/aer/aerdrv.c  | 20 +
 drivers/pci/pcie/aer/aerdrv_core.c |  9 +++-
 include/linux/aer.h|  4 
 include/linux/kvm_host.h   |  4 
 include/linux/pci.h|  2 ++
 include/uapi/linux/kvm.h   |  1 +
 virt/kvm/assigned-dev.c| 34 +
 virt/kvm/kvm_main.c| 34 +
 11 files changed, 193 insertions(+), 1 deletion(-)

 
Qemu files changed

 kvm-all.c |6 ++
 linux-headers/linux/kvm.h |1 +
 2 files changed, 7 insertions(+)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] AER-KVM: Integration of KVM with AER for PCI pass-thru devices

2012-11-19 Thread Pandarathil, Vijaymohan R

- Register a notifier function to be called whenever a PCIe error is
detected by the AER subsystem.

- The notifier function bumps up a global count to keep track of the
error notifications.

- Before guest entry, each vcpu checks if there has been any new
notifications since last check. If any, check if the device impacted
is assigned to the guest. If impacted, return to qemu requesting that
the guest be brought down. If no device assigned to the guest is impacted,
sync up the per guest notified count to the global value.

- At guest start time, check if any of the PCI devices assigned to the
guest is faulty and if so, fail the guest startup.

Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/x86.c  | 44 +
 include/linux/kvm_host.h|  4 
 include/uapi/linux/kvm.h|  1 +
 virt/kvm/assigned-dev.c | 34 +++
 virt/kvm/kvm_main.c | 34 +++
 6 files changed, 118 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b2e11f4..481ad94 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -951,6 +951,7 @@ enum {
  */
 asmlinkage void kvm_spurious_fault(void);
 extern bool kvm_rebooting;
+extern unsigned long kvm_aer_notified_cnt;
 
 #define kvm_handle_fault_on_reboot(insn, cleanup_insn) \
666:  insn \n\t \
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4f76417..87e3c3e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5235,6 +5235,32 @@ static void process_nmi(struct kvm_vcpu *vcpu)
kvm_make_request(KVM_REQ_EVENT, vcpu);
 }
 
+/*
+ * This function checks if KVM has been notified of any PCI error since last
+ * checked by this guest. If so, it checks if any PCI device assigned to this
+ * guest has got the error. If not, adjust the per guest notified_cnt to match
+ * the global kvm notified_cnt
+ */
+static inline int kvm_aer_exit(struct kvm *kvm)
+{
+   if (kvm_aer_notified_cnt == kvm-aer_notified_cnt)
+   return 0;
+
+   /*
+* These errors are expected to be very rare. In the case
+* of an error notification, multiple vcpu threads could reach
+* here and do the device check below. However, functionally
+* it shouldn't cause a problem.
+*/
+   if (kvm_find_assigned_dev_err(kvm)) {
+   return 1;
+   } else {
+   spin_lock(kvm-aer_lock);
+   kvm-aer_notified_cnt = kvm_aer_notified_cnt;
+   spin_unlock(kvm-aer_lock);
+   return 0;
+   }
+}
 static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 {
int r;
@@ -5334,6 +5360,24 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
goto cancel_injection;
}
 
+   /*
+* If any of the PCI devices assigned to a guest is reported to have
+* uncorrected error, do not allow guest code to execute, instead
+* bring down the guest to contain the error. Note that there is a
+* small window here where a new error notification could come in while
+* while the check is being done or right after the check before the cpu
+* enters the guest mode. Not sure if this check needs to be after
+* kvm_guest_enter() ?
+*/
+   if (kvm_aer_exit(vcpu-kvm)) {
+   vcpu-mode = OUTSIDE_GUEST_MODE;
+   smp_wmb();
+   local_irq_enable();
+   preempt_enable();
+   r = 0;
+   vcpu-run-exit_reason = KVM_EXIT_AER_SHUTDOWN;
+   goto cancel_injection;
+   }
srcu_read_unlock(vcpu-kvm-srcu, vcpu-srcu_idx);
 
if (req_immediate_exit)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ecc5543..b3c2730 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -364,6 +364,8 @@ struct kvm {
long mmu_notifier_count;
 #endif
long tlbs_dirty;
+   spinlock_t aer_lock;
+   unsigned long aer_notified_cnt;
 };
 
 #define kvm_err(fmt, ...) \
@@ -933,6 +935,8 @@ static inline long kvm_vm_ioctl_assigned_device(struct kvm 
*kvm, unsigned ioctl,
 
 #endif
 
+int kvm_find_assigned_dev_err(struct kvm *kvm);
+
 static inline void kvm_make_request(int req, struct kvm_vcpu *vcpu)
 {
set_bit(req, vcpu-requests);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 0a6d6ba..6263c21 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -167,6 +167,7 @@ struct kvm_pit_config {
 #define KVM_EXIT_OSI  18
 #define KVM_EXIT_PAPR_HCALL  19
 #define KVM_EXIT_S390_UCONTROL   20
+#define KVM_EXIT_AER_SHUTDOWN 21
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 #define KVM_INTERNAL_ERROR_EMULATION 1
diff --git 

[PATCH 4/4] AER-QEMU: Bring down the guest when KVM detects a PCI device error

2012-11-19 Thread Pandarathil, Vijaymohan R
- When KVM_RUN ioctl returns with an exit reason requesting a shutdown
of the guest due to a PCI device error detected by AER, shutdown the
guest immediately.

Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
---
 kvm-all.c | 6 ++
 linux-headers/linux/kvm.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/kvm-all.c b/kvm-all.c
index b6d0483..aaff44c 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1592,6 +1592,12 @@ int kvm_cpu_exec(CPUArchState *env)
 qemu_system_reset_request();
 ret = EXCP_INTERRUPT;
 break;
+case KVM_EXIT_AER_SHUTDOWN:
+fprintf(stderr, KVM: PCI device assigned to guest encountered 
+   an uncorrectable error. Stopping guest\n);
+qemu_system_shutdown_request();
+ret = EXCP_INTERRUPT;
+break;
 case KVM_EXIT_UNKNOWN:
 fprintf(stderr, KVM: unknown exit, hardware reason % PRIx64 \n,
 (uint64_t)run-hw.hardware_exit_reason);
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 81d2feb..64906ef 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -167,6 +167,7 @@ struct kvm_pit_config {
 #define KVM_EXIT_OSI  18
 #define KVM_EXIT_PAPR_HCALL  19
 #define KVM_EXIT_S390_UCONTROL   20
+#define KVM_EXIT_AER_SHUTDOWN 21
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 #define KVM_INTERNAL_ERROR_EMULATION 1
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

2012-11-17 Thread Pandarathil, Vijaymohan R
When an error is detected on a PCIe device which does not have an
AER-aware driver, prevent AER infrastructure from reporting
successful error recovery.

This is because the report_error_detected() function that gets
called in the first phase of recovery process allows forward
progress even when the driver for the device does not have AER
capabilities. It seems that all callbacks (in pci_error_handlers
structure) registered by drivers that gets called during error
recovery are not mandatory. So the intention of the infrastructure
design seems to be to allow forward progress even when a specific
callback has not been registered by a driver. However, if error
handler structure itself has not been registered, it doesn't make
sense to allow forward progress.

As a result of the current design, in the case of a single device
having an AER-unaware driver or in the case of any function in a
multi-function card having an AER-unaware driver, a successful
recovery is reported.

Typical scenario this happens is when a PCI device is detached
from a KVM host and the pci-stub driver on the host claims the
device. The pci-stub driver does not have error handling capabilities
but the AER infrastructure still reports that the device recovered
successfully.

The changes proposed here leaves the device(s)in an unrecovered state
if the driver for the device or for any device in the subtree
does not have error handler structure registered. This reflects
the true state of the device and prevents any partial recovery (or no
recovery at all) reported as successful.


v3:
  - Fixed a checkpatch/build issue

v2:
  - Made changes so that all devices in the subtree have the error
state set correctly.


Reviewed-by: Linas Vepstas  gmail.com>
Reviewed-by: Myron Stowe  redhat.com>
Reviewed-by: Bjorn Helgaas  google.com>
Signed-off-by: Vijay Mohan Pandarathil 
---
 drivers/pci/pcie/aer/aerdrv.h  |  5 -
 drivers/pci/pcie/aer/aerdrv_core.c | 21 ++---
 include/linux/pci.h|  3 +++
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/pcie/aer/aerdrv.h b/drivers/pci/pcie/aer/aerdrv.h
index 94a7598..22f840f 100644
--- a/drivers/pci/pcie/aer/aerdrv.h
+++ b/drivers/pci/pcie/aer/aerdrv.h
@@ -87,6 +87,9 @@ struct aer_broadcast_data {
 static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
enum pci_ers_result new)
 {
+   if (new == PCI_ERS_RESULT_NO_AER_DRIVER)
+   return PCI_ERS_RESULT_NO_AER_DRIVER;
+
if (new == PCI_ERS_RESULT_NONE)
return orig;
 
@@ -97,7 +100,7 @@ static inline pci_ers_result_t merge_result(enum 
pci_ers_result orig,
break;
case PCI_ERS_RESULT_DISCONNECT:
if (new == PCI_ERS_RESULT_NEED_RESET)
-   orig = new;
+   orig = PCI_ERS_RESULT_NEED_RESET;
break;
default:
break;
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c 
b/drivers/pci/pcie/aer/aerdrv_core.c
index 9f80cb0..b67f91a 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -232,11 +232,26 @@ static int report_error_detected(struct pci_dev *dev, 
void *data)
   dev->driver ?
   "no AER-aware driver" : "no driver");
}
-   return 0;
+
+   /*
+* If there's any device in the subtree that does not
+* have an error_detected callback, returning
+* PCI_ERS_RESULT_NO_AER_DRIVER prevents calling of
+* the subsequent mmio_enabled/slot_reset/resume
+* callbacks of "any" device in the subtree. All the
+* devices in the subtree are left in the error state
+* without recovery.
+*/
+
+   if (!(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE))
+   vote = PCI_ERS_RESULT_NO_AER_DRIVER;
+   else
+   vote = PCI_ERS_RESULT_NONE;
+   } else {
+   err_handler = dev->driver->err_handler;
+   vote = err_handler->error_detected(dev, result_data->state);
}
 
-   err_handler = dev->driver->err_handler;
-   vote = err_handler->error_detected(dev, result_data->state);
result_data->result = merge_result(result_data->result, vote);
return 0;
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index ab17a08..c458602 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -540,6 +540,9 @@ enum pci_ers_result {
 
/* Device driver is fully recovered and operational */
PCI_ERS_RESULT_RECOVERED = (__force pci_ers_result_t) 5,
+
+   /* No AER capabilities registered for the driver */
+   PCI_ERS_RESULT_NO_AER_DRIVER = (__force pci_ers_result_t) 6,
 };
 
 /* PCI bus error event callbacks */
-- 
1.7.11.3

--
To unsubscribe from this list: send the 

[PATCH v3] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

2012-11-17 Thread Pandarathil, Vijaymohan R
When an error is detected on a PCIe device which does not have an
AER-aware driver, prevent AER infrastructure from reporting
successful error recovery.

This is because the report_error_detected() function that gets
called in the first phase of recovery process allows forward
progress even when the driver for the device does not have AER
capabilities. It seems that all callbacks (in pci_error_handlers
structure) registered by drivers that gets called during error
recovery are not mandatory. So the intention of the infrastructure
design seems to be to allow forward progress even when a specific
callback has not been registered by a driver. However, if error
handler structure itself has not been registered, it doesn't make
sense to allow forward progress.

As a result of the current design, in the case of a single device
having an AER-unaware driver or in the case of any function in a
multi-function card having an AER-unaware driver, a successful
recovery is reported.

Typical scenario this happens is when a PCI device is detached
from a KVM host and the pci-stub driver on the host claims the
device. The pci-stub driver does not have error handling capabilities
but the AER infrastructure still reports that the device recovered
successfully.

The changes proposed here leaves the device(s)in an unrecovered state
if the driver for the device or for any device in the subtree
does not have error handler structure registered. This reflects
the true state of the device and prevents any partial recovery (or no
recovery at all) reported as successful.


v3:
  - Fixed a checkpatch/build issue

v2:
  - Made changes so that all devices in the subtree have the error
state set correctly.


Reviewed-by: Linas Vepstas linasvepstas at gmail.com
Reviewed-by: Myron Stowe mstowe at redhat.com
Reviewed-by: Bjorn Helgaas bhelgaas at google.com
Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarat...@hp.com
---
 drivers/pci/pcie/aer/aerdrv.h  |  5 -
 drivers/pci/pcie/aer/aerdrv_core.c | 21 ++---
 include/linux/pci.h|  3 +++
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/pcie/aer/aerdrv.h b/drivers/pci/pcie/aer/aerdrv.h
index 94a7598..22f840f 100644
--- a/drivers/pci/pcie/aer/aerdrv.h
+++ b/drivers/pci/pcie/aer/aerdrv.h
@@ -87,6 +87,9 @@ struct aer_broadcast_data {
 static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
enum pci_ers_result new)
 {
+   if (new == PCI_ERS_RESULT_NO_AER_DRIVER)
+   return PCI_ERS_RESULT_NO_AER_DRIVER;
+
if (new == PCI_ERS_RESULT_NONE)
return orig;
 
@@ -97,7 +100,7 @@ static inline pci_ers_result_t merge_result(enum 
pci_ers_result orig,
break;
case PCI_ERS_RESULT_DISCONNECT:
if (new == PCI_ERS_RESULT_NEED_RESET)
-   orig = new;
+   orig = PCI_ERS_RESULT_NEED_RESET;
break;
default:
break;
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c 
b/drivers/pci/pcie/aer/aerdrv_core.c
index 9f80cb0..b67f91a 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -232,11 +232,26 @@ static int report_error_detected(struct pci_dev *dev, 
void *data)
   dev-driver ?
   no AER-aware driver : no driver);
}
-   return 0;
+
+   /*
+* If there's any device in the subtree that does not
+* have an error_detected callback, returning
+* PCI_ERS_RESULT_NO_AER_DRIVER prevents calling of
+* the subsequent mmio_enabled/slot_reset/resume
+* callbacks of any device in the subtree. All the
+* devices in the subtree are left in the error state
+* without recovery.
+*/
+
+   if (!(dev-hdr_type  PCI_HEADER_TYPE_BRIDGE))
+   vote = PCI_ERS_RESULT_NO_AER_DRIVER;
+   else
+   vote = PCI_ERS_RESULT_NONE;
+   } else {
+   err_handler = dev-driver-err_handler;
+   vote = err_handler-error_detected(dev, result_data-state);
}
 
-   err_handler = dev-driver-err_handler;
-   vote = err_handler-error_detected(dev, result_data-state);
result_data-result = merge_result(result_data-result, vote);
return 0;
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index ab17a08..c458602 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -540,6 +540,9 @@ enum pci_ers_result {
 
/* Device driver is fully recovered and operational */
PCI_ERS_RESULT_RECOVERED = (__force pci_ers_result_t) 5,
+
+   /* No AER capabilities registered for the driver */
+   PCI_ERS_RESULT_NO_AER_DRIVER = (__force pci_ers_result_t) 6,
 };
 
 /* PCI bus error event callbacks */
-- 
1.7.11.3


[PATCH v2] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

2012-11-16 Thread Pandarathil, Vijaymohan R
When an error is detected on a PCIe device which does not have an
AER-aware driver, prevent AER infrastructure from reporting
successful error recovery.

This is because the report_error_detected() function that gets
called in the first phase of recovery process allows forward
progress even when the driver for the device does not have AER
capabilities. It seems that all callbacks (in pci_error_handlers
structure) registered by drivers that gets called during error
recovery are not mandatory. So the intention of the infrastructure
design seems to be to allow forward progress even when a specific
callback has not been registered by a driver. However, if error
handler structure itself has not been registered, it doesn't make
sense to allow forward progress.

As a result of the current design, in the case of a single device
having an AER-unaware driver or in the case of any function in a
multi-function card having an AER-unaware driver, a successful
recovery is reported.

Typical scenario this happens is when a PCI device is detached
from a KVM host and the pci-stub driver on the host claims the
device. The pci-stub driver does not have error handling capabilities
but the AER infrastructure still reports that the device recovered
successfully.

The changes proposed here leaves the device(s)in an unrecovered state
if the driver for the device or for any device in the subtree
does not have error handler structure registered. This reflects
the true state of the device and prevents any partial recovery (or no
recovery at all) reported as successful.

v2:
  - Made changes so that all devices in the subtree have the error
state set correctly.

Reviewed-by: Linas Vepstas  gmail.com>
Reviewed-by: Myron Stowe  redhat.com>
Reviewed-by: Bjorn Helgaas  google.com>
Signed-off-by: Vijay Mohan Pandarathil  hp.com>

---
 drivers/pci/pcie/aer/aerdrv.h  |  5 -
 drivers/pci/pcie/aer/aerdrv_core.c | 21 ++---
 include/linux/pci.h|  3 +++
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/pcie/aer/aerdrv.h b/drivers/pci/pcie/aer/aerdrv.h
index 94a7598..22f840f 100644
--- a/drivers/pci/pcie/aer/aerdrv.h
+++ b/drivers/pci/pcie/aer/aerdrv.h
@@ -87,6 +87,9 @@ struct aer_broadcast_data {
 static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
enum pci_ers_result new)
 {
+   if (new == PCI_ERS_RESULT_NO_AER_DRIVER)
+   return PCI_ERS_RESULT_NO_AER_DRIVER;
+
if (new == PCI_ERS_RESULT_NONE)
return orig;
 
@@ -97,7 +100,7 @@ static inline pci_ers_result_t merge_result(enum 
pci_ers_result orig,
break;
case PCI_ERS_RESULT_DISCONNECT:
if (new == PCI_ERS_RESULT_NEED_RESET)
-   orig = new;
+   orig = PCI_ERS_RESULT_NEED_RESET;
break;
default:
break;
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c 
b/drivers/pci/pcie/aer/aerdrv_core.c
index 06bad96..c1b8fdd 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -231,11 +231,26 @@ static int report_error_detected(struct pci_dev *dev, 
void *data)
   dev->driver ?
   "no AER-aware driver" : "no driver");
}
-   return 0;
+
+   /*
+* If there's any device in the subtree that does not
+* have an error_detected callback, returning
+* PCI_ERS_RESULT_NO_AER_DRIVER prevents calling of
+* the subsequent mmio_enabled/slot_reset/resume
+* callbacks of "any" device in the subtree. All the
+* devices in the subtree are left in the error state
+* without recovery.
+*/
+
+   if !(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)
+   vote = PCI_ERS_RESULT_NO_AER_DRIVER;
+   else
+   vote = PCI_ERS_RESULT_NONE;
+   } else {
+   err_handler = dev->driver->err_handler;
+   vote = err_handler->error_detected(dev, result_data->state);
}
 
-   err_handler = dev->driver->err_handler;
-   vote = err_handler->error_detected(dev, result_data->state);
result_data->result = merge_result(result_data->result, vote);
return 0;
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index ee21795..fb7e869 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -538,6 +538,9 @@ enum pci_ers_result {
 
/* Device driver is fully recovered and operational */
PCI_ERS_RESULT_RECOVERED = (__force pci_ers_result_t) 5,
+
+   /* No AER capabilities registered for the driver */
+   PCI_ERS_RESULT_NO_AER_DRIVER = (__force pci_ers_result_t) 6,
 };
 
 /* PCI bus error event callbacks */
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

[PATCH v2] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

2012-11-16 Thread Pandarathil, Vijaymohan R
When an error is detected on a PCIe device which does not have an
AER-aware driver, prevent AER infrastructure from reporting
successful error recovery.

This is because the report_error_detected() function that gets
called in the first phase of recovery process allows forward
progress even when the driver for the device does not have AER
capabilities. It seems that all callbacks (in pci_error_handlers
structure) registered by drivers that gets called during error
recovery are not mandatory. So the intention of the infrastructure
design seems to be to allow forward progress even when a specific
callback has not been registered by a driver. However, if error
handler structure itself has not been registered, it doesn't make
sense to allow forward progress.

As a result of the current design, in the case of a single device
having an AER-unaware driver or in the case of any function in a
multi-function card having an AER-unaware driver, a successful
recovery is reported.

Typical scenario this happens is when a PCI device is detached
from a KVM host and the pci-stub driver on the host claims the
device. The pci-stub driver does not have error handling capabilities
but the AER infrastructure still reports that the device recovered
successfully.

The changes proposed here leaves the device(s)in an unrecovered state
if the driver for the device or for any device in the subtree
does not have error handler structure registered. This reflects
the true state of the device and prevents any partial recovery (or no
recovery at all) reported as successful.

v2:
  - Made changes so that all devices in the subtree have the error
state set correctly.

Reviewed-by: Linas Vepstas linasvepstas at gmail.com
Reviewed-by: Myron Stowe mstowe at redhat.com
Reviewed-by: Bjorn Helgaas bhelgaas at google.com
Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarathil at hp.com

---
 drivers/pci/pcie/aer/aerdrv.h  |  5 -
 drivers/pci/pcie/aer/aerdrv_core.c | 21 ++---
 include/linux/pci.h|  3 +++
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/pcie/aer/aerdrv.h b/drivers/pci/pcie/aer/aerdrv.h
index 94a7598..22f840f 100644
--- a/drivers/pci/pcie/aer/aerdrv.h
+++ b/drivers/pci/pcie/aer/aerdrv.h
@@ -87,6 +87,9 @@ struct aer_broadcast_data {
 static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
enum pci_ers_result new)
 {
+   if (new == PCI_ERS_RESULT_NO_AER_DRIVER)
+   return PCI_ERS_RESULT_NO_AER_DRIVER;
+
if (new == PCI_ERS_RESULT_NONE)
return orig;
 
@@ -97,7 +100,7 @@ static inline pci_ers_result_t merge_result(enum 
pci_ers_result orig,
break;
case PCI_ERS_RESULT_DISCONNECT:
if (new == PCI_ERS_RESULT_NEED_RESET)
-   orig = new;
+   orig = PCI_ERS_RESULT_NEED_RESET;
break;
default:
break;
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c 
b/drivers/pci/pcie/aer/aerdrv_core.c
index 06bad96..c1b8fdd 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -231,11 +231,26 @@ static int report_error_detected(struct pci_dev *dev, 
void *data)
   dev-driver ?
   no AER-aware driver : no driver);
}
-   return 0;
+
+   /*
+* If there's any device in the subtree that does not
+* have an error_detected callback, returning
+* PCI_ERS_RESULT_NO_AER_DRIVER prevents calling of
+* the subsequent mmio_enabled/slot_reset/resume
+* callbacks of any device in the subtree. All the
+* devices in the subtree are left in the error state
+* without recovery.
+*/
+
+   if !(dev-hdr_type  PCI_HEADER_TYPE_BRIDGE)
+   vote = PCI_ERS_RESULT_NO_AER_DRIVER;
+   else
+   vote = PCI_ERS_RESULT_NONE;
+   } else {
+   err_handler = dev-driver-err_handler;
+   vote = err_handler-error_detected(dev, result_data-state);
}
 
-   err_handler = dev-driver-err_handler;
-   vote = err_handler-error_detected(dev, result_data-state);
result_data-result = merge_result(result_data-result, vote);
return 0;
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index ee21795..fb7e869 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -538,6 +538,9 @@ enum pci_ers_result {
 
/* Device driver is fully recovered and operational */
PCI_ERS_RESULT_RECOVERED = (__force pci_ers_result_t) 5,
+
+   /* No AER capabilities registered for the driver */
+   PCI_ERS_RESULT_NO_AER_DRIVER = (__force pci_ers_result_t) 6,
 };
 
 /* PCI bus error event callbacks */
-- 
1.7.11.3

--
To unsubscribe from this list: send 

RE: [ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

2012-11-15 Thread Pandarathil, Vijaymohan R


> -Original Message-
> From: Bjorn Helgaas [mailto:bhelg...@google.com]
> Sent: Thursday, November 15, 2012 5:09 PM
> To: Pandarathil, Vijaymohan R
> Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org;
> linasveps...@gmail.com; Myron Stowe; Ortiz, Lance E; Huang Ying; Hidetoshi
> Seto; Patterson, Andrew D (LeftHand Networks); Zhang Yanmin
> Subject: Re: [ PATCH RESEND ] PCI-AER: Do not report successful error
> recovery for devices with AER-unaware drivers
> 
> On Thu, Nov 15, 2012 at 12:09 AM, Pandarathil, Vijaymohan R
>  wrote:
> > Thanks for the comments. See my response below.
> >
> >> -Original Message-
> >> From: Bjorn Helgaas [mailto:bhelg...@google.com]
> >> Sent: Wednesday, November 14, 2012 4:51 PM
> >> To: Pandarathil, Vijaymohan R
> >> Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org;
> >> linasveps...@gmail.com; Myron Stowe; Ortiz, Lance E; Huang Ying;
> Hidetoshi
> >> Seto; Patterson, Andrew D (LeftHand Networks); Zhang Yanmin
> >> Subject: Re: [ PATCH RESEND ] PCI-AER: Do not report successful error
> >> recovery for devices with AER-unaware drivers
> >>
> >> [+cc Lance, Huang, Hidetoshi, Andrew, Zhang]
> >>
> >> On Sat, Nov 10, 2012 at 07:41:04AM +, Pandarathil, Vijaymohan R
> wrote:
> >> > When an error is detected on a PCIe device which does not have an
> >> > AER-aware driver, prevent AER infrastructure from reporting
> >> > successful error recovery.
> >> >
> >> > This is because the report_error_detected() function that gets
> >> > called in the first phase of recovery process allows forward
> >> > progress even when the driver for the device does not have AER
> >> > capabilities. It seems that all callbacks (in pci_error_handlers
> >> > structure) registered by drivers that gets called during error
> >> > recovery are not mandatory. So the intention of the infrastructure
> >> > design seems to be to allow forward progress even when a specific
> >> > callback has not been registered by a driver. However, if error
> >> > handler structure itself has not been registered, it doesn't make
> >> > sense to allow forward progress.
> >> >
> >> > As a result of the current design, in the case of a single device
> >> > having an AER-unaware driver or in the case of any function in a
> >> > multi-function card having an AER-unaware driver, a successful
> >> > recovery is reported.
> >> >
> >> > Typical scenario this happens is when a PCI device is detached
> >> > from a KVM host and the pci-stub driver on the host claims the
> >> > device. The pci-stub driver does not have error handling capabilities
> >> > but the AER infrastructure still reports that the device recovered
> >> > successfully.
> >> >
> >> > The changes proposed here leaves the device in an unrecovered state
> >> > if the driver for the device or for any function in a multi-function
> >> > card does not have error handler structure registered. This reflects
> >> > the true state of the device and prevents any partial recovery (or no
> >> > recovery at all) reported as successful.
> >> >
> >> > Please also see comments from Linas Vepstas at the following link
> >> > http://www.spinics.net/lists/linux-pci/msg18572.html
> >> >
> >> > Reviewed-by: Linas Vepstas  gmail.com>
> >> > Reviewed-by: Myron Stowe  redhat.com>
> >> > Signed-off-by: Vijay Mohan Pandarathil 
> >> hp.com>
> >> >
> >> > ---
> >> >
> >> > drivers/pci/pcie/aer/aerdrv_core.c | 6 ++
> >> >  1 file changed, 6 insertions(+)
> >> >
> >> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
> >> b/drivers/pci/pcie/aer/aerdrv_core.c
> >> > index 06bad96..030b229 100644
> >> > --- a/drivers/pci/pcie/aer/aerdrv_core.c
> >> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> >> > @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev
> >> *dev, void *data)
> >> >
> >> > dev->error_state = result_data->state;
> >> >
> >> > +   if ((!dev->driver || !dev->driver->err_handler) &&
> >> > +   !(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
> >> > +   dev_info(>dev, "AER: Error detected but no driver 

RE: [ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

2012-11-15 Thread Pandarathil, Vijaymohan R


 -Original Message-
 From: Bjorn Helgaas [mailto:bhelg...@google.com]
 Sent: Thursday, November 15, 2012 5:09 PM
 To: Pandarathil, Vijaymohan R
 Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org;
 linasveps...@gmail.com; Myron Stowe; Ortiz, Lance E; Huang Ying; Hidetoshi
 Seto; Patterson, Andrew D (LeftHand Networks); Zhang Yanmin
 Subject: Re: [ PATCH RESEND ] PCI-AER: Do not report successful error
 recovery for devices with AER-unaware drivers
 
 On Thu, Nov 15, 2012 at 12:09 AM, Pandarathil, Vijaymohan R
 vijaymohan.pandarat...@hp.com wrote:
  Thanks for the comments. See my response below.
 
  -Original Message-
  From: Bjorn Helgaas [mailto:bhelg...@google.com]
  Sent: Wednesday, November 14, 2012 4:51 PM
  To: Pandarathil, Vijaymohan R
  Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org;
  linasveps...@gmail.com; Myron Stowe; Ortiz, Lance E; Huang Ying;
 Hidetoshi
  Seto; Patterson, Andrew D (LeftHand Networks); Zhang Yanmin
  Subject: Re: [ PATCH RESEND ] PCI-AER: Do not report successful error
  recovery for devices with AER-unaware drivers
 
  [+cc Lance, Huang, Hidetoshi, Andrew, Zhang]
 
  On Sat, Nov 10, 2012 at 07:41:04AM +, Pandarathil, Vijaymohan R
 wrote:
   When an error is detected on a PCIe device which does not have an
   AER-aware driver, prevent AER infrastructure from reporting
   successful error recovery.
  
   This is because the report_error_detected() function that gets
   called in the first phase of recovery process allows forward
   progress even when the driver for the device does not have AER
   capabilities. It seems that all callbacks (in pci_error_handlers
   structure) registered by drivers that gets called during error
   recovery are not mandatory. So the intention of the infrastructure
   design seems to be to allow forward progress even when a specific
   callback has not been registered by a driver. However, if error
   handler structure itself has not been registered, it doesn't make
   sense to allow forward progress.
  
   As a result of the current design, in the case of a single device
   having an AER-unaware driver or in the case of any function in a
   multi-function card having an AER-unaware driver, a successful
   recovery is reported.
  
   Typical scenario this happens is when a PCI device is detached
   from a KVM host and the pci-stub driver on the host claims the
   device. The pci-stub driver does not have error handling capabilities
   but the AER infrastructure still reports that the device recovered
   successfully.
  
   The changes proposed here leaves the device in an unrecovered state
   if the driver for the device or for any function in a multi-function
   card does not have error handler structure registered. This reflects
   the true state of the device and prevents any partial recovery (or no
   recovery at all) reported as successful.
  
   Please also see comments from Linas Vepstas at the following link
   http://www.spinics.net/lists/linux-pci/msg18572.html
  
   Reviewed-by: Linas Vepstas linasvepstas at gmail.com
   Reviewed-by: Myron Stowe mstowe at redhat.com
   Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarathil at
  hp.com
  
   ---
  
   drivers/pci/pcie/aer/aerdrv_core.c | 6 ++
1 file changed, 6 insertions(+)
  
   diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
  b/drivers/pci/pcie/aer/aerdrv_core.c
   index 06bad96..030b229 100644
   --- a/drivers/pci/pcie/aer/aerdrv_core.c
   +++ b/drivers/pci/pcie/aer/aerdrv_core.c
   @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev
  *dev, void *data)
  
   dev-error_state = result_data-state;
  
   +   if ((!dev-driver || !dev-driver-err_handler) 
   +   !(dev-hdr_type  PCI_HEADER_TYPE_BRIDGE)) {
   +   dev_info(dev-dev, AER: Error detected but no driver has
  claimed this device or the driver is AER-unaware\n);
   +   result_data-result = PCI_ERS_RESULT_NONE;
   +   return 1;
 
  This doesn't seem right because returning 1 here causes pci_walk_bus()
  to terminate, which means we won't set dev-error_state or notify
  drivers for any devices we haven't visited yet.
 
   +   }
   if (!dev-driver ||
   !dev-driver-err_handler ||
   !dev-driver-err_handler-error_detected) {
 
  If none of the drivers in the affected hierarchy supports error
 handling,
  I think the call tree looks like this:
 
  do_recovery # uncorrectable only
  broadcast_error_message(dev, ..., report_error_detected)
  result_data.result = CAN_RECOVER
  pci_walk_bus(..., report_error_detected)
  report_error_detected   # (each dev in subtree)
  return 0# no change to result
  return result_data.result
  broadcast_error_message(dev, ..., report_mmio_enabled)
  result_data.result = PCI_ERS_RESULT_RECOVERED

RE: [ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

2012-11-14 Thread Pandarathil, Vijaymohan R
Thanks for the comments. See my response below.

> -Original Message-
> From: Bjorn Helgaas [mailto:bhelg...@google.com]
> Sent: Wednesday, November 14, 2012 4:51 PM
> To: Pandarathil, Vijaymohan R
> Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org;
> linasveps...@gmail.com; Myron Stowe; Ortiz, Lance E; Huang Ying; Hidetoshi
> Seto; Patterson, Andrew D (LeftHand Networks); Zhang Yanmin
> Subject: Re: [ PATCH RESEND ] PCI-AER: Do not report successful error
> recovery for devices with AER-unaware drivers
> 
> [+cc Lance, Huang, Hidetoshi, Andrew, Zhang]
> 
> On Sat, Nov 10, 2012 at 07:41:04AM +, Pandarathil, Vijaymohan R wrote:
> > When an error is detected on a PCIe device which does not have an
> > AER-aware driver, prevent AER infrastructure from reporting
> > successful error recovery.
> >
> > This is because the report_error_detected() function that gets
> > called in the first phase of recovery process allows forward
> > progress even when the driver for the device does not have AER
> > capabilities. It seems that all callbacks (in pci_error_handlers
> > structure) registered by drivers that gets called during error
> > recovery are not mandatory. So the intention of the infrastructure
> > design seems to be to allow forward progress even when a specific
> > callback has not been registered by a driver. However, if error
> > handler structure itself has not been registered, it doesn't make
> > sense to allow forward progress.
> >
> > As a result of the current design, in the case of a single device
> > having an AER-unaware driver or in the case of any function in a
> > multi-function card having an AER-unaware driver, a successful
> > recovery is reported.
> >
> > Typical scenario this happens is when a PCI device is detached
> > from a KVM host and the pci-stub driver on the host claims the
> > device. The pci-stub driver does not have error handling capabilities
> > but the AER infrastructure still reports that the device recovered
> > successfully.
> >
> > The changes proposed here leaves the device in an unrecovered state
> > if the driver for the device or for any function in a multi-function
> > card does not have error handler structure registered. This reflects
> > the true state of the device and prevents any partial recovery (or no
> > recovery at all) reported as successful.
> >
> > Please also see comments from Linas Vepstas at the following link
> > http://www.spinics.net/lists/linux-pci/msg18572.html
> >
> > Reviewed-by: Linas Vepstas  gmail.com>
> > Reviewed-by: Myron Stowe  redhat.com>
> > Signed-off-by: Vijay Mohan Pandarathil 
> hp.com>
> >
> > ---
> >
> > drivers/pci/pcie/aer/aerdrv_core.c | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
> b/drivers/pci/pcie/aer/aerdrv_core.c
> > index 06bad96..030b229 100644
> > --- a/drivers/pci/pcie/aer/aerdrv_core.c
> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> > @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev
> *dev, void *data)
> >
> > dev->error_state = result_data->state;
> >
> > +   if ((!dev->driver || !dev->driver->err_handler) &&
> > +   !(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
> > +   dev_info(>dev, "AER: Error detected but no driver has
> claimed this device or the driver is AER-unaware\n");
> > +   result_data->result = PCI_ERS_RESULT_NONE;
> > +   return 1;
> 
> This doesn't seem right because returning 1 here causes pci_walk_bus()
> to terminate, which means we won't set dev->error_state or notify
> drivers for any devices we haven't visited yet.
> 
> > +   }
> > if (!dev->driver ||
> > !dev->driver->err_handler ||
> > !dev->driver->err_handler->error_detected) {
> 
> If none of the drivers in the affected hierarchy supports error handling,
> I think the call tree looks like this:
> 
> do_recovery # uncorrectable only
> broadcast_error_message(dev, ..., report_error_detected)
> result_data.result = CAN_RECOVER
> pci_walk_bus(..., report_error_detected)
> report_error_detected   # (each dev in subtree)
> return 0# no change to result
> return result_data.result
> broadcast_error_message(dev, ..., report_mmio_enabled)
> result_data.result = PCI_ERS_RESULT_RECOVERED
> 

RE: [ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

2012-11-14 Thread Pandarathil, Vijaymohan R
Thanks for the comments. See my response below.

 -Original Message-
 From: Bjorn Helgaas [mailto:bhelg...@google.com]
 Sent: Wednesday, November 14, 2012 4:51 PM
 To: Pandarathil, Vijaymohan R
 Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org;
 linasveps...@gmail.com; Myron Stowe; Ortiz, Lance E; Huang Ying; Hidetoshi
 Seto; Patterson, Andrew D (LeftHand Networks); Zhang Yanmin
 Subject: Re: [ PATCH RESEND ] PCI-AER: Do not report successful error
 recovery for devices with AER-unaware drivers
 
 [+cc Lance, Huang, Hidetoshi, Andrew, Zhang]
 
 On Sat, Nov 10, 2012 at 07:41:04AM +, Pandarathil, Vijaymohan R wrote:
  When an error is detected on a PCIe device which does not have an
  AER-aware driver, prevent AER infrastructure from reporting
  successful error recovery.
 
  This is because the report_error_detected() function that gets
  called in the first phase of recovery process allows forward
  progress even when the driver for the device does not have AER
  capabilities. It seems that all callbacks (in pci_error_handlers
  structure) registered by drivers that gets called during error
  recovery are not mandatory. So the intention of the infrastructure
  design seems to be to allow forward progress even when a specific
  callback has not been registered by a driver. However, if error
  handler structure itself has not been registered, it doesn't make
  sense to allow forward progress.
 
  As a result of the current design, in the case of a single device
  having an AER-unaware driver or in the case of any function in a
  multi-function card having an AER-unaware driver, a successful
  recovery is reported.
 
  Typical scenario this happens is when a PCI device is detached
  from a KVM host and the pci-stub driver on the host claims the
  device. The pci-stub driver does not have error handling capabilities
  but the AER infrastructure still reports that the device recovered
  successfully.
 
  The changes proposed here leaves the device in an unrecovered state
  if the driver for the device or for any function in a multi-function
  card does not have error handler structure registered. This reflects
  the true state of the device and prevents any partial recovery (or no
  recovery at all) reported as successful.
 
  Please also see comments from Linas Vepstas at the following link
  http://www.spinics.net/lists/linux-pci/msg18572.html
 
  Reviewed-by: Linas Vepstas linasvepstas at gmail.com
  Reviewed-by: Myron Stowe mstowe at redhat.com
  Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarathil at
 hp.com
 
  ---
 
  drivers/pci/pcie/aer/aerdrv_core.c | 6 ++
   1 file changed, 6 insertions(+)
 
  diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
 b/drivers/pci/pcie/aer/aerdrv_core.c
  index 06bad96..030b229 100644
  --- a/drivers/pci/pcie/aer/aerdrv_core.c
  +++ b/drivers/pci/pcie/aer/aerdrv_core.c
  @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev
 *dev, void *data)
 
  dev-error_state = result_data-state;
 
  +   if ((!dev-driver || !dev-driver-err_handler) 
  +   !(dev-hdr_type  PCI_HEADER_TYPE_BRIDGE)) {
  +   dev_info(dev-dev, AER: Error detected but no driver has
 claimed this device or the driver is AER-unaware\n);
  +   result_data-result = PCI_ERS_RESULT_NONE;
  +   return 1;
 
 This doesn't seem right because returning 1 here causes pci_walk_bus()
 to terminate, which means we won't set dev-error_state or notify
 drivers for any devices we haven't visited yet.
 
  +   }
  if (!dev-driver ||
  !dev-driver-err_handler ||
  !dev-driver-err_handler-error_detected) {
 
 If none of the drivers in the affected hierarchy supports error handling,
 I think the call tree looks like this:
 
 do_recovery # uncorrectable only
 broadcast_error_message(dev, ..., report_error_detected)
 result_data.result = CAN_RECOVER
 pci_walk_bus(..., report_error_detected)
 report_error_detected   # (each dev in subtree)
 return 0# no change to result
 return result_data.result
 broadcast_error_message(dev, ..., report_mmio_enabled)
 result_data.result = PCI_ERS_RESULT_RECOVERED
 pci_walk_bus(..., report_mmio_enabled)
 report_mmio_enabled # (each dev in subtree)
 return 0# no change to result
 dev_info(recovery successful)
 
 Specifically, there are no error_handler functions, so we never change
 result_data.result, and the default is that we treat the error as
 recovered successfully.  That seems broken.  An uncorrectable error
 is by definition recoverable only by device-specific software, i.e.,
 the driver.  We didn't call any drivers, so we can't have recovered
 anything.
 
 What do you think of the following alternative?  I don't know why you
 checked

[ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

2012-11-09 Thread Pandarathil, Vijaymohan R
When an error is detected on a PCIe device which does not have an
AER-aware driver, prevent AER infrastructure from reporting
successful error recovery.

This is because the report_error_detected() function that gets
called in the first phase of recovery process allows forward
progress even when the driver for the device does not have AER
capabilities. It seems that all callbacks (in pci_error_handlers
structure) registered by drivers that gets called during error
recovery are not mandatory. So the intention of the infrastructure
design seems to be to allow forward progress even when a specific
callback has not been registered by a driver. However, if error
handler structure itself has not been registered, it doesn't make
sense to allow forward progress.

As a result of the current design, in the case of a single device
having an AER-unaware driver or in the case of any function in a
multi-function card having an AER-unaware driver, a successful
recovery is reported.

Typical scenario this happens is when a PCI device is detached
from a KVM host and the pci-stub driver on the host claims the
device. The pci-stub driver does not have error handling capabilities
but the AER infrastructure still reports that the device recovered
successfully.

The changes proposed here leaves the device in an unrecovered state
if the driver for the device or for any function in a multi-function
card does not have error handler structure registered. This reflects
the true state of the device and prevents any partial recovery (or no
recovery at all) reported as successful.

Please also see comments from Linas Vepstas at the following link
http://www.spinics.net/lists/linux-pci/msg18572.html

Reviewed-by: Linas Vepstas  gmail.com>
Reviewed-by: Myron Stowe  redhat.com>
Signed-off-by: Vijay Mohan Pandarathil  hp.com>

---

drivers/pci/pcie/aer/aerdrv_core.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/pci/pcie/aer/aerdrv_core.c 
b/drivers/pci/pcie/aer/aerdrv_core.c
index 06bad96..030b229 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev *dev, void 
*data)
 
dev->error_state = result_data->state;
 
+   if ((!dev->driver || !dev->driver->err_handler) &&
+   !(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
+   dev_info(>dev, "AER: Error detected but no driver has 
claimed this device or the driver is AER-unaware\n");
+   result_data->result = PCI_ERS_RESULT_NONE;
+   return 1;
+   }
if (!dev->driver ||
!dev->driver->err_handler ||
!dev->driver->err_handler->error_detected) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

2012-11-09 Thread Pandarathil, Vijaymohan R
When an error is detected on a PCIe device which does not have an
AER-aware driver, prevent AER infrastructure from reporting
successful error recovery.

This is because the report_error_detected() function that gets
called in the first phase of recovery process allows forward
progress even when the driver for the device does not have AER
capabilities. It seems that all callbacks (in pci_error_handlers
structure) registered by drivers that gets called during error
recovery are not mandatory. So the intention of the infrastructure
design seems to be to allow forward progress even when a specific
callback has not been registered by a driver. However, if error
handler structure itself has not been registered, it doesn't make
sense to allow forward progress.

As a result of the current design, in the case of a single device
having an AER-unaware driver or in the case of any function in a
multi-function card having an AER-unaware driver, a successful
recovery is reported.

Typical scenario this happens is when a PCI device is detached
from a KVM host and the pci-stub driver on the host claims the
device. The pci-stub driver does not have error handling capabilities
but the AER infrastructure still reports that the device recovered
successfully.

The changes proposed here leaves the device in an unrecovered state
if the driver for the device or for any function in a multi-function
card does not have error handler structure registered. This reflects
the true state of the device and prevents any partial recovery (or no
recovery at all) reported as successful.

Please also see comments from Linas Vepstas at the following link
http://www.spinics.net/lists/linux-pci/msg18572.html

Reviewed-by: Linas Vepstas linasvepstas at gmail.com
Reviewed-by: Myron Stowe mstowe at redhat.com
Signed-off-by: Vijay Mohan Pandarathil vijaymohan.pandarathil at hp.com

---

drivers/pci/pcie/aer/aerdrv_core.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/pci/pcie/aer/aerdrv_core.c 
b/drivers/pci/pcie/aer/aerdrv_core.c
index 06bad96..030b229 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev *dev, void 
*data)
 
dev-error_state = result_data-state;
 
+   if ((!dev-driver || !dev-driver-err_handler) 
+   !(dev-hdr_type  PCI_HEADER_TYPE_BRIDGE)) {
+   dev_info(dev-dev, AER: Error detected but no driver has 
claimed this device or the driver is AER-unaware\n);
+   result_data-result = PCI_ERS_RESULT_NONE;
+   return 1;
+   }
if (!dev-driver ||
!dev-driver-err_handler ||
!dev-driver-err_handler-error_detected) {
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v2 0/4] ACPI: hotplug messages improvement

2012-07-25 Thread Pandarathil, Vijaymohan R
Hi Toshi,

Tested your patches on a KVM setup. Since all your acpi_pr* macros are in the 
error path, I didn't see an easy way to trigger them. Instead added an 
acpi_pr_err() message in the success path and tested out vcpu addition/deletion 
sequence. No regressions seen in the functional tests and the ACPI err message 
comes out on the console and message buffer with a valid ACPI device path.

Vijay

Tested-by: Vijay Mohan Pandarathil


-Original Message-
From: Kani, Toshimitsu 
Sent: Friday, July 20, 2012 9:54 AM
To: l...@kernel.org; linux-a...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org; j...@perches.com; bhelg...@google.com; 
isimatu.yasu...@jp.fujitsu.com; liu...@gmail.com; 
srivatsa.b...@linux.vnet.ibm.com; pra...@redhat.com; imamm...@redhat.com; 
Pandarathil, Vijaymohan R; Kani, Toshimitsu
Subject: [PATCH v2 0/4] ACPI: hotplug messages improvement

This patchset improves logging messages for ACPI CPU, Memory, and
Container hotplug notify handlers.  The patchset introduces a set of
new macro interfaces, acpi_pr_(), and updates the notify
handlers to use them.  acpi_pr_() appends "ACPI" prefix and
ACPI object path to the messages.  This improves diagnostics in
hotplug operations since it identifies an object that caused an
issue in a log file.

v2:
 - Set buffer.pointer to NULL in acpi_printk().
 - Added acpi_pr_debug().

---
This patchset applies on top of the patch below.

[PATCH] ACPI: Add ACPI CPU hot-remove support
http://marc.info/?l=linux-acpi=134098193327362=2

---
Toshi Kani (4):
 ACPI: Add acpi_pr_() interfaces
 ACPI: Update CPU hotplug messages
 ACPI: Update Memory hotplug messages
 ACPI: Update Container hotplug messages

---
 drivers/acpi/acpi_memhotplug.c  |   24 
 drivers/acpi/container.c|6 +++---
 drivers/acpi/processor_driver.c |   36 +---
 drivers/acpi/utils.c|   34 ++
 include/acpi/acpi_bus.h |   20 
 5 files changed, 90 insertions(+), 30 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v2 0/4] ACPI: hotplug messages improvement

2012-07-25 Thread Pandarathil, Vijaymohan R
Hi Toshi,

Tested your patches on a KVM setup. Since all your acpi_pr* macros are in the 
error path, I didn't see an easy way to trigger them. Instead added an 
acpi_pr_err() message in the success path and tested out vcpu addition/deletion 
sequence. No regressions seen in the functional tests and the ACPI err message 
comes out on the console and message buffer with a valid ACPI device path.

Vijay

Tested-by: Vijay Mohan Pandarathilvijaymohan.pandarat...@hp.com


-Original Message-
From: Kani, Toshimitsu 
Sent: Friday, July 20, 2012 9:54 AM
To: l...@kernel.org; linux-a...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org; j...@perches.com; bhelg...@google.com; 
isimatu.yasu...@jp.fujitsu.com; liu...@gmail.com; 
srivatsa.b...@linux.vnet.ibm.com; pra...@redhat.com; imamm...@redhat.com; 
Pandarathil, Vijaymohan R; Kani, Toshimitsu
Subject: [PATCH v2 0/4] ACPI: hotplug messages improvement

This patchset improves logging messages for ACPI CPU, Memory, and
Container hotplug notify handlers.  The patchset introduces a set of
new macro interfaces, acpi_pr_level(), and updates the notify
handlers to use them.  acpi_pr_level() appends ACPI prefix and
ACPI object path to the messages.  This improves diagnostics in
hotplug operations since it identifies an object that caused an
issue in a log file.

v2:
 - Set buffer.pointer to NULL in acpi_printk().
 - Added acpi_pr_debug().

---
This patchset applies on top of the patch below.

[PATCH] ACPI: Add ACPI CPU hot-remove support
http://marc.info/?l=linux-acpim=134098193327362w=2

---
Toshi Kani (4):
 ACPI: Add acpi_pr_level() interfaces
 ACPI: Update CPU hotplug messages
 ACPI: Update Memory hotplug messages
 ACPI: Update Container hotplug messages

---
 drivers/acpi/acpi_memhotplug.c  |   24 
 drivers/acpi/container.c|6 +++---
 drivers/acpi/processor_driver.c |   36 +---
 drivers/acpi/utils.c|   34 ++
 include/acpi/acpi_bus.h |   20 
 5 files changed, 90 insertions(+), 30 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] ACPI: Add ACPI CPU hot-remove support

2012-07-15 Thread Pandarathil, Vijaymohan R
Hi Toshi,

Did some basic KVM guest cpu hotplug testing of this patch over your OST 
patchset along with fixes in qemu-kvm for guest cpu hotplug. 

Vijay

virsh # vcpucount vmb91g1
maximum  config 4
maximum  live   4
current  config 4
current  live   4

virsh # setvcpus vmb91g1 3

virsh # vcpucount vmb91g1
maximum  config 4
maximum  live   4
current  config 4
current  live   3

virsh # setvcpus vmb91g1 4

virsh # vcpucount vmb91g1
maximum  config 4
maximum  live   4
current  config 4
current  live   4

virsh #  setvcpus vmb91g1 2

virsh # vcpucount vmb91g1
maximum  config 4
maximum  live   4
current  config 4
current  live   2

The guest cpu counts (as shown by lscpu in the guest) also changes.


Tested-by: Vijay Mohan Pandarathil


-Original Message-
From: Kani, Toshimitsu 
Sent: Friday, June 29, 2012 7:51 AM
To: l...@kernel.org; linux-a...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org; Kani, Toshimitsu
Subject: [PATCH] ACPI: Add ACPI CPU hot-remove support

Added CPU hot-remove support through an ACPI eject notification.
It calls acpi_bus_hot_remove_device(), which shares the same code
path with the sysfs eject operation.  acpi_os_hotplug_execute()
serializes hot-remove operations between ACPI hot-remove and sysfs
eject requests.

Signed-off-by: Toshi Kani 

---
This patch applies on top of the patchset below.

[PATCH v6 0/6] ACPI: Add _OST support for ACPI hotplug
http://marc.info/?l=linux-acpi=134074381322973=2

---
 drivers/acpi/processor_driver.c |   27 +--
 1 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index f9fa1b2..a6f6bde 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -699,8 +699,8 @@ int acpi_processor_device_add(acpi_handle handle, struct 
acpi_device **device)
 static void acpi_processor_hotplug_notify(acpi_handle handle,
  u32 event, void *data)
 {
-   struct acpi_processor *pr;
struct acpi_device *device = NULL;
+   struct acpi_eject_event *ej_event = NULL;
u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
int result;
 
@@ -732,20 +732,27 @@ static void acpi_processor_hotplug_notify(acpi_handle 
handle,
  "received ACPI_NOTIFY_EJECT_REQUEST\n"));
 
if (acpi_bus_get_device(handle, )) {
-   printk(KERN_ERR PREFIX
-   "Device don't exist, dropping EJECT\n");
+   pr_err(PREFIX "Device don't exist, dropping EJECT\n");
break;
}
-   pr = acpi_driver_data(device);
-   if (!pr) {
-   printk(KERN_ERR PREFIX
-   "Driver data is NULL, dropping EJECT\n");
+   if (!acpi_driver_data(device)) {
+   pr_err(PREFIX "Driver data is NULL, dropping EJECT\n");
break;
}
 
-   /* REVISIT: update when eject is supported */
-   ost_code = ACPI_OST_SC_EJECT_NOT_SUPPORTED;
-   break;
+   ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
+   if (!ej_event) {
+   pr_err(PREFIX "No memory, dropping EJECT\n");
+   break;
+   }
+
+   ej_event->handle = handle;
+   ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
+   acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
+   (void *)ej_event);
+
+   /* eject is performed asynchronously */
+   return;
 
default:
ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-- 
1.7.7.6



RE: [PATCH] ACPI: Add ACPI CPU hot-remove support

2012-07-15 Thread Pandarathil, Vijaymohan R
Hi Toshi,

Did some basic KVM guest cpu hotplug testing of this patch over your OST 
patchset along with fixes in qemu-kvm for guest cpu hotplug. 

Vijay

virsh # vcpucount vmb91g1
maximum  config 4
maximum  live   4
current  config 4
current  live   4

virsh # setvcpus vmb91g1 3

virsh # vcpucount vmb91g1
maximum  config 4
maximum  live   4
current  config 4
current  live   3

virsh # setvcpus vmb91g1 4

virsh # vcpucount vmb91g1
maximum  config 4
maximum  live   4
current  config 4
current  live   4

virsh #  setvcpus vmb91g1 2

virsh # vcpucount vmb91g1
maximum  config 4
maximum  live   4
current  config 4
current  live   2

The guest cpu counts (as shown by lscpu in the guest) also changes.


Tested-by: Vijay Mohan Pandarathilvijaymohan.pandarat...@hp.com


-Original Message-
From: Kani, Toshimitsu 
Sent: Friday, June 29, 2012 7:51 AM
To: l...@kernel.org; linux-a...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org; Kani, Toshimitsu
Subject: [PATCH] ACPI: Add ACPI CPU hot-remove support

Added CPU hot-remove support through an ACPI eject notification.
It calls acpi_bus_hot_remove_device(), which shares the same code
path with the sysfs eject operation.  acpi_os_hotplug_execute()
serializes hot-remove operations between ACPI hot-remove and sysfs
eject requests.

Signed-off-by: Toshi Kani toshi.k...@hp.com

---
This patch applies on top of the patchset below.

[PATCH v6 0/6] ACPI: Add _OST support for ACPI hotplug
http://marc.info/?l=linux-acpim=134074381322973w=2

---
 drivers/acpi/processor_driver.c |   27 +--
 1 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index f9fa1b2..a6f6bde 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -699,8 +699,8 @@ int acpi_processor_device_add(acpi_handle handle, struct 
acpi_device **device)
 static void acpi_processor_hotplug_notify(acpi_handle handle,
  u32 event, void *data)
 {
-   struct acpi_processor *pr;
struct acpi_device *device = NULL;
+   struct acpi_eject_event *ej_event = NULL;
u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
int result;
 
@@ -732,20 +732,27 @@ static void acpi_processor_hotplug_notify(acpi_handle 
handle,
  received ACPI_NOTIFY_EJECT_REQUEST\n));
 
if (acpi_bus_get_device(handle, device)) {
-   printk(KERN_ERR PREFIX
-   Device don't exist, dropping EJECT\n);
+   pr_err(PREFIX Device don't exist, dropping EJECT\n);
break;
}
-   pr = acpi_driver_data(device);
-   if (!pr) {
-   printk(KERN_ERR PREFIX
-   Driver data is NULL, dropping EJECT\n);
+   if (!acpi_driver_data(device)) {
+   pr_err(PREFIX Driver data is NULL, dropping EJECT\n);
break;
}
 
-   /* REVISIT: update when eject is supported */
-   ost_code = ACPI_OST_SC_EJECT_NOT_SUPPORTED;
-   break;
+   ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
+   if (!ej_event) {
+   pr_err(PREFIX No memory, dropping EJECT\n);
+   break;
+   }
+
+   ej_event-handle = handle;
+   ej_event-event = ACPI_NOTIFY_EJECT_REQUEST;
+   acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
+   (void *)ej_event);
+
+   /* eject is performed asynchronously */
+   return;
 
default:
ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-- 
1.7.7.6