On 01/19/2017 06:31 AM, Alex Williamson wrote: > On Sat, 31 Dec 2016 17:13:07 +0800 > Cao jin <caoj.f...@cn.fujitsu.com> wrote: > >> From: Chen Fan <chen.fan.f...@cn.fujitsu.com> >>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >> index 76a8ac3..9861f72 100644 >> --- a/hw/vfio/pci.c >> +++ b/hw/vfio/pci.c >> @@ -2470,21 +2470,55 @@ static void vfio_put_device(VFIOPCIDevice *vdev) >> static void vfio_err_notifier_handler(void *opaque) >> { >> VFIOPCIDevice *vdev = opaque; >> + PCIDevice *dev = &vdev->pdev; >> + PCIEAERMsg msg = { >> + .severity = 0, >> + .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn, >> + }; >> + int len; >> + uint64_t uncor_status; >> + >> + /* Read uncorrectable error status from driver */ >> + len = read(vdev->err_notifier.rfd, &uncor_status, sizeof(uncor_status)); > > Whoa this seems bogus. In the kernel eventfd_signal() adds the value > to the internal counter. We can't guarantee that the user is going to > immediately respond to every signal, multiple signals might occur, each > incrementing the counter. I don't think we can use the eventfd like > this. > Ah, got your concern, make sense. Michael had the same comments as you, I didn't got him at that time... I explained the reason in v10 cover letter(5 of changelog): I find that error status register reading in error handler sometime returns ALL F's. I may want to take a look at v10, incorporate your comments, and test to see if the issue still exists. Currently if we only handle non-fatal error, we can still use eventfd like before. -- Sincerely, Cao jin >> + if (len != sizeof(uncor_status)) { >> + error_report("vfio-pci: uncor error status reading returns" >> + " invalid number of bytes: %d", len); >> + return; //Or goto stop? > > It's bogus use of the eventfd anyway afaict, but > event_notifier_test_and_clear() certainly handles at least EINTR. > >> + } >> + >> + if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) { >> + goto stop; >> + } > > I'm not sure this should be user selected anymore. > >> + >> + /* Populate the aer msg and send it to root port */ >> + if (dev->exp.aer_cap) { >> + uint8_t *aer_cap = dev->config + dev->exp.aer_cap; >> + bool isfatal = uncor_status & >> + pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER); >> + >> + if (isfatal) { >> + goto stop; >> + } > > QEMU uses spaces not tabs. > >> + >> + msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN : >> + PCI_ERR_ROOT_CMD_NONFATAL_EN; > > We don't get here if isfatal. > >> >> - if (!event_notifier_test_and_clear(&vdev->err_notifier)) { >> + error_report("vfio-pci device %d sending AER to root port. uncor" >> + " status = 0x%"PRIx64, dev->devfn, uncor_status); >> + pcie_aer_msg(dev, &msg); >> return; >> } >> >> +stop: >> /* >> - * TBD. Retrieve the error details and decide what action >> - * needs to be taken. One of the actions could be to pass >> - * the error to the guest and have the guest driver recover >> - * from the error. This requires that PCIe capabilities be >> - * exposed to the guest. For now, we just terminate the >> - * guest to contain the error. >> + * Terminate the guest in case of >> + * 1. AER capability is not exposed to guest. >> + * 2. AER capability is exposed, but error is fatal, only non-fatal >> + * error is handled now. > > You're also currently requiring the user to enable aer per device. > >> */ >> >> - error_report("%s(%s) Unrecoverable error detected. Please collect any >> data possible and then kill the guest", __func__, vdev->vbasedev.name); >> + error_report("%s(%s) fatal error detected. Please collect any data" >> + " possible and then kill the guest", __func__, >> vdev->vbasedev.name); >> >> vm_stop(RUN_STATE_INTERNAL_ERROR); >> } > > > > . >