On Tue, Mar 24, 2015 at 06:53:29AM -0600, Alex Williamson wrote: >On Tue, 2015-03-24 at 17:54 +1100, David Gibson wrote: >> On Tue, Mar 24, 2015 at 05:24:55PM +1100, Gavin Shan wrote: >> > On Tue, Mar 24, 2015 at 04:41:21PM +1100, David Gibson wrote: >> > >On Mon, Mar 23, 2015 at 04:25:10PM +1100, Gavin Shan wrote: >> > >> On Mon, Mar 23, 2015 at 04:06:56PM +1100, David Gibson wrote: >> > >> >On Fri, Mar 20, 2015 at 05:27:29PM +1100, Gavin Shan wrote: >> > >> >> On Fri, Mar 20, 2015 at 05:04:01PM +1100, David Gibson wrote: >> > >> >> >On Tue, Mar 17, 2015 at 03:31:24AM +1100, Gavin Shan wrote: >> > >> >> >> The PCI device MSIx table is cleaned out in hardware after EEH PE >> > >> >> >> reset. However, we still hold the stale MSIx entries in QEMU, >> > >> >> >> which >> > >> >> >> should be cleared accordingly. Otherwise, we will run into another >> > >> >> >> (recursive) EEH error and the PCI devices contained in the PE have >> > >> >> >> to be offlined exceptionally. >> > >> >> >> >> > >> >> >> The patch clears stale MSIx table before EEH PE reset so that MSIx >> > >> >> >> table could be restored properly after EEH PE reset. >> > >> >> >> >> > >> >> >> Signed-off-by: Gavin Shan <gws...@linux.vnet.ibm.com> >> > >> >> >> --- >> > >> >> >> v2: vfio_container_eeh_event() stub for !CONFIG_PCI and separate >> > >> >> >> error message for this function. Dropped vfio_put_group() >> > >> >> >> on NULL group >> > >> >> >> --- >> > >> >> >> hw/vfio/Makefile.objs | 6 +++++- >> > >> >> >> hw/vfio/common.c | 7 +++++++ >> > >> >> >> hw/vfio/pci-stub.c | 17 +++++++++++++++++ >> > >> >> >> hw/vfio/pci.c | 38 >> > >> >> >> ++++++++++++++++++++++++++++++++++++++ >> > >> >> >> include/hw/vfio/vfio.h | 2 ++ >> > >> >> >> 5 files changed, 69 insertions(+), 1 deletion(-) >> > >> >> >> create mode 100644 hw/vfio/pci-stub.c >> > >> >> >> >> > >> >> >> diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs >> > >> >> >> index e31f30e..1b8a065 100644 >> > >> >> >> --- a/hw/vfio/Makefile.objs >> > >> >> >> +++ b/hw/vfio/Makefile.objs >> > >> >> >> @@ -1,4 +1,8 @@ >> > >> >> >> ifeq ($(CONFIG_LINUX), y) >> > >> >> >> obj-$(CONFIG_SOFTMMU) += common.o >> > >> >> >> -obj-$(CONFIG_PCI) += pci.o >> > >> >> >> +ifeq ($(CONFIG_PCI), y) >> > >> >> >> +obj-y += pci.o >> > >> >> >> +else >> > >> >> >> +obj-y += pci-stub.o >> > >> >> >> +endif >> > >> >> >> endif >> > >> >> >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c >> > >> >> >> index 148eb53..ed07814 100644 >> > >> >> >> --- a/hw/vfio/common.c >> > >> >> >> +++ b/hw/vfio/common.c >> > >> >> >> @@ -949,7 +949,14 @@ int vfio_container_ioctl(AddressSpace *as, >> > >> >> >> int32_t groupid, >> > >> >> >> switch (req) { >> > >> >> >> case VFIO_CHECK_EXTENSION: >> > >> >> >> case VFIO_IOMMU_SPAPR_TCE_GET_INFO: >> > >> >> >> + break; >> > >> >> >> case VFIO_EEH_PE_OP: >> > >> >> >> + if (vfio_container_eeh_event(as, groupid, param) != 0) { >> > >> >> > >> > >> >> >I really dislike the idea of having an arbitrarily complex side >> > >> >> >effect >> > >> >> >from a function whose name suggest's it's just a trivial wrapper >> > >> >> >around the ioctl(). >> > >> >> > >> > >> >> >> > >> >> Ok. I guess you would like putting the complex in the callers of >> > >> >> vfio_container_ioctl(). >> > >> > >> > >> >Well.. maybe. I'd also be happy if helper functions were implemeneted >> > >> >which both called the ioctl() and did the other necessary pieces. >> > >> >They should just be called something that indicates their full >> > >> >function, not a name which suggests they're just an ioctl wrapper. >> > >> > >> > >> >> > >> Indeed, vfio_container_ioctl() isn't indicating what the function is >> > >> doing. >> > >> How about renaming it to vfio_container_event_and_ioctl()? I'm always >> > >> bad >> > >> at giving a good function name :) >> > > >> > >Well, I don't think your wrapper should be multiplexed. The multiplex >> > >works for the simple ioctl() wrapper, because there really is nothing >> > >that varies apart from the exact ioctl number called. >> > > >> > >But now that you have different operations here, I think you want >> > >wrappers for each one - each one will call the ioctl(), then do the >> > >specific extra steps necessary for that operation. So >> > >vfio_container_event() will go away as well, split into various other >> > >functions. >> > > >> > >> > It wouldn't a good idea if I understand your proposal correctly. Currnetly, >> > the global function vfio_container_ioctl() can be called from sPAPR >> > platform >> > for any ioctl commands handled in kernel source file >> > vfio_iommu_spapr_tce.c, >> > which means the function isn't called for EEH only. Other sPAPR TCE >> > container >> > ioctl commands are also routed by this function. There will be lots if >> > having >> > one global function for each ioctl commands, which just improve the cost to >> > maintain the code. >> >> I don't really follow your objection. I'm only suggesting separate >> wrappers for things which require extra actions currently implemented >> in vfio_container_event(). Things which only ned the plain ioctl() >> can still use the simple vfio_container_ioctl() wrapper. > >vfio_container_ioctl() also filters to a limited set of ioctls, it >clearly does not allow any ioctl. >
Ok. I think your guys expect something like follows. Note that the following vfio_container_eeh_ioctl() will accept a limited set of EEH operations, similar to what's doing in vfio_contain_ioctl() to the ioctl commands: If you agree to have the changes, I'll put another patch on top of this one to replace vfio_container_ioctl() in spapr_pci_vfio.c with vfio_container_eeh_ioctl() for EEH cases. int vfio_container_eeh_ioctl(AddressSpace *as, int32_t groupid, struct vfio_eeh_pe_op *op) { switch (op->op) { case VFIO_EEH_PE_RESET_HOT: case VFIO_EEH_PE_RESET_FUNDAMENTAL: { VFIOGroup *group; VFIODevice *vbasedev; VFIOPCIDevice *vdev; /* * The MSIx table will be cleaned out by reset. We need * disable it so that it can be reenabled properly. Also, * the cached MSIx table should be cleared as it's not * reflecting the contents in hardware. */ group = vfio_get_group(groupid, as); if (!group) { error_report("vfio: group %d not found\n", groupid); return -1; } QLIST_FOREACH(vbasedev, &group->device_list, next) { vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev); if (msix_enabled(&vdev->pdev)) { vfio_disable_msix(vdev); } msix_reset(&vdev->pdev); } vfio_put_group(group); break; } case VFIO_EEH_PE_DISABLE: case VFIO_EEH_PE_ENABLE: case VFIO_EEH_PE_UNFREEZE_IO: case VFIO_EEH_PE_UNFREEZE_DMA: case VFIO_EEH_PE_GET_STATE: case VFIO_EEH_PE_RESET_DEACTIVATE: case VFIO_EEH_PE_CONFIGURE: break; default: error_report("vfio: unsupported EEH operation %X\n", op->op); return -1; } return vfio_container_ioctl(as, groupid, VFIO_EEH_PE_OP, op); } Thanks, Gavin >> > Alternatively, we might expose another function vfio_container_eeh_ioctl(), >> > which calls vfio_container_ioctl() after doing what we did in >> > vfio_container_event() >> > if necessary. >> > >> > Thanks, >> > Gavin >> > >> > >> > >> > > >