On Mon, 2015-10-05 at 07:20 +0000, Bhushan Bharat wrote:
>
>
> > -----Original Message-----
> > From: Alex Williamson [mailto:[email protected]]
> > Sent: Saturday, October 03, 2015 4:17 AM
> > To: Bhushan Bharat-R65777 <[email protected]>
> > Cc: [email protected]; [email protected];
> > [email protected]; [email protected]; [email protected];
> > [email protected]; [email protected]
> > Subject: Re: [RFC PATCH 5/6] vfio-pci: Create iommu mapping for msi
> > interrupt
> >
> > On Wed, 2015-09-30 at 20:26 +0530, Bharat Bhushan wrote:
> > > An MSI-address is allocated and programmed in pcie device during
> > > interrupt configuration. Now for a pass-through device, try to create
> > > the iommu mapping for this allocted/programmed msi-address. If the
> > > iommu mapping is created and the msi address programmed in the pcie
> > > device is different from msi-iova as per iommu programming then
> > > reconfigure the pci device to use msi-iova as msi address.
> > >
> > > Signed-off-by: Bharat Bhushan <[email protected]>
> > > ---
> > > drivers/vfio/pci/vfio_pci_intrs.c | 36
> > > ++++++++++++++++++++++++++++++++++--
> > > 1 file changed, 34 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/vfio/pci/vfio_pci_intrs.c
> > > b/drivers/vfio/pci/vfio_pci_intrs.c
> > > index 1f577b4..c9690af 100644
> > > --- a/drivers/vfio/pci/vfio_pci_intrs.c
> > > +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> > > @@ -312,13 +312,23 @@ static int vfio_msi_set_vector_signal(struct
> > vfio_pci_device *vdev,
> > > int irq = msix ? vdev->msix[vector].vector : pdev->irq + vector;
> > > char *name = msix ? "vfio-msix" : "vfio-msi";
> > > struct eventfd_ctx *trigger;
> > > + struct msi_msg msg;
> > > + struct vfio_device *device;
> > > + uint64_t msi_addr, msi_iova;
> > > int ret;
> > >
> > > if (vector >= vdev->num_ctx)
> > > return -EINVAL;
> > >
> > > + device = vfio_device_get_from_dev(&pdev->dev);
> >
> > Have you looked at this function? I don't think we want to be doing that
> > every time we want to poke the interrupt configuration.
>
> I am trying to describe what I understood, a device can have many interrupts
> and we should setup iommu only once, when called for the first time to
> enable/setup interrupt.
> Similarly when disabling the interrupt we should iommu-unmap when called for
> the last enabled interrupt for that device. Now with this understanding,
> should I move this map-unmap to separate functions and call them from
> vfio_msi_set_block() rather than in vfio_msi_set_vector_signal()
Interrupts can be setup and torn down at any time and I don't see how
one function or the other makes much difference.
vfio_device_get_from_dev() is enough overhead that the data we need
should be cached if we're going to call it with some regularity. Maybe
vfio_iommu_driver_ops.open() should be called with a pointer to the
vfio_device... or the vfio_group.
> > Also note that
> > IOMMU mappings don't operate on devices, but groups, so maybe we want
> > to pass the group.
>
> Yes, it operates on group. I hesitated to add an API to get group. Do you
> suggest to that it is ok to add API to get group from device.
No, the above suggestion is probably better.
> >
> > > + if (device == NULL)
> > > + return -EINVAL;
> >
> > This would be a legitimate BUG_ON(!device)
> >
> > > +
> > > if (vdev->ctx[vector].trigger) {
> > > free_irq(irq, vdev->ctx[vector].trigger);
> > > + get_cached_msi_msg(irq, &msg);
> > > + msi_iova = ((u64)msg.address_hi << 32) | msg.address_lo;
> > > + vfio_device_unmap_msi(device, msi_iova, PAGE_SIZE);
> > > kfree(vdev->ctx[vector].name);
> > > eventfd_ctx_put(vdev->ctx[vector].trigger);
> > > vdev->ctx[vector].trigger = NULL;
> > > @@ -346,12 +356,11 @@ static int vfio_msi_set_vector_signal(struct
> > vfio_pci_device *vdev,
> > > * cached value of the message prior to enabling.
> > > */
> > > if (msix) {
> > > - struct msi_msg msg;
> > > -
> > > get_cached_msi_msg(irq, &msg);
> > > pci_write_msi_msg(irq, &msg);
> > > }
> > >
> > > +
> >
> > gratuitous newline
> >
> > > ret = request_irq(irq, vfio_msihandler, 0,
> > > vdev->ctx[vector].name, trigger);
> > > if (ret) {
> > > @@ -360,6 +369,29 @@ static int vfio_msi_set_vector_signal(struct
> > vfio_pci_device *vdev,
> > > return ret;
> > > }
> > >
> > > + /* Re-program the new-iova in pci-device in case there is
> > > + * different iommu-mapping created for programmed msi-address.
> > > + */
> > > + get_cached_msi_msg(irq, &msg);
> > > + msi_iova = 0;
> > > + msi_addr = (u64)(msg.address_hi) << 32 | (u64)(msg.address_lo);
> > > + ret = vfio_device_map_msi(device, msi_addr, PAGE_SIZE,
> > &msi_iova);
> > > + if (ret) {
> > > + free_irq(irq, vdev->ctx[vector].trigger);
> > > + kfree(vdev->ctx[vector].name);
> > > + eventfd_ctx_put(trigger);
> > > + return ret;
> > > + }
> > > +
> > > + /* Reprogram only if iommu-mapped iova is different from msi-
> > address */
> > > + if (msi_iova && (msi_iova != msi_addr)) {
> > > + msg.address_hi = (u32)(msi_iova >> 32);
> > > + /* Keep Lower bits from original msi message address */
> > > + msg.address_lo &= PAGE_MASK;
> > > + msg.address_lo |= (u32)(msi_iova & 0x00000000ffffffff);
> >
> > Seems like you're making some assumptions here that are dependent on the
> > architecture and maybe the platform.
>
> What I tried is to map the msi page with different iova, which is page size
> aligned. But the offset within the page will remain same.
> For example, original msi address was 0x0603_0040 and we have a reserved
> region at 0xf000_0000. So iommu mapping is created for 0xf000_0000
> =>0x0600_3000 of size 0x1000.
>
> So the new address to be programmed in device is 0xf000_0040, offset 0x40
> added to base address in iommu mapping.
Don't you need ~PAGE_MASK for it to work like that? The & with
0x00000000ffffffff shouldn't be needed either, certainly not with all
the leading zeros.
> > > + pci_write_msi_msg(irq, &msg);
> > > + }
> > > +
> > > vdev->ctx[vector].trigger = trigger;
> > >
> > > return 0;
> >
> >
>
_______________________________________________
kvmarm mailing list
[email protected]
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm