Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-13 Thread Alex Williamson
On Fri, 13 May 2016 06:50:25 +
"Tian, Kevin"  wrote:

> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Friday, May 13, 2016 1:33 PM  
> > > >
> > > > As argued previously in this thread, there's nothing special about a
> > > > DMA write to memory versus a DMA write to a special address that
> > > > triggers an MSI vector.  If the device is DMA capable, which we assume
> > > > all are, it can be fooled into generating those DMA writes regardless
> > > > of whether we actively block access to the MSI-X vector table itself.  
> > >
> > > But with DMA remapping above can be blocked.  
> > 
> > How?  VT-d explicitly ignores DMA writes to 0xFEEx_, section 3.13:
> > 
> >   Write requests without PASID of DWORD length are treated as interrupt
> >   requests. Interrupt requests are not subjected to DMA remapping[...]
> >   Instead, remapping hardware can be enabled to subject such interrupt
> >   requests to interrupt remapping.  
> 
> Thanks for catching this!
> 
> >   
> > > > MSI-X vector table access w/o interrupt remapping is to avoid obvious
> > > > collisions if it were to be programmed directly, it doesn't actually
> > > > prevent an identical DMA transaction from being generated by other  
> > >
> > > Kernel can enable DMA remapping but disable IRQ remapping. In such
> > > case identical DMA transaction can be prevented.  
> > 
> > Not according to the VT-d spec as quoted above.  If so, how?  
> 
> So my argument on this is wrong. sorry.
> 
> >   
> > > Anyway my point is simple. Let's ignore how Linux kernel implements
> > > IRQ remapping on x86 (which may change time to time), and just
> > > focus on architectural possibility. Non-x86 platform may implement
> > > IRQ remapping completely separate from device side, then checking
> > > availability of IRQ remapping is enough to decide whether mmap
> > > MSI-X table is safe. x86 with VT-d can be configured to a mode
> > > requiring host control of both MSI-X entry and IRQ remapping hardware
> > > (without source id check). In such case it's insufficient to make
> > > decision simply based on IRQ remapping availability. We need a way
> > > to query from IRQ remapping module whether it's actually safe to
> > > mmap MSI-X.  
> > 
> > We're going in circles here.  This patch is attempting to remove
> > protection from the MSI-X vector table that is really nothing more than
> > security theater already.  That "protection" only actually prevents
> > casual misuse of the API which is really only a problem when the
> > platform offers no form of interrupt isolation, such as VT-d with DMA
> > remapping but not interrupt remapping.  Disabling source-id checking in
> > VT-d should be handled as the equivalent of disabling interrupt
> > remapping altogether as far as the IOMMU API is concerned.  That's
> > a trivial gap that should be fixed.  There is no such thing as a secure  
> 
> That is the main change I'm asking against original patch, which has:
> 
> +static void pci_check_msi_remapping(struct pci_dev *pdev,
> + const struct iommu_ops *ops)
> +{
> + struct pci_bus *bus = pdev->bus;
> +
> + if (ops->capable(IOMMU_CAP_INTR_REMAP) &&
> + !(bus->bus_flags & PCI_BUS_FLAGS_MSI_REMAP))
> + bus->bus_flags |= PCI_BUS_FLAGS_MSI_REMAP;
> +}
> +
> 
> Above flag should be cleared when source-id checking is disabled on x86. 
> Yes, VFIO is part of OS but any assumption we made about other parts
> needs to be reflected accurately in the code.

I would say this is an independent bug which should be fixed simply as:

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index e1852e8..60d55c0 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4948,7 +4948,7 @@ static bool intel_iommu_capable(enum iommu_cap cap)
if (cap == IOMMU_CAP_CACHE_COHERENCY)
return domain_update_iommu_snooping(NULL) == 1;
if (cap == IOMMU_CAP_INTR_REMAP)
-   return irq_remapping_enabled == 1;
+   return irq_remapping_enabled == 1 && !disable_sourceid_checking;
 
return false;
 }

I believe the intent of the IOMMU_CAP_INTR_REMAP flag is simply to
indicate interrupt isolation is provided through the IOMMU.  Nobody
cares about the interrupt remapping support beyond that.  If source-id
checking is disabled, the remainder of interrupt remapping is
irrelevant as far as this capability is concerned imho.  Thanks,

Alex
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-13 Thread David Laight
From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: 13 May 2016 06:33
...
> Simply denying direct writes to the vector table or preventing mapping
> of the vector table into the user address space does not provide any
> tangible form of protection.  Many devices make use of window registers
> that allow backdoors to arbitrary device registers.  Some drivers even
> use this as the primary means for configuring MSI-X, which makes them
> incompatible with device assignment without device specific quirks to
> enable virtualization of these paths.

We have one fgpa based PCIe slave where the device driver has to read
the MSI-X table and then write the value to other fpga registers so
that the logic can generate the correct PCIe write cycle when an
interrupt is requested.
The MSI-X table itself is only as a PCIe slave.

We also have host accessible DMA controllers that the device driver
uses to copy data to kernel memory.
These could easily be used to generate arbitrary MSI-X requests.
As I've said earlier it is almost certainly possible to get any
ethernet hardware to perform something similar.

So without hardware that is able to limit the memory and MSI-X
that each PCIe endpoint can access I believe that if a virtualisation
system gives a guest kernel direct access to a PCIe devices it gives
the guest kernel the ability to raise and MSI-X interrupt and read/write
any physical memory.
(I've not looked at the cpu virtualisation support, but do know what
the PCIe devices can do.)

More interestingly, probably the 'worst' thing (from a security point of view)
that changing the MSI-X table lets you do is a write to an arbitrary
physical memory address.

David


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-13 Thread Tian, Kevin
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Friday, May 13, 2016 1:33 PM
> > >
> > > As argued previously in this thread, there's nothing special about a
> > > DMA write to memory versus a DMA write to a special address that
> > > triggers an MSI vector.  If the device is DMA capable, which we assume
> > > all are, it can be fooled into generating those DMA writes regardless
> > > of whether we actively block access to the MSI-X vector table itself.
> >
> > But with DMA remapping above can be blocked.
> 
> How?  VT-d explicitly ignores DMA writes to 0xFEEx_, section 3.13:
> 
>   Write requests without PASID of DWORD length are treated as interrupt
>   requests. Interrupt requests are not subjected to DMA remapping[...]
>   Instead, remapping hardware can be enabled to subject such interrupt
>   requests to interrupt remapping.

Thanks for catching this!

> 
> > > MSI-X vector table access w/o interrupt remapping is to avoid obvious
> > > collisions if it were to be programmed directly, it doesn't actually
> > > prevent an identical DMA transaction from being generated by other
> >
> > Kernel can enable DMA remapping but disable IRQ remapping. In such
> > case identical DMA transaction can be prevented.
> 
> Not according to the VT-d spec as quoted above.  If so, how?

So my argument on this is wrong. sorry.

> 
> > Anyway my point is simple. Let's ignore how Linux kernel implements
> > IRQ remapping on x86 (which may change time to time), and just
> > focus on architectural possibility. Non-x86 platform may implement
> > IRQ remapping completely separate from device side, then checking
> > availability of IRQ remapping is enough to decide whether mmap
> > MSI-X table is safe. x86 with VT-d can be configured to a mode
> > requiring host control of both MSI-X entry and IRQ remapping hardware
> > (without source id check). In such case it's insufficient to make
> > decision simply based on IRQ remapping availability. We need a way
> > to query from IRQ remapping module whether it's actually safe to
> > mmap MSI-X.
> 
> We're going in circles here.  This patch is attempting to remove
> protection from the MSI-X vector table that is really nothing more than
> security theater already.  That "protection" only actually prevents
> casual misuse of the API which is really only a problem when the
> platform offers no form of interrupt isolation, such as VT-d with DMA
> remapping but not interrupt remapping.  Disabling source-id checking in
> VT-d should be handled as the equivalent of disabling interrupt
> remapping altogether as far as the IOMMU API is concerned.  That's
> a trivial gap that should be fixed.  There is no such thing as a secure

That is the main change I'm asking against original patch, which has:

+static void pci_check_msi_remapping(struct pci_dev *pdev,
+   const struct iommu_ops *ops)
+{
+   struct pci_bus *bus = pdev->bus;
+
+   if (ops->capable(IOMMU_CAP_INTR_REMAP) &&
+   !(bus->bus_flags & PCI_BUS_FLAGS_MSI_REMAP))
+   bus->bus_flags |= PCI_BUS_FLAGS_MSI_REMAP;
+}
+

Above flag should be cleared when source-id checking is disabled on x86. 
Yes, VFIO is part of OS but any assumption we made about other parts
needs to be reflected accurately in the code.

> MSI-X vector table when untrusted userspace drivers are involved, we
> must always assume that a device can generate DMA writes that are
> indistinguishable from actual interrupt requests and if the platform
> does not provide interrupt isolation we should require the user to
> opt-in to an unsafe mode.
> 
> Simply denying direct writes to the vector table or preventing mapping
> of the vector table into the user address space does not provide any
> tangible form of protection.  Many devices make use of window registers
> that allow backdoors to arbitrary device registers.  Some drivers even
> use this as the primary means for configuring MSI-X, which makes them
> incompatible with device assignment without device specific quirks to
> enable virtualization of these paths.
> 
> If you have an objection to this patch, please show me how preventing
> direct CPU access to the MSI-X vector table provides any kind of
> security guarantee of the contents of the vector table and also prove
> to me that a device cannot spoof a DMA write that is indistinguishable
> from one associated with an actual interrupt, regardless of the
> contents of the MSI-X vector table.  Thanks,
> 

I'm not object to the whole patch series. As replied above, my point
is just that current condition of allowing mmap MSI-X in this patch is not 
accurate, but my argument on security manner is not correct. Thanks
for your elaboration to make it clear.

Thanks
Kevin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-12 Thread Alex Williamson
On Fri, 13 May 2016 02:33:18 +
"Tian, Kevin"  wrote:

> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Friday, May 13, 2016 1:48 AM
> > 
> > On Thu, 12 May 2016 04:53:19 +
> > "Tian, Kevin"  wrote:
> >   
> > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > Sent: Thursday, May 12, 2016 10:21 AM
> > > >
> > > > On Thu, 12 May 2016 01:19:44 +
> > > > "Tian, Kevin"  wrote:
> > > >  
> > > > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > > > Sent: Wednesday, May 11, 2016 11:54 PM
> > > > > >
> > > > > > On Wed, 11 May 2016 06:29:06 +
> > > > > > "Tian, Kevin"  wrote:
> > > > > >  
> > > > > > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > > > > > Sent: Thursday, May 05, 2016 11:05 PM
> > > > > > > >
> > > > > > > > On Thu, 5 May 2016 12:15:46 +
> > > > > > > > "Tian, Kevin"  wrote:
> > > > > > > >  
> > > > > > > > > > From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> > > > > > > > > > Sent: Thursday, May 05, 2016 7:43 PM
> > > > > > > > > >
> > > > > > > > > > Hi David and Kevin,
> > > > > > > > > >
> > > > > > > > > > On 2016/5/5 17:54, David Laight wrote:
> > > > > > > > > >  
> > > > > > > > > > > From: Tian, Kevin  
> > > > > > > > > > >> Sent: 05 May 2016 10:37  
> > > > > > > > > > > ...  
> > > > > > > > > > >>> Acutually, we are not aimed at accessing MSI-X table 
> > > > > > > > > > >>> from
> > > > > > > > > > >>> guest. So I think it's safe to passthrough MSI-X table 
> > > > > > > > > > >>> if we
> > > > > > > > > > >>> can make sure guest kernel would not touch MSI-X table 
> > > > > > > > > > >>> in
> > > > > > > > > > >>> normal code path such as para-virtualized guest kernel 
> > > > > > > > > > >>> on PPC64.
> > > > > > > > > > >>>  
> > > > > > > > > > >> Then how do you prevent malicious guest kernel accessing 
> > > > > > > > > > >> it?  
> > > > > > > > > > > Or a malicious guest driver for an ethernet card setting 
> > > > > > > > > > > up
> > > > > > > > > > > the receive buffer ring to contain a single word entry 
> > > > > > > > > > > that
> > > > > > > > > > > contains the address associated with an MSI-X interrupt 
> > > > > > > > > > > and
> > > > > > > > > > > then using a loopback mode to cause a specific packet be
> > > > > > > > > > > received that writes the required word through that 
> > > > > > > > > > > address.
> > > > > > > > > > >
> > > > > > > > > > > Remember the PCIe cycle for an interrupt is a normal 
> > > > > > > > > > > memory write
> > > > > > > > > > > cycle.
> > > > > > > > > > >
> > > > > > > > > > >   David
> > > > > > > > > > >  
> > > > > > > > > >
> > > > > > > > > > If we have enough permission to load a malicious driver or
> > > > > > > > > > kernel, we can easily break the guest without exposed
> > > > > > > > > > MSI-X table.
> > > > > > > > > >
> > > > > > > > > > I think it should be safe to expose MSI-X table if we can
> > > > > > > > > > make sure that malicious guest driver/kernel can't use
> > > > > > > > > > the MSI-X table to break other guest or host. The
> > > > > > > > > > capability of IRQ remapping could provide this
> > > > > > > > > > kind of protection.
> > > > > > > > > >  
> > > > > > > > >
> > > > > > > > > With IRQ remapping it doesn't mean you can pass through MSI-X
> > > > > > > > > structure to guest. I know actual IRQ remapping might be 
> > > > > > > > > platform
> > > > > > > > > specific, but at least for Intel VT-d specification, MSI-X 
> > > > > > > > > entry must
> > > > > > > > > be configured with a remappable format by host kernel which
> > > > > > > > > contains an index into IRQ remapping table. The index will 
> > > > > > > > > find a
> > > > > > > > > IRQ remapping entry which controls interrupt routing for a 
> > > > > > > > > specific
> > > > > > > > > device. If you allow a malicious program random index into 
> > > > > > > > > MSI-X
> > > > > > > > > entry of assigned device, the hole is obvious...
> > > > > > > > >
> > > > > > > > > Above might make sense only for a IRQ remapping implementation
> > > > > > > > > which doesn't rely on extended MSI-X format (e.g. simply 
> > > > > > > > > based on
> > > > > > > > > BDF). If that's the case for PPC, then you should build MSI-X
> > > > > > > > > passthrough based on this fact instead of general IRQ 
> > > > > > > > > remapping
> > > > > > > > > enabled or not.  
> > > > > > > >
> > > > > > > > I don't think anyone is expecting that we can expose the MSI-X 
> > > > > > > > vector
> > > > > > > > table to the guest and the guest can make direct use of it.  
> > > > > > > > The end
> > > > > > > > goal here is that the guest on a power system is already
> > > > > > > > paravirtualized to not program the device MSI-X by directly 
> > > > > > > > writing to
> > > > > > > > the MSI-X vector table.  They have hypercalls for this since 
> > > > > > > 

RE: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-12 Thread Tian, Kevin
> From: Tian, Kevin
> Sent: Friday, May 13, 2016 10:33 AM
> 
> > means.  The MSI-X vector table of a device is always considered
> > untrusted which is why we require user opt-ins to subvert that
> > protection.  Thanks,
> >
> 
> I only partially agree with this statement since there is different
> trust level for kernel space driver and user space driver.
> 
> OS may choose to trust kernel space driver then it can enable IRQ
> remapping only for load balance purpose w/o source id verfification.
> In such case MSI-X vector table is trusted and fully under kernel
> control. Allowing to mmap MSI-X table in user space definitely
> breaks that boundary.
> 
> Anyway my point is simple. Let's ignore how Linux kernel implements
> IRQ remapping on x86 (which may change time to time), and just
> focus on architectural possibility. Non-x86 platform may implement
> IRQ remapping completely separate from device side, then checking
> availability of IRQ remapping is enough to decide whether mmap
> MSI-X table is safe. x86 with VT-d can be configured to a mode
> requiring host control of both MSI-X entry and IRQ remapping hardware
> (without source id check). In such case it's insufficient to make
> decision simply based on IRQ remapping availability. We need a way
> to query from IRQ remapping module whether it's actually safe to
> mmap MSI-X.
> 

Or another way is to have VFIO explicitly request intel-iommu to
enforce sid check when IRQ remapping is enabled...

Thanks
Kevin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-12 Thread Tian, Kevin
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Friday, May 13, 2016 1:48 AM
> 
> On Thu, 12 May 2016 04:53:19 +
> "Tian, Kevin"  wrote:
> 
> > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > Sent: Thursday, May 12, 2016 10:21 AM
> > >
> > > On Thu, 12 May 2016 01:19:44 +
> > > "Tian, Kevin"  wrote:
> > >
> > > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > > Sent: Wednesday, May 11, 2016 11:54 PM
> > > > >
> > > > > On Wed, 11 May 2016 06:29:06 +
> > > > > "Tian, Kevin"  wrote:
> > > > >
> > > > > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > > > > Sent: Thursday, May 05, 2016 11:05 PM
> > > > > > >
> > > > > > > On Thu, 5 May 2016 12:15:46 +
> > > > > > > "Tian, Kevin"  wrote:
> > > > > > >
> > > > > > > > > From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> > > > > > > > > Sent: Thursday, May 05, 2016 7:43 PM
> > > > > > > > >
> > > > > > > > > Hi David and Kevin,
> > > > > > > > >
> > > > > > > > > On 2016/5/5 17:54, David Laight wrote:
> > > > > > > > >
> > > > > > > > > > From: Tian, Kevin
> > > > > > > > > >> Sent: 05 May 2016 10:37
> > > > > > > > > > ...
> > > > > > > > > >>> Acutually, we are not aimed at accessing MSI-X table from
> > > > > > > > > >>> guest. So I think it's safe to passthrough MSI-X table if 
> > > > > > > > > >>> we
> > > > > > > > > >>> can make sure guest kernel would not touch MSI-X table in
> > > > > > > > > >>> normal code path such as para-virtualized guest kernel on 
> > > > > > > > > >>> PPC64.
> > > > > > > > > >>>
> > > > > > > > > >> Then how do you prevent malicious guest kernel accessing 
> > > > > > > > > >> it?
> > > > > > > > > > Or a malicious guest driver for an ethernet card setting up
> > > > > > > > > > the receive buffer ring to contain a single word entry that
> > > > > > > > > > contains the address associated with an MSI-X interrupt and
> > > > > > > > > > then using a loopback mode to cause a specific packet be
> > > > > > > > > > received that writes the required word through that address.
> > > > > > > > > >
> > > > > > > > > > Remember the PCIe cycle for an interrupt is a normal memory 
> > > > > > > > > > write
> > > > > > > > > > cycle.
> > > > > > > > > >
> > > > > > > > > > David
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > If we have enough permission to load a malicious driver or
> > > > > > > > > kernel, we can easily break the guest without exposed
> > > > > > > > > MSI-X table.
> > > > > > > > >
> > > > > > > > > I think it should be safe to expose MSI-X table if we can
> > > > > > > > > make sure that malicious guest driver/kernel can't use
> > > > > > > > > the MSI-X table to break other guest or host. The
> > > > > > > > > capability of IRQ remapping could provide this
> > > > > > > > > kind of protection.
> > > > > > > > >
> > > > > > > >
> > > > > > > > With IRQ remapping it doesn't mean you can pass through MSI-X
> > > > > > > > structure to guest. I know actual IRQ remapping might be 
> > > > > > > > platform
> > > > > > > > specific, but at least for Intel VT-d specification, MSI-X 
> > > > > > > > entry must
> > > > > > > > be configured with a remappable format by host kernel which
> > > > > > > > contains an index into IRQ remapping table. The index will find 
> > > > > > > > a
> > > > > > > > IRQ remapping entry which controls interrupt routing for a 
> > > > > > > > specific
> > > > > > > > device. If you allow a malicious program random index into MSI-X
> > > > > > > > entry of assigned device, the hole is obvious...
> > > > > > > >
> > > > > > > > Above might make sense only for a IRQ remapping implementation
> > > > > > > > which doesn't rely on extended MSI-X format (e.g. simply based 
> > > > > > > > on
> > > > > > > > BDF). If that's the case for PPC, then you should build MSI-X
> > > > > > > > passthrough based on this fact instead of general IRQ remapping
> > > > > > > > enabled or not.
> > > > > > >
> > > > > > > I don't think anyone is expecting that we can expose the MSI-X 
> > > > > > > vector
> > > > > > > table to the guest and the guest can make direct use of it.  The 
> > > > > > > end
> > > > > > > goal here is that the guest on a power system is already
> > > > > > > paravirtualized to not program the device MSI-X by directly 
> > > > > > > writing to
> > > > > > > the MSI-X vector table.  They have hypercalls for this since they
> > > > > > > always run virtualized.  Therefore a) they never intend to touch 
> > > > > > > the
> > > > > > > MSI-X vector table and b) they have sufficient isolation that a 
> > > > > > > guest
> > > > > > > can only hurt itself by doing so.
> > > > > > >
> > > > > > > On x86 we don't have a), our method of programming the MSI-X 
> > > > > > > vector
> > > > > > > table is to directly write to it. Therefore we will always 
> > > > > > > require QEMU
> > > > > > > to place a 

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-12 Thread Alex Williamson
On Thu, 12 May 2016 04:53:19 +
"Tian, Kevin"  wrote:

> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Thursday, May 12, 2016 10:21 AM
> > 
> > On Thu, 12 May 2016 01:19:44 +
> > "Tian, Kevin"  wrote:
> >   
> > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > Sent: Wednesday, May 11, 2016 11:54 PM
> > > >
> > > > On Wed, 11 May 2016 06:29:06 +
> > > > "Tian, Kevin"  wrote:
> > > >  
> > > > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > > > Sent: Thursday, May 05, 2016 11:05 PM
> > > > > >
> > > > > > On Thu, 5 May 2016 12:15:46 +
> > > > > > "Tian, Kevin"  wrote:
> > > > > >  
> > > > > > > > From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> > > > > > > > Sent: Thursday, May 05, 2016 7:43 PM
> > > > > > > >
> > > > > > > > Hi David and Kevin,
> > > > > > > >
> > > > > > > > On 2016/5/5 17:54, David Laight wrote:
> > > > > > > >  
> > > > > > > > > From: Tian, Kevin  
> > > > > > > > >> Sent: 05 May 2016 10:37  
> > > > > > > > > ...  
> > > > > > > > >>> Acutually, we are not aimed at accessing MSI-X table from
> > > > > > > > >>> guest. So I think it's safe to passthrough MSI-X table if we
> > > > > > > > >>> can make sure guest kernel would not touch MSI-X table in
> > > > > > > > >>> normal code path such as para-virtualized guest kernel on 
> > > > > > > > >>> PPC64.
> > > > > > > > >>>  
> > > > > > > > >> Then how do you prevent malicious guest kernel accessing it? 
> > > > > > > > >>  
> > > > > > > > > Or a malicious guest driver for an ethernet card setting up
> > > > > > > > > the receive buffer ring to contain a single word entry that
> > > > > > > > > contains the address associated with an MSI-X interrupt and
> > > > > > > > > then using a loopback mode to cause a specific packet be
> > > > > > > > > received that writes the required word through that address.
> > > > > > > > >
> > > > > > > > > Remember the PCIe cycle for an interrupt is a normal memory 
> > > > > > > > > write
> > > > > > > > > cycle.
> > > > > > > > >
> > > > > > > > >   David
> > > > > > > > >  
> > > > > > > >
> > > > > > > > If we have enough permission to load a malicious driver or
> > > > > > > > kernel, we can easily break the guest without exposed
> > > > > > > > MSI-X table.
> > > > > > > >
> > > > > > > > I think it should be safe to expose MSI-X table if we can
> > > > > > > > make sure that malicious guest driver/kernel can't use
> > > > > > > > the MSI-X table to break other guest or host. The
> > > > > > > > capability of IRQ remapping could provide this
> > > > > > > > kind of protection.
> > > > > > > >  
> > > > > > >
> > > > > > > With IRQ remapping it doesn't mean you can pass through MSI-X
> > > > > > > structure to guest. I know actual IRQ remapping might be platform
> > > > > > > specific, but at least for Intel VT-d specification, MSI-X entry 
> > > > > > > must
> > > > > > > be configured with a remappable format by host kernel which
> > > > > > > contains an index into IRQ remapping table. The index will find a
> > > > > > > IRQ remapping entry which controls interrupt routing for a 
> > > > > > > specific
> > > > > > > device. If you allow a malicious program random index into MSI-X
> > > > > > > entry of assigned device, the hole is obvious...
> > > > > > >
> > > > > > > Above might make sense only for a IRQ remapping implementation
> > > > > > > which doesn't rely on extended MSI-X format (e.g. simply based on
> > > > > > > BDF). If that's the case for PPC, then you should build MSI-X
> > > > > > > passthrough based on this fact instead of general IRQ remapping
> > > > > > > enabled or not.  
> > > > > >
> > > > > > I don't think anyone is expecting that we can expose the MSI-X 
> > > > > > vector
> > > > > > table to the guest and the guest can make direct use of it.  The end
> > > > > > goal here is that the guest on a power system is already
> > > > > > paravirtualized to not program the device MSI-X by directly writing 
> > > > > > to
> > > > > > the MSI-X vector table.  They have hypercalls for this since they
> > > > > > always run virtualized.  Therefore a) they never intend to touch the
> > > > > > MSI-X vector table and b) they have sufficient isolation that a 
> > > > > > guest
> > > > > > can only hurt itself by doing so.
> > > > > >
> > > > > > On x86 we don't have a), our method of programming the MSI-X vector
> > > > > > table is to directly write to it. Therefore we will always require 
> > > > > > QEMU
> > > > > > to place a MemoryRegion over the vector table to intercept those
> > > > > > accesses.  However with interrupt remapping, we do have b) on x86, 
> > > > > > which
> > > > > > means that we don't need to be so strict in disallowing user 
> > > > > > accesses
> > > > > > to the MSI-X vector table.  It's not useful for configuring MSI-X on
> > > > > > the device, but the user should only be able 

RE: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-11 Thread Tian, Kevin
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Thursday, May 12, 2016 10:21 AM
> 
> On Thu, 12 May 2016 01:19:44 +
> "Tian, Kevin"  wrote:
> 
> > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > Sent: Wednesday, May 11, 2016 11:54 PM
> > >
> > > On Wed, 11 May 2016 06:29:06 +
> > > "Tian, Kevin"  wrote:
> > >
> > > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > > Sent: Thursday, May 05, 2016 11:05 PM
> > > > >
> > > > > On Thu, 5 May 2016 12:15:46 +
> > > > > "Tian, Kevin"  wrote:
> > > > >
> > > > > > > From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> > > > > > > Sent: Thursday, May 05, 2016 7:43 PM
> > > > > > >
> > > > > > > Hi David and Kevin,
> > > > > > >
> > > > > > > On 2016/5/5 17:54, David Laight wrote:
> > > > > > >
> > > > > > > > From: Tian, Kevin
> > > > > > > >> Sent: 05 May 2016 10:37
> > > > > > > > ...
> > > > > > > >>> Acutually, we are not aimed at accessing MSI-X table from
> > > > > > > >>> guest. So I think it's safe to passthrough MSI-X table if we
> > > > > > > >>> can make sure guest kernel would not touch MSI-X table in
> > > > > > > >>> normal code path such as para-virtualized guest kernel on 
> > > > > > > >>> PPC64.
> > > > > > > >>>
> > > > > > > >> Then how do you prevent malicious guest kernel accessing it?
> > > > > > > > Or a malicious guest driver for an ethernet card setting up
> > > > > > > > the receive buffer ring to contain a single word entry that
> > > > > > > > contains the address associated with an MSI-X interrupt and
> > > > > > > > then using a loopback mode to cause a specific packet be
> > > > > > > > received that writes the required word through that address.
> > > > > > > >
> > > > > > > > Remember the PCIe cycle for an interrupt is a normal memory 
> > > > > > > > write
> > > > > > > > cycle.
> > > > > > > >
> > > > > > > > David
> > > > > > > >
> > > > > > >
> > > > > > > If we have enough permission to load a malicious driver or
> > > > > > > kernel, we can easily break the guest without exposed
> > > > > > > MSI-X table.
> > > > > > >
> > > > > > > I think it should be safe to expose MSI-X table if we can
> > > > > > > make sure that malicious guest driver/kernel can't use
> > > > > > > the MSI-X table to break other guest or host. The
> > > > > > > capability of IRQ remapping could provide this
> > > > > > > kind of protection.
> > > > > > >
> > > > > >
> > > > > > With IRQ remapping it doesn't mean you can pass through MSI-X
> > > > > > structure to guest. I know actual IRQ remapping might be platform
> > > > > > specific, but at least for Intel VT-d specification, MSI-X entry 
> > > > > > must
> > > > > > be configured with a remappable format by host kernel which
> > > > > > contains an index into IRQ remapping table. The index will find a
> > > > > > IRQ remapping entry which controls interrupt routing for a specific
> > > > > > device. If you allow a malicious program random index into MSI-X
> > > > > > entry of assigned device, the hole is obvious...
> > > > > >
> > > > > > Above might make sense only for a IRQ remapping implementation
> > > > > > which doesn't rely on extended MSI-X format (e.g. simply based on
> > > > > > BDF). If that's the case for PPC, then you should build MSI-X
> > > > > > passthrough based on this fact instead of general IRQ remapping
> > > > > > enabled or not.
> > > > >
> > > > > I don't think anyone is expecting that we can expose the MSI-X vector
> > > > > table to the guest and the guest can make direct use of it.  The end
> > > > > goal here is that the guest on a power system is already
> > > > > paravirtualized to not program the device MSI-X by directly writing to
> > > > > the MSI-X vector table.  They have hypercalls for this since they
> > > > > always run virtualized.  Therefore a) they never intend to touch the
> > > > > MSI-X vector table and b) they have sufficient isolation that a guest
> > > > > can only hurt itself by doing so.
> > > > >
> > > > > On x86 we don't have a), our method of programming the MSI-X vector
> > > > > table is to directly write to it. Therefore we will always require 
> > > > > QEMU
> > > > > to place a MemoryRegion over the vector table to intercept those
> > > > > accesses.  However with interrupt remapping, we do have b) on x86, 
> > > > > which
> > > > > means that we don't need to be so strict in disallowing user accesses
> > > > > to the MSI-X vector table.  It's not useful for configuring MSI-X on
> > > > > the device, but the user should only be able to hurt themselves by
> > > > > writing it directly.  x86 doesn't really get anything out of this
> > > > > change, but it helps this special case on power pretty significantly
> > > > > aiui.  Thanks,
> > > > >
> > > >
> > > > Allowing guest direct write to MSI-x table has system-wide impact.
> > > > As I explained earlier, hypervisor needs to control "interrupt_index"
> 

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-11 Thread Alex Williamson
On Thu, 12 May 2016 01:19:44 +
"Tian, Kevin"  wrote:

> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Wednesday, May 11, 2016 11:54 PM
> > 
> > On Wed, 11 May 2016 06:29:06 +
> > "Tian, Kevin"  wrote:
> >   
> > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > Sent: Thursday, May 05, 2016 11:05 PM
> > > >
> > > > On Thu, 5 May 2016 12:15:46 +
> > > > "Tian, Kevin"  wrote:
> > > >  
> > > > > > From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> > > > > > Sent: Thursday, May 05, 2016 7:43 PM
> > > > > >
> > > > > > Hi David and Kevin,
> > > > > >
> > > > > > On 2016/5/5 17:54, David Laight wrote:
> > > > > >  
> > > > > > > From: Tian, Kevin  
> > > > > > >> Sent: 05 May 2016 10:37  
> > > > > > > ...  
> > > > > > >>> Acutually, we are not aimed at accessing MSI-X table from
> > > > > > >>> guest. So I think it's safe to passthrough MSI-X table if we
> > > > > > >>> can make sure guest kernel would not touch MSI-X table in
> > > > > > >>> normal code path such as para-virtualized guest kernel on PPC64.
> > > > > > >>>  
> > > > > > >> Then how do you prevent malicious guest kernel accessing it?  
> > > > > > > Or a malicious guest driver for an ethernet card setting up
> > > > > > > the receive buffer ring to contain a single word entry that
> > > > > > > contains the address associated with an MSI-X interrupt and
> > > > > > > then using a loopback mode to cause a specific packet be
> > > > > > > received that writes the required word through that address.
> > > > > > >
> > > > > > > Remember the PCIe cycle for an interrupt is a normal memory write
> > > > > > > cycle.
> > > > > > >
> > > > > > >   David
> > > > > > >  
> > > > > >
> > > > > > If we have enough permission to load a malicious driver or
> > > > > > kernel, we can easily break the guest without exposed
> > > > > > MSI-X table.
> > > > > >
> > > > > > I think it should be safe to expose MSI-X table if we can
> > > > > > make sure that malicious guest driver/kernel can't use
> > > > > > the MSI-X table to break other guest or host. The
> > > > > > capability of IRQ remapping could provide this
> > > > > > kind of protection.
> > > > > >  
> > > > >
> > > > > With IRQ remapping it doesn't mean you can pass through MSI-X
> > > > > structure to guest. I know actual IRQ remapping might be platform
> > > > > specific, but at least for Intel VT-d specification, MSI-X entry must
> > > > > be configured with a remappable format by host kernel which
> > > > > contains an index into IRQ remapping table. The index will find a
> > > > > IRQ remapping entry which controls interrupt routing for a specific
> > > > > device. If you allow a malicious program random index into MSI-X
> > > > > entry of assigned device, the hole is obvious...
> > > > >
> > > > > Above might make sense only for a IRQ remapping implementation
> > > > > which doesn't rely on extended MSI-X format (e.g. simply based on
> > > > > BDF). If that's the case for PPC, then you should build MSI-X
> > > > > passthrough based on this fact instead of general IRQ remapping
> > > > > enabled or not.  
> > > >
> > > > I don't think anyone is expecting that we can expose the MSI-X vector
> > > > table to the guest and the guest can make direct use of it.  The end
> > > > goal here is that the guest on a power system is already
> > > > paravirtualized to not program the device MSI-X by directly writing to
> > > > the MSI-X vector table.  They have hypercalls for this since they
> > > > always run virtualized.  Therefore a) they never intend to touch the
> > > > MSI-X vector table and b) they have sufficient isolation that a guest
> > > > can only hurt itself by doing so.
> > > >
> > > > On x86 we don't have a), our method of programming the MSI-X vector
> > > > table is to directly write to it. Therefore we will always require QEMU
> > > > to place a MemoryRegion over the vector table to intercept those
> > > > accesses.  However with interrupt remapping, we do have b) on x86, which
> > > > means that we don't need to be so strict in disallowing user accesses
> > > > to the MSI-X vector table.  It's not useful for configuring MSI-X on
> > > > the device, but the user should only be able to hurt themselves by
> > > > writing it directly.  x86 doesn't really get anything out of this
> > > > change, but it helps this special case on power pretty significantly
> > > > aiui.  Thanks,
> > > >  
> > >
> > > Allowing guest direct write to MSI-x table has system-wide impact.
> > > As I explained earlier, hypervisor needs to control "interrupt_index"
> > > programmed in MSI-X entry, which is used to associate a specific
> > > IRQ remapping entry. Now if you expose whole MSI-x table to guest,
> > > it can program random index into MSI-X entry to associate with
> > > any IRQ remapping entry and then there won't be any isolation per se.
> > >
> > > You can check "5.5.2 MSI and MSI-X 

RE: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-11 Thread Tian, Kevin
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Wednesday, May 11, 2016 11:54 PM
> 
> On Wed, 11 May 2016 06:29:06 +
> "Tian, Kevin"  wrote:
> 
> > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > Sent: Thursday, May 05, 2016 11:05 PM
> > >
> > > On Thu, 5 May 2016 12:15:46 +
> > > "Tian, Kevin"  wrote:
> > >
> > > > > From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> > > > > Sent: Thursday, May 05, 2016 7:43 PM
> > > > >
> > > > > Hi David and Kevin,
> > > > >
> > > > > On 2016/5/5 17:54, David Laight wrote:
> > > > >
> > > > > > From: Tian, Kevin
> > > > > >> Sent: 05 May 2016 10:37
> > > > > > ...
> > > > > >>> Acutually, we are not aimed at accessing MSI-X table from
> > > > > >>> guest. So I think it's safe to passthrough MSI-X table if we
> > > > > >>> can make sure guest kernel would not touch MSI-X table in
> > > > > >>> normal code path such as para-virtualized guest kernel on PPC64.
> > > > > >>>
> > > > > >> Then how do you prevent malicious guest kernel accessing it?
> > > > > > Or a malicious guest driver for an ethernet card setting up
> > > > > > the receive buffer ring to contain a single word entry that
> > > > > > contains the address associated with an MSI-X interrupt and
> > > > > > then using a loopback mode to cause a specific packet be
> > > > > > received that writes the required word through that address.
> > > > > >
> > > > > > Remember the PCIe cycle for an interrupt is a normal memory write
> > > > > > cycle.
> > > > > >
> > > > > > David
> > > > > >
> > > > >
> > > > > If we have enough permission to load a malicious driver or
> > > > > kernel, we can easily break the guest without exposed
> > > > > MSI-X table.
> > > > >
> > > > > I think it should be safe to expose MSI-X table if we can
> > > > > make sure that malicious guest driver/kernel can't use
> > > > > the MSI-X table to break other guest or host. The
> > > > > capability of IRQ remapping could provide this
> > > > > kind of protection.
> > > > >
> > > >
> > > > With IRQ remapping it doesn't mean you can pass through MSI-X
> > > > structure to guest. I know actual IRQ remapping might be platform
> > > > specific, but at least for Intel VT-d specification, MSI-X entry must
> > > > be configured with a remappable format by host kernel which
> > > > contains an index into IRQ remapping table. The index will find a
> > > > IRQ remapping entry which controls interrupt routing for a specific
> > > > device. If you allow a malicious program random index into MSI-X
> > > > entry of assigned device, the hole is obvious...
> > > >
> > > > Above might make sense only for a IRQ remapping implementation
> > > > which doesn't rely on extended MSI-X format (e.g. simply based on
> > > > BDF). If that's the case for PPC, then you should build MSI-X
> > > > passthrough based on this fact instead of general IRQ remapping
> > > > enabled or not.
> > >
> > > I don't think anyone is expecting that we can expose the MSI-X vector
> > > table to the guest and the guest can make direct use of it.  The end
> > > goal here is that the guest on a power system is already
> > > paravirtualized to not program the device MSI-X by directly writing to
> > > the MSI-X vector table.  They have hypercalls for this since they
> > > always run virtualized.  Therefore a) they never intend to touch the
> > > MSI-X vector table and b) they have sufficient isolation that a guest
> > > can only hurt itself by doing so.
> > >
> > > On x86 we don't have a), our method of programming the MSI-X vector
> > > table is to directly write to it. Therefore we will always require QEMU
> > > to place a MemoryRegion over the vector table to intercept those
> > > accesses.  However with interrupt remapping, we do have b) on x86, which
> > > means that we don't need to be so strict in disallowing user accesses
> > > to the MSI-X vector table.  It's not useful for configuring MSI-X on
> > > the device, but the user should only be able to hurt themselves by
> > > writing it directly.  x86 doesn't really get anything out of this
> > > change, but it helps this special case on power pretty significantly
> > > aiui.  Thanks,
> > >
> >
> > Allowing guest direct write to MSI-x table has system-wide impact.
> > As I explained earlier, hypervisor needs to control "interrupt_index"
> > programmed in MSI-X entry, which is used to associate a specific
> > IRQ remapping entry. Now if you expose whole MSI-x table to guest,
> > it can program random index into MSI-X entry to associate with
> > any IRQ remapping entry and then there won't be any isolation per se.
> >
> > You can check "5.5.2 MSI and MSI-X Register Programming" in VT-d
> > spec.
> 
> I think you're extrapolating beyond the vfio interface.  The change
> here is to remove the vfio protection of the MSI-X vector table when
> the system provides interrupt isolation.  The argument is that this is
> safe to do because the hardware 

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-11 Thread Alex Williamson
On Wed, 11 May 2016 06:29:06 +
"Tian, Kevin"  wrote:

> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Thursday, May 05, 2016 11:05 PM
> > 
> > On Thu, 5 May 2016 12:15:46 +
> > "Tian, Kevin"  wrote:
> >   
> > > > From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> > > > Sent: Thursday, May 05, 2016 7:43 PM
> > > >
> > > > Hi David and Kevin,
> > > >
> > > > On 2016/5/5 17:54, David Laight wrote:
> > > >  
> > > > > From: Tian, Kevin  
> > > > >> Sent: 05 May 2016 10:37  
> > > > > ...  
> > > > >>> Acutually, we are not aimed at accessing MSI-X table from
> > > > >>> guest. So I think it's safe to passthrough MSI-X table if we
> > > > >>> can make sure guest kernel would not touch MSI-X table in
> > > > >>> normal code path such as para-virtualized guest kernel on PPC64.
> > > > >>>  
> > > > >> Then how do you prevent malicious guest kernel accessing it?  
> > > > > Or a malicious guest driver for an ethernet card setting up
> > > > > the receive buffer ring to contain a single word entry that
> > > > > contains the address associated with an MSI-X interrupt and
> > > > > then using a loopback mode to cause a specific packet be
> > > > > received that writes the required word through that address.
> > > > >
> > > > > Remember the PCIe cycle for an interrupt is a normal memory write
> > > > > cycle.
> > > > >
> > > > >   David
> > > > >  
> > > >
> > > > If we have enough permission to load a malicious driver or
> > > > kernel, we can easily break the guest without exposed
> > > > MSI-X table.
> > > >
> > > > I think it should be safe to expose MSI-X table if we can
> > > > make sure that malicious guest driver/kernel can't use
> > > > the MSI-X table to break other guest or host. The
> > > > capability of IRQ remapping could provide this
> > > > kind of protection.
> > > >  
> > >
> > > With IRQ remapping it doesn't mean you can pass through MSI-X
> > > structure to guest. I know actual IRQ remapping might be platform
> > > specific, but at least for Intel VT-d specification, MSI-X entry must
> > > be configured with a remappable format by host kernel which
> > > contains an index into IRQ remapping table. The index will find a
> > > IRQ remapping entry which controls interrupt routing for a specific
> > > device. If you allow a malicious program random index into MSI-X
> > > entry of assigned device, the hole is obvious...
> > >
> > > Above might make sense only for a IRQ remapping implementation
> > > which doesn't rely on extended MSI-X format (e.g. simply based on
> > > BDF). If that's the case for PPC, then you should build MSI-X
> > > passthrough based on this fact instead of general IRQ remapping
> > > enabled or not.  
> > 
> > I don't think anyone is expecting that we can expose the MSI-X vector
> > table to the guest and the guest can make direct use of it.  The end
> > goal here is that the guest on a power system is already
> > paravirtualized to not program the device MSI-X by directly writing to
> > the MSI-X vector table.  They have hypercalls for this since they
> > always run virtualized.  Therefore a) they never intend to touch the
> > MSI-X vector table and b) they have sufficient isolation that a guest
> > can only hurt itself by doing so.
> > 
> > On x86 we don't have a), our method of programming the MSI-X vector
> > table is to directly write to it. Therefore we will always require QEMU
> > to place a MemoryRegion over the vector table to intercept those
> > accesses.  However with interrupt remapping, we do have b) on x86, which
> > means that we don't need to be so strict in disallowing user accesses
> > to the MSI-X vector table.  It's not useful for configuring MSI-X on
> > the device, but the user should only be able to hurt themselves by
> > writing it directly.  x86 doesn't really get anything out of this
> > change, but it helps this special case on power pretty significantly
> > aiui.  Thanks,
> >   
> 
> Allowing guest direct write to MSI-x table has system-wide impact.
> As I explained earlier, hypervisor needs to control "interrupt_index"
> programmed in MSI-X entry, which is used to associate a specific
> IRQ remapping entry. Now if you expose whole MSI-x table to guest, 
> it can program random index into MSI-X entry to associate with 
> any IRQ remapping entry and then there won't be any isolation per se.
> 
> You can check "5.5.2 MSI and MSI-X Register Programming" in VT-d
> spec.

I think you're extrapolating beyond the vfio interface.  The change
here is to remove the vfio protection of the MSI-X vector table when
the system provides interrupt isolation.  The argument is that this is
safe to do because the hardware protects the host from erroneous and
malicious user programming, but it is not meant to provide a means to
program MSI-X directly through the vector table.  This is effectively
the same as general DMA programming, if the vfio programming model is
not followed the device 

RE: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-11 Thread Tian, Kevin
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Thursday, May 05, 2016 11:05 PM
> 
> On Thu, 5 May 2016 12:15:46 +
> "Tian, Kevin"  wrote:
> 
> > > From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> > > Sent: Thursday, May 05, 2016 7:43 PM
> > >
> > > Hi David and Kevin,
> > >
> > > On 2016/5/5 17:54, David Laight wrote:
> > >
> > > > From: Tian, Kevin
> > > >> Sent: 05 May 2016 10:37
> > > > ...
> > > >>> Acutually, we are not aimed at accessing MSI-X table from
> > > >>> guest. So I think it's safe to passthrough MSI-X table if we
> > > >>> can make sure guest kernel would not touch MSI-X table in
> > > >>> normal code path such as para-virtualized guest kernel on PPC64.
> > > >>>
> > > >> Then how do you prevent malicious guest kernel accessing it?
> > > > Or a malicious guest driver for an ethernet card setting up
> > > > the receive buffer ring to contain a single word entry that
> > > > contains the address associated with an MSI-X interrupt and
> > > > then using a loopback mode to cause a specific packet be
> > > > received that writes the required word through that address.
> > > >
> > > > Remember the PCIe cycle for an interrupt is a normal memory write
> > > > cycle.
> > > >
> > > > David
> > > >
> > >
> > > If we have enough permission to load a malicious driver or
> > > kernel, we can easily break the guest without exposed
> > > MSI-X table.
> > >
> > > I think it should be safe to expose MSI-X table if we can
> > > make sure that malicious guest driver/kernel can't use
> > > the MSI-X table to break other guest or host. The
> > > capability of IRQ remapping could provide this
> > > kind of protection.
> > >
> >
> > With IRQ remapping it doesn't mean you can pass through MSI-X
> > structure to guest. I know actual IRQ remapping might be platform
> > specific, but at least for Intel VT-d specification, MSI-X entry must
> > be configured with a remappable format by host kernel which
> > contains an index into IRQ remapping table. The index will find a
> > IRQ remapping entry which controls interrupt routing for a specific
> > device. If you allow a malicious program random index into MSI-X
> > entry of assigned device, the hole is obvious...
> >
> > Above might make sense only for a IRQ remapping implementation
> > which doesn't rely on extended MSI-X format (e.g. simply based on
> > BDF). If that's the case for PPC, then you should build MSI-X
> > passthrough based on this fact instead of general IRQ remapping
> > enabled or not.
> 
> I don't think anyone is expecting that we can expose the MSI-X vector
> table to the guest and the guest can make direct use of it.  The end
> goal here is that the guest on a power system is already
> paravirtualized to not program the device MSI-X by directly writing to
> the MSI-X vector table.  They have hypercalls for this since they
> always run virtualized.  Therefore a) they never intend to touch the
> MSI-X vector table and b) they have sufficient isolation that a guest
> can only hurt itself by doing so.
> 
> On x86 we don't have a), our method of programming the MSI-X vector
> table is to directly write to it. Therefore we will always require QEMU
> to place a MemoryRegion over the vector table to intercept those
> accesses.  However with interrupt remapping, we do have b) on x86, which
> means that we don't need to be so strict in disallowing user accesses
> to the MSI-X vector table.  It's not useful for configuring MSI-X on
> the device, but the user should only be able to hurt themselves by
> writing it directly.  x86 doesn't really get anything out of this
> change, but it helps this special case on power pretty significantly
> aiui.  Thanks,
> 

Allowing guest direct write to MSI-x table has system-wide impact.
As I explained earlier, hypervisor needs to control "interrupt_index"
programmed in MSI-X entry, which is used to associate a specific
IRQ remapping entry. Now if you expose whole MSI-x table to guest, 
it can program random index into MSI-X entry to associate with 
any IRQ remapping entry and then there won't be any isolation per se.

You can check "5.5.2 MSI and MSI-X Register Programming" in VT-d
spec.

Thanks
Kevin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-06 Thread Alex Williamson
On Fri, 6 May 2016 16:35:38 +1000
Alexey Kardashevskiy  wrote:

> On 05/06/2016 01:05 AM, Alex Williamson wrote:
> > On Thu, 5 May 2016 12:15:46 +
> > "Tian, Kevin"  wrote:
> >  
> >>> From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> >>> Sent: Thursday, May 05, 2016 7:43 PM
> >>>
> >>> Hi David and Kevin,
> >>>
> >>> On 2016/5/5 17:54, David Laight wrote:
> >>>  
>  From: Tian, Kevin  
> > Sent: 05 May 2016 10:37  
>  ...  
> >> Acutually, we are not aimed at accessing MSI-X table from
> >> guest. So I think it's safe to passthrough MSI-X table if we
> >> can make sure guest kernel would not touch MSI-X table in
> >> normal code path such as para-virtualized guest kernel on PPC64.
> >>  
> > Then how do you prevent malicious guest kernel accessing it?  
>  Or a malicious guest driver for an ethernet card setting up
>  the receive buffer ring to contain a single word entry that
>  contains the address associated with an MSI-X interrupt and
>  then using a loopback mode to cause a specific packet be
>  received that writes the required word through that address.
> 
>  Remember the PCIe cycle for an interrupt is a normal memory write
>  cycle.
> 
>   David
>   
> >>>
> >>> If we have enough permission to load a malicious driver or
> >>> kernel, we can easily break the guest without exposed
> >>> MSI-X table.
> >>>
> >>> I think it should be safe to expose MSI-X table if we can
> >>> make sure that malicious guest driver/kernel can't use
> >>> the MSI-X table to break other guest or host. The
> >>> capability of IRQ remapping could provide this
> >>> kind of protection.
> >>>  
> >>
> >> With IRQ remapping it doesn't mean you can pass through MSI-X
> >> structure to guest. I know actual IRQ remapping might be platform
> >> specific, but at least for Intel VT-d specification, MSI-X entry must
> >> be configured with a remappable format by host kernel which
> >> contains an index into IRQ remapping table. The index will find a
> >> IRQ remapping entry which controls interrupt routing for a specific
> >> device. If you allow a malicious program random index into MSI-X
> >> entry of assigned device, the hole is obvious...
> >>
> >> Above might make sense only for a IRQ remapping implementation
> >> which doesn't rely on extended MSI-X format (e.g. simply based on
> >> BDF). If that's the case for PPC, then you should build MSI-X
> >> passthrough based on this fact instead of general IRQ remapping
> >> enabled or not.  
> >
> > I don't think anyone is expecting that we can expose the MSI-X vector
> > table to the guest and the guest can make direct use of it.  The end
> > goal here is that the guest on a power system is already
> > paravirtualized to not program the device MSI-X by directly writing to
> > the MSI-X vector table.  They have hypercalls for this since they
> > always run virtualized.  Therefore a) they never intend to touch the
> > MSI-X vector table and b) they have sufficient isolation that a guest
> > can only hurt itself by doing so.
> >
> > On x86 we don't have a), our method of programming the MSI-X vector
> > table is to directly write to it. Therefore we will always require QEMU
> > to place a MemoryRegion over the vector table to intercept those
> > accesses.  However with interrupt remapping, we do have b) on x86, which
> > means that we don't need to be so strict in disallowing user accesses
> > to the MSI-X vector table.  It's not useful for configuring MSI-X on
> > the device, but the user should only be able to hurt themselves by
> > writing it directly.  x86 doesn't really get anything out of this
> > change, but it helps this special case on power pretty significantly
> > aiui.  Thanks,  
> 
> Excellent short overview, saved :)
> 
> How do we proceed with these patches? Nobody seems objecting them but also 
> nobody seems taking them either...

Well, this series is still based on some non-upstream patches, so...
Once that dependency is resolved this series should probably be split
into functional areas for acceptance by the appropriate subsystem
maintainers.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-06 Thread Alexey Kardashevskiy

On 05/06/2016 01:05 AM, Alex Williamson wrote:

On Thu, 5 May 2016 12:15:46 +
"Tian, Kevin"  wrote:


From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
Sent: Thursday, May 05, 2016 7:43 PM

Hi David and Kevin,

On 2016/5/5 17:54, David Laight wrote:


From: Tian, Kevin

Sent: 05 May 2016 10:37

...

Acutually, we are not aimed at accessing MSI-X table from
guest. So I think it's safe to passthrough MSI-X table if we
can make sure guest kernel would not touch MSI-X table in
normal code path such as para-virtualized guest kernel on PPC64.


Then how do you prevent malicious guest kernel accessing it?

Or a malicious guest driver for an ethernet card setting up
the receive buffer ring to contain a single word entry that
contains the address associated with an MSI-X interrupt and
then using a loopback mode to cause a specific packet be
received that writes the required word through that address.

Remember the PCIe cycle for an interrupt is a normal memory write
cycle.

David



If we have enough permission to load a malicious driver or
kernel, we can easily break the guest without exposed
MSI-X table.

I think it should be safe to expose MSI-X table if we can
make sure that malicious guest driver/kernel can't use
the MSI-X table to break other guest or host. The
capability of IRQ remapping could provide this
kind of protection.



With IRQ remapping it doesn't mean you can pass through MSI-X
structure to guest. I know actual IRQ remapping might be platform
specific, but at least for Intel VT-d specification, MSI-X entry must
be configured with a remappable format by host kernel which
contains an index into IRQ remapping table. The index will find a
IRQ remapping entry which controls interrupt routing for a specific
device. If you allow a malicious program random index into MSI-X
entry of assigned device, the hole is obvious...

Above might make sense only for a IRQ remapping implementation
which doesn't rely on extended MSI-X format (e.g. simply based on
BDF). If that's the case for PPC, then you should build MSI-X
passthrough based on this fact instead of general IRQ remapping
enabled or not.


I don't think anyone is expecting that we can expose the MSI-X vector
table to the guest and the guest can make direct use of it.  The end
goal here is that the guest on a power system is already
paravirtualized to not program the device MSI-X by directly writing to
the MSI-X vector table.  They have hypercalls for this since they
always run virtualized.  Therefore a) they never intend to touch the
MSI-X vector table and b) they have sufficient isolation that a guest
can only hurt itself by doing so.

On x86 we don't have a), our method of programming the MSI-X vector
table is to directly write to it. Therefore we will always require QEMU
to place a MemoryRegion over the vector table to intercept those
accesses.  However with interrupt remapping, we do have b) on x86, which
means that we don't need to be so strict in disallowing user accesses
to the MSI-X vector table.  It's not useful for configuring MSI-X on
the device, but the user should only be able to hurt themselves by
writing it directly.  x86 doesn't really get anything out of this
change, but it helps this special case on power pretty significantly
aiui.  Thanks,


Excellent short overview, saved :)

How do we proceed with these patches? Nobody seems objecting them but also 
nobody seems taking them either...





--
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-05 Thread Alex Williamson
On Thu, 5 May 2016 12:15:46 +
"Tian, Kevin"  wrote:

> > From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> > Sent: Thursday, May 05, 2016 7:43 PM
> > 
> > Hi David and Kevin,
> > 
> > On 2016/5/5 17:54, David Laight wrote:
> >   
> > > From: Tian, Kevin  
> > >> Sent: 05 May 2016 10:37  
> > > ...  
> > >>> Acutually, we are not aimed at accessing MSI-X table from
> > >>> guest. So I think it's safe to passthrough MSI-X table if we
> > >>> can make sure guest kernel would not touch MSI-X table in
> > >>> normal code path such as para-virtualized guest kernel on PPC64.
> > >>>  
> > >> Then how do you prevent malicious guest kernel accessing it?  
> > > Or a malicious guest driver for an ethernet card setting up
> > > the receive buffer ring to contain a single word entry that
> > > contains the address associated with an MSI-X interrupt and
> > > then using a loopback mode to cause a specific packet be
> > > received that writes the required word through that address.
> > >
> > > Remember the PCIe cycle for an interrupt is a normal memory write
> > > cycle.
> > >
> > >   David
> > >  
> > 
> > If we have enough permission to load a malicious driver or
> > kernel, we can easily break the guest without exposed
> > MSI-X table.
> > 
> > I think it should be safe to expose MSI-X table if we can
> > make sure that malicious guest driver/kernel can't use
> > the MSI-X table to break other guest or host. The
> > capability of IRQ remapping could provide this
> > kind of protection.
> >   
> 
> With IRQ remapping it doesn't mean you can pass through MSI-X
> structure to guest. I know actual IRQ remapping might be platform
> specific, but at least for Intel VT-d specification, MSI-X entry must
> be configured with a remappable format by host kernel which
> contains an index into IRQ remapping table. The index will find a
> IRQ remapping entry which controls interrupt routing for a specific
> device. If you allow a malicious program random index into MSI-X 
> entry of assigned device, the hole is obvious...
> 
> Above might make sense only for a IRQ remapping implementation 
> which doesn't rely on extended MSI-X format (e.g. simply based on 
> BDF). If that's the case for PPC, then you should build MSI-X 
> passthrough based on this fact instead of general IRQ remapping 
> enabled or not.

I don't think anyone is expecting that we can expose the MSI-X vector
table to the guest and the guest can make direct use of it.  The end
goal here is that the guest on a power system is already
paravirtualized to not program the device MSI-X by directly writing to
the MSI-X vector table.  They have hypercalls for this since they
always run virtualized.  Therefore a) they never intend to touch the
MSI-X vector table and b) they have sufficient isolation that a guest
can only hurt itself by doing so.

On x86 we don't have a), our method of programming the MSI-X vector
table is to directly write to it. Therefore we will always require QEMU
to place a MemoryRegion over the vector table to intercept those
accesses.  However with interrupt remapping, we do have b) on x86, which
means that we don't need to be so strict in disallowing user accesses
to the MSI-X vector table.  It's not useful for configuring MSI-X on
the device, but the user should only be able to hurt themselves by
writing it directly.  x86 doesn't really get anything out of this
change, but it helps this special case on power pretty significantly
aiui.  Thanks,

Alex
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-05 Thread Yongji Xie

On 2016/5/5 20:15, Tian, Kevin wrote:


From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
Sent: Thursday, May 05, 2016 7:43 PM

Hi David and Kevin,

On 2016/5/5 17:54, David Laight wrote:


From: Tian, Kevin

Sent: 05 May 2016 10:37

...

Acutually, we are not aimed at accessing MSI-X table from
guest. So I think it's safe to passthrough MSI-X table if we
can make sure guest kernel would not touch MSI-X table in
normal code path such as para-virtualized guest kernel on PPC64.


Then how do you prevent malicious guest kernel accessing it?

Or a malicious guest driver for an ethernet card setting up
the receive buffer ring to contain a single word entry that
contains the address associated with an MSI-X interrupt and
then using a loopback mode to cause a specific packet be
received that writes the required word through that address.

Remember the PCIe cycle for an interrupt is a normal memory write
cycle.

David


If we have enough permission to load a malicious driver or
kernel, we can easily break the guest without exposed
MSI-X table.

I think it should be safe to expose MSI-X table if we can
make sure that malicious guest driver/kernel can't use
the MSI-X table to break other guest or host. The
capability of IRQ remapping could provide this
kind of protection.


With IRQ remapping it doesn't mean you can pass through MSI-X
structure to guest. I know actual IRQ remapping might be platform
specific, but at least for Intel VT-d specification, MSI-X entry must
be configured with a remappable format by host kernel which
contains an index into IRQ remapping table. The index will find a
IRQ remapping entry which controls interrupt routing for a specific
device. If you allow a malicious program random index into MSI-X
entry of assigned device, the hole is obvious...


Do you mean we can trigger MSIs that correspond to interrupt
IDs of other devices by writing to MSI-X table although IRQ
remapping is enabled?

On PPC64, there is a mapping between MSIs and PE num
which can be used to identify a PCI device on PHB. So the
hardware can ensure a given pci device can only shoot the
MSIs assigned for it.  Isn't there a similar mapping in IRQ
remapping table on Intel.

Thanks,
Yongji


Above might make sense only for a IRQ remapping implementation
which doesn't rely on extended MSI-X format (e.g. simply based on
BDF). If that's the case for PPC, then you should build MSI-X
passthrough based on this fact instead of general IRQ remapping
enabled or not.

Thanks
Kevin


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-05 Thread Tian, Kevin
> From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> Sent: Thursday, May 05, 2016 7:43 PM
> 
> Hi David and Kevin,
> 
> On 2016/5/5 17:54, David Laight wrote:
> 
> > From: Tian, Kevin
> >> Sent: 05 May 2016 10:37
> > ...
> >>> Acutually, we are not aimed at accessing MSI-X table from
> >>> guest. So I think it's safe to passthrough MSI-X table if we
> >>> can make sure guest kernel would not touch MSI-X table in
> >>> normal code path such as para-virtualized guest kernel on PPC64.
> >>>
> >> Then how do you prevent malicious guest kernel accessing it?
> > Or a malicious guest driver for an ethernet card setting up
> > the receive buffer ring to contain a single word entry that
> > contains the address associated with an MSI-X interrupt and
> > then using a loopback mode to cause a specific packet be
> > received that writes the required word through that address.
> >
> > Remember the PCIe cycle for an interrupt is a normal memory write
> > cycle.
> >
> > David
> >
> 
> If we have enough permission to load a malicious driver or
> kernel, we can easily break the guest without exposed
> MSI-X table.
> 
> I think it should be safe to expose MSI-X table if we can
> make sure that malicious guest driver/kernel can't use
> the MSI-X table to break other guest or host. The
> capability of IRQ remapping could provide this
> kind of protection.
> 

With IRQ remapping it doesn't mean you can pass through MSI-X
structure to guest. I know actual IRQ remapping might be platform
specific, but at least for Intel VT-d specification, MSI-X entry must
be configured with a remappable format by host kernel which
contains an index into IRQ remapping table. The index will find a
IRQ remapping entry which controls interrupt routing for a specific
device. If you allow a malicious program random index into MSI-X 
entry of assigned device, the hole is obvious...

Above might make sense only for a IRQ remapping implementation 
which doesn't rely on extended MSI-X format (e.g. simply based on 
BDF). If that's the case for PPC, then you should build MSI-X 
passthrough based on this fact instead of general IRQ remapping 
enabled or not.

Thanks
Kevin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-05 Thread Yongji Xie

On 2016/5/5 17:36, Tian, Kevin wrote:


From: Yongji Xie
Sent: Tuesday, May 03, 2016 3:34 PM

On 2016/5/3 14:22, Tian, Kevin wrote:


From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
Sent: Tuesday, May 03, 2016 2:08 PM

On 2016/5/3 13:34, Tian, Kevin wrote:


From: Yongji Xie
Sent: Wednesday, April 27, 2016 8:43 PM

This patch enables mmapping MSI-X tables if hardware supports
interrupt remapping which can ensure that a given pci device
can only shoot the MSIs assigned for it.

With MSI-X table mmapped, we also need to expose the
read/write interface which will be used to access MSI-X table.

Signed-off-by: Yongji Xie 

A curious question here. Does "allow to mmap MSI-X" essentially
mean that KVM guest can directly read/write physical MSI-X
structure then?

Thanks
Kevin


Here we just allow to mmap MSI-X table in kernel. It doesn't
mean all KVM guest can directly read/write physical MSI-X
structure. This should be decided by QEMU. For PPC64
platform, we would allow to passthrough the MSI-X table
because we know guest kernel would not write physical
MSI-X structure when enabling MSI.


A bit confused here. If guest kernel doesn't need to write
physical MSI-X structure, what's the point of passing through
the table then?

We want to allow the MSI-X table because there may be
some critical registers in the same page as the MSI-X table.
We have to handle the mmio access to these register in QEMU
rather than in guest if mmapping MSI-X table is disallowed.

So you mean critical registers in same MMIO BAR as MSI-X
table, instead of two MMIO BARs in same page (the latter I
suppose with your whole patchset it won't happen then)?


Yes. That's what I mean!

Thanks,
Yongji

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-05 Thread Yongji Xie

Hi David and Kevin,

On 2016/5/5 17:54, David Laight wrote:


From: Tian, Kevin

Sent: 05 May 2016 10:37

...

Acutually, we are not aimed at accessing MSI-X table from
guest. So I think it's safe to passthrough MSI-X table if we
can make sure guest kernel would not touch MSI-X table in
normal code path such as para-virtualized guest kernel on PPC64.


Then how do you prevent malicious guest kernel accessing it?

Or a malicious guest driver for an ethernet card setting up
the receive buffer ring to contain a single word entry that
contains the address associated with an MSI-X interrupt and
then using a loopback mode to cause a specific packet be
received that writes the required word through that address.

Remember the PCIe cycle for an interrupt is a normal memory write
cycle.

David



If we have enough permission to load a malicious driver or
kernel, we can easily break the guest without exposed
MSI-X table.

I think it should be safe to expose MSI-X table if we can
make sure that malicious guest driver/kernel can't use
the MSI-X table to break other guest or host. The
capability of IRQ remapping could provide this
kind of protection.

Thanks,
Yongji

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-05 Thread David Laight
From: Tian, Kevin
> Sent: 05 May 2016 10:37
...
> > Acutually, we are not aimed at accessing MSI-X table from
> > guest. So I think it's safe to passthrough MSI-X table if we
> > can make sure guest kernel would not touch MSI-X table in
> > normal code path such as para-virtualized guest kernel on PPC64.
> >
> 
> Then how do you prevent malicious guest kernel accessing it?

Or a malicious guest driver for an ethernet card setting up
the receive buffer ring to contain a single word entry that
contains the address associated with an MSI-X interrupt and
then using a loopback mode to cause a specific packet be
received that writes the required word through that address.

Remember the PCIe cycle for an interrupt is a normal memory write
cycle.

David

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-05 Thread Tian, Kevin
> From: Yongji Xie
> Sent: Tuesday, May 03, 2016 3:34 PM
> 
> On 2016/5/3 14:22, Tian, Kevin wrote:
> 
> >> From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> >> Sent: Tuesday, May 03, 2016 2:08 PM
> >>
> >> On 2016/5/3 13:34, Tian, Kevin wrote:
> >>
>  From: Yongji Xie
>  Sent: Wednesday, April 27, 2016 8:43 PM
> 
>  This patch enables mmapping MSI-X tables if hardware supports
>  interrupt remapping which can ensure that a given pci device
>  can only shoot the MSIs assigned for it.
> 
>  With MSI-X table mmapped, we also need to expose the
>  read/write interface which will be used to access MSI-X table.
> 
>  Signed-off-by: Yongji Xie 
> >>> A curious question here. Does "allow to mmap MSI-X" essentially
> >>> mean that KVM guest can directly read/write physical MSI-X
> >>> structure then?
> >>>
> >>> Thanks
> >>> Kevin
> >>>
> >> Here we just allow to mmap MSI-X table in kernel. It doesn't
> >> mean all KVM guest can directly read/write physical MSI-X
> >> structure. This should be decided by QEMU. For PPC64
> >> platform, we would allow to passthrough the MSI-X table
> >> because we know guest kernel would not write physical
> >> MSI-X structure when enabling MSI.
> >>
> > A bit confused here. If guest kernel doesn't need to write
> > physical MSI-X structure, what's the point of passing through
> > the table then?
> 
> We want to allow the MSI-X table because there may be
> some critical registers in the same page as the MSI-X table.
> We have to handle the mmio access to these register in QEMU
> rather than in guest if mmapping MSI-X table is disallowed.

So you mean critical registers in same MMIO BAR as MSI-X
table, instead of two MMIO BARs in same page (the latter I
suppose with your whole patchset it won't happen then)?

> 
> > I think the key whether MSI-X table can be passed through
> > is related to where hypervisor control is deployed. At least
> > for x86:
> >
> > - When irq remapping is not enabled, host/hypervisor needs
> > to control physical interrupt message including vector/dest/etc.
> > directly in MSI-X structure, so we cannot allow a guest to
> > access it;
> >
> > - when irq remapping is enabled, host/hypervisor can control
> > interrupt routing in irq remapping table. However MSI-X
> > also needs to be configured as remappable format. In this
> > manner we also cannot allow direct access from guest.
> >
> > The only sane case to pass through MSI-X structure, is a
> > mechanism similar to irq remapping but w/o need to change
> > original MSI-X format so direct access from guest side is
> > safe. Is it the case in PPC64?
> >
> > Thanks
> > Kevin
> 
> Acutually, we are not aimed at accessing MSI-X table from
> guest. So I think it's safe to passthrough MSI-X table if we
> can make sure guest kernel would not touch MSI-X table in
> normal code path such as para-virtualized guest kernel on PPC64.
> 

Then how do you prevent malicious guest kernel accessing it?

Thanks
Kevin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-03 Thread Yongji Xie

On 2016/5/3 14:22, Tian, Kevin wrote:


From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
Sent: Tuesday, May 03, 2016 2:08 PM

On 2016/5/3 13:34, Tian, Kevin wrote:


From: Yongji Xie
Sent: Wednesday, April 27, 2016 8:43 PM

This patch enables mmapping MSI-X tables if hardware supports
interrupt remapping which can ensure that a given pci device
can only shoot the MSIs assigned for it.

With MSI-X table mmapped, we also need to expose the
read/write interface which will be used to access MSI-X table.

Signed-off-by: Yongji Xie 

A curious question here. Does "allow to mmap MSI-X" essentially
mean that KVM guest can directly read/write physical MSI-X
structure then?

Thanks
Kevin


Here we just allow to mmap MSI-X table in kernel. It doesn't
mean all KVM guest can directly read/write physical MSI-X
structure. This should be decided by QEMU. For PPC64
platform, we would allow to passthrough the MSI-X table
because we know guest kernel would not write physical
MSI-X structure when enabling MSI.


A bit confused here. If guest kernel doesn't need to write
physical MSI-X structure, what's the point of passing through
the table then?


We want to allow the MSI-X table because there may be
some critical registers in the same page as the MSI-X table.
We have to handle the mmio access to these register in QEMU
rather than in guest if mmapping MSI-X table is disallowed.


I think the key whether MSI-X table can be passed through
is related to where hypervisor control is deployed. At least
for x86:

- When irq remapping is not enabled, host/hypervisor needs
to control physical interrupt message including vector/dest/etc.
directly in MSI-X structure, so we cannot allow a guest to
access it;

- when irq remapping is enabled, host/hypervisor can control
interrupt routing in irq remapping table. However MSI-X
also needs to be configured as remappable format. In this
manner we also cannot allow direct access from guest.

The only sane case to pass through MSI-X structure, is a
mechanism similar to irq remapping but w/o need to change
original MSI-X format so direct access from guest side is
safe. Is it the case in PPC64?

Thanks
Kevin


Acutually, we are not aimed at accessing MSI-X table from
guest. So I think it's safe to passthrough MSI-X table if we
can make sure guest kernel would not touch MSI-X table in
normal code path such as para-virtualized guest kernel on PPC64.

Thanks,
Yongji

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-03 Thread Tian, Kevin
> From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com]
> Sent: Tuesday, May 03, 2016 2:08 PM
> 
> On 2016/5/3 13:34, Tian, Kevin wrote:
> 
> >> From: Yongji Xie
> >> Sent: Wednesday, April 27, 2016 8:43 PM
> >>
> >> This patch enables mmapping MSI-X tables if hardware supports
> >> interrupt remapping which can ensure that a given pci device
> >> can only shoot the MSIs assigned for it.
> >>
> >> With MSI-X table mmapped, we also need to expose the
> >> read/write interface which will be used to access MSI-X table.
> >>
> >> Signed-off-by: Yongji Xie 
> > A curious question here. Does "allow to mmap MSI-X" essentially
> > mean that KVM guest can directly read/write physical MSI-X
> > structure then?
> >
> > Thanks
> > Kevin
> >
> 
> Here we just allow to mmap MSI-X table in kernel. It doesn't
> mean all KVM guest can directly read/write physical MSI-X
> structure. This should be decided by QEMU. For PPC64
> platform, we would allow to passthrough the MSI-X table
> because we know guest kernel would not write physical
> MSI-X structure when enabling MSI.
> 

A bit confused here. If guest kernel doesn't need to write
physical MSI-X structure, what's the point of passing through
the table then?

I think the key whether MSI-X table can be passed through
is related to where hypervisor control is deployed. At least
for x86:

- When irq remapping is not enabled, host/hypervisor needs
to control physical interrupt message including vector/dest/etc.
directly in MSI-X structure, so we cannot allow a guest to 
access it;

- when irq remapping is enabled, host/hypervisor can control
interrupt routing in irq remapping table. However MSI-X
also needs to be configured as remappable format. In this
manner we also cannot allow direct access from guest.

The only sane case to pass through MSI-X structure, is a
mechanism similar to irq remapping but w/o need to change
original MSI-X format so direct access from guest side is
safe. Is it the case in PPC64?

Thanks
Kevin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-03 Thread Yongji Xie

On 2016/5/3 13:34, Tian, Kevin wrote:


From: Yongji Xie
Sent: Wednesday, April 27, 2016 8:43 PM

This patch enables mmapping MSI-X tables if hardware supports
interrupt remapping which can ensure that a given pci device
can only shoot the MSIs assigned for it.

With MSI-X table mmapped, we also need to expose the
read/write interface which will be used to access MSI-X table.

Signed-off-by: Yongji Xie 

A curious question here. Does "allow to mmap MSI-X" essentially
mean that KVM guest can directly read/write physical MSI-X
structure then?

Thanks
Kevin



Here we just allow to mmap MSI-X table in kernel. It doesn't
mean all KVM guest can directly read/write physical MSI-X
structure. This should be decided by QEMU. For PPC64
platform, we would allow to passthrough the MSI-X table
because we know guest kernel would not write physical
MSI-X structure when enabling MSI.

Thanks,
Yongji

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-02 Thread Tian, Kevin
> From: Yongji Xie
> Sent: Wednesday, April 27, 2016 8:43 PM
> 
> This patch enables mmapping MSI-X tables if hardware supports
> interrupt remapping which can ensure that a given pci device
> can only shoot the MSIs assigned for it.
> 
> With MSI-X table mmapped, we also need to expose the
> read/write interface which will be used to access MSI-X table.
> 
> Signed-off-by: Yongji Xie 

A curious question here. Does "allow to mmap MSI-X" essentially
mean that KVM guest can directly read/write physical MSI-X
structure then?

Thanks
Kevin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev