On Sun, 17 Mar 2019 14:36:13 +,
Zenghui Yu wrote:
>
> Currently, IRQFD on arm still uses the deferred workqueue mechanism
> to inject interrupts into guest, which will likely lead to a busy
> context-switching from/to the kworker thread. This overhead is for
> no purpose (only in my view
Hi,
On 3/17/19 3:50 PM, Raslan, KarimAllah wrote:
> On Sun, 2019-03-17 at 14:36 +, Zenghui Yu wrote:
>> Currently, IRQFD on arm still uses the deferred workqueue mechanism
>> to inject interrupts into guest, which will likely lead to a busy
>> context-switching from/to the kworker thread.
New iotcls were introduced to pass information about guest stage1
to the host through VFIO. Let's document the nested stage control.
Signed-off-by: Eric Auger
---
v2 -> v3:
- document the new fault API
v1 -> v2:
- use the new ioctl names
- add doc related to fault handling
---
This patch registers a fault handler which records faults in
a circular buffer and then signals an eventfd. This buffer is
exposed within the fault region.
Signed-off-by: Eric Auger
---
v3 -> v4:
- move iommu_unregister_device_fault_handler to vfio_pci_release
---
drivers/vfio/pci/vfio_pci.c
This patch adds two new regions aiming to handle nested mode
translation faults.
The first region (two host kernel pages) is read-only from the
user-space perspective. The first page contains an header
that provides information about the circular buffer located in the
second page. The circular
The bind/unbind_guest_msi() callbacks check the domain
is NESTED and redirect to the dma-iommu implementation.
Signed-off-by: Eric Auger
---
drivers/iommu/arm-smmu-v3.c | 44 +
1 file changed, 44 insertions(+)
diff --git a/drivers/iommu/arm-smmu-v3.c
Implement domain-selective and page-selective IOTLB invalidations.
Signed-off-by: Eric Auger
---
v3 -> v4:
- adapt to changes in the uapi
- add support for leaf parameter
- do not use arm_smmu_tlb_inv_range_nosync or arm_smmu_tlb_inv_context
anymore
v2 -> v3:
- replace __arm_smmu_tlb_sync
When a stage 1 related fault event is read from the event queue,
let's propagate it to potential external fault listeners, ie. users
who registered a fault handler.
Signed-off-by: Eric Auger
---
v4 -> v5:
- s/IOMMU_FAULT_PERM_INST/IOMMU_FAULT_PERM_EXEC
---
drivers/iommu/arm-smmu-v3.c | 169
The Producer Fault region contains the fault queue in the second page.
There is benefit to let the userspace mmap this area. So let's expose
this mmappable area through a sparse mmap entry and implement the mmap
operation.
Signed-off-by: Eric Auger
---
drivers/vfio/pci/vfio_pci.c | 61
Add a new VFIO_PCI_DMA_FAULT_IRQ_INDEX index. This allows to
set/unset an eventfd that will be triggered when DMA translation
faults are detected at physical level when the nested mode is used.
Signed-off-by: Eric Auger
---
drivers/vfio/pci/vfio_pci.c | 3 +++
From: Jean-Philippe Brucker
When handling faults from the event or PRI queue, we need to find the
struct device associated to a SID. Add a rb_tree to keep track of SIDs.
Signed-off-by: Jean-Philippe Brucker
---
drivers/iommu/arm-smmu-v3.c | 136 ++--
1 file
Up to now, when the type was UNMANAGED, we used to
allocate IOVA pages within a range provided by the user.
This does not work in nested mode.
If both the host and the guest are exposed with SMMUs, each
would allocate an IOVA. The guest allocates an IOVA (gIOVA)
to map onto the guest MSI doorbell
On attach_pasid_table() we program STE S1 related info set
by the guest into the actual physical STEs. At minimum
we need to program the context descriptor GPA and compute
whether the stage1 is translated/bypassed or aborted.
Signed-off-by: Eric Auger
---
v3 -> v4:
- adapt to changes in
To allow nested stage support, we need to store both
stage 1 and stage 2 configurations (and remove the former
union).
A nested setup is characterized by both s1_cfg and s2_cfg
set.
We introduce a new ste.abort field that will be set upon
guest stage1 configuration passing. If s1_cfg is NULL and
From: "Liu, Yi L"
This patch adds VFIO_IOMMU_ATTACH/DETACH_PASID_TABLE ioctl
which aims to pass/withdraw the virtual iommu guest configuration
to/from the VFIO driver downto to the iommu subsystem.
Signed-off-by: Jacob Pan
Signed-off-by: Liu, Yi L
Signed-off-by: Eric Auger
---
v3 -> v4:
-
From: Jacob Pan
Device faults detected by IOMMU can be reported outside the IOMMU
subsystem for further processing. This patch introduces
a generic device fault data structure.
The fault can be either an unrecoverable fault or a page request,
also referred to as a recoverable fault.
We only
From: Jean-Philippe Brucker
When removing a mapping from a domain, we need to send an invalidation to
all devices that might have stored it in their Address Translation Cache
(ATC). In addition with SVM, we'll need to invalidate context descriptors
of all devices attached to a live domain.
From: Jacob Pan
In virtualization use case, when a guest is assigned
a PCI host device, protected by a virtual IOMMU on the guest,
the physical IOMMU must be programmed to be consistent with
the guest mappings. If the physical IOMMU supports two
translation stages it makes sense to program guest
From: Jacob Pan
Traditionally, device specific faults are detected and handled within
their own device drivers. When IOMMU is enabled, faults such as DMA
related transactions are detected by IOMMU. There is no generic
reporting mechanism to report faults back to the in-kernel device
driver or
This patch adds the VFIO_IOMMU_BIND/UNBIND_MSI ioctl which aim
to pass/withdraw the guest MSI binding to/from the host.
Signed-off-by: Eric Auger
---
v3 -> v4:
- add UNBIND
- unwind on BIND error
v2 -> v3:
- adapt to new proto of bind_guest_msi
- directly use vfio_iommu_for_each_dev
v1 -> v2:
On ARM, MSI are translated by the SMMU. An IOVA is allocated
for each MSI doorbell. If both the host and the guest are exposed
with SMMUs, we end up with 2 different IOVAs allocated by each.
guest allocates an IOVA (gIOVA) to map onto the guest MSI
doorbell (gDB). The Host allocates another IOVA
From: "Liu, Yi L"
When the guest "owns" the stage 1 translation structures, the host
IOMMU driver has no knowledge of caching structure updates unless
the guest invalidation requests are trapped and passed down to the
host.
This patch adds the VFIO_IOMMU_CACHE_INVALIDATE ioctl with aims
at
From: "Liu, Yi L"
In any virtualization use case, when the first translation stage
is "owned" by the guest OS, the host IOMMU driver has no knowledge
of caching structure updates unless the guest invalidation activities
are trapped by the virtualizer and passed down to the host.
Since the
This series allows a virtualizer to program the nested stage mode.
This is useful when both the host and the guest are exposed with
an SMMUv3 and a PCI device is assigned to the guest using VFIO.
In this mode, the physical IOMMU must be programmed to translate
the two stages: the one set up by
From: Jacob Pan
DMA faults can be detected by IOMMU at device level. Adding a pointer
to struct device allows IOMMU subsystem to report relevant faults
back to the device driver for further handling.
For direct assigned device (or user space drivers), guest OS holds
responsibility to handle and
Hi Jacob,
On 3/15/19 7:37 PM, Jacob Pan wrote:
> On Fri, 15 Mar 2019 17:08:49 +0100
> Eric Auger wrote:
>
>> From: "Liu, Yi L"
>>
>> In any virtualization use case, when the first translation stage
>> is "owned" by the guest OS, the host IOMMU driver has no knowledge
>> of caching structure
Currently, IRQFD on arm still uses the deferred workqueue mechanism
to inject interrupts into guest, which will likely lead to a busy
context-switching from/to the kworker thread. This overhead is for
no purpose (only in my view ...) and will result in an interrupt
performance degradation.
Hi Suzuki,
On 2019/3/15 22:56, Suzuki K Poulose wrote:
Hi Zhengui,
On 15/03/2019 08:21, Zheng Xiang wrote:
Hi Suzuki,
I have tested this patch, VM doesn't hang and we get expected WARNING
log:
Thanks for the quick testing !
However, we also get the following unexpected log:
[
Hi Suzuki,
On 2019/3/15 22:56, Suzuki K Poulose wrote:
Hi Zhengui,
s/Zhengui/Zheng/
(I think you must wanted to say "Hi" to Zheng :-) )
I have looked into your patch and the kernel log, and I believe that
your patch had already addressed this issue. But I think we can do it
a little better
Hi Suzuki,
On 2019/3/12 17:52, Suzuki K Poulose wrote:
commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
made the checks to skip huge mappings, stricter. However it introduced
a bug where we still use huge mappings, ignoring the flag to
use PTE mappings, by not
30 matches
Mail list logo