Re: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2021-04-25 Thread Sumit Gupta




I have worked around the issue by filtering out the request if the
pfn is not valid in __clean_dcache_guest_page().  As the patch
wasn't posted in the community, reverted it as well.


That's papering over the real issue, and this mapping path needs
fixing as it was only ever expected to be called for CoW.

Can you please try the following patch and let me know if that fixes
the issue for good?



Hi Marc,

Thank you for the patch. This patch fixed the crash for me.
For the formal patch, please add:

Tested-by: Sumit Gupta 



diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 77cb2d28f2a4..b62dd40a4083 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1147,7 +1147,8 @@ int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, 
pte_t pte)
  * We've moved a page around, probably through CoW, so let's treat it
  * just like a translation fault and clean the cache to the PoC.
  */
-   clean_dcache_guest_page(pfn, PAGE_SIZE);
+   if (!kvm_is_device_pfn(pfn))
+   clean_dcache_guest_page(pfn, PAGE_SIZE);
 handle_hva_to_gpa(kvm, hva, end, _set_spte_handler, );
 return 0;
  }


--
Without deviation from the norm, progress is not possible.


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2021-04-24 Thread Sumit Gupta





Did that patch cause any issue, or is it just not needed on your system?
It fixes an hypothetical problem with the way ATS is implemented.
Maybe I actually observed it on an old software model, I don't
remember. Either way it's unlikely to go upstream but I'd like to know
if I should drop it from my tree.



Had to revert same patch "mm: notify remote TLBs when dirtying a PTE" to
avoid below crash[1]. I am not sure about the cause yet.


I have noticed this issue earlier with patch pointed here and root caused the 
issue as below.
It happens after vfio_mmap request from QEMU for the PCIe device and during the 
access of VA when
PTE access flags are updated.

kvm_mmu_notifier_change_pte() --> kvm_set_spte_hve() --> kvm_set_spte_hva() --> 
clean_dcache_guest_page()

The validation model doesn't have FWB capability supported.
__clean_dcache_guest_page() attempts to perform dcache flush on pcie bar 
address(not a valid_pfn()) through page_address(),
which doesn't have page table mapping and leads to exception.

I have worked around the issue by filtering out the request if the pfn is not 
valid in  __clean_dcache_guest_page().
As the patch wasn't posted in the community, reverted it as well.


Thank you Krishna for sharing the analysis.

Best Regards,
Sumit Gupta
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2021-04-24 Thread Sumit Gupta

Hi Jean,



Hi Sumit,

On Thu, Apr 22, 2021 at 08:34:38PM +0530, Sumit Gupta wrote:

Had to revert patch "mm: notify remote TLBs when dirtying a PTE".


Did that patch cause any issue, or is it just not needed on your system?
It fixes an hypothetical problem with the way ATS is implemented. Maybe I
actually observed it on an old software model, I don't remember. Either
way it's unlikely to go upstream but I'd like to know if I should drop it
from my tree.


I tried Nested SMMUv3 patches v15(Eric's branch: 
v5.12-rc6-jean-iopf-14-2stage-v15) on top of your current sva patches 
with Kernel-5.12.0-rc8.
Had to revert same patch "mm: notify remote TLBs when dirtying a PTE" to 
avoid below crash[1]. I am not sure about the cause yet.

Didn't get crash after reverting patch and nested translations worked.

[1]
[   11.730943] arm-smmu-v3 905.smmuv3: ias 44-bit, oas 44-bit 
(features 0x8305)^M^M
[   11.833791] arm-smmu-v3 905.smmuv3: allocated 524288 entries for 
cmdq^M^M
[   11.979456] arm-smmu-v3 905.smmuv3: allocated 524288 entries for 
evtq^M^M

[   12.048895] cacheinfo: Unable to detect cache hierarchy for CPU 0^M^M
[   12.234175] loop: module loaded^M^M
[   12.279552] megasas: 07.714.04.00-rc1^M^M
[   12.408831] nvme :00:02.0: Adding to iommu group 0^M^M
[   12.488063] nvme nvme0: pci function :00:02.0^M^M
[   12.525887] nvme :00:02.0: enabling device ( -> 0002)^M^M
[   12.612159] physmap-flash 0.flash: physmap platform flash device: 
[mem 0x-0x03ff]^M^M
[ 1721.586943] Unable to handle kernel paging request at virtual address 
617f8000^M

[ 1721.587263] Mem abort info:^M
[ 1721.587776]   ESR = 0x96000145^M
[ 1721.587968]   EC = 0x25: DABT (current EL), IL = 32 bits^M
[ 1721.588416]   SET = 0, FnV = 0^M
[ 1721.588672]   EA = 0, S1PTW = 0^M
[ 1721.588863] Data abort info:^M
[ 1721.589120]   ISV = 0, ISS = 0x0145^M
[ 1721.589311]   CM = 1, WnR = 1^M
[ 1721.589568] swapper pgtable: 64k pages, 48-bit VAs, 
pgdp=00011128^M
[ 1721.589951] [617f8000] pgd=, 
p4d=, pud=^M

[ 1721.590592] Internal error: Oops: 96000145 [#1] PREEMPT SMP^M
[ 1721.590912] Modules linked in:^M
[ 1721.591232] CPU: 0 PID: 664 Comm: qemu-system-aar Not tainted 
5.12.0-rc8-tegra-229886-g4786d4a20d7 #22^M

[ 1721.591680] pstate: a045 (NzCv daif +PAN -UAO -TCO BTYPE=--)
[ 1721.592128] pc : __flush_dcache_area+0x20/0x38
[ 1721.592511] lr : kvm_set_spte_hva+0x64/0xc8
[ 1721.592832] sp : 8000145cfc30
[ 1721.593087] x29: 8000145cfc30 x28: 95221c80
[ 1721.593599] x27: 0002 x26: a3711c88
[ 1721.594112] x25: 9333a740 x24: 01e861800f53
[ 1721.594624] x23: b832 x22: 0001
[ 1721.595136] x21: b832 x20: a1268000
[ 1721.595647] x19: 800011c95000 x18: 
[ 1721.596160] x17:  x16: 
[ 1721.596608] x15:  x14: 
[ 1721.597120] x13:  x12: 
[ 1721.597568] x11:  x10: 
[ 1721.598080] x9 :  x8 : 9333a740
[ 1721.598592] x7 : 07fd000b8320 x6 : 815bc190
[ 1721.599104] x5 : 00011b06 x4 : 
[ 1721.599552] x3 : 003f x2 : 0040
[ 1721.600064] x1 : 617f8001 x0 : 617f8000
[ 1721.600576] Call trace:
[ 1721.600768]  __flush_dcache_area+0x20/0x38
[ 1721.601216]  kvm_mmu_notifier_change_pte+0x5c/0xa8
[ 1721.601601]  __mmu_notifier_change_pte+0x60/0xa0
[ 1721.601985]  __handle_mm_fault+0x740/0xde8
[ 1721.602367]  handle_mm_fault+0xe8/0x238
[ 1721.602751]  do_page_fault+0x160/0x3a8
[ 1721.603200]  do_mem_abort+0x40/0xb0
[ 1721.603520]  el0_da+0x20/0x30
[ 1721.603967]  el0_sync_handler+0x68/0xd0
[ 1721.604416]  el0_sync+0x154/0x180
[ 1721.604864] Code: 9ac32042 8b010001 d1000443 8a23 (d50b7e20)
[ 1721.605184] ---[ end trace 7678eb97889b6fbd ]---
[ 1721.605504] Kernel panic - not syncing: Oops: Fatal exception
[ 1721.605824] Kernel Offset: disabled
[ 1721.606016] CPU features: 0x00340216,6280a018
[ 1721.606335] Memory Limit: 2909 MB
[ 1721.606656] ---[ end Kernel panic - not syncing: Oops: Fatal 
exception ]---

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


RE: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2021-04-24 Thread Krishna Reddy
>> Did that patch cause any issue, or is it just not needed on your system?
>> It fixes an hypothetical problem with the way ATS is implemented. 
>> Maybe I actually observed it on an old software model, I don't 
>> remember. Either way it's unlikely to go upstream but I'd like to know 
>> if I should drop it from my tree.

> Had to revert same patch "mm: notify remote TLBs when dirtying a PTE" to
> avoid below crash[1]. I am not sure about the cause yet.

I have noticed this issue earlier with patch pointed here and root caused the 
issue as below.
It happens after vfio_mmap request from QEMU for the PCIe device and during the 
access of VA when
PTE access flags are updated. 

kvm_mmu_notifier_change_pte() --> kvm_set_spte_hve() --> kvm_set_spte_hva() --> 
clean_dcache_guest_page()

The validation model doesn't have FWB capability supported.
__clean_dcache_guest_page() attempts to perform dcache flush on pcie bar 
address(not a valid_pfn()) through page_address(),
which doesn't have page table mapping and leads to exception.

I have worked around the issue by filtering out the request if the pfn is not 
valid in  __clean_dcache_guest_page(). 
As the patch wasn't posted in the community, reverted it as well.

-KR

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2021-04-24 Thread Marc Zyngier
On Fri, 23 Apr 2021 18:58:23 +0100,
Krishna Reddy  wrote:
> 
> >> Did that patch cause any issue, or is it just not needed on your system?
> >> It fixes an hypothetical problem with the way ATS is implemented. 
> >> Maybe I actually observed it on an old software model, I don't 
> >> remember. Either way it's unlikely to go upstream but I'd like to know 
> >> if I should drop it from my tree.
> 
> > Had to revert same patch "mm: notify remote TLBs when dirtying a PTE" to
> > avoid below crash[1]. I am not sure about the cause yet.
> 
> I have noticed this issue earlier with patch pointed here and root
> caused the issue as below.  It happens after vfio_mmap request from
> QEMU for the PCIe device and during the access of VA when PTE access
> flags are updated.
> 
> kvm_mmu_notifier_change_pte() --> kvm_set_spte_hve() -->
> kvm_set_spte_hva() --> clean_dcache_guest_page()
> 
> The validation model doesn't have FWB capability supported.
> __clean_dcache_guest_page() attempts to perform dcache flush on pcie
> bar address(not a valid_pfn()) through page_address(), which doesn't
> have page table mapping and leads to exception.
> 
> I have worked around the issue by filtering out the request if the
> pfn is not valid in __clean_dcache_guest_page().  As the patch
> wasn't posted in the community, reverted it as well.

That's papering over the real issue, and this mapping path needs
fixing as it was only ever expected to be called for CoW.

Can you please try the following patch and let me know if that fixes
the issue for good?

Thanks,

M.

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 77cb2d28f2a4..b62dd40a4083 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1147,7 +1147,8 @@ int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, 
pte_t pte)
 * We've moved a page around, probably through CoW, so let's treat it
 * just like a translation fault and clean the cache to the PoC.
 */
-   clean_dcache_guest_page(pfn, PAGE_SIZE);
+   if (!kvm_is_device_pfn(pfn))
+   clean_dcache_guest_page(pfn, PAGE_SIZE);
handle_hva_to_gpa(kvm, hva, end, _set_spte_handler, );
return 0;
 }


-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2021-04-23 Thread Jean-Philippe Brucker
Hi Sumit,

On Thu, Apr 22, 2021 at 08:34:38PM +0530, Sumit Gupta wrote:
> Had to revert patch "mm: notify remote TLBs when dirtying a PTE".

Did that patch cause any issue, or is it just not needed on your system?
It fixes an hypothetical problem with the way ATS is implemented. Maybe I
actually observed it on an old software model, I don't remember. Either
way it's unlikely to go upstream but I'd like to know if I should drop it
from my tree.

Thanks,
Jean
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2021-04-22 Thread Sumit Gupta
Hi Eric,
I have validated the v14 of the patch series from branch 
"jean_sva_current_2stage_v14".
Verfied nested translations with NVMe PCI device assigned to Qemu 5.2 Guest.
Had to revert patch "mm: notify remote TLBs when dirtying a PTE".

Tested-by: Sumit Gupta 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2021-03-19 Thread Auger Eric
Hi Krishna,

On 3/18/21 1:16 AM, Krishna Reddy wrote:
> Tested-by: Krishna Reddy 
> 
> Validated nested translations with NVMe PCI device assigned to Guest VM. 
> Tested with both v12 and v13 of Jean-Philippe's patches as base.

Many thanks for that.
> 
>> This is based on Jean-Philippe's
>> [PATCH v12 00/10] iommu: I/O page faults for SMMUv3
>> https://lore.kernel.org/linux-arm-kernel/YBfij71tyYvh8LhB@myrica/T/
> 
> With Jean-Philippe's V13, Patch 12 of this series has a conflict that had to 
> be resolved manually.

Yep I will respin accordingly.

Best Regards

Eric
> 
> -KR
> 
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


RE: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2021-03-19 Thread Krishna Reddy
Tested-by: Krishna Reddy 

Validated nested translations with NVMe PCI device assigned to Guest VM. 
Tested with both v12 and v13 of Jean-Philippe's patches as base.

> This is based on Jean-Philippe's
> [PATCH v12 00/10] iommu: I/O page faults for SMMUv3
> https://lore.kernel.org/linux-arm-kernel/YBfij71tyYvh8LhB@myrica/T/

With Jean-Philippe's V13, Patch 12 of this series has a conflict that had to be 
resolved manually.

-KR


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2021-02-25 Thread Auger Eric
Hi Shameer, all

On 2/23/21 9:56 PM, Eric Auger wrote:
> This series brings the IOMMU part of HW nested paging support
> in the SMMUv3. The VFIO part is submitted separately.
> 
> This is based on Jean-Philippe's
> [PATCH v12 00/10] iommu: I/O page faults for SMMUv3
> https://lore.kernel.org/linux-arm-kernel/YBfij71tyYvh8LhB@myrica/T/
> 
> The IOMMU API is extended to support 2 new API functionalities:
> 1) pass the guest stage 1 configuration
> 2) pass stage 1 MSI bindings
> 
> Then those capabilities gets implemented in the SMMUv3 driver.
> 
> The virtualizer passes information through the VFIO user API
> which cascades them to the iommu subsystem. This allows the guest
> to own stage 1 tables and context descriptors (so-called PASID
> table) while the host owns stage 2 tables and main configuration
> structures (STE).
> 
> Best Regards
> 
> Eric
> 
> This series can be found at:
> https://github.com/eauger/linux/tree/v5.11-stallv12-2stage-v14
> (including the VFIO part in its last version: v12)

As committed, I have rebased the iommu + vfio part on top of Jean's
sva/current (5.11-rc4).

https://github.com/eauger/linux/tree/jean_sva_current_2stage_v14

I have not tested the SVA bits but I have tested there is no regression
from my pov.

>From the QEMU perspective is works off the shelf with that branch but if
you want to use other SVA related IOCTLs please remind of updating the
linux headers.

Again thank you to all of you who reviewed and tested the previous version.

Thanks

Eric
> 
> The VFIO series is sent separately.
> 
> History:
> 
> Previous version (v13):
> https://github.com/eauger/linux/tree/5.10-rc4-2stage-v13
> 
> v13 -> v14:
> - Took into account all received comments I think. Great
>   thanks to all the testers for their effort and sometimes
>   fixes. I am really grateful to you!
> - numerous fixes including guest running in
>   noiommu, iommu.strict=0, iommu.passthrough=on,
>   enable_unsafe_noiommu_mode
> 
> v12 -> v13:
> - fixed compilation issue with CONFIG_ARM_SMMU_V3_SVA
>   reported by Shameer. This urged me to revisit patch 4 into
>   iommu/smmuv3: Allow s1 and s2 configs to coexist where
>   s1_cfg and s2_cfg are not dynamically allocated anymore.
>   Instead I use a new set field in existing structs
> - fixed 2 others config checks
> - Updated "iommu/arm-smmu-v3: Maintain a SID->device structure"
>   according to the last version
> 
> v11 -> v12:
> - rebase on top of v5.10-rc4
> 
> Eric Auger (13):
>   iommu: Introduce attach/detach_pasid_table API
>   iommu: Introduce bind/unbind_guest_msi
>   iommu/smmuv3: Allow s1 and s2 configs to coexist
>   iommu/smmuv3: Get prepared for nested stage support
>   iommu/smmuv3: Implement attach/detach_pasid_table
>   iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
>   iommu/smmuv3: Implement cache_invalidate
>   dma-iommu: Implement NESTED_MSI cookie
>   iommu/smmuv3: Nested mode single MSI doorbell per domain enforcement
>   iommu/smmuv3: Enforce incompatibility between nested mode and HW MSI
> regions
>   iommu/smmuv3: Implement bind/unbind_guest_msi
>   iommu/smmuv3: report additional recoverable faults
>   iommu/smmuv3: Accept configs with more than one context descriptor
> 
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 444 ++--
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  14 +-
>  drivers/iommu/dma-iommu.c   | 142 ++-
>  drivers/iommu/iommu.c   | 106 +
>  include/linux/dma-iommu.h   |  16 +
>  include/linux/iommu.h   |  47 +++
>  include/uapi/linux/iommu.h  |  54 +++
>  7 files changed, 781 insertions(+), 42 deletions(-)
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm