On Tue, Nov 18, 2025 at 07:29:22PM +0800, Baolu Lu wrote:
> On 11/18/2025 3:47 PM, Tian, Kevin wrote:
> > > From: Baolu Lu <[email protected]>
> > > Sent: Tuesday, November 18, 2025 2:24 PM
> > > 
> > > On 11/18/25 12:04, Tian, Kevin wrote:
> > > > > 46 bits is not particularly big... Hmm, I wonder if we have some issue
> > > > > with the sign-extend? iommupt does that properly and IIRC the old code
> > > > > did not. Which of the page table formats is this using second stage or
> > > > > first stage?
> > > > Assume it's first stage for kernel IOVA, if available in hw
> > > 
> > > It's the first stage (x86_64 fmt) according to the PASID entry setup:
> > > 
> > > IOMMU dmar0: Root Table Address: 0x105a82000
> > > B.D.F     Root_entry                              Context_entry
> > >           PASID   PASID_table_entry
> > > 00:02.0   0x0000000000000000:0x0000000105a85001
> > > 0x0000000000000000:0x0000000105a84405     0
> > > 0x0000000105a86000:0x0000000000000002:0x0000000000000049
> > > 
> > 
> > so the 3rd experiment (if the former two doesn't show difference) is
> > to force using second stage to see whether it's caused by the
> > sign-extend logic.
> 
> I hardcoded the driver to always use the second stage for paging domain
> translation, and it works now.
> 
> IOMMU dmar0: Root Table Address: 0x1049b6000
> B.D.F Root_entry                              Context_entry                   
>         PASID   PASID_table_entry
> 00:02.0       0x0000000000000000:0x00000001049ba001
> 0x0000000000000000:0x00000001049b9405 0
> 0x0000000000000000:0x0000000000000002:0x00000001049bb089

Okay, that is a great finding!

So either it is something about the sign extend or something about
x86_64. Given the similarity of vtdss all the code around cache/iotlb
flushing is the same so we can say that is working.

1) Can you run the test with CONFIG_DEBUG_GENERIC_PT=y? Lets see if
   pt_check_install_leaf_args() fails?

2) Lets try to disabling the sign extend function:

--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2818,8 +2818,7 @@ intel_iommu_domain_alloc_first_stage(struct device *dev,
        else
                cfg.common.hw_max_vasz_lg2 = 48;
        cfg.common.hw_max_oasz_lg2 = 52;
-       cfg.common.features = BIT(PT_FEAT_SIGN_EXTEND) |
-                             BIT(PT_FEAT_FLUSH_RANGE);
+       cfg.common.features = BIT(PT_FEAT_FLUSH_RANGE);
        /* First stage always uses scalable mode */
        if (!ecap_smpwc(iommu->ecap))
                cfg.common.features |= BIT(PT_FEAT_DMA_INCOHERENT);

3) Let's validate the mapping:

--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2572,6 +2572,21 @@ int iommu_map_nosync(struct iommu_domain *domain, 
unsigned long iova,
        else
                trace_map(orig_iova, orig_paddr, orig_size);
 
+       if (!ret) {
+               paddr = orig_paddr;
+               for (iova = orig_iova; iova < orig_iova + orig_size; iova += 
PAGE_SIZE) {
+                       phys_addr_t pt_paddr = ops->iova_to_phys(domain, iova);
+
+                       if (pt_paddr != paddr) {
+                               pr_warn("mapping: Bad physical storage %lx != 
%lx at %lx\n",
+                                       (unsigned long)paddr,
+                                       (unsigned long)pt_paddr, iova);
+                               break;
+                       }
+                       paddr += PAGE_SIZE;
+               }
+       }
+

  Maybe the physical is getting truncated for some reason?

4) Please collect the map/unmap traces, including the return code

Jason

Reply via email to