On Tue Dec 10 19, Lu Baolu wrote:
Hi,

On 12/10/19 8:52 AM, Jerry Snitselaar wrote:
On Sun Dec 08 19, Lu Baolu wrote:
Hi,

On 12/7/19 10:41 AM, Jerry Snitselaar wrote:
On Fri Dec 06 19, Jerry Snitselaar wrote:
On Sat Dec 07 19, Lu Baolu wrote:
Hi Jerry,

On 12/6/19 3:24 PM, Jerry Snitselaar wrote:
On Fri Dec 06 19, Lu Baolu wrote:
[snip]

Can you please try below change? Let's check whether the afending
address has been mapped for device 01.00.2.

$ git diff
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index db7bfd4f2d20..d9daf66be849 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -663,6 +663,8 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,                        ret = iommu_map(domain, addr, addr, pg_size, entry->prot);
                       if (ret)
                               goto out;
+
+                       dev_info(dev, "Setting identity map [0x%Lx - 0x%Lx] for group %d\n", addr, addr + pg_size, group->id);
               }

       }

I am doubting that device 01.00.2 is not in the device scope of

[    4.485108] DMAR: RMRR base: 0x000000bdf6f000 end: 0x000000bdf7efff

By the way, does device 01.00.2 works well after binding the driver?


When I boot it with passthrough it doesn't get to a point where I can
login. I think the serial console on these systems is tied to the ilo,
so the conserver connection could be making things
worse. Unfortunately the system is remote. I should have more time now
to focus on debugging this.

Attaching console output for the above patch.

It seems that device 01.00.2 isn't in the scope of RMRR [base:
0x000000bdf6f000 end: 0x000000bdf7efff]. But it still tries to access
the address within it, hence faults generated.

You can check it with ACPI/DMAR table.

Best regards,
baolu


I believe it is the 3rd endpoint device entry in dmar data below.
So question about request_default_domain_for_dev. Since a dma mapping
is already done for 1.00.0, and that sets the default_domain for the
group (I think), won't it bail out for 1.00.2 at this check?

    if (group->default_domain && group->default_domain->type == type)
        goto out;


Or I guess request_default_domain_for_dev wouldn't even be called for 1.00.2.
intel_iommu_add_device it wouldn't even call one of the request
functions with 1.00.2 since domain->type would be dma from 1.00.0, and device_def_domain_type
should return dma.

Can you please add some debug messages and check what really happens
here?

Best regards,
baolu


[   25.000544] pci 0000:01:00.0: Adding to iommu group 25
[   25.502243] pci 0000:01:00.0: DMAR: domain->type is identity  << intel_iommu_add_device (alloced in iommu_group_get_for_dev) [   25.504239] pci 0000:01:00.0: DMAR: device default domain type is dma. requesting dma domain  << intel_iommu_add_device [   25.507954] pci 0000:01:00.0: Using iommu dma mapping    << request_default_domain_for_dev  (now default domain for group is dma)
[   25.509765] pci 0000:01:00.1: Adding to iommu group 25
[   25.511514] pci 0000:01:00.1: DMAR: domain->type is dma  << intel_iommu_add_device [   25.513263] pci 0000:01:00.1: DMAR: device default domain type is identity. requesting identity domain  << intel_iommu_add_device [   25.516435] pci 0000:01:00.1: don't change mappings of existing devices.    << request_default_domain_for_dev [   25.518669] pci 0000:01:00.1: DMAR: Device uses a private identity domain.  << intel_iommu_add_device
[   25.521061] pci 0000:01:00.2: Adding to iommu group 25
[   25.522791] pci 0000:01:00.2: DMAR: domain->type is dma  << intel_iommu_add_device
[   25.524706] pci 0000:01:00.4: Adding to iommu group 25
[   25.526458] pci 0000:01:00.4: DMAR: domain->type is dma  << intel_iommu_add_device [   25.528213] pci 0000:01:00.4: DMAR: device default domain type is identity. requesting identity domain  << intel_iommu_add_device [   25.531284] pci 0000:01:00.4: don't change mappings of existing devices.    << request_default_domain_for_dev [   25.533500] pci 0000:01:00.4: DMAR: Device uses a private identity domain.  << intel_iommu_add_device

So the domain type is dma after 01:00.0 gets added, and when
intel_iommu_add_device is called for 01:00.2 it will go into the if
section. Since the device default domain type for 01:00.2 is dma
nothing happens in there, and it goes on to 01:00.4. Is the "private
identity domain" message really accurate since everyone will use
si_domain? Adding some more debugging.

The facts that we have seen:

1) 01.00.2 uses the default domain in group 25. The domain type of this
  default domain is DMA.

2) iommu_group_create_direct_mappings() *should* be called when adding
  01.00.2 into group 25. As the result, RMRR for this device *should*
  be identity mapped.

3) By checking DMAR table, RMRR (0x000000bdf6f000 ~ 0x000000bdf7efff) is
  reported for device 01.00.2.

The problem is that RMRR (0x000000bdf6f000 ~ 0x000000bdf7efff) hasn't
actually been mapped, as the result IOMMU faults generated when device
try to access this range.

So I guess you could add more debug message to check why
iommu_group_create_direct_mappings() doesn't do the right thing?

Best regards,
baolu


A call to iommu_map is failing.

[   36.686881] pci 0000:01:00.2: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   36.689843] pci 0000:01:00.2: iommu_group_create_direct_mappings: iterating 
through mappings
[   36.692757] pci 0000:01:00.2: iommu_group_create_direct_mappings: calling 
apply_resv_region
[   36.695526] pci 0000:01:00.2: e_direct_mappings: entry type is direct
[   37.198053] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 
0x00000000bddde000 pgsize 0x1000
[   37.201357] pci 0000:01:00.2: iommu_group_create_direct_mappings: iommu_map 
failed
[   37.203973] pci 0000:01:00.2: iommu_group_create_direct_mappings: leaving 
func
[   37.206385] pci 0000:01:00.2: iommu_group_add_device: calling 
__iommu_attach_device
[   37.208950] pci 0000:01:00.2: Adding to iommu group 25
[   37.210660] pci 0000:01:00.2: DMAR: domain->type is dma

Also fails for 01:00.4:

[   37.212448] pci 0000:01:00.4: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   37.215382] pci 0000:01:00.4: iommu_group_create_direct_mappings: iterating 
through mappings
[   37.218170] pci 0000:01:00.4: iommu_group_create_direct_mappings: calling 
apply_resv_region
[   37.220933] pci 0000:01:00.4: iommu_group_create_direct_mappings: entry type 
is direct-relaxable
[   37.223932] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 
0x00000000bddde000 pgsize 0x1000
[   37.226857] pci 0000:01:00.4: iommu_group_create_direct_mappings: iommu_map 
failed
[   37.229300] pci 0000:01:00.4: iommu_group_create_direct_mappings: leaving 
func
[   37.231648] pci 0000:01:00.4: iommu_group_add_device: calling 
__iommu_attach_device
[   37.234194] pci 0000:01:00.4: Adding to iommu group 25
[   37.236192] pci 0000:01:00.4: DMAR: domain->type is dma
[   37.237958] pci 0000:01:00.4: DMAR: device default domain type is identity. 
requesting identity domain
[   37.241061] pci 0000:01:00.4: don't change mappings of existing d37.489870] 
pci 0000:01:00.4: DMAR: Device uses a private identity domain.

There is an RMRR for 0xbddde000-0xddddefff:

[63Ah 1594   2]                Subtable Type : 0001 [Reserved Memory Region]
[63Ch 1596   2]                       Length : 0036

[63Eh 1598   2]                     Reserved : 0000
[640h 1600   2]           PCI Segment Number : 0000
[642h 1602   8]                 Base Address : 00000000BDDDE000
[64Ah 1610   8]          End Address (limit) : 00000000BDDDEFFF

[652h 1618   1]            Device Scope Type : 01 [PCI Endpoint Device]
[653h 1619   1]                 Entry Length : 0A
[654h 1620   2]                     Reserved : 0000
[656h 1622   1]               Enumeration ID : 00
[657h 1623   1]               PCI Bus Number : 00

[658h 1624   2]                     PCI Path : 1C,07

[65Ah 1626   2]                     PCI Path : 00,00


[65Ch 1628   1]            Device Scope Type : 01 [PCI Endpoint Device]
[65Dh 1629   1]                 Entry Length : 0A
[65Eh 1630   2]                     Reserved : 0000
[660h 1632   1]               Enumeration ID : 00
[661h 1633   1]               PCI Bus Number : 00

[662h 1634   2]                     PCI Path : 1C,07

[664h 1636   2]                     PCI Path : 00,02


[666h 1638   1]            Device Scope Type : 01 [PCI Endpoint Device]
[667h 1639   1]                 Entry Length : 0A
[668h 1640   2]                     Reserved : 0000
[66Ah 1642   1]               Enumeration ID : 00
[66Bh 1643   1]               PCI Bus Number : 00

[66Ch 1644   2]                     PCI Path : 1C,07

[66Eh 1646   2]                     PCI Path : 00,04

_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to