On Tue Dec 10 19, Lu Baolu wrote:
Hi,
On 12/10/19 8:52 AM, Jerry Snitselaar wrote:
On Sun Dec 08 19, Lu Baolu wrote:
Hi,
On 12/7/19 10:41 AM, Jerry Snitselaar wrote:
On Fri Dec 06 19, Jerry Snitselaar wrote:
On Sat Dec 07 19, Lu Baolu wrote:
Hi Jerry,
On 12/6/19 3:24 PM, Jerry Snitselaar wrote:
On Fri Dec 06 19, Lu Baolu wrote:
[snip]
Can you please try below change? Let's check whether the afending
address has been mapped for device 01.00.2.
$ git diff
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index db7bfd4f2d20..d9daf66be849 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -663,6 +663,8 @@ static int
iommu_group_create_direct_mappings(struct iommu_group
*group,
ret = iommu_map(domain, addr,
addr, pg_size, entry->prot);
if (ret)
goto out;
+
+ dev_info(dev, "Setting identity
map [0x%Lx - 0x%Lx] for group %d\n", addr, addr +
pg_size, group->id);
}
}
I am doubting that device 01.00.2 is not in the device scope of
[ 4.485108] DMAR: RMRR base: 0x000000bdf6f000 end:
0x000000bdf7efff
By the way, does device 01.00.2 works well after binding the driver?
When I boot it with passthrough it doesn't get to a point where I can
login. I think the serial console on these systems is tied
to the ilo,
so the conserver connection could be making things
worse. Unfortunately the system is remote. I should have
more time now
to focus on debugging this.
Attaching console output for the above patch.
It seems that device 01.00.2 isn't in the scope of RMRR [base:
0x000000bdf6f000 end: 0x000000bdf7efff]. But it still tries to access
the address within it, hence faults generated.
You can check it with ACPI/DMAR table.
Best regards,
baolu
I believe it is the 3rd endpoint device entry in dmar data below.
So question about request_default_domain_for_dev. Since a dma mapping
is already done for 1.00.0, and that sets the default_domain for the
group (I think), won't it bail out for 1.00.2 at this check?
if (group->default_domain && group->default_domain->type == type)
goto out;
Or I guess request_default_domain_for_dev wouldn't even be
called for 1.00.2.
intel_iommu_add_device it wouldn't even call one of the request
functions with 1.00.2 since domain->type would be dma from
1.00.0, and device_def_domain_type
should return dma.
Can you please add some debug messages and check what really happens
here?
Best regards,
baolu
[ 25.000544] pci 0000:01:00.0: Adding to iommu group 25
[ 25.502243] pci 0000:01:00.0: DMAR: domain->type is identity <<
intel_iommu_add_device (alloced in iommu_group_get_for_dev)
[ 25.504239] pci 0000:01:00.0: DMAR: device default domain type is
dma. requesting dma domain << intel_iommu_add_device
[ 25.507954] pci 0000:01:00.0: Using iommu dma mapping <<
request_default_domain_for_dev (now default domain for group is
dma)
[ 25.509765] pci 0000:01:00.1: Adding to iommu group 25
[ 25.511514] pci 0000:01:00.1: DMAR: domain->type is dma <<
intel_iommu_add_device
[ 25.513263] pci 0000:01:00.1: DMAR: device default domain type is
identity. requesting identity domain << intel_iommu_add_device
[ 25.516435] pci 0000:01:00.1: don't change mappings of existing
devices. << request_default_domain_for_dev
[ 25.518669] pci 0000:01:00.1: DMAR: Device uses a private
identity domain. << intel_iommu_add_device
[ 25.521061] pci 0000:01:00.2: Adding to iommu group 25
[ 25.522791] pci 0000:01:00.2: DMAR: domain->type is dma <<
intel_iommu_add_device
[ 25.524706] pci 0000:01:00.4: Adding to iommu group 25
[ 25.526458] pci 0000:01:00.4: DMAR: domain->type is dma <<
intel_iommu_add_device
[ 25.528213] pci 0000:01:00.4: DMAR: device default domain type is
identity. requesting identity domain << intel_iommu_add_device
[ 25.531284] pci 0000:01:00.4: don't change mappings of existing
devices. << request_default_domain_for_dev
[ 25.533500] pci 0000:01:00.4: DMAR: Device uses a private
identity domain. << intel_iommu_add_device
So the domain type is dma after 01:00.0 gets added, and when
intel_iommu_add_device is called for 01:00.2 it will go into the if
section. Since the device default domain type for 01:00.2 is dma
nothing happens in there, and it goes on to 01:00.4. Is the "private
identity domain" message really accurate since everyone will use
si_domain? Adding some more debugging.
The facts that we have seen:
1) 01.00.2 uses the default domain in group 25. The domain type of this
default domain is DMA.
2) iommu_group_create_direct_mappings() *should* be called when adding
01.00.2 into group 25. As the result, RMRR for this device *should*
be identity mapped.
3) By checking DMAR table, RMRR (0x000000bdf6f000 ~ 0x000000bdf7efff) is
reported for device 01.00.2.
The problem is that RMRR (0x000000bdf6f000 ~ 0x000000bdf7efff) hasn't
actually been mapped, as the result IOMMU faults generated when device
try to access this range.
So I guess you could add more debug message to check why
iommu_group_create_direct_mappings() doesn't do the right thing?
Best regards,
baolu
A call to iommu_map is failing.
[ 36.686881] pci 0000:01:00.2: iommu_group_add_device: calling
iommu_group_create_direct_mappings
[ 36.689843] pci 0000:01:00.2: iommu_group_create_direct_mappings: iterating
through mappings
[ 36.692757] pci 0000:01:00.2: iommu_group_create_direct_mappings: calling
apply_resv_region
[ 36.695526] pci 0000:01:00.2: e_direct_mappings: entry type is direct
[ 37.198053] iommu: iommu_map: ops->map failed iova 0xbddde000 pa
0x00000000bddde000 pgsize 0x1000
[ 37.201357] pci 0000:01:00.2: iommu_group_create_direct_mappings: iommu_map
failed
[ 37.203973] pci 0000:01:00.2: iommu_group_create_direct_mappings: leaving
func
[ 37.206385] pci 0000:01:00.2: iommu_group_add_device: calling
__iommu_attach_device
[ 37.208950] pci 0000:01:00.2: Adding to iommu group 25
[ 37.210660] pci 0000:01:00.2: DMAR: domain->type is dma
Also fails for 01:00.4:
[ 37.212448] pci 0000:01:00.4: iommu_group_add_device: calling
iommu_group_create_direct_mappings
[ 37.215382] pci 0000:01:00.4: iommu_group_create_direct_mappings: iterating
through mappings
[ 37.218170] pci 0000:01:00.4: iommu_group_create_direct_mappings: calling
apply_resv_region
[ 37.220933] pci 0000:01:00.4: iommu_group_create_direct_mappings: entry type
is direct-relaxable
[ 37.223932] iommu: iommu_map: ops->map failed iova 0xbddde000 pa
0x00000000bddde000 pgsize 0x1000
[ 37.226857] pci 0000:01:00.4: iommu_group_create_direct_mappings: iommu_map
failed
[ 37.229300] pci 0000:01:00.4: iommu_group_create_direct_mappings: leaving
func
[ 37.231648] pci 0000:01:00.4: iommu_group_add_device: calling
__iommu_attach_device
[ 37.234194] pci 0000:01:00.4: Adding to iommu group 25
[ 37.236192] pci 0000:01:00.4: DMAR: domain->type is dma
[ 37.237958] pci 0000:01:00.4: DMAR: device default domain type is identity.
requesting identity domain
[ 37.241061] pci 0000:01:00.4: don't change mappings of existing d37.489870]
pci 0000:01:00.4: DMAR: Device uses a private identity domain.
There is an RMRR for 0xbddde000-0xddddefff:
[63Ah 1594 2] Subtable Type : 0001 [Reserved Memory Region]
[63Ch 1596 2] Length : 0036
[63Eh 1598 2] Reserved : 0000
[640h 1600 2] PCI Segment Number : 0000
[642h 1602 8] Base Address : 00000000BDDDE000
[64Ah 1610 8] End Address (limit) : 00000000BDDDEFFF
[652h 1618 1] Device Scope Type : 01 [PCI Endpoint Device]
[653h 1619 1] Entry Length : 0A
[654h 1620 2] Reserved : 0000
[656h 1622 1] Enumeration ID : 00
[657h 1623 1] PCI Bus Number : 00
[658h 1624 2] PCI Path : 1C,07
[65Ah 1626 2] PCI Path : 00,00
[65Ch 1628 1] Device Scope Type : 01 [PCI Endpoint Device]
[65Dh 1629 1] Entry Length : 0A
[65Eh 1630 2] Reserved : 0000
[660h 1632 1] Enumeration ID : 00
[661h 1633 1] PCI Bus Number : 00
[662h 1634 2] PCI Path : 1C,07
[664h 1636 2] PCI Path : 00,02
[666h 1638 1] Device Scope Type : 01 [PCI Endpoint Device]
[667h 1639 1] Entry Length : 0A
[668h 1640 2] Reserved : 0000
[66Ah 1642 1] Enumeration ID : 00
[66Bh 1643 1] PCI Bus Number : 00
[66Ch 1644 2] PCI Path : 1C,07
[66Eh 1646 2] PCI Path : 00,04
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu