Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu

2020-07-22 Thread Lu Baolu

On 7/22/20 7:45 AM, Limonciello, Mario wrote:




-Original Message-
From: Lu Baolu 
Sent: Tuesday, July 21, 2020 6:07 PM
To: Limonciello, Mario; Joerg Roedel
Cc: baolu...@linux.intel.com; Ashok Raj; linux-ker...@vger.kernel.org;
sta...@vger.kernel.org; Koba Ko; iommu@lists.linux-foundation.org
Subject: Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated
iommu


[EXTERNAL EMAIL]

Hi Limonciello,

On 7/21/20 10:44 PM, Limonciello, Mario wrote:

-Original Message-
From: iommu  On Behalf Of Lu
Baolu
Sent: Monday, July 20, 2020 7:17 PM
To: Joerg Roedel
Cc: Ashok Raj;linux-ker...@vger.kernel.org;sta...@vger.kernel.org; Koba
Ko;iommu@lists.linux-foundation.org
Subject: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated
iommu

The VT-d spec requires (10.4.4 Global Command Register, TE field) that:

Hardware implementations supporting DMA draining must drain any in-flight
DMA read/write requests queued within the Root-Complex before completing
the translation enable command and reflecting the status of the command
through the TES field in the Global Status register.

Unfortunately, some integrated graphic devices fail to do so after some
kind of power state transition. As the result, the system might stuck in
iommu_disable_translation(), waiting for the completion of TE transition.

This provides a quirk list for those devices and skips TE disabling if
the qurik hits.

Fixes:https://bugzilla.kernel.org/show_bug.cgi?id=208363

That one is for TGL.

I think you also want to add this one for ICL:
Fixes:https://bugzilla.kernel.org/show_bug.cgi?id=206571



Do you mean someone have tested that this patch also fixes the problem
described in 206571?



Yes, confusingly https://bugzilla.kernel.org/show_bug.cgi?id=208363#c31 actually
is the XPS 9300 ICL system and issue.

I also have a private confirmation from another person that it resolves it for
them on another ICL platform.



Okay! Thank you very much! I just posted v2 with this tag added.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu

2020-07-22 Thread Sasha Levin
Hi

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag
fixing commit: .

The bot has tested the following trees: v5.7.9, v5.4.52, v4.19.133, v4.14.188, 
v4.9.230, v4.4.230.

v5.7.9: Failed to apply! Possible dependencies:
Unable to calculate

v5.4.52: Failed to apply! Possible dependencies:
Unable to calculate

v4.19.133: Failed to apply! Possible dependencies:
e5e04d051979d ("iommu/vt-d: Check whether device requires bounce buffer")

v4.14.188: Failed to apply! Possible dependencies:
85319dcc8955f ("iommu/vt-d: Add for_each_device_domain() helper")
9ddbfb42138d8 ("iommu/vt-d: Move device_domain_info to header")
e5e04d051979d ("iommu/vt-d: Check whether device requires bounce buffer")

v4.9.230: Failed to apply! Possible dependencies:
161b28aae1651 ("iommu/vt-d: Make sure IOMMUs are off when intel_iommu=off")
61012985eb132 ("iommu/vt-d: Use lo_hi_readq() / lo_hi_writeq()")
85319dcc8955f ("iommu/vt-d: Add for_each_device_domain() helper")
9ddbfb42138d8 ("iommu/vt-d: Move device_domain_info to header")
a7fdb6e648fb1 ("iommu/vt-d: Fix crash when accessing VT-d sysfs entries")
b0119e870837d ("iommu: Introduce new 'struct iommu_device'")
b316d02a13c3a ("iommu/vt-d: Unwrap __get_valid_domain_for_dev()")
bfd20f1cc8501 ("x86, iommu/vt-d: Add an option to disable Intel IOMMU force 
on")
e5e04d051979d ("iommu/vt-d: Check whether device requires bounce buffer")

v4.4.230: Failed to apply! Possible dependencies:
0824c5920b16f ("iommu/vt-d: avoid dev iotlb logic for domains with no dev 
iotlbs")
161b28aae1651 ("iommu/vt-d: Make sure IOMMUs are off when intel_iommu=off")
314f1dc140844 ("iommu/vt-d: refactoring of deferred flush entries")
53c92d793395f ("iommu: of: enforce const-ness of struct iommu_ops")
57f98d2f61e19 ("iommu: Introduce iommu_fwspec")
592033790e827 ("iommu/vt-d: Check the return value of 
iommu_device_create()")
85319dcc8955f ("iommu/vt-d: Add for_each_device_domain() helper")
8d54d6c8b8f3e ("iommu/amd: Implement apply_dm_region call-back")
9ddbfb42138d8 ("iommu/vt-d: Move device_domain_info to header")
a7fdb6e648fb1 ("iommu/vt-d: Fix crash when accessing VT-d sysfs entries")
aa4732406e129 ("iommu/vt-d: per-cpu deferred invalidation queues")
b0119e870837d ("iommu: Introduce new 'struct iommu_device'")
b996444cf35e7 ("iommu/of: Handle iommu-map property for PCI")
bc8474549e94e ("iommu/vt-d: Fix up error handling in alloc_iommu")
bfd20f1cc8501 ("x86, iommu/vt-d: Add an option to disable Intel IOMMU force 
on")
e5e04d051979d ("iommu/vt-d: Check whether device requires bounce buffer")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

-- 
Thanks
Sasha
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Subject: Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu

2020-07-22 Thread Miao, Jun
Hi Lu,  limonciello.

Yestoday i just verified the issue with the patch. and just iommu Subscription 
today.This is my test log.

[Hardware info]

 Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz   1.20GHz
 ICLSFWR1.R00.3162.A00.1904162000
 BIOS Information

 BIOS Vendo Intel

   Core Version 1.5.2.0 RP01
   Client Silicon Version   0.2.0.15
   Project Version  ICLSFWR1.R00.3162.A00.1904162000
   Build Date   20:00 04/16/2019

   Board Name   IceLake U DDR4 SODIMM PD RVP TLC

   Processor Information
   Name IceLake UL

[S3(mem) failed]

$ echo deep > /sys/power/mem_sleep

$ rtcwake -m mem -s 10

ACPI: EC: interrupt blocked
e1000e :00:1f.6: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 14317
usecs
ec PNP0C09:00: acpi_ec_suspend_noirq+0x0/0x50 returned 0 after 355319 usecs
wdat_wdt wdat_wdt: calling wdat_wdt_suspend_noirq+0x0/0x66 [wdat_wdt] @ 347,
parent: platform
ahci :00:17.0: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 383843
usecs
intel-lpss :00:1e.3: pci_pm_suspend_noirq+0x0/0x250 returned 0 after
384062 usecs
wdat_wdt wdat_wdt: wdat_wdt_suspend_noirq+0x0/0x66 [wdat_wdt] returned 0
after 11 usecs
intel-lpss :00:1e.0: pci_pm_suspend_noirq+0x0/0x250 returned 0 after
414466 usecs
xhci_hcd :00:14.0: pci_pm_suspend_noirq+0x0/0x250 returned 0 after
414023 usecs
sdhci-pci :00:14.5: pci_pm_suspend_noirq+0x0/0x250 returned 0 after
429325 usecs
pcieport :00:07.3: pci_pm_suspend_noirq+0x0/0x250 returned 0 after
429026 usecs
pcieport :00:07.1: pci_pm_suspend_noirq+0x0/0x250 returned 0 after
429675 usecs
pcieport :00:07.2: pci_pm_suspend_noirq+0x0/0x250 returned 0 after
430309 usecs
pcieport :00:07.0: pci_pm_suspend_noirq+0x0/0x250 returned 0 after
430213 usecs
thunderbolt :00:0d.2: pci_pm_suspend_noirq+0x0/0x250 returned 0 after
432523 usecs
thunderbolt :00:0d.3: pci_pm_suspend_noirq+0x0/0x250 returned 0 after
432815 usecs
ACPI: Preparing to enter system sleep state S3
ACPI: EC: event blocked
ACPI: EC: EC stopped
PM: Saving platform NVS memory
Disabling non-boot CPUs ...
smpboot: CPU 1 is now offline
smpboot: CPU 2 is now offline
smpboot: CPU 3 is now offline
smpboot: CPU 4 is now offline
smpboot: CPU 5 is now offline
smpboot: CPU 6 is now offline
smpboot: CPU 7 is now offline
PM: Calling mce_syscore_suspend+0x0/0x20
PM: Calling nmi_suspend+0x0/0x20
PM: Calling timekeeping_suspend+0x0/0x2d0
PM: Calling save_ioapic_entries+0x0/0x90
PM: Calling i8259A_suspend+0x0/0x30
PM: Calling iommu_suspend+0x0/0x1b0
Kernel panic - not syncing: DMAR hardware is malfunctioning
CPU: 0 PID: 347 Comm: rtcwake Not tainted 5.4.0-yocto-standard #124
Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4
SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3162.A00.1904162000 04/16/2019
Call Trace:
  dump_stack+0x59/0x75
  panic+0xff/0x2d4
  iommu_disable_translation+0x88/0x90
  iommu_suspend+0x12f/0x1b0
  syscore_suspend+0x6c/0x220
  suspend_devices_and_enter+0x313/0x840
  pm_suspend+0x30d/0x390
  state_store+0x82/0xf0
  kobj_attr_store+0x12/0x20
  sysfs_kf_write+0x3c/0x50
  kernfs_fop_write+0x11d/0x190
  __vfs_write+0x1b/0x40
  vfs_write+0xc6/0x1d0
  ksys_write+0x5e/0xe0
  __x64_sys_write+0x1a/0x20
  do_syscall_64+0x4d/0x150
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f97b8080113
Code: 8b 15 81 bd 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00
64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff
77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
RSP: 002b:7ffcfa6f48b8 EFLAGS: 0246 ORIG_RAX: 0001
RAX: ffda RBX: 0004 RCX: 7f97b8080113
RDX: 0004 RSI: 55e7db03b700 RDI: 0004
RBP: 55e7db03b700 R08: 55e7db03b700 R09: 0004
R10: 0004 R11: 0246 R12: 0004
R13: 55e7db039380 R14: 0004 R15: 7f97b814d700
Kernel Offset: 0x38a0 from 0x8100 (relocation range:
0x8000-0xbfff)
---[ end Kernel panic - not syncing: DMAR hardware is malfunctioning ]---

[S3 successfully with the patch]

sh-5.0# uname -a
Linux intel-x86-64 5.8.0-rc6-yoctodev-standard+ #128 SMP PREEMPT Tue Jul 21 
12:14:39 CST 2020 x86_64 x86_64 x86_64 GNU/Linux
sh-5.0#

sh-5.0# lsmod |grep -i thunderbolt
intel_wmi_thunderbolt16384  0
thunderbolt   167936  0
wmi24576  2 intel_wmi_thunderbolt,wmi_bmof
sh-5.0#
sh-5.0#
sh-5.0#
sh-5.0# modinfo thunderbolt
filename: 
/lib/modules/5.8.0-rc6-yoctodev-standard+/kernel/drivers/thunderbolt/thunderbolt.ko
license:GPL
alias:  pci:v*d*sv*sd*bc0Csc03i40*
alias:  pci:v8086d9A1Dsv*sd*bc*sc*i*
alias:  pci:v8086d9A1Bsv*sd*bc*sc*i*
alias:  pci:v8086d8A0Dsv*sd*bc*sc*i*
alias:  pci:v8086d8A17sv*sd*bc*sc*i*
alias:  

Re: Subject: Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu

2020-07-22 Thread Jun Miao

On 7/22/20 11:07 AM, Lu Baolu wrote:

On 7/22/20 11:03 AM, Jun Miao wrote:

On 7/22/20 10:40 AM, Lu Baolu wrote:

Hi Jun,

On 7/22/20 10:26 AM, Miao, Jun wrote:

Kernel panic - not syncing: DMAR hardware is malfunctioning
CPU: 0 PID: 347 Comm: rtcwake Not tainted 5.4.0-yocto-standard #124
Hardware name: Intel Corporation Ice Lake Client Platform/IceLake 
U DDR4

SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3162.A00.1904162000 04/16/2019
Call Trace:
   dump_stack+0x59/0x75
   panic+0xff/0x2d4
   iommu_disable_translation+0x88/0x90
   iommu_suspend+0x12f/0x1b0
   syscore_suspend+0x6c/0x220
   suspend_devices_and_enter+0x313/0x840
   pm_suspend+0x30d/0x390
   state_store+0x82/0xf0
   kobj_attr_store+0x12/0x20
   sysfs_kf_write+0x3c/0x50
   kernfs_fop_write+0x11d/0x190
   __vfs_write+0x1b/0x40
   vfs_write+0xc6/0x1d0
   ksys_write+0x5e/0xe0
   __x64_sys_write+0x1a/0x20
   do_syscall_64+0x4d/0x150
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f97b8080113
Code: 8b 15 81 bd 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 
0f 1f 00
64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 
00 f0 ff ff

77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
RSP: 002b:7ffcfa6f48b8 EFLAGS: 0246 ORIG_RAX: 
0001

RAX: ffda RBX: 0004 RCX: 7f97b8080113
RDX: 0004 RSI: 55e7db03b700 RDI: 0004
RBP: 55e7db03b700 R08: 55e7db03b700 R09: 0004
R10: 0004 R11: 0246 R12: 0004
R13: 55e7db039380 R14: 0004 R15: 7f97b814d700
Kernel Offset: 0x38a0 from 0x8100 (relocation range:
0x8000-0xbfff)
---[ end Kernel panic - not syncing: DMAR hardware is 
malfunctioning ]---




Do you mean that system hangs in iommu_disable_translation() without 
this fix.


Yes ,From the call trace and i also read the DMARD_GCMD_RGS is wrong 
without this patch.


Okay! Thanks a lot for confirming this.

Best regards,
baolu


[S3 successfully with the patch]


And, this failure disappeared after you applied this fix?
YES , the log is too long , only head and tail . this failure 
disappereared.


Best regards,
baolu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Subject: Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu

2020-07-22 Thread Jun Miao

On 7/22/20 10:40 AM, Lu Baolu wrote:

Hi Jun,

On 7/22/20 10:26 AM, Miao, Jun wrote:

Kernel panic - not syncing: DMAR hardware is malfunctioning
CPU: 0 PID: 347 Comm: rtcwake Not tainted 5.4.0-yocto-standard #124
Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U 
DDR4

SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3162.A00.1904162000 04/16/2019
Call Trace:
   dump_stack+0x59/0x75
   panic+0xff/0x2d4
   iommu_disable_translation+0x88/0x90
   iommu_suspend+0x12f/0x1b0
   syscore_suspend+0x6c/0x220
   suspend_devices_and_enter+0x313/0x840
   pm_suspend+0x30d/0x390
   state_store+0x82/0xf0
   kobj_attr_store+0x12/0x20
   sysfs_kf_write+0x3c/0x50
   kernfs_fop_write+0x11d/0x190
   __vfs_write+0x1b/0x40
   vfs_write+0xc6/0x1d0
   ksys_write+0x5e/0xe0
   __x64_sys_write+0x1a/0x20
   do_syscall_64+0x4d/0x150
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f97b8080113
Code: 8b 15 81 bd 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 
0f 1f 00
64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 
f0 ff ff

77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
RSP: 002b:7ffcfa6f48b8 EFLAGS: 0246 ORIG_RAX: 0001
RAX: ffda RBX: 0004 RCX: 7f97b8080113
RDX: 0004 RSI: 55e7db03b700 RDI: 0004
RBP: 55e7db03b700 R08: 55e7db03b700 R09: 0004
R10: 0004 R11: 0246 R12: 0004
R13: 55e7db039380 R14: 0004 R15: 7f97b814d700
Kernel Offset: 0x38a0 from 0x8100 (relocation range:
0x8000-0xbfff)
---[ end Kernel panic - not syncing: DMAR hardware is 
malfunctioning ]---




Do you mean that system hangs in iommu_disable_translation() without 
this fix.


Yes ,From the call trace and i also read the DMARD_GCMD_RGS is wrong 
without this patch.

[S3 successfully with the patch]


And, this failure disappeared after you applied this fix?

Best regards,
baolu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Subject: Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu

2020-07-21 Thread Lu Baolu

On 7/22/20 11:03 AM, Jun Miao wrote:

On 7/22/20 10:40 AM, Lu Baolu wrote:

Hi Jun,

On 7/22/20 10:26 AM, Miao, Jun wrote:

Kernel panic - not syncing: DMAR hardware is malfunctioning
CPU: 0 PID: 347 Comm: rtcwake Not tainted 5.4.0-yocto-standard #124
Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U 
DDR4

SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3162.A00.1904162000 04/16/2019
Call Trace:
   dump_stack+0x59/0x75
   panic+0xff/0x2d4
   iommu_disable_translation+0x88/0x90
   iommu_suspend+0x12f/0x1b0
   syscore_suspend+0x6c/0x220
   suspend_devices_and_enter+0x313/0x840
   pm_suspend+0x30d/0x390
   state_store+0x82/0xf0
   kobj_attr_store+0x12/0x20
   sysfs_kf_write+0x3c/0x50
   kernfs_fop_write+0x11d/0x190
   __vfs_write+0x1b/0x40
   vfs_write+0xc6/0x1d0
   ksys_write+0x5e/0xe0
   __x64_sys_write+0x1a/0x20
   do_syscall_64+0x4d/0x150
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f97b8080113
Code: 8b 15 81 bd 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 
0f 1f 00
64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 
f0 ff ff

77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
RSP: 002b:7ffcfa6f48b8 EFLAGS: 0246 ORIG_RAX: 0001
RAX: ffda RBX: 0004 RCX: 7f97b8080113
RDX: 0004 RSI: 55e7db03b700 RDI: 0004
RBP: 55e7db03b700 R08: 55e7db03b700 R09: 0004
R10: 0004 R11: 0246 R12: 0004
R13: 55e7db039380 R14: 0004 R15: 7f97b814d700
Kernel Offset: 0x38a0 from 0x8100 (relocation range:
0x8000-0xbfff)
---[ end Kernel panic - not syncing: DMAR hardware is 
malfunctioning ]---




Do you mean that system hangs in iommu_disable_translation() without 
this fix.


Yes ,From the call trace and i also read the DMARD_GCMD_RGS is wrong 
without this patch.


Okay! Thanks a lot for confirming this.

Best regards,
baolu


[S3 successfully with the patch]


And, this failure disappeared after you applied this fix?

Best regards,
baolu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Subject: Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu

2020-07-21 Thread Lu Baolu

Hi Jun,

On 7/22/20 10:26 AM, Miao, Jun wrote:

Kernel panic - not syncing: DMAR hardware is malfunctioning
CPU: 0 PID: 347 Comm: rtcwake Not tainted 5.4.0-yocto-standard #124
Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4
SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3162.A00.1904162000 04/16/2019
Call Trace:
   dump_stack+0x59/0x75
   panic+0xff/0x2d4
   iommu_disable_translation+0x88/0x90
   iommu_suspend+0x12f/0x1b0
   syscore_suspend+0x6c/0x220
   suspend_devices_and_enter+0x313/0x840
   pm_suspend+0x30d/0x390
   state_store+0x82/0xf0
   kobj_attr_store+0x12/0x20
   sysfs_kf_write+0x3c/0x50
   kernfs_fop_write+0x11d/0x190
   __vfs_write+0x1b/0x40
   vfs_write+0xc6/0x1d0
   ksys_write+0x5e/0xe0
   __x64_sys_write+0x1a/0x20
   do_syscall_64+0x4d/0x150
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f97b8080113
Code: 8b 15 81 bd 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00
64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff
77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
RSP: 002b:7ffcfa6f48b8 EFLAGS: 0246 ORIG_RAX: 0001
RAX: ffda RBX: 0004 RCX: 7f97b8080113
RDX: 0004 RSI: 55e7db03b700 RDI: 0004
RBP: 55e7db03b700 R08: 55e7db03b700 R09: 0004
R10: 0004 R11: 0246 R12: 0004
R13: 55e7db039380 R14: 0004 R15: 7f97b814d700
Kernel Offset: 0x38a0 from 0x8100 (relocation range:
0x8000-0xbfff)
---[ end Kernel panic - not syncing: DMAR hardware is malfunctioning ]---




Do you mean that system hangs in iommu_disable_translation() without 
this fix.



[S3 successfully with the patch]


And, this failure disappeared after you applied this fix?

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu

2020-07-21 Thread Limonciello, Mario



> -Original Message-
> From: Lu Baolu 
> Sent: Tuesday, July 21, 2020 6:07 PM
> To: Limonciello, Mario; Joerg Roedel
> Cc: baolu...@linux.intel.com; Ashok Raj; linux-ker...@vger.kernel.org;
> sta...@vger.kernel.org; Koba Ko; iommu@lists.linux-foundation.org
> Subject: Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated
> iommu
> 
> 
> [EXTERNAL EMAIL]
> 
> Hi Limonciello,
> 
> On 7/21/20 10:44 PM, Limonciello, Mario wrote:
> >> -Original Message-
> >> From: iommu  On Behalf Of Lu
> >> Baolu
> >> Sent: Monday, July 20, 2020 7:17 PM
> >> To: Joerg Roedel
> >> Cc: Ashok Raj;linux-ker...@vger.kernel.org;sta...@vger.kernel.org; Koba
> >> Ko;iommu@lists.linux-foundation.org
> >> Subject: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated
> >> iommu
> >>
> >> The VT-d spec requires (10.4.4 Global Command Register, TE field) that:
> >>
> >> Hardware implementations supporting DMA draining must drain any in-flight
> >> DMA read/write requests queued within the Root-Complex before completing
> >> the translation enable command and reflecting the status of the command
> >> through the TES field in the Global Status register.
> >>
> >> Unfortunately, some integrated graphic devices fail to do so after some
> >> kind of power state transition. As the result, the system might stuck in
> >> iommu_disable_translation(), waiting for the completion of TE transition.
> >>
> >> This provides a quirk list for those devices and skips TE disabling if
> >> the qurik hits.
> >>
> >> Fixes:https://bugzilla.kernel.org/show_bug.cgi?id=208363
> > That one is for TGL.
> >
> > I think you also want to add this one for ICL:
> > Fixes:https://bugzilla.kernel.org/show_bug.cgi?id=206571
> >
> 
> Do you mean someone have tested that this patch also fixes the problem
> described in 206571?
> 

Yes, confusingly https://bugzilla.kernel.org/show_bug.cgi?id=208363#c31 actually
is the XPS 9300 ICL system and issue.

I also have a private confirmation from another person that it resolves it for
them on another ICL platform.

Christian, maybe you can add a tested by clause for the ICL testing.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu

2020-07-21 Thread Lu Baolu

Hi Limonciello,

On 7/21/20 10:44 PM, Limonciello, Mario wrote:

-Original Message-
From: iommu  On Behalf Of Lu
Baolu
Sent: Monday, July 20, 2020 7:17 PM
To: Joerg Roedel
Cc: Ashok Raj;linux-ker...@vger.kernel.org;sta...@vger.kernel.org; Koba
Ko;iommu@lists.linux-foundation.org
Subject: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated
iommu

The VT-d spec requires (10.4.4 Global Command Register, TE field) that:

Hardware implementations supporting DMA draining must drain any in-flight
DMA read/write requests queued within the Root-Complex before completing
the translation enable command and reflecting the status of the command
through the TES field in the Global Status register.

Unfortunately, some integrated graphic devices fail to do so after some
kind of power state transition. As the result, the system might stuck in
iommu_disable_translation(), waiting for the completion of TE transition.

This provides a quirk list for those devices and skips TE disabling if
the qurik hits.

Fixes:https://bugzilla.kernel.org/show_bug.cgi?id=208363

That one is for TGL.

I think you also want to add this one for ICL:
Fixes:https://bugzilla.kernel.org/show_bug.cgi?id=206571



Do you mean someone have tested that this patch also fixes the problem
described in 206571?

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu

2020-07-21 Thread Limonciello, Mario
> -Original Message-
> From: iommu  On Behalf Of Lu
> Baolu
> Sent: Monday, July 20, 2020 7:17 PM
> To: Joerg Roedel
> Cc: Ashok Raj; linux-ker...@vger.kernel.org; sta...@vger.kernel.org; Koba
> Ko; iommu@lists.linux-foundation.org
> Subject: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated
> iommu
> 
> The VT-d spec requires (10.4.4 Global Command Register, TE field) that:
> 
> Hardware implementations supporting DMA draining must drain any in-flight
> DMA read/write requests queued within the Root-Complex before completing
> the translation enable command and reflecting the status of the command
> through the TES field in the Global Status register.
> 
> Unfortunately, some integrated graphic devices fail to do so after some
> kind of power state transition. As the result, the system might stuck in
> iommu_disable_translation(), waiting for the completion of TE transition.
> 
> This provides a quirk list for those devices and skips TE disabling if
> the qurik hits.
> 
> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=208363
That one is for TGL.

I think you also want to add this one for ICL:
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206571

> Tested-by: Koba Ko 
> Cc: Ashok Raj 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel/dmar.c  |  1 +
>  drivers/iommu/intel/iommu.c | 27 +++
>  include/linux/dmar.h|  1 +
>  include/linux/intel-iommu.h |  2 ++
>  4 files changed, 31 insertions(+)
> 
> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
> index 683b812c5c47..16f47041f1bf 100644
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -1102,6 +1102,7 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd)
>   }
> 
>   drhd->iommu = iommu;
> + iommu->drhd = drhd;
> 
>   return 0;
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 98390a6d8113..11418b14cc3f 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -356,6 +356,7 @@ static int intel_iommu_strict;
>  static int intel_iommu_superpage = 1;
>  static int iommu_identity_mapping;
>  static int intel_no_bounce;
> +static int iommu_skip_te_disable;
> 
>  #define IDENTMAP_GFX 2
>  #define IDENTMAP_AZALIA  4
> @@ -1633,6 +1634,10 @@ static void iommu_disable_translation(struct
> intel_iommu *iommu)
>   u32 sts;
>   unsigned long flag;
> 
> + if (iommu_skip_te_disable && iommu->drhd->gfx_dedicated &&
> + (cap_read_drain(iommu->cap) || cap_write_drain(iommu->cap)))
> + return;
> +
>   raw_spin_lock_irqsave(>register_lock, flag);
>   iommu->gcmd &= ~DMA_GCMD_TE;
>   writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
> @@ -4043,6 +4048,7 @@ static void __init init_no_remapping_devices(void)
> 
>   /* This IOMMU has *only* gfx devices. Either bypass it or
>  set the gfx_mapped flag, as appropriate */
> + drhd->gfx_dedicated = 1;
>   if (!dmar_map_gfx) {
>   drhd->ignored = 1;
>   for_each_active_dev_scope(drhd->devices,
> @@ -6160,6 +6166,27 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL,
> 0x0044, quirk_calpella_no_shadow_g
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0062,
> quirk_calpella_no_shadow_gtt);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x006a,
> quirk_calpella_no_shadow_gtt);
> 
> +static void quirk_igfx_skip_te_disable(struct pci_dev *dev)
> +{
> + unsigned short ver;
> +
> + if (!IS_GFX_DEVICE(dev))
> + return;
> +
> + ver = (dev->device >> 8) & 0xff;
> + if (ver != 0x45 && ver != 0x46 && ver != 0x4c &&
> + ver != 0x4e && ver != 0x8a && ver != 0x98 &&
> + ver != 0x9a)
> + return;
> +
> + if (risky_device(dev))
> + return;
> +
> + pci_info(dev, "Skip IOMMU disabling for graphics\n");
> + iommu_skip_te_disable = 1;
> +}
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_ANY_ID,
> quirk_igfx_skip_te_disable);
> +
>  /* On Tylersburg chipsets, some BIOSes have been known to enable the
> ISOCH DMAR unit for the Azalia sound device, but not give it any
> TLB entries, which causes it to deadlock. Check for that.  We do
> diff --git a/include/linux/dmar.h b/include/linux/dmar.h
> index d7bf029df737..65565820328a 100644
> --- a/include/linux/dmar.h
> +++ b/include/linux/dmar.h
> @@ -48,6 +48,7 @@ struct dmar_drhd_unit {
>   u16 segment;/* PCI domain   */
>   u8  ignored:1;  /* ignore drhd  */
>   u8  include_all:1;
> + u8  gfx_dedicated:1;/* graphic dedicated*/
>   struct intel_iommu *iommu;
>  };
> 
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index bf6009a344f5..329629e1e9de 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@