Re: [PATCH] Revert "iommu/amd: Fix performance counter initialization"

2021-04-07 Thread Joerg Roedel
Hi Paul,

On Thu, Mar 18, 2021 at 10:20:16AM +0100, Paul Menzel wrote:
> Jörg, I know you are probably busy, but the patch was applied to the stable
> series (v5.11.7). There are still too many question open regarding the
> patch, and Suravee has not yet addressed the comments. It’d be great, if you
> could revert it.

We are currently discussing the next steps here. Maybe the retry logic
can be removed entirely.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] Revert "iommu/amd: Fix performance counter initialization"

2021-03-18 Thread Paul Menzel

Dear Jörg, dear Suravee,


Am 03.03.21 um 15:10 schrieb Alexander Monakov:

On Wed, 3 Mar 2021, Suravee Suthikulpanit wrote:


Additionally, alternative proposed solutions [1] were not considered or
discussed.

[1]:https://lore.kernel.org/linux-iommu/alpine.lnx.2.20.13.2006030935570.3...@monopod.intra.ispras.ru/


This check has been introduced early on to detect a HW issue for
certain platforms in the past, where the performance counters are not
accessible and would result in silent failure when try to use the
counters. This is considered legacy code, and can be removed if we
decide to no longer provide sanity check for such case.


Which platforms? There is no such information in the code or the commit
messages that introduced this.

According to AMD's documentation, presence of performance counters is
indicated by "PCSup" bit in the "EFR" register. I don't think the driver
should second-guess that. If there were platforms where the CPU or the
firmware lied to the OS (EFR[PCSup] was 1, but counters were not present),
I think that should have been handled in a more explicit manner, e.g.
via matching broken CPUs by cpuid.


Suravee, could you please answer the questions?

Jörg, I know you are probably busy, but the patch was applied to the 
stable series (v5.11.7). There are still too many question open 
regarding the patch, and Suravee has not yet addressed the comments. 
It’d be great, if you could revert it.



Kind regards,

Paul

Could you please
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] Revert "iommu/amd: Fix performance counter initialization"

2021-03-03 Thread Alexander Monakov
On Wed, 3 Mar 2021, Suravee Suthikulpanit wrote:

> > Additionally, alternative proposed solutions [1] were not considered or
> > discussed.
> > 
> > [1]:https://lore.kernel.org/linux-iommu/alpine.lnx.2.20.13.2006030935570.3...@monopod.intra.ispras.ru/
> 
> This check has been introduced early on to detect a HW issue for
> certain platforms in the past, where the performance counters are not
> accessible and would result in silent failure when try to use the
> counters. This is considered legacy code, and can be removed if we
> decide to no longer provide sanity check for such case.

Which platforms? There is no such information in the code or the commit
messages that introduced this.

According to AMD's documentation, presence of performance counters is
indicated by "PCSup" bit in the "EFR" register. I don't think the driver
should second-guess that. If there were platforms where the CPU or the
firmware lied to the OS (EFR[PCSup] was 1, but counters were not present),
I think that should have been handled in a more explicit manner, e.g.
via matching broken CPUs by cpuid.

Alexander
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] Revert "iommu/amd: Fix performance counter initialization"

2021-03-03 Thread Suravee Suthikulpanit

Paul,

On 3/3/21 7:11 PM, Paul Menzel wrote:

This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b.

The commit adds up to 100 ms to the boot process, which is not mentioned
in the commit message, and is making up more than 20 % on current
systems, where the Linux kernel takes 500 ms.


The 100 msec (5 * 20ms) is only for the worst-case scenario. For most cases,
the delay is not applicable. In addition, this patch has shown to fix the issue 
for
some users in the field.



 [0.00] Linux version 5.11.0-10281-g19b4f3edd5c9 
(root@a2ab663d937e) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU 
Binutils for Debian) 2.35.1) #138 SMP Wed Feb 24 11:28:17 UTC 2021
 […]
 [0.106422] smpboot: CPU0: AMD Ryzen 3 2200G with Radeon Vega Graphics 
(family: 0x17, model: 0x11, stepping: 0x0)
 […]
 [0.291257] pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU 
perf counter.
 […]

Also, it does not fix the problem on an MSI B350M MORTAR with AMD Ryzen
3 2200G (even with ten retries, resulting in 200 ms time-out).


We are still investigating to root cause the long delay for the IOMMU
performance counter unit to disable power-gating, and allow access to
the performance counters. If your concern is the amount of retries,
we can try to reduce the number of retires.



 [0.401152] pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU 
perf counter.

Additionally, alternative proposed solutions [1] were not considered or
discussed.

[1]:https://lore.kernel.org/linux-iommu/alpine.lnx.2.20.13.2006030935570.3...@monopod.intra.ispras.ru/


This check has been introduced early on to detect a HW issue for certain 
platforms in the past,
where the performance counters are not accessible and would result in silent 
failure when try
to use the counters. This is considered legacy code, and can be removed if we 
decide to no
longer provide sanity check for such case.

Regards,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] Revert "iommu/amd: Fix performance counter initialization"

2021-03-03 Thread Paul Menzel
This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b.

The commit adds up to 100 ms to the boot process, which is not mentioned
in the commit message, and is making up more than 20 % on current
systems, where the Linux kernel takes 500 ms.

[0.00] Linux version 5.11.0-10281-g19b4f3edd5c9 (root@a2ab663d937e) 
(gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 
2.35.1) #138 SMP Wed Feb 24 11:28:17 UTC 2021
[…]
[0.106422] smpboot: CPU0: AMD Ryzen 3 2200G with Radeon Vega Graphics 
(family: 0x17, model: 0x11, stepping: 0x0)
[…]
[0.291257] pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU perf 
counter.
[…]

Also, it does not fix the problem on an MSI B350M MORTAR with AMD Ryzen
3 2200G (even with ten retries, resulting in 200 ms time-out).

[0.401152] pci :00:00.2: AMD-Vi: Unable to read/write to IOMMU perf 
counter.

Additionally, alternative proposed solutions [1] were not considered or
discussed.

[1]: 
https://lore.kernel.org/linux-iommu/alpine.lnx.2.20.13.2006030935570.3...@monopod.intra.ispras.ru/

Cc: Suravee Suthikulpanit 
Cc: Tj (Elloe Linux) 
Cc: Shuah Khan 
Cc: Alexander Monakov 
Cc: David Coe 
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Paul Menzel 
---
 drivers/iommu/amd/init.c | 45 ++--
 1 file changed, 11 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 9126efcbaf2c..af195f11d254 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -12,7 +12,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -257,8 +256,6 @@ static enum iommu_init_state init_state = IOMMU_START_STATE;
 static int amd_iommu_enable_interrupts(void);
 static int __init iommu_go_to_state(enum iommu_init_state state);
 static void init_device_table_dma(void);
-static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
-   u8 fxn, u64 *value, bool is_write);
 
 static bool amd_iommu_pre_enabled = true;
 
@@ -1717,11 +1714,13 @@ static int __init init_iommu_all(struct 
acpi_table_header *table)
return 0;
 }
 
-static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
+static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
+   u8 fxn, u64 *value, bool is_write);
+
+static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 {
-   int retry;
struct pci_dev *pdev = iommu->dev;
-   u64 val = 0xabcd, val2 = 0, save_reg, save_src;
+   u64 val = 0xabcd, val2 = 0, save_reg = 0;
 
if (!iommu_feature(iommu, FEATURE_PC))
return;
@@ -1729,39 +1728,17 @@ static void __init init_iommu_perf_ctr(struct amd_iommu 
*iommu)
amd_iommu_pc_present = true;
 
/* save the value to restore, if writable */
-   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false) ||
-   iommu_pc_get_set_reg(iommu, 0, 0, 8, _src, false))
-   goto pc_false;
-
-   /*
-* Disable power gating by programing the performance counter
-* source to 20 (i.e. counts the reads and writes from/to IOMMU
-* Reserved Register [MMIO Offset 1FF8h] that are ignored.),
-* which never get incremented during this init phase.
-* (Note: The event is also deprecated.)
-*/
-   val = 20;
-   if (iommu_pc_get_set_reg(iommu, 0, 0, 8, , true))
+   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false))
goto pc_false;
 
/* Check if the performance counters can be written to */
-   val = 0xabcd;
-   for (retry = 5; retry; retry--) {
-   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, , true) ||
-   iommu_pc_get_set_reg(iommu, 0, 0, 0, , false) ||
-   val2)
-   break;
-
-   /* Wait about 20 msec for power gating to disable and retry. */
-   msleep(20);
-   }
-
-   /* restore */
-   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true) ||
-   iommu_pc_get_set_reg(iommu, 0, 0, 8, _src, true))
+   if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, , true)) ||
+   (iommu_pc_get_set_reg(iommu, 0, 0, 0, , false)) ||
+   (val != val2))
goto pc_false;
 
-   if (val != val2)
+   /* restore */
+   if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true))
goto pc_false;
 
pci_info(pdev, "IOMMU performance counters supported\n");
-- 
2.30.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu