RE: [PATCH] drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound driver

2021-09-09 Thread Chen, Guchun
[Public]

Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Friday, September 10, 2021 11:18 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Lazar, Lijo 
; Quan, Evan ; Pelloux-prayer, 
Pierre-eric 
Subject: [PATCH] drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound 
driver

Current RUNPM mechanism relies on PMFW to master the timing for BACO in/exit. 
And that needs cooperation from sound driver for dstate change notification for 
function 1(audio). Otherwise(on sound driver missing), BACO cannot be kicked in 
correctly and hang will be observed on RUNPM exit.

By switching back to legacy message way on sound driver missing, we are able to 
fix the runpm hang observed for the scenario below:
amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded

Change-Id: I0e44fef11349b5e45e6102913eb46c8c7d279c65
Signed-off-by: Evan Quan 
Reported-by: Pierre-Eric Pelloux-Prayer 
---
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 24 +--
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |  4 ++--
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 21 
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h|  2 ++
 4 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 7bc90f841a11..bcafccf7f07a 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2272,7 +2272,27 @@ static int navi10_baco_enter(struct smu_context *smu)  {
struct amdgpu_device *adev = smu->adev;
 
-   if (adev->in_runpm)
+   /*
+* This aims the case below:
+*   amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded
+*
+* For NAVI10 and later ASICs, we rely on PMFW to handle the runpm. To
+* make that possible, PMFW needs to acknowledge the dstate transition
+* process for both gfx(function 0) and audio(function 1) function of
+* the ASIC.
+*
+* The PCI device's initial runpm status is RUNPM_SUSPENDED. So as the
+* device representing the audio function of the ASIC. And that means
+* even if the sound driver(snd_hda_intel) was not loaded yet, it's 
still
+* possible runpm suspend kicked on the ASIC. However without the dstate
+* transition notification from audio function, pmfw cannot handle the
+* BACO in/exit correctly. And that will cause driver hang on runpm
+* resuming.
+*
+* To address this, we revert to legacy message way(driver masters the
+* timing for BACO in/exit) on sound driver missing.
+*/
+   if (adev->in_runpm && smu_cmn_is_audio_func_enabled(adev))
return smu_v11_0_baco_set_armd3_sequence(smu, BACO_SEQ_BACO);
else
return smu_v11_0_baco_enter(smu);
@@ -2282,7 +2302,7 @@ static int navi10_baco_exit(struct smu_context *smu)  {
struct amdgpu_device *adev = smu->adev;
 
-   if (adev->in_runpm) {
+   if (adev->in_runpm && smu_cmn_is_audio_func_enabled(adev)) {
/* Wait for PMFW handling for the Dstate change */
msleep(10);
return smu_v11_0_baco_set_armd3_sequence(smu, BACO_SEQ_ULPS); 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 43c7580a4ea6..f9b730c5ba9e 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -2361,7 +2361,7 @@ static int sienna_cichlid_baco_enter(struct smu_context 
*smu)  {
struct amdgpu_device *adev = smu->adev;
 
-   if (adev->in_runpm)
+   if (adev->in_runpm && smu_cmn_is_audio_func_enabled(adev))
return smu_v11_0_baco_set_armd3_sequence(smu, BACO_SEQ_BACO);
else
return smu_v11_0_baco_enter(smu);
@@ -2371,7 +2371,7 @@ static int sienna_cichlid_baco_exit(struct smu_context 
*smu)  {
struct amdgpu_device *adev = smu->adev;
 
-   if (adev->in_runpm) {
+   if (adev->in_runpm && smu_cmn_is_audio_func_enabled(adev)) {
/* Wait for PMFW handling for the Dstate change */
msleep(10);
return smu_v11_0_baco_set_armd3_sequence(smu, BACO_SEQ_ULPS); 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index 69da9a7b665f..d61403e917df 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -1055,3 +1055,24 @@ int smu_cmn_set_mp1_state(struct smu_context *smu,
 
return ret;
 }
+
+bool smu_cmn_is_audio_func_enabled(struct amdgpu_device *adev) {
+   struct pci_dev *p = NULL;
+   bool snd_driver_loaded;
+
+   /*
+* If the ASIC comes with no audio function, we always 

RE: [PATCH v3 1/1] drm/amdkfd: make needs_pcie_atomics FW-version dependent

2021-09-09 Thread Chen, Guchun
[Public]

Move PCIe atomic detection from kgf2kfd_probe into kgf2kfd_device_init because 
the MEC firmware is not loaded yet at the probe stage

A spelling typo, s/kgf2kfd_device_init/ kgd2kfd_device_init

With above fixed, the patch is: Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Felix 
Kuehling
Sent: Friday, September 10, 2021 1:10 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Shaoyun 
Subject: Re: [PATCH v3 1/1] drm/amdkfd: make needs_pcie_atomics FW-version 
dependent

Am 2021-09-08 um 6:48 p.m. schrieb Felix Kuehling:
> On some GPUs the PCIe atomic requirement for KFD depends on the MEC 
> firmware version. Add a firmware version check for this. The minimum 
> firmware version that works without atomics can be updated in the 
> device_info structure for each GPU type.
>
> Move PCIe atomic detection from kgf2kfd_probe into kgf2kfd_device_init 
> because the MEC firmware is not loaded yet at the probe stage.
>
> Signed-off-by: Felix Kuehling 
I tested this change on a Sienna Cichlid on a system without PCIe atomics, both 
with the old and the new firmware. This version of the change should be good to 
go if I can get an R-b.

Thanks,
  Felix


> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 44 -
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h   |  1 +
>  2 files changed, 29 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 16a57b70cc1a..30fde852af19 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -468,6 +468,7 @@ static const struct kfd_device_info navi10_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 145,
>   .num_sdma_engines = 2,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -487,6 +488,7 @@ static const struct kfd_device_info navi12_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 145,
>   .num_sdma_engines = 2,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -506,6 +508,7 @@ static const struct kfd_device_info navi14_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 145,
>   .num_sdma_engines = 2,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -525,6 +528,7 @@ static const struct kfd_device_info 
> sienna_cichlid_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 92,
>   .num_sdma_engines = 4,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -544,6 +548,7 @@ static const struct kfd_device_info 
> navy_flounder_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 92,
>   .num_sdma_engines = 2,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -562,7 +567,8 @@ static const struct kfd_device_info vangogh_device_info = 
> {
>   .mqd_size_aligned = MQD_SIZE_ALIGNED,
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
> - .needs_pci_atomics = false,
> + .needs_pci_atomics = true,
> + .no_atomic_fw_version = 92,
>   .num_sdma_engines = 1,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 2,
> @@ -582,6 +588,7 @@ static const struct kfd_device_info 
> dimgrey_cavefish_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 92,
>   .num_sdma_engines = 2,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -601,6 +608,7 @@ static const struct kfd_device_info 
> beige_goby_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 92,
>   .num_sdma_engines = 1,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -619,7 +627,8 @@ static const struct kfd_device_info 
> yellow_carp_device_info = {
>   .mqd_size_aligned = MQD_SIZE_ALIGNED,
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
> - .needs_pci_atomics = false,
> + .needs_pci_atomics = true,
> + .no_atomic_fw_version = 92,
>   .num_sdma_engines = 1,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 2,
> @@ -708,20 +717,6 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>   if (!kfd)
>   return NULL;
>  
> - /* Allow BIF to recode atomics to PCIe 3.0 

Re: [PATCH v3 1/1] drm/amdkfd: make needs_pcie_atomics FW-version dependent

2021-09-09 Thread Felix Kuehling
Am 2021-09-08 um 6:48 p.m. schrieb Felix Kuehling:
> On some GPUs the PCIe atomic requirement for KFD depends on the MEC
> firmware version. Add a firmware version check for this. The minimum
> firmware version that works without atomics can be updated in the
> device_info structure for each GPU type.
>
> Move PCIe atomic detection from kgf2kfd_probe into kgf2kfd_device_init
> because the MEC firmware is not loaded yet at the probe stage.
>
> Signed-off-by: Felix Kuehling 
I tested this change on a Sienna Cichlid on a system without PCIe
atomics, both with the old and the new firmware. This version of the
change should be good to go if I can get an R-b.

Thanks,
  Felix


> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 44 -
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h   |  1 +
>  2 files changed, 29 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 16a57b70cc1a..30fde852af19 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -468,6 +468,7 @@ static const struct kfd_device_info navi10_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 145,
>   .num_sdma_engines = 2,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -487,6 +488,7 @@ static const struct kfd_device_info navi12_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 145,
>   .num_sdma_engines = 2,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -506,6 +508,7 @@ static const struct kfd_device_info navi14_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 145,
>   .num_sdma_engines = 2,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -525,6 +528,7 @@ static const struct kfd_device_info 
> sienna_cichlid_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 92,
>   .num_sdma_engines = 4,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -544,6 +548,7 @@ static const struct kfd_device_info 
> navy_flounder_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 92,
>   .num_sdma_engines = 2,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -562,7 +567,8 @@ static const struct kfd_device_info vangogh_device_info = 
> {
>   .mqd_size_aligned = MQD_SIZE_ALIGNED,
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
> - .needs_pci_atomics = false,
> + .needs_pci_atomics = true,
> + .no_atomic_fw_version = 92,
>   .num_sdma_engines = 1,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 2,
> @@ -582,6 +588,7 @@ static const struct kfd_device_info 
> dimgrey_cavefish_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 92,
>   .num_sdma_engines = 2,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -601,6 +608,7 @@ static const struct kfd_device_info 
> beige_goby_device_info = {
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
>   .needs_pci_atomics = true,
> + .no_atomic_fw_version = 92,
>   .num_sdma_engines = 1,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 8,
> @@ -619,7 +627,8 @@ static const struct kfd_device_info 
> yellow_carp_device_info = {
>   .mqd_size_aligned = MQD_SIZE_ALIGNED,
>   .needs_iommu_device = false,
>   .supports_cwsr = true,
> - .needs_pci_atomics = false,
> + .needs_pci_atomics = true,
> + .no_atomic_fw_version = 92,
>   .num_sdma_engines = 1,
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 2,
> @@ -708,20 +717,6 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>   if (!kfd)
>   return NULL;
>  
> - /* Allow BIF to recode atomics to PCIe 3.0 AtomicOps.
> -  * 32 and 64-bit requests are possible and must be
> -  * supported.
> -  */
> - kfd->pci_atomic_requested = amdgpu_amdkfd_have_atomics_support(kgd);
> - if (device_info->needs_pci_atomics &&
> - !kfd->pci_atomic_requested) {
> - dev_info(kfd_device,
> -  "skipped device %x:%x, PCI rejects atomics\n",
> -  pdev->vendor, pdev->device);
> - kfree(kfd);
> - return NULL;
> - }
> -
>   kfd->kgd = kgd;
>   

Re: [PATCH] drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound driver

2021-09-09 Thread Lazar, Lijo




On 9/10/2021 8:47 AM, Evan Quan wrote:

Current RUNPM mechanism relies on PMFW to master the timing for BACO
in/exit. And that needs cooperation from sound driver for dstate
change notification for function 1(audio). Otherwise(on sound driver
missing), BACO cannot be kicked in correctly and hang will be observed
on RUNPM exit.

By switching back to legacy message way on sound driver missing,
we are able to fix the runpm hang observed for the scenario below:
amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded

Change-Id: I0e44fef11349b5e45e6102913eb46c8c7d279c65
Signed-off-by: Evan Quan 
Reported-by: Pierre-Eric Pelloux-Prayer 


Reviewed-by: Lijo Lazar 

Thanks,
Lijo


---
  .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 24 +--
  .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |  4 ++--
  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 21 
  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h|  2 ++
  4 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 7bc90f841a11..bcafccf7f07a 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2272,7 +2272,27 @@ static int navi10_baco_enter(struct smu_context *smu)
  {
struct amdgpu_device *adev = smu->adev;
  
-	if (adev->in_runpm)

+   /*
+* This aims the case below:
+*   amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded
+*
+* For NAVI10 and later ASICs, we rely on PMFW to handle the runpm. To
+* make that possible, PMFW needs to acknowledge the dstate transition
+* process for both gfx(function 0) and audio(function 1) function of
+* the ASIC.
+*
+* The PCI device's initial runpm status is RUNPM_SUSPENDED. So as the
+* device representing the audio function of the ASIC. And that means
+* even if the sound driver(snd_hda_intel) was not loaded yet, it's 
still
+* possible runpm suspend kicked on the ASIC. However without the dstate
+* transition notification from audio function, pmfw cannot handle the
+* BACO in/exit correctly. And that will cause driver hang on runpm
+* resuming.
+*
+* To address this, we revert to legacy message way(driver masters the
+* timing for BACO in/exit) on sound driver missing.
+*/
+   if (adev->in_runpm && smu_cmn_is_audio_func_enabled(adev))
return smu_v11_0_baco_set_armd3_sequence(smu, BACO_SEQ_BACO);
else
return smu_v11_0_baco_enter(smu);
@@ -2282,7 +2302,7 @@ static int navi10_baco_exit(struct smu_context *smu)
  {
struct amdgpu_device *adev = smu->adev;
  
-	if (adev->in_runpm) {

+   if (adev->in_runpm && smu_cmn_is_audio_func_enabled(adev)) {
/* Wait for PMFW handling for the Dstate change */
msleep(10);
return smu_v11_0_baco_set_armd3_sequence(smu, BACO_SEQ_ULPS);
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 43c7580a4ea6..f9b730c5ba9e 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -2361,7 +2361,7 @@ static int sienna_cichlid_baco_enter(struct smu_context 
*smu)
  {
struct amdgpu_device *adev = smu->adev;
  
-	if (adev->in_runpm)

+   if (adev->in_runpm && smu_cmn_is_audio_func_enabled(adev))
return smu_v11_0_baco_set_armd3_sequence(smu, BACO_SEQ_BACO);
else
return smu_v11_0_baco_enter(smu);
@@ -2371,7 +2371,7 @@ static int sienna_cichlid_baco_exit(struct smu_context 
*smu)
  {
struct amdgpu_device *adev = smu->adev;
  
-	if (adev->in_runpm) {

+   if (adev->in_runpm && smu_cmn_is_audio_func_enabled(adev)) {
/* Wait for PMFW handling for the Dstate change */
msleep(10);
return smu_v11_0_baco_set_armd3_sequence(smu, BACO_SEQ_ULPS);
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index 69da9a7b665f..d61403e917df 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -1055,3 +1055,24 @@ int smu_cmn_set_mp1_state(struct smu_context *smu,
  
  	return ret;

  }
+
+bool smu_cmn_is_audio_func_enabled(struct amdgpu_device *adev)
+{
+   struct pci_dev *p = NULL;
+   bool snd_driver_loaded;
+
+   /*
+* If the ASIC comes with no audio function, we always assume
+* it is "enabled".
+*/
+   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
+   adev->pdev->bus->number, 1);
+   if (!p)
+   return true;
+
+   snd_driver_loaded = pci_is_enabled(p) ? true : false;
+
+   

Re: [PATCH] drm/ttm: add a BUG_ON in ttm_set_driver_manager when array bounds

2021-09-09 Thread Pan, Xinhui
[AMD Official Use Only]

looks good to me.
But maybe build_bug_on works too and more reasonable to detect such wrong usage.

From: Chen, Guchun 
Sent: Friday, September 10, 2021 12:30:14 PM
To: amd-gfx@lists.freedesktop.org ; 
dri-de...@lists.freedesktop.org ; Koenig, 
Christian ; Pan, Xinhui ; 
Deucher, Alexander 
Cc: Chen, Guchun ; Shi, Leslie 
Subject: [PATCH] drm/ttm: add a BUG_ON in ttm_set_driver_manager when array 
bounds

Vendor will define their own memory types on top of TTM_PL_PRIV,
but call ttm_set_driver_manager directly without checking mem_type
value when setting up memory manager. So add such check to aware
the case when array bounds.

Signed-off-by: Leslie Shi 
Signed-off-by: Guchun Chen 
---
 include/drm/ttm/ttm_device.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index 7a0f561c57ee..24ad76ca8022 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -308,6 +308,7 @@ ttm_manager_type(struct ttm_device *bdev, int mem_type)
 static inline void ttm_set_driver_manager(struct ttm_device *bdev, int type,
   struct ttm_resource_manager *manager)
 {
+   BUG_ON(type >= TTM_NUM_MEM_TYPES);
 bdev->man_drv[type] = manager;
 }

--
2.17.1



[PATCH] drm/ttm: add a BUG_ON in ttm_set_driver_manager when array bounds

2021-09-09 Thread Guchun Chen
Vendor will define their own memory types on top of TTM_PL_PRIV,
but call ttm_set_driver_manager directly without checking mem_type
value when setting up memory manager. So add such check to aware
the case when array bounds.

Signed-off-by: Leslie Shi 
Signed-off-by: Guchun Chen 
---
 include/drm/ttm/ttm_device.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index 7a0f561c57ee..24ad76ca8022 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -308,6 +308,7 @@ ttm_manager_type(struct ttm_device *bdev, int mem_type)
 static inline void ttm_set_driver_manager(struct ttm_device *bdev, int type,
  struct ttm_resource_manager *manager)
 {
+   BUG_ON(type >= TTM_NUM_MEM_TYPES);
bdev->man_drv[type] = manager;
 }
 
-- 
2.17.1



[PATCH] drm/amdgpu: use generic fb helpers instead of setting up AMD own's.

2021-09-09 Thread Evan Quan
With the shadow buffer support from generic framebuffer emulation, it's
possible now to have runpm kicked when no update for console.

Change-Id: I285472c9100ee6f649d3f3f3548f402b9cd34eaf
Signed-off-by: Evan Quan 
Acked-by: Christian König 
--
v1->v2:
  - rename amdgpu_align_pitch as amdgpu_gem_align_pitch to align with
other APIs from the same file (Alex)
---
 drivers/gpu/drm/amd/amdgpu/Makefile |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c |  11 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  13 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c  | 388 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |  30 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h|  20 -
 7 files changed, 50 insertions(+), 426 deletions(-)
 delete mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 8d0748184a14..73a2151ee43f 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -45,7 +45,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_atombios.o atombios_crtc.o amdgpu_connectors.o \
atom.o amdgpu_fence.o amdgpu_ttm.o amdgpu_object.o amdgpu_gart.o \
amdgpu_encoders.o amdgpu_display.o amdgpu_i2c.o \
-   amdgpu_fb.o amdgpu_gem.o amdgpu_ring.o \
+   amdgpu_gem.o amdgpu_ring.o \
amdgpu_cs.o amdgpu_bios.o amdgpu_benchmark.o amdgpu_test.o \
atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \
atombios_encoders.o amdgpu_sa.o atombios_i2c.o \
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 682d459e992a..bcc308b7f826 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3695,8 +3695,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
/* Get a log2 for easy divisions. */
adev->mm_stats.log2_max_MBps = ilog2(max(1u, max_MBps));
 
-   amdgpu_fbdev_init(adev);
-
r = amdgpu_pm_sysfs_init(adev);
if (r) {
adev->pm_sysfs_en = false;
@@ -3854,8 +3852,6 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
amdgpu_ucode_sysfs_fini(adev);
sysfs_remove_files(>dev->kobj, amdgpu_dev_attributes);
 
-   amdgpu_fbdev_fini(adev);
-
amdgpu_irq_fini_hw(adev);
 
amdgpu_device_ip_fini_early(adev);
@@ -3931,7 +3927,7 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
drm_kms_helper_poll_disable(dev);
 
if (fbcon)
-   amdgpu_fbdev_set_suspend(adev, 1);
+   
drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true);
 
cancel_delayed_work_sync(>delayed_init_work);
 
@@ -4009,7 +4005,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
fbcon)
flush_delayed_work(>delayed_init_work);
 
if (fbcon)
-   amdgpu_fbdev_set_suspend(adev, 0);
+   
drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, false);
 
drm_kms_helper_poll_enable(dev);
 
@@ -4638,7 +4634,7 @@ int amdgpu_do_asic_reset(struct list_head 
*device_list_handle,
if (r)
goto out;
 
-   amdgpu_fbdev_set_suspend(tmp_adev, 0);
+   
drm_fb_helper_set_suspend_unlocked(adev_to_drm(tmp_adev)->fb_helper, false);
 
/*
 * The GPU enters bad state once faulty pages
@@ -5025,7 +5021,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 */
amdgpu_unregister_gpu_instance(tmp_adev);
 
-   amdgpu_fbdev_set_suspend(tmp_adev, 1);
+   
drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true);
 
/* disable ras on ALL IPs */
if (!need_emergency_restart &&
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 7a7316731911..58bfc7f00d76 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1572,13 +1572,10 @@ int amdgpu_display_suspend_helper(struct amdgpu_device 
*adev)
continue;
}
robj = gem_to_amdgpu_bo(fb->obj[0]);
-   /* don't unpin kernel fb objects */
-   if (!amdgpu_fbdev_robj_is_fb(adev, robj)) {
-   r = amdgpu_bo_reserve(robj, true);
-   if (r == 0) {
-   amdgpu_bo_unpin(robj);
-   amdgpu_bo_unreserve(robj);
-   }
+   r = amdgpu_bo_reserve(robj, true);
+   if (r == 0) {
+   amdgpu_bo_unpin(robj);
+   amdgpu_bo_unreserve(robj);

[PATCH] drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound driver

2021-09-09 Thread Evan Quan
Current RUNPM mechanism relies on PMFW to master the timing for BACO
in/exit. And that needs cooperation from sound driver for dstate
change notification for function 1(audio). Otherwise(on sound driver
missing), BACO cannot be kicked in correctly and hang will be observed
on RUNPM exit.

By switching back to legacy message way on sound driver missing,
we are able to fix the runpm hang observed for the scenario below:
amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded

Change-Id: I0e44fef11349b5e45e6102913eb46c8c7d279c65
Signed-off-by: Evan Quan 
Reported-by: Pierre-Eric Pelloux-Prayer 
---
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 24 +--
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |  4 ++--
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 21 
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h|  2 ++
 4 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 7bc90f841a11..bcafccf7f07a 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2272,7 +2272,27 @@ static int navi10_baco_enter(struct smu_context *smu)
 {
struct amdgpu_device *adev = smu->adev;
 
-   if (adev->in_runpm)
+   /*
+* This aims the case below:
+*   amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded
+*
+* For NAVI10 and later ASICs, we rely on PMFW to handle the runpm. To
+* make that possible, PMFW needs to acknowledge the dstate transition
+* process for both gfx(function 0) and audio(function 1) function of
+* the ASIC.
+*
+* The PCI device's initial runpm status is RUNPM_SUSPENDED. So as the
+* device representing the audio function of the ASIC. And that means
+* even if the sound driver(snd_hda_intel) was not loaded yet, it's 
still
+* possible runpm suspend kicked on the ASIC. However without the dstate
+* transition notification from audio function, pmfw cannot handle the
+* BACO in/exit correctly. And that will cause driver hang on runpm
+* resuming.
+*
+* To address this, we revert to legacy message way(driver masters the
+* timing for BACO in/exit) on sound driver missing.
+*/
+   if (adev->in_runpm && smu_cmn_is_audio_func_enabled(adev))
return smu_v11_0_baco_set_armd3_sequence(smu, BACO_SEQ_BACO);
else
return smu_v11_0_baco_enter(smu);
@@ -2282,7 +2302,7 @@ static int navi10_baco_exit(struct smu_context *smu)
 {
struct amdgpu_device *adev = smu->adev;
 
-   if (adev->in_runpm) {
+   if (adev->in_runpm && smu_cmn_is_audio_func_enabled(adev)) {
/* Wait for PMFW handling for the Dstate change */
msleep(10);
return smu_v11_0_baco_set_armd3_sequence(smu, BACO_SEQ_ULPS);
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 43c7580a4ea6..f9b730c5ba9e 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -2361,7 +2361,7 @@ static int sienna_cichlid_baco_enter(struct smu_context 
*smu)
 {
struct amdgpu_device *adev = smu->adev;
 
-   if (adev->in_runpm)
+   if (adev->in_runpm && smu_cmn_is_audio_func_enabled(adev))
return smu_v11_0_baco_set_armd3_sequence(smu, BACO_SEQ_BACO);
else
return smu_v11_0_baco_enter(smu);
@@ -2371,7 +2371,7 @@ static int sienna_cichlid_baco_exit(struct smu_context 
*smu)
 {
struct amdgpu_device *adev = smu->adev;
 
-   if (adev->in_runpm) {
+   if (adev->in_runpm && smu_cmn_is_audio_func_enabled(adev)) {
/* Wait for PMFW handling for the Dstate change */
msleep(10);
return smu_v11_0_baco_set_armd3_sequence(smu, BACO_SEQ_ULPS);
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index 69da9a7b665f..d61403e917df 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -1055,3 +1055,24 @@ int smu_cmn_set_mp1_state(struct smu_context *smu,
 
return ret;
 }
+
+bool smu_cmn_is_audio_func_enabled(struct amdgpu_device *adev)
+{
+   struct pci_dev *p = NULL;
+   bool snd_driver_loaded;
+
+   /*
+* If the ASIC comes with no audio function, we always assume
+* it is "enabled".
+*/
+   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
+   adev->pdev->bus->number, 1);
+   if (!p)
+   return true;
+
+   snd_driver_loaded = pci_is_enabled(p) ? true : false;
+
+   pci_dev_put(p);
+
+   return snd_driver_loaded;
+}
diff --git 

[PATCH 4/4] drm/amdgpu: VCN avoid memory allocation during IB test

2021-09-09 Thread xinhui pan
alloc extra msg from direct IB pool.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 99 +++--
 1 file changed, 45 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 561296a85b43..b60d5f01fdae 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -541,15 +541,14 @@ int amdgpu_vcn_dec_sw_ring_test_ring(struct amdgpu_ring 
*ring)
 }
 
 static int amdgpu_vcn_dec_send_msg(struct amdgpu_ring *ring,
-  struct amdgpu_bo *bo,
-  struct dma_fence **fence)
+   struct amdgpu_ib *ib_msg,
+   struct dma_fence **fence)
 {
struct amdgpu_device *adev = ring->adev;
struct dma_fence *f = NULL;
struct amdgpu_job *job;
struct amdgpu_ib *ib;
-   uint64_t addr;
-   void *msg = NULL;
+   uint64_t addr = ib_msg->gpu_addr;
int i, r;
 
r = amdgpu_job_alloc_with_ib(adev, 64,
@@ -558,8 +557,6 @@ static int amdgpu_vcn_dec_send_msg(struct amdgpu_ring *ring,
goto err;
 
ib = >ibs[0];
-   addr = amdgpu_bo_gpu_offset(bo);
-   msg = amdgpu_bo_kptr(bo);
ib->ptr[0] = PACKET0(adev->vcn.internal.data0, 0);
ib->ptr[1] = addr;
ib->ptr[2] = PACKET0(adev->vcn.internal.data1, 0);
@@ -576,9 +573,7 @@ static int amdgpu_vcn_dec_send_msg(struct amdgpu_ring *ring,
if (r)
goto err_free;
 
-   amdgpu_bo_fence(bo, f, false);
-   amdgpu_bo_unreserve(bo);
-   amdgpu_bo_free_kernel(, NULL, (void **));
+   amdgpu_ib_free(adev, ib_msg, f);
 
if (fence)
*fence = dma_fence_get(f);
@@ -588,27 +583,26 @@ static int amdgpu_vcn_dec_send_msg(struct amdgpu_ring 
*ring,
 
 err_free:
amdgpu_job_free(job);
-
 err:
-   amdgpu_bo_unreserve(bo);
-   amdgpu_bo_free_kernel(, NULL, (void **));
+   amdgpu_ib_free(adev, ib_msg, f);
return r;
 }
 
 static int amdgpu_vcn_dec_get_create_msg(struct amdgpu_ring *ring, uint32_t 
handle,
-struct amdgpu_bo **bo)
+   struct amdgpu_ib *ib)
 {
struct amdgpu_device *adev = ring->adev;
uint32_t *msg;
int r, i;
 
-   *bo = NULL;
-   r = amdgpu_bo_create_reserved(adev, 1024, PAGE_SIZE,
- AMDGPU_GEM_DOMAIN_VRAM,
- bo, NULL, (void **));
+   memset(ib, 0, sizeof(*ib));
+   r = amdgpu_ib_get(adev, NULL, PAGE_SIZE,
+   AMDGPU_IB_POOL_DIRECT,
+   ib);
if (r)
return r;
 
+   msg = ib->ptr;
msg[0] = cpu_to_le32(0x0028);
msg[1] = cpu_to_le32(0x0038);
msg[2] = cpu_to_le32(0x0001);
@@ -630,19 +624,20 @@ static int amdgpu_vcn_dec_get_create_msg(struct 
amdgpu_ring *ring, uint32_t hand
 }
 
 static int amdgpu_vcn_dec_get_destroy_msg(struct amdgpu_ring *ring, uint32_t 
handle,
- struct amdgpu_bo **bo)
+ struct amdgpu_ib *ib)
 {
struct amdgpu_device *adev = ring->adev;
uint32_t *msg;
int r, i;
 
-   *bo = NULL;
-   r = amdgpu_bo_create_reserved(adev, 1024, PAGE_SIZE,
- AMDGPU_GEM_DOMAIN_VRAM,
- bo, NULL, (void **));
+   memset(ib, 0, sizeof(*ib));
+   r = amdgpu_ib_get(adev, NULL, PAGE_SIZE,
+   AMDGPU_IB_POOL_DIRECT,
+   ib);
if (r)
return r;
 
+   msg = ib->ptr;
msg[0] = cpu_to_le32(0x0028);
msg[1] = cpu_to_le32(0x0018);
msg[2] = cpu_to_le32(0x);
@@ -658,21 +653,21 @@ static int amdgpu_vcn_dec_get_destroy_msg(struct 
amdgpu_ring *ring, uint32_t han
 int amdgpu_vcn_dec_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 {
struct dma_fence *fence = NULL;
-   struct amdgpu_bo *bo;
+   struct amdgpu_ib ib;
long r;
 
-   r = amdgpu_vcn_dec_get_create_msg(ring, 1, );
+   r = amdgpu_vcn_dec_get_create_msg(ring, 1, );
if (r)
goto error;
 
-   r = amdgpu_vcn_dec_send_msg(ring, bo, NULL);
+   r = amdgpu_vcn_dec_send_msg(ring, , NULL);
if (r)
goto error;
-   r = amdgpu_vcn_dec_get_destroy_msg(ring, 1, );
+   r = amdgpu_vcn_dec_get_destroy_msg(ring, 1, );
if (r)
goto error;
 
-   r = amdgpu_vcn_dec_send_msg(ring, bo, );
+   r = amdgpu_vcn_dec_send_msg(ring, , );
if (r)
goto error;
 
@@ -688,8 +683,8 @@ int amdgpu_vcn_dec_ring_test_ib(struct amdgpu_ring *ring, 
long timeout)
 }
 
 static int amdgpu_vcn_dec_sw_send_msg(struct amdgpu_ring *ring,
-  struct amdgpu_bo 

[PATCH 3/4] drm/amdgpu: VCE avoid memory allocation during IB test

2021-09-09 Thread xinhui pan
alloc extra msg from direct IB pool.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 18 +++---
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index e9fdf49d69e8..45d98694db18 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -82,7 +82,6 @@ MODULE_FIRMWARE(FIRMWARE_VEGA20);
 
 static void amdgpu_vce_idle_work_handler(struct work_struct *work);
 static int amdgpu_vce_get_create_msg(struct amdgpu_ring *ring, uint32_t handle,
-struct amdgpu_bo *bo,
 struct dma_fence **fence);
 static int amdgpu_vce_get_destroy_msg(struct amdgpu_ring *ring, uint32_t 
handle,
  bool direct, struct dma_fence **fence);
@@ -441,7 +440,6 @@ void amdgpu_vce_free_handles(struct amdgpu_device *adev, 
struct drm_file *filp)
  * Open up a stream for HW test
  */
 static int amdgpu_vce_get_create_msg(struct amdgpu_ring *ring, uint32_t handle,
-struct amdgpu_bo *bo,
 struct dma_fence **fence)
 {
const unsigned ib_size_dw = 1024;
@@ -451,14 +449,13 @@ static int amdgpu_vce_get_create_msg(struct amdgpu_ring 
*ring, uint32_t handle,
uint64_t addr;
int i, r;
 
-   r = amdgpu_job_alloc_with_ib(ring->adev, ib_size_dw * 4,
+   r = amdgpu_job_alloc_with_ib(ring->adev, ib_size_dw * 4 + PAGE_SIZE,
 AMDGPU_IB_POOL_DIRECT, );
if (r)
return r;
 
ib = >ibs[0];
-
-   addr = amdgpu_bo_gpu_offset(bo);
+   addr = ib->gpu_addr + ib_size_dw * 4;
 
/* stitch together an VCE create msg */
ib->length_dw = 0;
@@ -1134,20 +1131,13 @@ int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)
 int amdgpu_vce_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 {
struct dma_fence *fence = NULL;
-   struct amdgpu_bo *bo = NULL;
long r;
 
/* skip vce ring1/2 ib test for now, since it's not reliable */
if (ring != >adev->vce.ring[0])
return 0;
 
-   r = amdgpu_bo_create_reserved(ring->adev, 512, PAGE_SIZE,
- AMDGPU_GEM_DOMAIN_VRAM,
- , NULL, NULL);
-   if (r)
-   return r;
-
-   r = amdgpu_vce_get_create_msg(ring, 1, bo, NULL);
+   r = amdgpu_vce_get_create_msg(ring, 1, NULL);
if (r)
goto error;
 
@@ -1163,8 +1153,6 @@ int amdgpu_vce_ring_test_ib(struct amdgpu_ring *ring, 
long timeout)
 
 error:
dma_fence_put(fence);
-   amdgpu_bo_unreserve(bo);
-   amdgpu_bo_free_kernel(, NULL, NULL);
return r;
 }
 
-- 
2.25.1



[PATCH 2/4] drm/amdgpu: UVD avoid memory allocation during IB test

2021-09-09 Thread xinhui pan
move BO allocation in sw_init.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 75 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h |  1 +
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c   |  8 +--
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c   |  8 +--
 4 files changed, 49 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index d451c359606a..e2eaac941d37 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -141,6 +141,8 @@ int amdgpu_uvd_sw_init(struct amdgpu_device *adev)
const char *fw_name;
const struct common_firmware_header *hdr;
unsigned family_id;
+   struct amdgpu_bo *bo = NULL;
+   void *addr;
int i, j, r;
 
INIT_DELAYED_WORK(>uvd.idle_work, amdgpu_uvd_idle_work_handler);
@@ -298,9 +300,34 @@ int amdgpu_uvd_sw_init(struct amdgpu_device *adev)
adev->uvd.filp[i] = NULL;
}
 
+   r = amdgpu_bo_create_reserved(adev, 128 << 10, PAGE_SIZE,
+   AMDGPU_GEM_DOMAIN_GTT,
+   , NULL, );
+   if (r)
+   return r;
+
/* from uvd v5.0 HW addressing capacity increased to 64 bits */
-   if (!amdgpu_device_ip_block_version_cmp(adev, AMD_IP_BLOCK_TYPE_UVD, 5, 
0))
+   if (!amdgpu_device_ip_block_version_cmp(adev, AMD_IP_BLOCK_TYPE_UVD, 5, 
0)) {
adev->uvd.address_64_bit = true;
+   amdgpu_bo_kunmap(bo);
+   amdgpu_bo_unpin(bo);
+   r = amdgpu_bo_pin_restricted(bo, AMDGPU_GEM_DOMAIN_VRAM,
+   0, 256 << 20);
+   if (r) {
+   amdgpu_bo_unreserve(bo);
+   amdgpu_bo_unref();
+   return r;
+   }
+   r = amdgpu_bo_kmap(bo, );
+   if (r) {
+   amdgpu_bo_unpin(bo);
+   amdgpu_bo_unreserve(bo);
+   amdgpu_bo_unref();
+   return r;
+   }
+   }
+   adev->uvd.ib_bo = bo;
+   amdgpu_bo_unreserve(bo);
 
switch (adev->asic_type) {
case CHIP_TONGA:
@@ -342,6 +369,7 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev)
for (i = 0; i < AMDGPU_MAX_UVD_ENC_RINGS; ++i)
amdgpu_ring_fini(>uvd.inst[j].ring_enc[i]);
}
+   amdgpu_bo_free_kernel(>uvd.ib_bo, NULL, NULL);
release_firmware(adev->uvd.fw);
 
return 0;
@@ -1080,23 +1108,10 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring 
*ring, struct amdgpu_bo *bo,
unsigned offset_idx = 0;
unsigned offset[3] = { UVD_BASE_SI, 0, 0 };
 
-   amdgpu_bo_kunmap(bo);
-   amdgpu_bo_unpin(bo);
-
-   if (!ring->adev->uvd.address_64_bit) {
-   struct ttm_operation_ctx ctx = { true, false };
-
-   amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_VRAM);
-   amdgpu_uvd_force_into_uvd_segment(bo);
-   r = ttm_bo_validate(>tbo, >placement, );
-   if (r)
-   goto err;
-   }
-
r = amdgpu_job_alloc_with_ib(adev, 64, direct ? AMDGPU_IB_POOL_DIRECT :
 AMDGPU_IB_POOL_DELAYED, );
if (r)
-   goto err;
+   return r;
 
if (adev->asic_type >= CHIP_VEGA10) {
offset_idx = 1 + ring->me;
@@ -1148,8 +1163,6 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, 
struct amdgpu_bo *bo,
}
 
amdgpu_bo_fence(bo, f, false);
-   amdgpu_bo_unreserve(bo);
-   amdgpu_bo_unref();
 
if (fence)
*fence = dma_fence_get(f);
@@ -1159,10 +1172,6 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, 
struct amdgpu_bo *bo,
 
 err_free:
amdgpu_job_free(job);
-
-err:
-   amdgpu_bo_unreserve(bo);
-   amdgpu_bo_unref();
return r;
 }
 
@@ -1173,16 +1182,15 @@ int amdgpu_uvd_get_create_msg(struct amdgpu_ring *ring, 
uint32_t handle,
  struct dma_fence **fence)
 {
struct amdgpu_device *adev = ring->adev;
-   struct amdgpu_bo *bo = NULL;
+   struct amdgpu_bo *bo = adev->uvd.ib_bo;
uint32_t *msg;
int r, i;
 
-   r = amdgpu_bo_create_reserved(adev, 1024, PAGE_SIZE,
- AMDGPU_GEM_DOMAIN_GTT,
- , NULL, (void **));
+   r = ttm_bo_reserve(>tbo, true, true, NULL);
if (r)
return r;
 
+   msg = amdgpu_bo_kptr(bo);
/* stitch together an UVD create msg */
msg[0] = cpu_to_le32(0x0de4);
msg[1] = cpu_to_le32(0x);
@@ -1198,23 +1206,25 @@ int amdgpu_uvd_get_create_msg(struct amdgpu_ring *ring, 
uint32_t handle,
for (i = 11; i < 1024; ++i)
msg[i] = cpu_to_le32(0x0);
 
-   return 

[PATCH 1/4] drm/amdgpu: Increase direct IB pool size

2021-09-09 Thread xinhui pan
Direct IB pool is used for vce/vcn IB extra msg too. Increase its size
to AMDGPU_IB_POOL_SIZE.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index c076a6b9a5a2..9274f32c3661 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -307,13 +307,9 @@ int amdgpu_ib_pool_init(struct amdgpu_device *adev)
return 0;
 
for (i = 0; i < AMDGPU_IB_POOL_MAX; i++) {
-   if (i == AMDGPU_IB_POOL_DIRECT)
-   size = PAGE_SIZE * 6;
-   else
-   size = AMDGPU_IB_POOL_SIZE;
-
r = amdgpu_sa_bo_manager_init(adev, >ib_pools[i],
- size, AMDGPU_GPU_PAGE_SIZE,
+ AMDGPU_IB_POOL_SIZE,
+ AMDGPU_GPU_PAGE_SIZE,
  AMDGPU_GEM_DOMAIN_GTT);
if (r)
goto error;
-- 
2.25.1



Re: [PATCH 0/4] Fix stack usage of DML

2021-09-09 Thread Nathan Chancellor
On Wed, Sep 08, 2021 at 09:00:19PM -0400, Harry Wentland wrote:
> With the '-Werror' enablement patch the amdgpu build was failing
> on clang builds because a bunch of functions were blowing past
> the 1024 byte stack frame default. Due to this we also noticed
> that a lot of functions were passing large structs by value
> instead of by pointer.
> 
> This series attempts to fix this.
> 
> There is still one remaining function that blows the 1024 limit by 40 bytes:
> 
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_mode_vba_21.c:3397:6:
>  
> error: stack frame size of 1064 bytes in function 
> 'dml21_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than=]
> 
> This will be a slightly more challenging fix but I'll see if we can get it
> below 1024 by breaking it into smaller functions.
> 
> With this series I can build amdgpu with CC=clang and a stack frame limit of 
> 1064.
> 
> This series boots on a Radeon RX 5500 XT.
> 
> Harry Wentland (4):
>   drm/amd/display: Pass display_pipe_params_st as const in DML
>   drm/amd/display: Pass all structs in display_rq_dlg_helpers by pointer
>   drm/amd/display: Fix rest of pass-by-value structs in DML
>   drm/amd/display: Allocate structs needed by dcn_bw_calc_rq_dlg_ttu in
> pipe_ctx

This series resolves some warnings that were reported on our issue
tracker for 32-bit x86. I do see some other warnings in amdgpu with
clang in various configurations but this is a great start. Thank you for
taking a look at them. For the series:

Build-tested-by: Nathan Chancellor 

>  .../gpu/drm/amd/display/dc/calcs/dcn_calcs.c  |  55 ++--
>  .../drm/amd/display/dc/dcn20/dcn20_resource.c |   2 +-
>  .../dc/dml/dcn20/display_rq_dlg_calc_20.c | 158 +--
>  .../dc/dml/dcn20/display_rq_dlg_calc_20.h |   4 +-
>  .../dc/dml/dcn20/display_rq_dlg_calc_20v2.c   | 156 +--
>  .../dc/dml/dcn20/display_rq_dlg_calc_20v2.h   |   4 +-
>  .../dc/dml/dcn21/display_rq_dlg_calc_21.c | 156 +--
>  .../dc/dml/dcn21/display_rq_dlg_calc_21.h |   4 +-
>  .../dc/dml/dcn30/display_rq_dlg_calc_30.c | 132 -
>  .../dc/dml/dcn30/display_rq_dlg_calc_30.h |   4 +-
>  .../dc/dml/dcn31/display_rq_dlg_calc_31.c | 166 ++--
>  .../dc/dml/dcn31/display_rq_dlg_calc_31.h |   4 +-
>  .../drm/amd/display/dc/dml/display_mode_lib.h |   4 +-
>  .../display/dc/dml/display_rq_dlg_helpers.c   | 256 +-
>  .../display/dc/dml/display_rq_dlg_helpers.h   |  20 +-
>  .../display/dc/dml/dml1_display_rq_dlg_calc.c | 246 -
>  .../display/dc/dml/dml1_display_rq_dlg_calc.h |  10 +-
>  .../gpu/drm/amd/display/dc/inc/core_types.h   |   3 +
>  18 files changed, 695 insertions(+), 689 deletions(-)
> 
> -- 
> 2.33.0


Re: [PATCH] drm/amd/display: Add NULL checks for vblank workqueue

2021-09-09 Thread Harry Wentland



On 2021-09-07 9:41 p.m., Mike Lothian wrote:
> Hi
> 
> I've just tested this out against Linus's tree and it seems to fix things
> 

Thanks.

> Out of interest does Tonga have GPU reset when things go wrong?
> 

Not sure. Alex might know.

Harry

> Thanks
> 
> Mike
> 
> On Tue, 7 Sept 2021 at 15:20, Harry Wentland  wrote:
>>
>>
>>
>> On 2021-09-07 10:10 a.m., Nicholas Kazlauskas wrote:
>>> [Why]
>>> If we're running a headless config with 0 links then the vblank
>>> workqueue will be NULL - causing a NULL pointer exception during
>>> any commit.
>>>
>>> [How]
>>> Guard access to the workqueue if it's NULL and don't queue or flush
>>> work if it is.
>>>
>>> Cc: Roman Li 
>>> Cc: Wayne Lin 
>>> Cc: Harry Wentland 
>>> Reported-by: Mike Lothian 
>>> BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1700 Fixes: 
>>> 91f86d4cce2 ("drm/amd/display: Use vblank control events for PSR 
>>> enable/disable")
>>> Signed-off-by: Nicholas Kazlauskas 
>>
>> Reviewed-by: Harry Wentland 
>>
>> Harry
>>
>>> ---
>>>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 32 +++
>>>  1 file changed, 18 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
>>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>> index 8837259215d..46e08736f94 100644
>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>> @@ -6185,21 +6185,23 @@ static inline int dm_set_vblank(struct drm_crtc 
>>> *crtc, bool enable)
>>>   return 0;
>>>
>>>  #if defined(CONFIG_DRM_AMD_DC_DCN)
>>> - work = kzalloc(sizeof(*work), GFP_ATOMIC);
>>> - if (!work)
>>> - return -ENOMEM;
>>> + if (dm->vblank_control_workqueue) {
>>> + work = kzalloc(sizeof(*work), GFP_ATOMIC);
>>> + if (!work)
>>> + return -ENOMEM;
>>>
>>> - INIT_WORK(>work, vblank_control_worker);
>>> - work->dm = dm;
>>> - work->acrtc = acrtc;
>>> - work->enable = enable;
>>> + INIT_WORK(>work, vblank_control_worker);
>>> + work->dm = dm;
>>> + work->acrtc = acrtc;
>>> + work->enable = enable;
>>>
>>> - if (acrtc_state->stream) {
>>> - dc_stream_retain(acrtc_state->stream);
>>> - work->stream = acrtc_state->stream;
>>> - }
>>> + if (acrtc_state->stream) {
>>> + dc_stream_retain(acrtc_state->stream);
>>> + work->stream = acrtc_state->stream;
>>> + }
>>>
>>> - queue_work(dm->vblank_control_workqueue, >work);
>>> + queue_work(dm->vblank_control_workqueue, >work);
>>> + }
>>>  #endif
>>>
>>>   return 0;
>>> @@ -8809,7 +8811,8 @@ static void amdgpu_dm_commit_planes(struct 
>>> drm_atomic_state *state,
>>>* If PSR or idle optimizations are enabled then flush out
>>>* any pending work before hardware programming.
>>>*/
>>> - flush_workqueue(dm->vblank_control_workqueue);
>>> + if (dm->vblank_control_workqueue)
>>> + flush_workqueue(dm->vblank_control_workqueue);
>>>  #endif
>>>
>>>   bundle->stream_update.stream = acrtc_state->stream;
>>> @@ -9144,7 +9147,8 @@ static void amdgpu_dm_atomic_commit_tail(struct 
>>> drm_atomic_state *state)
>>>   /* if there mode set or reset, disable eDP PSR */
>>>   if (mode_set_reset_required) {
>>>  #if defined(CONFIG_DRM_AMD_DC_DCN)
>>> - flush_workqueue(dm->vblank_control_workqueue);
>>> + if (dm->vblank_control_workqueue)
>>> + flush_workqueue(dm->vblank_control_workqueue);
>>>  #endif
>>>   amdgpu_dm_psr_disable_all(dm);
>>>   }
>>>
>>



Re: [PATCH] drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count

2021-09-09 Thread Lyude Paul
Reviewed-by: Lyude Paul 

On Thu, 2021-09-09 at 18:56 +0200, Michel Dänzer wrote:
> From: Michel Dänzer 
> 
> This was unusual; normally, inline functions are declared static as
> well, and defined in a header file if used by multiple compilation
> units. The latter would be more involved in this case, so just drop
> the inline declaration for now.
> 
> Fixes compile failure building for ppc64le on RHEL 8:
> 
> In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32,
>  from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33:
> ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function
> ‘amdgpu_ras_recovery_init’:
> ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining
> failed in call
>  to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not
> available
>    90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
>   | ^~~
> ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here
>  1985 | max_eeprom_records_len =
> amdgpu_ras_eeprom_max_record_count();
>   | 
> ^
> 
> # The function is called amdgpu_ras_eeprom_get_record_max_length on
> # stable branches
> Fixes: c84d46707ebb "drm/amdgpu: validate bad page threshold in ras(v3)"
> Signed-off-by: Michel Dänzer 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index 194590252bb9..210f30867870 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -756,7 +756,7 @@ int amdgpu_ras_eeprom_read(struct
> amdgpu_ras_eeprom_control *control,
> return res;
>  }
>  
> -inline uint32_t amdgpu_ras_eeprom_max_record_count(void)
> +uint32_t amdgpu_ras_eeprom_max_record_count(void)
>  {
> return RAS_MAX_RECORD_COUNT;
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> index f95fc61b3021..6bb00578bfbb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
> @@ -120,7 +120,7 @@ int amdgpu_ras_eeprom_read(struct
> amdgpu_ras_eeprom_control *control,
>  int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
>  struct eeprom_table_record *records, const u32
> num);
>  
> -inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
> +uint32_t amdgpu_ras_eeprom_max_record_count(void);
>  
>  void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control
> *control);
>  

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat



Re: [PATCH] drm/amd/display: dc_assert_fp_enabled assert only if FPU is not enabled

2021-09-09 Thread Harry Wentland
On 2021-09-09 12:55 p.m., Anson Jacob wrote:
> Assert only when FPU is not enabled.
> 
> Fixes: e549f77c1965 ("drm/amd/display: Add DC_FP helper to check FPU state")
> Signed-off-by: Anson Jacob 
> Cc: Christian König 
> Cc: Hersen Wu 
> Cc: Harry Wentland 
> Cc: Rodrigo Siqueira 

Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> index c9f47d167472..b1bf80da3a55 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> @@ -62,7 +62,7 @@ inline void dc_assert_fp_enabled(void)
>   depth = *pcpu;
>   put_cpu_ptr(_recursion_depth);
>  
> - ASSERT(depth > 1);
> + ASSERT(depth >= 1);
>  }
>  
>  /**
> 



RE: [PATCH] drm/amdgpu: Get atomicOps info from Host for sriov setup

2021-09-09 Thread Liu, Shaoyun
[AMD Official Use Only]

Thanks for the  review .  I accepted  your comments and  will sent another 
change list for review once your change is in. 

Regards
Shaoyun.liu


-Original Message-
From: Kuehling, Felix  
Sent: Thursday, September 9, 2021 12:18 PM
To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Get atomicOps info from Host for sriov setup

Am 2021-09-09 um 11:59 a.m. schrieb shaoyunl:
> The AtomicOp Requester Enable bit is reserved in VFs and the PF value 
> applies to all associated VFs. so guest driver can not directly enable 
> the atomicOps for VF, it depends on PF to enable it. In current 
> design, amdgpu driver  will get the enabled atomicOps bits through 
> private pf2vf data
>
> Signed-off-by: shaoyunl 
> Change-Id: Ifdbcb4396d64e3f3cbf6bcbf7ab9c7b2cb061052
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 20 ++--  
> drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h |  4 +++-
>  2 files changed, 21 insertions(+), 3 deletions(-)  mode change 100644 
> => 100755 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>  mode change 100644 => 100755 
> drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> old mode 100644
> new mode 100755
> index 653bd8fdaa33..a0d2b9eb84fc
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2167,8 +2167,6 @@ static int amdgpu_device_ip_early_init(struct 
> amdgpu_device *adev)
>   return -EINVAL;
>   }
>  
> - amdgpu_amdkfd_device_probe(adev);
> -
>   adev->pm.pp_feature = amdgpu_pp_feature_mask;
>   if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS)
>   adev->pm.pp_feature &= ~PP_GFXOFF_MASK; @@ -3562,6 +3560,24 @@ 
> int 
> amdgpu_device_init(struct amdgpu_device *adev,
>   if (r)
>   return r;
>  
> + /* enable PCIE atomic ops */
> + if (amdgpu_sriov_bios(adev))
> + adev->have_atomics_support = (((struct amd_sriov_msg_pf2vf_info 
> *)
> + 
> adev->virt.fw_reserve.p_pf2vf)->pcie_atomic_ops_enabled_flags ==
> + (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | 
> PCI_EXP_DEVCAP2_ATOMIC_COMP64))
> + ? TRUE : FALSE;

Please don't use this "condition ? TRUE : FALSE" idiom. Just "condition"
is good enough.


> + else
> + adev->have_atomics_support =
> + pci_enable_atomic_ops_to_root(adev->pdev,
> +   PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> +   PCI_EXP_DEVCAP2_ATOMIC_COMP64)
> + ? FALSE : TRUE;

Same as above, but in this case it's "!condition". Also, I would have expected 
that you remove the other call to pci_enable_atomic_ops_to_root from this 
function.


> + if (adev->have_atomics_support = false )

This should be "==", but even better would be "if
(!adev->have_atomics_support) ...

That said, the message below may be redundant. The PCIe atomic check in 
kgd2kfd_device_init already prints an error message if atomics are required by 
the GPU but not supported. If you really want to print it for information on 
GPUs where it's not required, use dev_info so the message clearly shows which 
GPU in a multi-GPU system it refers to.


> + DRM_INFO("PCIE atomic ops is not supported\n");
> +
> + amdgpu_amdkfd_device_probe(adev);

This should not be necessary. I just sent another patch for review that moves 
the PCIe atomic check in KFD into kgd2kfd_device_init:
"drm/amdkfd: make needs_pcie_atomics FW-version dependent". So 
amdgpu_amdkfd_device_probe can stay where it is, if you can wait a few days for 
my change to go in first.

Regards,
  Felix


> +
> +
>   /* doorbell bar mapping and doorbell index init*/
>   amdgpu_device_doorbell_init(adev);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
> old mode 100644
> new mode 100755
> index a434c71fde8e..995899191288
> --- a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
> @@ -204,8 +204,10 @@ struct amd_sriov_msg_pf2vf_info {
>   } mm_bw_management[AMD_SRIOV_MSG_RESERVE_VCN_INST];
>   /* UUID info */
>   struct amd_sriov_msg_uuid_info uuid_info;
> + /* pcie atomic Ops info */
> + uint32_t pcie_atomic_ops_enabled_flags;
>   /* reserved */
> - uint32_t reserved[256 - 47];
> + uint32_t reserved[256 - 48];
>  };
>  
>  struct amd_sriov_msg_vf2pf_info_header {

Re: [PATCH] drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count

2021-09-09 Thread Michel Dänzer


Apologies, I messed up the recipients in this patch, please follow up to the 
fixed version I just sent.


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer


[PATCH] drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count

2021-09-09 Thread Michel Dänzer
From: Michel Dänzer 

This was unusual; normally, inline functions are declared static as
well, and defined in a header file if used by multiple compilation
units. The latter would be more involved in this case, so just drop
the inline declaration for now.

Fixes compile failure building for ppc64le on RHEL 8:

In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32,
 from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function 
‘amdgpu_ras_recovery_init’:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining failed 
in call
 to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not 
available
   90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
  | ^~~
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here
 1985 | max_eeprom_records_len = amdgpu_ras_eeprom_max_record_count();
  |  
^

# The function is called amdgpu_ras_eeprom_get_record_max_length on
# stable branches
Fixes: c84d46707ebb "drm/amdgpu: validate bad page threshold in ras(v3)"
Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 194590252bb9..210f30867870 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -756,7 +756,7 @@ int amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control 
*control,
return res;
 }
 
-inline uint32_t amdgpu_ras_eeprom_max_record_count(void)
+uint32_t amdgpu_ras_eeprom_max_record_count(void)
 {
return RAS_MAX_RECORD_COUNT;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
index f95fc61b3021..6bb00578bfbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
@@ -120,7 +120,7 @@ int amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control 
*control,
 int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
 struct eeprom_table_record *records, const u32 
num);
 
-inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
+uint32_t amdgpu_ras_eeprom_max_record_count(void);
 
 void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control 
*control);
 
-- 
2.33.0



[PATCH] drm/amd/display: dc_assert_fp_enabled assert only if FPU is not enabled

2021-09-09 Thread Anson Jacob
Assert only when FPU is not enabled.

Fixes: e549f77c1965 ("drm/amd/display: Add DC_FP helper to check FPU state")
Signed-off-by: Anson Jacob 
Cc: Christian König 
Cc: Hersen Wu 
Cc: Harry Wentland 
Cc: Rodrigo Siqueira 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
index c9f47d167472..b1bf80da3a55 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
@@ -62,7 +62,7 @@ inline void dc_assert_fp_enabled(void)
depth = *pcpu;
put_cpu_ptr(_recursion_depth);
 
-   ASSERT(depth > 1);
+   ASSERT(depth >= 1);
 }
 
 /**
-- 
2.25.1



[PATCH] drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count

2021-09-09 Thread Michel Dänzer
From: Michel Dänzer 

This was unusual; normally, inline functions are declared static as
well, and defined in a header file if used by multiple compilation
units. The latter would be more involved in this case, so just drop
the inline declaration for now.

Fixes compile failure building for ppc64le on RHEL 8:

In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32,
 from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function 
‘amdgpu_ras_recovery_init’:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining failed 
in call
 to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not 
available
   90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
  | ^~~
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here
 1985 | max_eeprom_records_len = amdgpu_ras_eeprom_max_record_count();
  |  
^

# The function is called amdgpu_ras_eeprom_get_record_max_length on
# stable branches
Fixes: c84d46707ebb "drm/amdgpu: validate bad page threshold in ras(v3)"
Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 194590252bb9..210f30867870 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -756,7 +756,7 @@ int amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control 
*control,
return res;
 }
 
-inline uint32_t amdgpu_ras_eeprom_max_record_count(void)
+uint32_t amdgpu_ras_eeprom_max_record_count(void)
 {
return RAS_MAX_RECORD_COUNT;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
index f95fc61b3021..6bb00578bfbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
@@ -120,7 +120,7 @@ int amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control 
*control,
 int amdgpu_ras_eeprom_append(struct amdgpu_ras_eeprom_control *control,
 struct eeprom_table_record *records, const u32 
num);
 
-inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
+uint32_t amdgpu_ras_eeprom_max_record_count(void);
 
 void amdgpu_ras_debugfs_set_ret_size(struct amdgpu_ras_eeprom_control 
*control);
 
-- 
2.33.0



Re: [PATCH 3/3] drm/amdgpu: move iommu_resume before ip init/resume

2021-09-09 Thread James Zhu



On 2021-09-07 6:20 p.m., Felix Kuehling wrote:

Am 2021-09-07 um 4:30 p.m. schrieb James Zhu:


On 2021-09-07 1:53 p.m., Felix Kuehling wrote:

Am 2021-09-07 um 1:51 p.m. schrieb Felix Kuehling:

Am 2021-09-07 um 1:22 p.m. schrieb James Zhu:

On 2021-09-07 12:48 p.m., Felix Kuehling wrote:

Am 2021-09-07 um 12:07 p.m. schrieb James Zhu:

Separate iommu_resume from kfd_resume, and move it before
other amdgpu ip init/resume.

Fixed Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211277

I think the change is OK. But I don't understand how the IOMMUv2
initialization sequence could affect a crash in DM. The display should
not depend on IOMMUv2 at all. What am I missing?

[JZ] It is a weird issue. disable VCN IP block or disable gpu_off
feature, or set pci=noats, all

can fix DM crash. Also the issue occurred quite random, some time
after few suspend/resume cycle,

some times after few hundreds S/R cycles. the maximum that I saw is
2422 S/R cycles.

But every time DM crash, I can see one or two iommu errors ahead:

*AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=
flags=0x0070]*

This error is not from IOMMUv2 doing GVA to GPA translations. It's from
IOMMUv1 doing GPA to SPA translation. This error points to an invalid
physical (GVA) address being used by the GPU to access random system

Oops: s/GVA/GPA
memory it shouldn't be accessing (because there is no valid DMA mapping).

On AMD systems, IOMMUv1 tends to be in pass-through mode when IOMMUv2 is
enabled. It's possible that the earlier initialization of IOMMUv2 hides
the problem by putting the IOMMU into passthrough mode. I don't think
this patch series is a valid solution.

[JZ] Good to know, thanks! So amd_iommu_init_device is for v2 only.

And it is supposed to be safe to do amd_iommu_init_device after amdgpu
IP init/resume without any interference.


Yes, it's supposed to. But with your results below, this is getting very
confusing. It's as if the IOMMUv2 initialization has some unintended
side effects if it happens at the wrong moment during resume. If you
want to debug this further, you'll probably need to work with the server
team that's working on the IOMMU driver. I'm not sure it's worth the
trouble.


[JZ] Can you point to me who is the right person from service team? I 
wish they


can give a review on the patches and issues? Thanks!

Also I got advice from Ray, and used ignore_crat=1 during modprobe to 
get iommu v2 fallthrough,


and it  also fixed the issue.


The series is

Reviewed-by: Felix Kuehling 



You can probably fix the problem with this kernel boot parameter: iommu=pt

[JZ] Still not working after apply *iommu=pt*

BOOT_IMAGE=/boot/vmlinuz-5.8.0-41-generic
root=UUID=030a18fe-22f0-49be-818f-192093d543b5 quiet splash
modprobe.blacklist=amdgpu *iommu=pt* 3
[    0.612117] iommu: Default domain type: *Passthrough* (set via
kernel command line)
[  354.067871] amdgpu :04:00.0: AMD-Vi: Event logged
[*IO_PAGE_FAULT* domain=0x address=0x32de00040 flags=0x0070]
[  354.067884] amdgpu :04:00.0: AMD-Vi: Event logged
[IO_PAGE_FAULT domain=0x address=0x32de4 flags=0x0070]


And you can probably reproduce it even with this patch series if instead
you add: iommu=nopt amd_iommu=force_isolation

[JZ] could not set both *iommu=nopt *and*  amd*_*iommu=force_isolation
*together*. *(does it mean something?)*
*

BOOT_IMAGE=/boot/vmlinuz-5.13.0-custom+
root=UUID=030a18fe-22f0-49be-818f-192093d543b5 quiet splash
modprobe.blacklist=amdgpu*iommu=nopt amd_iommu=force_isolation* 3
[    0.294242] iommu: Default domain type: Translated (set via kernel
command line)
[    0.350675] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4
counters/bank).
[  106.403927] amdgpu :04:00.0: amdgpu: amdgpu_device_ip_resume
failed (-6).
[  106.403931] PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -6
[  106.403941] amdgpu :04:00.0: PM: failed to resume async: error -6


This is weird. Is this happening during resume or driver init?

[JZ] this happened during init, not resume.




*iommu=nopt**: *Passed at least 200 S/R cycles

BOOT_IMAGE=/boot/vmlinuz-5.13.0-custom+
root=UUID=030a18fe-22f0-49be-818f-192093d543b5 quiet splash
modprobe.blacklist=amdgpu   *iommu=nopt* 3
[    0.294242] iommu: Default domain type: Translated (set via kernel
command line)


Interesting. That's the opposite of what I would have expected.



*amd_iommu=force_isolation*: failed at 1st resume

BOOT_IMAGE=/boot/vmlinuz-5.13.0-custom+
root=UUID=030a18fe-22f0-49be-818f-192093d543b5 quiet splash
modprobe.blacklist=amdgpu *amd_iommu=force_isolation*   3
[    0.294242] iommu: Default domain type: Translated

[   49.513262] PM: suspend entry (deep)
[   49.514404] Filesystems sync: 0.001 seconds
[   49.514668] Freezing user space processes ...
[   69.523111] Freezing of tasks failed after 20.008 seconds (2 tasks
refusing to freeze, wq_busy=0):
[   69.523163] task:gnome-shell state:D stack:    0 pid: 2196
ppid:  2108 flags:0x0004


I've never seen 

Re: [PATCH] drm/amdgpu: Get atomicOps info from Host for sriov setup

2021-09-09 Thread Felix Kuehling
Am 2021-09-09 um 11:59 a.m. schrieb shaoyunl:
> The AtomicOp Requester Enable bit is reserved in VFs and the PF value applies 
> to all
> associated VFs. so guest driver can not directly enable the atomicOps for VF, 
> it
> depends on PF to enable it. In current design, amdgpu driver  will get the 
> enabled
> atomicOps bits through private pf2vf data
>
> Signed-off-by: shaoyunl 
> Change-Id: Ifdbcb4396d64e3f3cbf6bcbf7ab9c7b2cb061052
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 20 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h |  4 +++-
>  2 files changed, 21 insertions(+), 3 deletions(-)
>  mode change 100644 => 100755 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>  mode change 100644 => 100755 drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> old mode 100644
> new mode 100755
> index 653bd8fdaa33..a0d2b9eb84fc
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2167,8 +2167,6 @@ static int amdgpu_device_ip_early_init(struct 
> amdgpu_device *adev)
>   return -EINVAL;
>   }
>  
> - amdgpu_amdkfd_device_probe(adev);
> -
>   adev->pm.pp_feature = amdgpu_pp_feature_mask;
>   if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS)
>   adev->pm.pp_feature &= ~PP_GFXOFF_MASK;
> @@ -3562,6 +3560,24 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   if (r)
>   return r;
>  
> + /* enable PCIE atomic ops */
> + if (amdgpu_sriov_bios(adev))
> + adev->have_atomics_support = (((struct amd_sriov_msg_pf2vf_info 
> *)
> + 
> adev->virt.fw_reserve.p_pf2vf)->pcie_atomic_ops_enabled_flags ==
> + (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | 
> PCI_EXP_DEVCAP2_ATOMIC_COMP64))
> + ? TRUE : FALSE;

Please don't use this "condition ? TRUE : FALSE" idiom. Just "condition"
is good enough.


> + else
> + adev->have_atomics_support =
> + pci_enable_atomic_ops_to_root(adev->pdev,
> +   PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> +   PCI_EXP_DEVCAP2_ATOMIC_COMP64)
> + ? FALSE : TRUE;

Same as above, but in this case it's "!condition". Also, I would have
expected that you remove the other call to pci_enable_atomic_ops_to_root
from this function.


> + if (adev->have_atomics_support = false )

This should be "==", but even better would be "if
(!adev->have_atomics_support) ...

That said, the message below may be redundant. The PCIe atomic check in
kgd2kfd_device_init already prints an error message if atomics are
required by the GPU but not supported. If you really want to print it
for information on GPUs where it's not required, use dev_info so the
message clearly shows which GPU in a multi-GPU system it refers to.


> + DRM_INFO("PCIE atomic ops is not supported\n");
> +
> + amdgpu_amdkfd_device_probe(adev);

This should not be necessary. I just sent another patch for review that
moves the PCIe atomic check in KFD into kgd2kfd_device_init:
"drm/amdkfd: make needs_pcie_atomics FW-version dependent". So
amdgpu_amdkfd_device_probe can stay where it is, if you can wait a few
days for my change to go in first.

Regards,
  Felix


> +
> +
>   /* doorbell bar mapping and doorbell index init*/
>   amdgpu_device_doorbell_init(adev);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
> old mode 100644
> new mode 100755
> index a434c71fde8e..995899191288
> --- a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
> @@ -204,8 +204,10 @@ struct amd_sriov_msg_pf2vf_info {
>   } mm_bw_management[AMD_SRIOV_MSG_RESERVE_VCN_INST];
>   /* UUID info */
>   struct amd_sriov_msg_uuid_info uuid_info;
> + /* pcie atomic Ops info */
> + uint32_t pcie_atomic_ops_enabled_flags;
>   /* reserved */
> - uint32_t reserved[256 - 47];
> + uint32_t reserved[256 - 48];
>  };
>  
>  struct amd_sriov_msg_vf2pf_info_header {


[PATCH] drm/amdgpu: Get atomicOps info from Host for sriov setup

2021-09-09 Thread shaoyunl
The AtomicOp Requester Enable bit is reserved in VFs and the PF value applies 
to all
associated VFs. so guest driver can not directly enable the atomicOps for VF, it
depends on PF to enable it. In current design, amdgpu driver  will get the 
enabled
atomicOps bits through private pf2vf data

Signed-off-by: shaoyunl 
Change-Id: Ifdbcb4396d64e3f3cbf6bcbf7ab9c7b2cb061052
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 20 ++--
 drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h |  4 +++-
 2 files changed, 21 insertions(+), 3 deletions(-)
 mode change 100644 => 100755 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
 mode change 100644 => 100755 drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
old mode 100644
new mode 100755
index 653bd8fdaa33..a0d2b9eb84fc
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2167,8 +2167,6 @@ static int amdgpu_device_ip_early_init(struct 
amdgpu_device *adev)
return -EINVAL;
}
 
-   amdgpu_amdkfd_device_probe(adev);
-
adev->pm.pp_feature = amdgpu_pp_feature_mask;
if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS)
adev->pm.pp_feature &= ~PP_GFXOFF_MASK;
@@ -3562,6 +3560,24 @@ int amdgpu_device_init(struct amdgpu_device *adev,
if (r)
return r;
 
+   /* enable PCIE atomic ops */
+   if (amdgpu_sriov_bios(adev))
+   adev->have_atomics_support = (((struct amd_sriov_msg_pf2vf_info 
*)
+   
adev->virt.fw_reserve.p_pf2vf)->pcie_atomic_ops_enabled_flags ==
+   (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | 
PCI_EXP_DEVCAP2_ATOMIC_COMP64))
+   ? TRUE : FALSE;
+   else
+   adev->have_atomics_support =
+   pci_enable_atomic_ops_to_root(adev->pdev,
+ PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
+ PCI_EXP_DEVCAP2_ATOMIC_COMP64)
+   ? FALSE : TRUE;
+   if (adev->have_atomics_support = false )
+   DRM_INFO("PCIE atomic ops is not supported\n");
+
+   amdgpu_amdkfd_device_probe(adev);
+
+
/* doorbell bar mapping and doorbell index init*/
amdgpu_device_doorbell_init(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h 
b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
old mode 100644
new mode 100755
index a434c71fde8e..995899191288
--- a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
@@ -204,8 +204,10 @@ struct amd_sriov_msg_pf2vf_info {
} mm_bw_management[AMD_SRIOV_MSG_RESERVE_VCN_INST];
/* UUID info */
struct amd_sriov_msg_uuid_info uuid_info;
+   /* pcie atomic Ops info */
+   uint32_t pcie_atomic_ops_enabled_flags;
/* reserved */
-   uint32_t reserved[256 - 47];
+   uint32_t reserved[256 - 48];
 };
 
 struct amd_sriov_msg_vf2pf_info_header {
-- 
2.17.1



Re: [PATCH] Enable '-Werror' by default for all kernel builds

2021-09-09 Thread Guenter Roeck

On 9/9/21 12:30 AM, Christian König wrote:

Am 09.09.21 um 08:07 schrieb Guenter Roeck:

On 9/8/21 10:58 PM, Christoph Hellwig wrote:

On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:

It'd be good to avoid. It has helped uncover build issues with KASAN in
the past. Or at least make it dependent on the problematic architecture.
For example if arm is a problem, something like this:


I'm also seeing quite a few stack size warnings with KASAN on x86_64
without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
few warnings without KASAN, but with KASAN there are a lot more.
I'll try to find some time to dig into them.

While we're at it, with -Werror something like this is really futile:

drivers/gpu/drm/amd/amdgpu/amdgpu_object.c: In function 
‘amdgpu_bo_support_uswc’:
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:493:2: warning: #warning
Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance thanks to 
write-combining [-Wcpp
   493 | #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better 
performance \
   |  ^~~


Ah, yes good point!



I have been wondering if all those #warning "errors" should either
be removed or be replaced with "#pragma message".


Well we started to add those warnings because people compiled their kernel with 
CONFIG_MTRR and CONFIG_X86_PAT and was then wondering why the performance of 
the display driver was so crappy.

When those warning now generate an error which you have to disable explicitly 
then that might not be bad at all.

It at least points people to this setting and makes it really clear that they 
are doing something very unusual and need to keep in mind that it might not 
have the desired result.



That specific warning is surrounded with "#ifndef CONFIG_COMPILE_TEST"
so it doesn't really matter because it doesn't cause test build failures.
Of course, we could do the same for any #warning which does now
cause a test build failure.

Guenter


Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()

2021-09-09 Thread Christophe Leroy




On 9/8/21 10:58 PM, Tom Lendacky wrote:

Introduce a powerpc version of the cc_platform_has() function. This will
be used to replace the powerpc mem_encrypt_active() implementation, so
the implementation will initially only support the CC_ATTR_MEM_ENCRYPT
attribute.

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Signed-off-by: Tom Lendacky 
---
  arch/powerpc/platforms/pseries/Kconfig   |  1 +
  arch/powerpc/platforms/pseries/Makefile  |  2 ++
  arch/powerpc/platforms/pseries/cc_platform.c | 26 
  3 files changed, 29 insertions(+)
  create mode 100644 arch/powerpc/platforms/pseries/cc_platform.c

diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index 5e037df2a3a1..2e57391e0778 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -159,6 +159,7 @@ config PPC_SVM
select SWIOTLB
select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
+   select ARCH_HAS_CC_PLATFORM
help
 There are certain POWER platforms which support secure guests using
 the Protected Execution Facility, with the help of an Ultravisor
diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index 4cda0ef87be0..41d8aee98da4 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -31,3 +31,5 @@ obj-$(CONFIG_FA_DUMP) += rtas-fadump.o
  
  obj-$(CONFIG_SUSPEND)		+= suspend.o

  obj-$(CONFIG_PPC_VAS) += vas.o
+
+obj-$(CONFIG_ARCH_HAS_CC_PLATFORM) += cc_platform.o
diff --git a/arch/powerpc/platforms/pseries/cc_platform.c 
b/arch/powerpc/platforms/pseries/cc_platform.c
new file mode 100644
index ..e8021af83a19
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/cc_platform.c
@@ -0,0 +1,26 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Confidential Computing Platform Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+
+bool cc_platform_has(enum cc_attr attr)
+{


Please keep this function inline as mem_encrypt_active() is



+   switch (attr) {
+   case CC_ATTR_MEM_ENCRYPT:
+   return is_secure_guest();
+
+   default:
+   return false;
+   }
+}
+EXPORT_SYMBOL_GPL(cc_platform_has);



Re: [PATCH v3 0/8] Implement generic cc_platform_has() helper function

2021-09-09 Thread Christian Borntraeger




On 09.09.21 00:58, Tom Lendacky wrote:

This patch series provides a generic helper function, cc_platform_has(),
to replace the sme_active(), sev_active(), sev_es_active() and
mem_encrypt_active() functions.

It is expected that as new confidential computing technologies are
added to the kernel, they can all be covered by a single function call
instead of a collection of specific function calls all called from the
same locations.

The powerpc and s390 patches have been compile tested only. Can the
folks copied on this series verify that nothing breaks for them.


Is there a tree somewhere?

 Also,

a new file, arch/powerpc/platforms/pseries/cc_platform.c, has been
created for powerpc to hold the out of line function.

Cc: Andi Kleen 
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Daniel Vetter 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Airlie 
Cc: Heiko Carstens 
Cc: Ingo Molnar 
Cc: Joerg Roedel 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Thomas Zimmermann 
Cc: Vasily Gorbik 
Cc: VMware Graphics 
Cc: Will Deacon 
Cc: Christoph Hellwig 

---

Patches based on:
   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
   4b93c544e90e ("thunderbolt: test: split up test cases in 
tb_test_credit_alloc_all")

Changes since v2:
- Changed the name from prot_guest_has() to cc_platform_has()
- Took the cc_platform_has() function out of line. Created two new files,
   cc_platform.c, in both x86 and ppc to implment the function. As a
   result, also changed the attribute defines into enums.
- Removed any received Reviewed-by's and Acked-by's given changes in this
   version.
- Added removal of new instances of mem_encrypt_active() usage in powerpc
   arch.
- Based on latest Linux tree to pick up powerpc changes related to the
   mem_encrypt_active() function.

Changes since v1:
- Moved some arch ioremap functions within #ifdef CONFIG_AMD_MEM_ENCRYPT
   in prep for use of prot_guest_has() by TDX.
- Added type includes to the the protected_guest.h header file to prevent
   build errors outside of x86.
- Made amd_prot_guest_has() EXPORT_SYMBOL_GPL
- Used amd_prot_guest_has() in place of checking sme_me_mask in the
   arch/x86/mm/mem_encrypt.c file.

Tom Lendacky (8):
   x86/ioremap: Selectively build arch override encryption functions
   mm: Introduce a function to check for confidential computing features
   x86/sev: Add an x86 version of cc_platform_has()
   powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
   x86/sme: Replace occurrences of sme_active() with cc_platform_has()
   x86/sev: Replace occurrences of sev_active() with cc_platform_has()
   x86/sev: Replace occurrences of sev_es_active() with cc_platform_has()
   treewide: Replace the use of mem_encrypt_active() with
 cc_platform_has()

  arch/Kconfig |  3 +
  arch/powerpc/include/asm/mem_encrypt.h   |  5 --
  arch/powerpc/platforms/pseries/Kconfig   |  1 +
  arch/powerpc/platforms/pseries/Makefile  |  2 +
  arch/powerpc/platforms/pseries/cc_platform.c | 26 ++
  arch/powerpc/platforms/pseries/svm.c |  5 +-
  arch/s390/include/asm/mem_encrypt.h  |  2 -
  arch/x86/Kconfig |  1 +
  arch/x86/include/asm/io.h|  8 ++
  arch/x86/include/asm/kexec.h |  2 +-
  arch/x86/include/asm/mem_encrypt.h   | 14 +---
  arch/x86/kernel/Makefile |  3 +
  arch/x86/kernel/cc_platform.c| 21 +
  arch/x86/kernel/crash_dump_64.c  |  4 +-
  arch/x86/kernel/head64.c |  4 +-
  arch/x86/kernel/kvm.c|  3 +-
  arch/x86/kernel/kvmclock.c   |  4 +-
  arch/x86/kernel/machine_kexec_64.c   | 19 +++--
  arch/x86/kernel/pci-swiotlb.c|  9 +-
  arch/x86/kernel/relocate_kernel_64.S |  2 +-
  arch/x86/kernel/sev.c|  6 +-
  arch/x86/kvm/svm/svm.c   |  3 +-
  arch/x86/mm/ioremap.c| 18 ++--
  arch/x86/mm/mem_encrypt.c| 57 +++--
  arch/x86/mm/mem_encrypt_identity.c   |  3 +-
  arch/x86/mm/pat/set_memory.c |  3 +-
  arch/x86/platform/efi/efi_64.c   |  9 +-
  arch/x86/realmode/init.c |  8 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |  4 +-
  drivers/gpu/drm/drm_cache.c  |  4 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c  |  4 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_msg.c  |  6 +-
  drivers/iommu/amd/init.c |  7 +-
  drivers/iommu/amd/iommu.c|  3 +-
  drivers/iommu/amd/iommu_v2.c |  3 +-
  drivers/iommu/iommu.c|  3 +-
  fs/proc/vmcore.c | 

Re: [PATCH v3 8/8] treewide: Replace the use of mem_encrypt_active() with cc_platform_has()

2021-09-09 Thread Christophe Leroy




On 9/8/21 10:58 PM, Tom Lendacky wrote:


diff --git a/arch/powerpc/include/asm/mem_encrypt.h 
b/arch/powerpc/include/asm/mem_encrypt.h
index ba9dab07c1be..2f26b8fc8d29 100644
--- a/arch/powerpc/include/asm/mem_encrypt.h
+++ b/arch/powerpc/include/asm/mem_encrypt.h
@@ -10,11 +10,6 @@
  
  #include 
  
-static inline bool mem_encrypt_active(void)

-{
-   return is_secure_guest();
-}
-
  static inline bool force_dma_unencrypted(struct device *dev)
  {
return is_secure_guest();
diff --git a/arch/powerpc/platforms/pseries/svm.c 
b/arch/powerpc/platforms/pseries/svm.c
index 87f001b4c4e4..c083ecbbae4d 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -8,6 +8,7 @@
  
  #include 

  #include 
+#include 
  #include 
  #include 
  #include 
@@ -63,7 +64,7 @@ void __init svm_swiotlb_init(void)
  
  int set_memory_encrypted(unsigned long addr, int numpages)

  {
-   if (!mem_encrypt_active())
+   if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return 0;
  
  	if (!PAGE_ALIGNED(addr))

@@ -76,7 +77,7 @@ int set_memory_encrypted(unsigned long addr, int numpages)
  
  int set_memory_decrypted(unsigned long addr, int numpages)

  {
-   if (!mem_encrypt_active())
+   if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return 0;
  
  	if (!PAGE_ALIGNED(addr))


This change unnecessarily complexifies the two functions. This is due to 
cc_platform_has() being out-line. It should really remain inline.


Before the change we got:

 <.set_memory_encrypted>:
   0:   7d 20 00 a6 mfmsr   r9
   4:   75 29 00 40 andis.  r9,r9,64
   8:   41 82 00 48 beq 50 <.set_memory_encrypted+0x50>
   c:   78 69 04 20 clrldi  r9,r3,48
  10:   2c 29 00 00 cmpdi   r9,0
  14:   40 82 00 4c bne 60 <.set_memory_encrypted+0x60>
  18:   7c 08 02 a6 mflrr0
  1c:   7c 85 23 78 mr  r5,r4
  20:   78 64 85 02 rldicl  r4,r3,48,20
  24:   61 23 f1 34 ori r3,r9,61748
  28:   f8 01 00 10 std r0,16(r1)
  2c:   f8 21 ff 91 stdur1,-112(r1)
  30:   48 00 00 01 bl  30 <.set_memory_encrypted+0x30>
30: R_PPC64_REL24   .ucall_norets
  34:   60 00 00 00 nop
  38:   38 60 00 00 li  r3,0
  3c:   38 21 00 70 addir1,r1,112
  40:   e8 01 00 10 ld  r0,16(r1)
  44:   7c 08 03 a6 mtlrr0
  48:   4e 80 00 20 blr
  50:   38 60 00 00 li  r3,0
  54:   4e 80 00 20 blr
  60:   38 60 ff ea li  r3,-22
  64:   4e 80 00 20 blr

After the change we get:

 <.set_memory_encrypted>:
   0:   7c 08 02 a6 mflrr0
   4:   fb c1 ff f0 std r30,-16(r1)
   8:   fb e1 ff f8 std r31,-8(r1)
   c:   7c 7f 1b 78 mr  r31,r3
  10:   38 60 00 00 li  r3,0
  14:   7c 9e 23 78 mr  r30,r4
  18:   f8 01 00 10 std r0,16(r1)
  1c:   f8 21 ff 81 stdur1,-128(r1)
  20:   48 00 00 01 bl  20 <.set_memory_encrypted+0x20>
20: R_PPC64_REL24   .cc_platform_has
  24:   60 00 00 00 nop
  28:   2c 23 00 00 cmpdi   r3,0
  2c:   41 82 00 44 beq 70 <.set_memory_encrypted+0x70>
  30:   7b e9 04 20 clrldi  r9,r31,48
  34:   2c 29 00 00 cmpdi   r9,0
  38:   40 82 00 58 bne 90 <.set_memory_encrypted+0x90>
  3c:   38 60 00 00 li  r3,0
  40:   7f c5 f3 78 mr  r5,r30
  44:   7b e4 85 02 rldicl  r4,r31,48,20
  48:   60 63 f1 34 ori r3,r3,61748
  4c:   48 00 00 01 bl  4c <.set_memory_encrypted+0x4c>
4c: R_PPC64_REL24   .ucall_norets
  50:   60 00 00 00 nop
  54:   38 60 00 00 li  r3,0
  58:   38 21 00 80 addir1,r1,128
  5c:   e8 01 00 10 ld  r0,16(r1)
  60:   eb c1 ff f0 ld  r30,-16(r1)
  64:   eb e1 ff f8 ld  r31,-8(r1)
  68:   7c 08 03 a6 mtlrr0
  6c:   4e 80 00 20 blr
  70:   38 21 00 80 addir1,r1,128
  74:   38 60 00 00 li  r3,0
  78:   e8 01 00 10 ld  r0,16(r1)
  7c:   eb c1 ff f0 ld  r30,-16(r1)
  80:   eb e1 ff f8 ld  r31,-8(r1)
  84:   7c 08 03 a6 mtlrr0
  88:   4e 80 00 20 blr
  90:   38 60 ff ea li  r3,-22
  94:   4b ff ff c4 b   58 <.set_memory_encrypted+0x58>



Re: [PATCH v3 2/8] mm: Introduce a function to check for confidential computing features

2021-09-09 Thread Christophe Leroy




On 9/8/21 10:58 PM, Tom Lendacky wrote:

In prep for other confidential computing technologies, introduce a generic
helper function, cc_platform_has(), that can be used to check for specific


I have little problem with that naming.

For me CC has always meant Compiler Collection.


active confidential computing attributes, like memory encryption. This is
intended to eliminate having to add multiple technology-specific checks to
the code (e.g. if (sev_active() || tdx_active())).

Co-developed-by: Andi Kleen 
Signed-off-by: Andi Kleen 
Co-developed-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Tom Lendacky 
---
  arch/Kconfig|  3 ++
  include/linux/cc_platform.h | 88 +
  2 files changed, 91 insertions(+)
  create mode 100644 include/linux/cc_platform.h

diff --git a/arch/Kconfig b/arch/Kconfig
index 3743174da870..ca7c359e5da8 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1234,6 +1234,9 @@ config RELR
  config ARCH_HAS_MEM_ENCRYPT
bool
  
+config ARCH_HAS_CC_PLATFORM

+   bool
+
  config HAVE_SPARSE_SYSCALL_NR
 bool
 help
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
new file mode 100644
index ..253f3ea66cd8
--- /dev/null
+++ b/include/linux/cc_platform.h
@@ -0,0 +1,88 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Confidential Computing Platform Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#ifndef _CC_PLATFORM_H
+#define _CC_PLATFORM_H
+
+#include 
+#include 
+
+/**
+ * enum cc_attr - Confidential computing attributes
+ *
+ * These attributes represent confidential computing features that are
+ * currently active.
+ */
+enum cc_attr {
+   /**
+* @CC_ATTR_MEM_ENCRYPT: Memory encryption is active
+*
+* The platform/OS is running with active memory encryption. This
+* includes running either as a bare-metal system or a hypervisor
+* and actively using memory encryption or as a guest/virtual machine
+* and actively using memory encryption.
+*
+* Examples include SME, SEV and SEV-ES.
+*/
+   CC_ATTR_MEM_ENCRYPT,
+
+   /**
+* @CC_ATTR_HOST_MEM_ENCRYPT: Host memory encryption is active
+*
+* The platform/OS is running as a bare-metal system or a hypervisor
+* and actively using memory encryption.
+*
+* Examples include SME.
+*/
+   CC_ATTR_HOST_MEM_ENCRYPT,
+
+   /**
+* @CC_ATTR_GUEST_MEM_ENCRYPT: Guest memory encryption is active
+*
+* The platform/OS is running as a guest/virtual machine and actively
+* using memory encryption.
+*
+* Examples include SEV and SEV-ES.
+*/
+   CC_ATTR_GUEST_MEM_ENCRYPT,
+
+   /**
+* @CC_ATTR_GUEST_STATE_ENCRYPT: Guest state encryption is active
+*
+* The platform/OS is running as a guest/virtual machine and actively
+* using memory encryption and register state encryption.
+*
+* Examples include SEV-ES.
+*/
+   CC_ATTR_GUEST_STATE_ENCRYPT,
+};
+
+#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
+
+/**
+ * cc_platform_has() - Checks if the specified cc_attr attribute is active
+ * @attr: Confidential computing attribute to check
+ *
+ * The cc_platform_has() function will return an indicator as to whether the
+ * specified Confidential Computing attribute is currently active.
+ *
+ * Context: Any context
+ * Return:
+ * * TRUE  - Specified Confidential Computing attribute is active
+ * * FALSE - Specified Confidential Computing attribute is not active
+ */
+bool cc_platform_has(enum cc_attr attr);


This declaration make it impossible for architectures to define this 
function inline.


For such function, having it inline would make more sense as it would 
allow GCC to perform constant folding and avoid the overhead  of calling 
a sub-function.



+
+#else  /* !CONFIG_ARCH_HAS_CC_PLATFORM */
+
+static inline bool cc_platform_has(enum cc_attr attr) { return false; }
+
+#endif /* CONFIG_ARCH_HAS_CC_PLATFORM */
+
+#endif /* _CC_PLATFORM_H */



Re: [PATCH v3 8/8] treewide: Replace the use of mem_encrypt_active() with cc_platform_has()

2021-09-09 Thread Tom Lendacky

On 9/9/21 2:25 AM, Christophe Leroy wrote:



On 9/8/21 10:58 PM, Tom Lendacky wrote:


diff --git a/arch/powerpc/include/asm/mem_encrypt.h 
b/arch/powerpc/include/asm/mem_encrypt.h

index ba9dab07c1be..2f26b8fc8d29 100644
--- a/arch/powerpc/include/asm/mem_encrypt.h
+++ b/arch/powerpc/include/asm/mem_encrypt.h
@@ -10,11 +10,6 @@
  #include 
-static inline bool mem_encrypt_active(void)
-{
-    return is_secure_guest();
-}
-
  static inline bool force_dma_unencrypted(struct device *dev)
  {
  return is_secure_guest();
diff --git a/arch/powerpc/platforms/pseries/svm.c 
b/arch/powerpc/platforms/pseries/svm.c

index 87f001b4c4e4..c083ecbbae4d 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -8,6 +8,7 @@
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -63,7 +64,7 @@ void __init svm_swiotlb_init(void)
  int set_memory_encrypted(unsigned long addr, int numpages)
  {
-    if (!mem_encrypt_active())
+    if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
  return 0;
  if (!PAGE_ALIGNED(addr))
@@ -76,7 +77,7 @@ int set_memory_encrypted(unsigned long addr, int 
numpages)

  int set_memory_decrypted(unsigned long addr, int numpages)
  {
-    if (!mem_encrypt_active())
+    if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
  return 0;
  if (!PAGE_ALIGNED(addr))


This change unnecessarily complexifies the two functions. This is due to 
cc_platform_has() being out-line. It should really remain inline.


Please see previous discussion(s) on this series for why the function is
implemented out of line and for the naming:

V1: https://lore.kernel.org/lkml/cover.1627424773.git.thomas.lenda...@amd.com/

V2: https://lore.kernel.org/lkml/cover.1628873970.git.thomas.lenda...@amd.com/

Thanks,
Tom



Before the change we got:

 <.set_memory_encrypted>:
    0:    7d 20 00 a6 mfmsr   r9
    4:    75 29 00 40 andis.  r9,r9,64
    8:    41 82 00 48 beq 50 <.set_memory_encrypted+0x50>
    c:    78 69 04 20 clrldi  r9,r3,48
   10:    2c 29 00 00 cmpdi   r9,0
   14:    40 82 00 4c bne 60 <.set_memory_encrypted+0x60>
   18:    7c 08 02 a6 mflr    r0
   1c:    7c 85 23 78 mr  r5,r4
   20:    78 64 85 02 rldicl  r4,r3,48,20
   24:    61 23 f1 34 ori r3,r9,61748
   28:    f8 01 00 10 std r0,16(r1)
   2c:    f8 21 ff 91 stdu    r1,-112(r1)
   30:    48 00 00 01 bl  30 <.set_memory_encrypted+0x30>
     30: R_PPC64_REL24    .ucall_norets
   34:    60 00 00 00 nop
   38:    38 60 00 00 li  r3,0
   3c:    38 21 00 70 addi    r1,r1,112
   40:    e8 01 00 10 ld  r0,16(r1)
   44:    7c 08 03 a6 mtlr    r0
   48:    4e 80 00 20 blr
   50:    38 60 00 00 li  r3,0
   54:    4e 80 00 20 blr
   60:    38 60 ff ea li  r3,-22
   64:    4e 80 00 20 blr

After the change we get:

 <.set_memory_encrypted>:
    0:    7c 08 02 a6 mflr    r0
    4:    fb c1 ff f0 std r30,-16(r1)
    8:    fb e1 ff f8 std r31,-8(r1)
    c:    7c 7f 1b 78 mr  r31,r3
   10:    38 60 00 00 li  r3,0
   14:    7c 9e 23 78 mr  r30,r4
   18:    f8 01 00 10 std r0,16(r1)
   1c:    f8 21 ff 81 stdu    r1,-128(r1)
   20:    48 00 00 01 bl  20 <.set_memory_encrypted+0x20>
     20: R_PPC64_REL24    .cc_platform_has
   24:    60 00 00 00 nop
   28:    2c 23 00 00 cmpdi   r3,0
   2c:    41 82 00 44 beq 70 <.set_memory_encrypted+0x70>
   30:    7b e9 04 20 clrldi  r9,r31,48
   34:    2c 29 00 00 cmpdi   r9,0
   38:    40 82 00 58 bne 90 <.set_memory_encrypted+0x90>
   3c:    38 60 00 00 li  r3,0
   40:    7f c5 f3 78 mr  r5,r30
   44:    7b e4 85 02 rldicl  r4,r31,48,20
   48:    60 63 f1 34 ori r3,r3,61748
   4c:    48 00 00 01 bl  4c <.set_memory_encrypted+0x4c>
     4c: R_PPC64_REL24    .ucall_norets
   50:    60 00 00 00 nop
   54:    38 60 00 00 li  r3,0
   58:    38 21 00 80 addi    r1,r1,128
   5c:    e8 01 00 10 ld  r0,16(r1)
   60:    eb c1 ff f0 ld  r30,-16(r1)
   64:    eb e1 ff f8 ld  r31,-8(r1)
   68:    7c 08 03 a6 mtlr    r0
   6c:    4e 80 00 20 blr
   70:    38 21 00 80 addi    r1,r1,128
   74:    38 60 00 00 li  r3,0
   78:    e8 01 00 10 ld  r0,16(r1)
   7c:    eb c1 ff f0 ld  r30,-16(r1)
   80:    eb e1 ff f8 ld  r31,-8(r1)
   84:    7c 08 03 a6 mtlr    r0
   88:    4e 80 00 20 blr
   90:    38 60 ff ea li  r3,-22
   94:    4b ff ff c4 b   58 <.set_memory_encrypted+0x58>



Re: [PATCH v3 0/8] Implement generic cc_platform_has() helper function

2021-09-09 Thread Tom Lendacky

On 9/9/21 2:32 AM, Christian Borntraeger wrote:



On 09.09.21 00:58, Tom Lendacky wrote:

This patch series provides a generic helper function, cc_platform_has(),
to replace the sme_active(), sev_active(), sev_es_active() and
mem_encrypt_active() functions.

It is expected that as new confidential computing technologies are
added to the kernel, they can all be covered by a single function call
instead of a collection of specific function calls all called from the
same locations.

The powerpc and s390 patches have been compile tested only. Can the
folks copied on this series verify that nothing breaks for them.


Is there a tree somewhere?


I pushed it up to github:

https://github.com/AMDESE/linux/tree/prot-guest-has-v3

Thanks,
Tom



  Also,

a new file, arch/powerpc/platforms/pseries/cc_platform.c, has been
created for powerpc to hold the out of line function.

Cc: Andi Kleen 
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Daniel Vetter 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Airlie 
Cc: Heiko Carstens 
Cc: Ingo Molnar 
Cc: Joerg Roedel 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Thomas Zimmermann 
Cc: Vasily Gorbik 
Cc: VMware Graphics 
Cc: Will Deacon 
Cc: Christoph Hellwig 

---

Patches based on:
   
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.gitdata=04%7C01%7Cthomas.lendacky%40amd.com%7C5cd71ef2c2ce4b90060708d973640358%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637667695657121432%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=FVngrPSxCCRKutAaIMtU2Nk8WArFQB1dEE2wN7v8RgA%3Dreserved=0 
master
   4b93c544e90e ("thunderbolt: test: split up test cases in 
tb_test_credit_alloc_all")


Changes since v2:
- Changed the name from prot_guest_has() to cc_platform_has()
- Took the cc_platform_has() function out of line. Created two new files,
   cc_platform.c, in both x86 and ppc to implment the function. As a
   result, also changed the attribute defines into enums.
- Removed any received Reviewed-by's and Acked-by's given changes in this
   version.
- Added removal of new instances of mem_encrypt_active() usage in powerpc
   arch.
- Based on latest Linux tree to pick up powerpc changes related to the
   mem_encrypt_active() function.

Changes since v1:
- Moved some arch ioremap functions within #ifdef CONFIG_AMD_MEM_ENCRYPT
   in prep for use of prot_guest_has() by TDX.
- Added type includes to the the protected_guest.h header file to prevent
   build errors outside of x86.
- Made amd_prot_guest_has() EXPORT_SYMBOL_GPL
- Used amd_prot_guest_has() in place of checking sme_me_mask in the
   arch/x86/mm/mem_encrypt.c file.

Tom Lendacky (8):
   x86/ioremap: Selectively build arch override encryption functions
   mm: Introduce a function to check for confidential computing features
   x86/sev: Add an x86 version of cc_platform_has()
   powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
   x86/sme: Replace occurrences of sme_active() with cc_platform_has()
   x86/sev: Replace occurrences of sev_active() with cc_platform_has()
   x86/sev: Replace occurrences of sev_es_active() with cc_platform_has()
   treewide: Replace the use of mem_encrypt_active() with
 cc_platform_has()

  arch/Kconfig |  3 +
  arch/powerpc/include/asm/mem_encrypt.h   |  5 --
  arch/powerpc/platforms/pseries/Kconfig   |  1 +
  arch/powerpc/platforms/pseries/Makefile  |  2 +
  arch/powerpc/platforms/pseries/cc_platform.c | 26 ++
  arch/powerpc/platforms/pseries/svm.c |  5 +-
  arch/s390/include/asm/mem_encrypt.h  |  2 -
  arch/x86/Kconfig |  1 +
  arch/x86/include/asm/io.h    |  8 ++
  arch/x86/include/asm/kexec.h |  2 +-
  arch/x86/include/asm/mem_encrypt.h   | 14 +---
  arch/x86/kernel/Makefile |  3 +
  arch/x86/kernel/cc_platform.c    | 21 +
  arch/x86/kernel/crash_dump_64.c  |  4 +-
  arch/x86/kernel/head64.c |  4 +-
  arch/x86/kernel/kvm.c    |  3 +-
  arch/x86/kernel/kvmclock.c   |  4 +-
  arch/x86/kernel/machine_kexec_64.c   | 19 +++--
  arch/x86/kernel/pci-swiotlb.c    |  9 +-
  arch/x86/kernel/relocate_kernel_64.S |  2 +-
  arch/x86/kernel/sev.c    |  6 +-
  arch/x86/kvm/svm/svm.c   |  3 +-
  arch/x86/mm/ioremap.c    | 18 ++--
  arch/x86/mm/mem_encrypt.c    | 57 +++--
  arch/x86/mm/mem_encrypt_identity.c   |  3 +-
  arch/x86/mm/pat/set_memory.c |  3 +-
  arch/x86/platform/efi/efi_64.c   |  9 +-
  

Re: [PATCH] Enable '-Werror' by default for all kernel builds

2021-09-09 Thread Arnd Bergmann
On Thu, Sep 9, 2021 at 1:43 PM Marco Elver  wrote:
> On Thu, 9 Sept 2021 at 13:00, Arnd Bergmann  wrote:
> > On Thu, Sep 9, 2021 at 12:54 PM Marco Elver  wrote:
> > > On Thu, 9 Sept 2021 at 07:59, Christoph Hellwig  
> > > wrote:
> > > > On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
> > > > > It'd be good to avoid. It has helped uncover build issues with KASAN 
> > > > > in
> > > > > the past. Or at least make it dependent on the problematic 
> > > > > architecture.
> > > > > For example if arm is a problem, something like this:
> > > >
> > > > I'm also seeing quite a few stack size warnings with KASAN on x86_64
> > > > without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
> > > > few warnings without KASAN, but with KASAN there are a lot more.
> > > > I'll try to find some time to dig into them.
> > >
> > > Right, this reminded me that we actually at least double the real
> > > stack size for KASAN builds, because it inherently requires more stack
> > > space. I think we need Wframe-larger-than to match that, otherwise
> > > we'll just keep having this problem:
> > >
> > > https://lkml.kernel.org/r/20210909104925.809674-1-el...@google.com
> >
> > The problem with this is that it completely defeats the point of the
> > stack size warnings in allmodconfig kernels when they have KASAN
> > enabled and end up missing obvious code bugs in drivers that put
> > large structures on the stack. Let's not go there.
>
> Sure, but the reality is that the real stack size is already doubled
> for KASAN. And that should be reflected in Wframe-larger-than.

I don't think "double" is an accurate description of what is going on,
it's much more complex than this. There are some functions
that completely explode with KASAN_STACK enabled on clang,
and many other functions instances that don't grow much at all.

I've been building randconfig kernels for a long time with KASAN_STACK
enabled on gcc, and the limit increased to 1440 bytes for 32-bit
and not increased beyond the normal 2048 bytes for 64-bit. I have
some patches to address the outliers and should go through and
resend some of those.

With the same limits and patches using clang, and KASAN=y but
KASAN_STACK=n I also get no warnings in randconfig builds,
but KASAN_STACK on clang doesn't really seem to have a good
limit that would make an allmodconfig kernel build with no warnings.

These are the worst offenders I see based on configuration, using
an 32-bit ARM allmodconfig with my fixups:

gcc-11, KASAN, no KASAN_STACK, FRAME_WARN=1024:
(nothing)

gcc-11, KASAN_STACK:
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:782:1:
warning: the frame size of 1416 bytes is larger than 1024 bytes
[-Wframe-larger-than=]
drivers/media/dvb-frontends/mxl5xx.c:1575:1: warning: the frame size
of 1240 bytes is larger than 1024 bytes [-Wframe-larger-than=]
drivers/mtd/nftlcore.c:468:1: warning: the frame size of 1232 bytes is
larger than 1024 bytes [-Wframe-larger-than=]
drivers/char/ipmi/ipmi_msghandler.c:4880:1: warning: the frame size of
1232 bytes is larger than 1024 bytes [-Wframe-larger-than=]
drivers/mtd/chips/cfi_cmdset_0001.c:1870:1: warning: the frame size of
1224 bytes is larger than 1024 bytes [-Wframe-larger-than=]
drivers/net/wireless/ath/ath9k/ar9003_paprd.c:749:1: warning: the
frame size of 1216 bytes is larger than 1024 bytes
[-Wframe-larger-than=]
drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c:136:1: warning:
the frame size of 1216 bytes is larger than 1024 bytes
[-Wframe-larger-than=]
drivers/ntb/hw/idt/ntb_hw_idt.c:1116:1: warning: the frame size of
1200 bytes is larger than 1024 bytes [-Wframe-larger-than=]
net/dcb/dcbnl.c:1172:1: warning: the frame size of 1192 bytes is
larger than 1024 bytes [-Wframe-larger-than=]
fs/select.c:1042:1: warning: the frame size of 1192 bytes is larger
than 1024 bytes [-Wframe-larger-than=]

clang-12 KASAN, no KASAN_STACK, FRAME_WARN=1024:

kernel/trace/trace_events_hist.c:4601:13: error: stack frame size 1384
exceeds limit 1024 in function 'hist_trigger_print_key'
[-Werror,-Wframe-larger-than]
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:3045:6:
error: stack frame size 1384 exceeds limit 1024 in function 'bw_calcs'
[-Werror,-Wframe-larger-than]
drivers/staging/fbtft/fbtft-core.c:992:5: error: stack frame size 1208
exceeds limit 1024 in function 'fbtft_init_display'
[-Werror,-Wframe-larger-than]
crypto/wp512.c:782:13: error: stack frame size 1176 exceeds limit 1024
in function 'wp512_process_buffer' [-Werror,-Wframe-larger-than]
drivers/staging/fbtft/fbtft-core.c:902:12: error: stack frame size
1080 exceeds limit 1024 in function 'fbtft_init_display_from_property'
[-Werror,-Wframe-larger-than]
drivers/mtd/chips/cfi_cmdset_0001.c:1872:12: error: stack frame size
1064 exceeds limit 1024 in function 'cfi_intelext_writev'
[-Werror,-Wframe-larger-than]
drivers/staging/rtl8723bs/core/rtw_security.c:1288:5: error: stack
frame size 1040 exceeds limit 1024 in function 

[PATCH AUTOSEL 4.4 22/35] gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable access in amdgpu_i2c_router_select_ddc_port()

2021-09-09 Thread Sasha Levin
From: Tuo Li 

[ Upstream commit a211260c34cfadc6068fece8c9e99e0fe1e2a2b6 ]

The variable val is declared without initialization, and its address is
passed to amdgpu_i2c_get_byte(). In this function, the value of val is
accessed in:
  DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n",
   addr, *val);

Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized,
but it is accessed in:
  val &= ~amdgpu_connector->router.ddc_mux_control_pin;

To fix this possible uninitialized-variable access, initialize val to 0 in
amdgpu_i2c_router_select_ddc_port().

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
index 31a676376d73..3490d300bed2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@@ -340,7 +340,7 @@ static void amdgpu_i2c_put_byte(struct amdgpu_i2c_chan 
*i2c_bus,
 void
 amdgpu_i2c_router_select_ddc_port(struct amdgpu_connector *amdgpu_connector)
 {
-   u8 val;
+   u8 val = 0;
 
if (!amdgpu_connector->router.ddc_valid)
return;
-- 
2.30.2



[PATCH AUTOSEL 4.9 32/48] gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable access in amdgpu_i2c_router_select_ddc_port()

2021-09-09 Thread Sasha Levin
From: Tuo Li 

[ Upstream commit a211260c34cfadc6068fece8c9e99e0fe1e2a2b6 ]

The variable val is declared without initialization, and its address is
passed to amdgpu_i2c_get_byte(). In this function, the value of val is
accessed in:
  DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n",
   addr, *val);

Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized,
but it is accessed in:
  val &= ~amdgpu_connector->router.ddc_mux_control_pin;

To fix this possible uninitialized-variable access, initialize val to 0 in
amdgpu_i2c_router_select_ddc_port().

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
index 91d367399956..a334eb7dbff4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@@ -339,7 +339,7 @@ static void amdgpu_i2c_put_byte(struct amdgpu_i2c_chan 
*i2c_bus,
 void
 amdgpu_i2c_router_select_ddc_port(const struct amdgpu_connector 
*amdgpu_connector)
 {
-   u8 val;
+   u8 val = 0;
 
if (!amdgpu_connector->router.ddc_valid)
return;
-- 
2.30.2



[PATCH AUTOSEL 4.14 36/59] gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable access in amdgpu_i2c_router_select_ddc_port()

2021-09-09 Thread Sasha Levin
From: Tuo Li 

[ Upstream commit a211260c34cfadc6068fece8c9e99e0fe1e2a2b6 ]

The variable val is declared without initialization, and its address is
passed to amdgpu_i2c_get_byte(). In this function, the value of val is
accessed in:
  DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n",
   addr, *val);

Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized,
but it is accessed in:
  val &= ~amdgpu_connector->router.ddc_mux_control_pin;

To fix this possible uninitialized-variable access, initialize val to 0 in
amdgpu_i2c_router_select_ddc_port().

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
index f2739995c335..199eccee0b0b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@@ -338,7 +338,7 @@ static void amdgpu_i2c_put_byte(struct amdgpu_i2c_chan 
*i2c_bus,
 void
 amdgpu_i2c_router_select_ddc_port(const struct amdgpu_connector 
*amdgpu_connector)
 {
-   u8 val;
+   u8 val = 0;
 
if (!amdgpu_connector->router.ddc_valid)
return;
-- 
2.30.2



Subject: [PATCH 1/1] drm/amdgpu: Update RAS trigger error block support

2021-09-09 Thread Clements, John
[AMD Official Use Only]

Submitting patch to update RAS trigger error to support additional blocks


0002-drm-amdgpu-Update-RAS-trigger-error-block-support.patch
Description: 0002-drm-amdgpu-Update-RAS-trigger-error-block-support.patch


[PATCH AUTOSEL 4.19 46/74] gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable access in amdgpu_i2c_router_select_ddc_port()

2021-09-09 Thread Sasha Levin
From: Tuo Li 

[ Upstream commit a211260c34cfadc6068fece8c9e99e0fe1e2a2b6 ]

The variable val is declared without initialization, and its address is
passed to amdgpu_i2c_get_byte(). In this function, the value of val is
accessed in:
  DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n",
   addr, *val);

Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized,
but it is accessed in:
  val &= ~amdgpu_connector->router.ddc_mux_control_pin;

To fix this possible uninitialized-variable access, initialize val to 0 in
amdgpu_i2c_router_select_ddc_port().

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
index f2739995c335..199eccee0b0b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@@ -338,7 +338,7 @@ static void amdgpu_i2c_put_byte(struct amdgpu_i2c_chan 
*i2c_bus,
 void
 amdgpu_i2c_router_select_ddc_port(const struct amdgpu_connector 
*amdgpu_connector)
 {
-   u8 val;
+   u8 val = 0;
 
if (!amdgpu_connector->router.ddc_valid)
return;
-- 
2.30.2



[PATCH AUTOSEL 4.19 16/74] drm/amd/amdgpu: Update debugfs link_settings output link_rate field in hex

2021-09-09 Thread Sasha Levin
From: Anson Jacob 

[ Upstream commit 1a394b3c3de2577f200cb623c52a5c2b82805cec ]

link_rate is updated via debugfs using hex values, set it to output
in hex as well.

eg: Resolution: 1920x1080@144Hz
cat /sys/kernel/debug/dri/0/DP-1/link_settings
Current:  4  0x14  0  Verified:  4  0x1e  0  Reported:  4  0x1e  16  Preferred: 
 0  0x0  0

echo "4 0x1e" > /sys/kernel/debug/dri/0/DP-1/link_settings

cat /sys/kernel/debug/dri/0/DP-1/link_settings
Current:  4  0x1e  0  Verified:  4  0x1e  0  Reported:  4  0x1e  16  Preferred: 
 4  0x1e  0

Signed-off-by: Anson Jacob 
Reviewed-by: Harry Wentland 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c| 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 0d9e410ca01e..dbfe5623997d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -92,29 +92,29 @@ static ssize_t dp_link_settings_read(struct file *f, char 
__user *buf,
 
rd_buf_ptr = rd_buf;
 
-   str_len = strlen("Current:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Current:  %d  %d  %d  ",
+   str_len = strlen("Current:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Current:  %d  0x%x  %d  ",
link->cur_link_settings.lane_count,
link->cur_link_settings.link_rate,
link->cur_link_settings.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Verified:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Verified:  %d  %d  %d  ",
+   str_len = strlen("Verified:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Verified:  %d  0x%x  %d  ",
link->verified_link_cap.lane_count,
link->verified_link_cap.link_rate,
link->verified_link_cap.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Reported:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Reported:  %d  %d  %d  ",
+   str_len = strlen("Reported:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Reported:  %d  0x%x  %d  ",
link->reported_link_cap.lane_count,
link->reported_link_cap.link_rate,
link->reported_link_cap.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Preferred:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Preferred:  %d  %d  %d\n",
+   str_len = strlen("Preferred:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Preferred:  %d  0x%x  %d\n",
link->preferred_link_setting.lane_count,
link->preferred_link_setting.link_rate,
link->preferred_link_setting.link_spread);
-- 
2.30.2



[PATCH AUTOSEL 5.4 099/109] drm/amdkfd: Account for SH/SE count when setting up cu masks.

2021-09-09 Thread Sasha Levin
From: Sean Keely 

[ Upstream commit 1ec06c2dee679e9f089e78ed20cb74ee90155f61 ]

On systems with multiple SH per SE compute_static_thread_mgmt_se#
is split into independent masks, one for each SH, in the upper and
lower 16 bits.  We need to detect this and apply cu masking to each
SH.  The cu mask bits are assigned first to each SE, then to
alternate SHs, then finally to higher CU id.  This ensures that
the maximum number of SPIs are engaged as early as possible while
balancing CU assignment to each SH.

v2: Use max SH/SE rather than max SH in cu_per_sh.

v3: Fix comment blocks, ensure se_mask is initially zero filled,
and correctly assign se.sh.cu positions to unset bits in cu_mask.

Signed-off-by: Sean Keely 
Reviewed-by: Felix Kuehling 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c | 84 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h |  1 +
 2 files changed, 64 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
index 88813dad731f..c021519af810 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
@@ -98,36 +98,78 @@ void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
uint32_t *se_mask)
 {
struct kfd_cu_info cu_info;
-   uint32_t cu_per_se[KFD_MAX_NUM_SE] = {0};
-   int i, se, sh, cu = 0;
-
+   uint32_t cu_per_sh[KFD_MAX_NUM_SE][KFD_MAX_NUM_SH_PER_SE] = {0};
+   int i, se, sh, cu;
amdgpu_amdkfd_get_cu_info(mm->dev->kgd, _info);
 
if (cu_mask_count > cu_info.cu_active_number)
cu_mask_count = cu_info.cu_active_number;
 
+   /* Exceeding these bounds corrupts the stack and indicates a coding 
error.
+* Returning with no CU's enabled will hang the queue, which should be
+* attention grabbing.
+*/
+   if (cu_info.num_shader_engines > KFD_MAX_NUM_SE) {
+   pr_err("Exceeded KFD_MAX_NUM_SE, chip reports %d\n", 
cu_info.num_shader_engines);
+   return;
+   }
+   if (cu_info.num_shader_arrays_per_engine > KFD_MAX_NUM_SH_PER_SE) {
+   pr_err("Exceeded KFD_MAX_NUM_SH, chip reports %d\n",
+   cu_info.num_shader_arrays_per_engine * 
cu_info.num_shader_engines);
+   return;
+   }
+   /* Count active CUs per SH.
+*
+* Some CUs in an SH may be disabled.   HW expects disabled CUs to be
+* represented in the high bits of each SH's enable mask (the upper and 
lower
+* 16 bits of se_mask) and will take care of the actual distribution of
+* disabled CUs within each SH automatically.
+* Each half of se_mask must be filled only on bits 
0-cu_per_sh[se][sh]-1.
+*
+* See note on Arcturus cu_bitmap layout in gfx_v9_0_get_cu_info.
+*/
for (se = 0; se < cu_info.num_shader_engines; se++)
for (sh = 0; sh < cu_info.num_shader_arrays_per_engine; sh++)
-   cu_per_se[se] += hweight32(cu_info.cu_bitmap[se % 4][sh 
+ (se / 4)]);
-
-   /* Symmetrically map cu_mask to all SEs:
-* cu_mask[0] bit0 -> se_mask[0] bit0;
-* cu_mask[0] bit1 -> se_mask[1] bit0;
-* ... (if # SE is 4)
-* cu_mask[0] bit4 -> se_mask[0] bit1;
+   cu_per_sh[se][sh] = hweight32(cu_info.cu_bitmap[se % 
4][sh + (se / 4)]);
+
+   /* Symmetrically map cu_mask to all SEs & SHs:
+* se_mask programs up to 2 SH in the upper and lower 16 bits.
+*
+* Examples
+* Assuming 1 SH/SE, 4 SEs:
+* cu_mask[0] bit0 -> se_mask[0] bit0
+* cu_mask[0] bit1 -> se_mask[1] bit0
+* ...
+* cu_mask[0] bit4 -> se_mask[0] bit1
+* ...
+*
+* Assuming 2 SH/SE, 4 SEs
+* cu_mask[0] bit0 -> se_mask[0] bit0 (SE0,SH0,CU0)
+* cu_mask[0] bit1 -> se_mask[1] bit0 (SE1,SH0,CU0)
+* ...
+* cu_mask[0] bit4 -> se_mask[0] bit16 (SE0,SH1,CU0)
+* cu_mask[0] bit5 -> se_mask[1] bit16 (SE1,SH1,CU0)
+* ...
+* cu_mask[0] bit8 -> se_mask[0] bit1 (SE0,SH0,CU1)
 * ...
+*
+* First ensure all CUs are disabled, then enable user specified CUs.
 */
-   se = 0;
-   for (i = 0; i < cu_mask_count; i++) {
-   if (cu_mask[i / 32] & (1 << (i % 32)))
-   se_mask[se] |= 1 << cu;
-
-   do {
-   se++;
-   if (se == cu_info.num_shader_engines) {
-   se = 0;
-   cu++;
+   for (i = 0; i < cu_info.num_shader_engines; i++)
+   se_mask[i] = 0;
+
+   i = 0;
+   for (cu = 0; cu < 16; cu++) {
+   for (sh = 0; sh < cu_info.num_shader_arrays_per_engine; sh++) {
+   for (se = 0; se < 

[PATCH AUTOSEL 5.4 063/109] drm/display: fix possible null-pointer dereference in dcn10_set_clock()

2021-09-09 Thread Sasha Levin
From: Tuo Li 

[ Upstream commit 554594567b1fa3da74f88ec7b2dc83d000c58e98 ]

The variable dc->clk_mgr is checked in:
  if (dc->clk_mgr && dc->clk_mgr->funcs->get_clock)

This indicates dc->clk_mgr can be NULL.
However, it is dereferenced in:
if (!dc->clk_mgr->funcs->get_clock)

To fix this null-pointer dereference, check dc->clk_mgr and the function
pointer dc->clk_mgr->funcs->get_clock earlier, and return if one of them
is NULL.

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 60123db7ba02..bc5ebea1abed 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -3264,13 +3264,12 @@ static enum dc_status dcn10_set_clock(struct dc *dc,
struct dc_clock_config clock_cfg = {0};
struct dc_clocks *current_clocks = >bw_ctx.bw.dcn.clk;
 
-   if (dc->clk_mgr && dc->clk_mgr->funcs->get_clock)
-   dc->clk_mgr->funcs->get_clock(dc->clk_mgr,
-   context, clock_type, 
_cfg);
-
-   if (!dc->clk_mgr->funcs->get_clock)
+   if (!dc->clk_mgr || !dc->clk_mgr->funcs->get_clock)
return DC_FAIL_UNSUPPORTED_1;
 
+   dc->clk_mgr->funcs->get_clock(dc->clk_mgr,
+   context, clock_type, _cfg);
+
if (clk_khz > clock_cfg.max_clock_khz)
return DC_FAIL_CLK_EXCEED_MAX;
 
@@ -3288,7 +3287,7 @@ static enum dc_status dcn10_set_clock(struct dc *dc,
else
return DC_ERROR_UNEXPECTED;
 
-   if (dc->clk_mgr && dc->clk_mgr->funcs->update_clocks)
+   if (dc->clk_mgr->funcs->update_clocks)
dc->clk_mgr->funcs->update_clocks(dc->clk_mgr,
context, true);
return DC_OK;
-- 
2.30.2



[PATCH AUTOSEL 5.4 062/109] gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable access in amdgpu_i2c_router_select_ddc_port()

2021-09-09 Thread Sasha Levin
From: Tuo Li 

[ Upstream commit a211260c34cfadc6068fece8c9e99e0fe1e2a2b6 ]

The variable val is declared without initialization, and its address is
passed to amdgpu_i2c_get_byte(). In this function, the value of val is
accessed in:
  DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n",
   addr, *val);

Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized,
but it is accessed in:
  val &= ~amdgpu_connector->router.ddc_mux_control_pin;

To fix this possible uninitialized-variable access, initialize val to 0 in
amdgpu_i2c_router_select_ddc_port().

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
index 70dbe343f51d..89cecdba81ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@@ -339,7 +339,7 @@ static void amdgpu_i2c_put_byte(struct amdgpu_i2c_chan 
*i2c_bus,
 void
 amdgpu_i2c_router_select_ddc_port(const struct amdgpu_connector 
*amdgpu_connector)
 {
-   u8 val;
+   u8 val = 0;
 
if (!amdgpu_connector->router.ddc_valid)
return;
-- 
2.30.2



[PATCH AUTOSEL 5.4 025/109] drm/amd/amdgpu: Update debugfs link_settings output link_rate field in hex

2021-09-09 Thread Sasha Levin
From: Anson Jacob 

[ Upstream commit 1a394b3c3de2577f200cb623c52a5c2b82805cec ]

link_rate is updated via debugfs using hex values, set it to output
in hex as well.

eg: Resolution: 1920x1080@144Hz
cat /sys/kernel/debug/dri/0/DP-1/link_settings
Current:  4  0x14  0  Verified:  4  0x1e  0  Reported:  4  0x1e  16  Preferred: 
 0  0x0  0

echo "4 0x1e" > /sys/kernel/debug/dri/0/DP-1/link_settings

cat /sys/kernel/debug/dri/0/DP-1/link_settings
Current:  4  0x1e  0  Verified:  4  0x1e  0  Reported:  4  0x1e  16  Preferred: 
 4  0x1e  0

Signed-off-by: Anson Jacob 
Reviewed-by: Harry Wentland 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c| 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index f3dfb2887ae0..2cdcefab2d7d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -95,29 +95,29 @@ static ssize_t dp_link_settings_read(struct file *f, char 
__user *buf,
 
rd_buf_ptr = rd_buf;
 
-   str_len = strlen("Current:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Current:  %d  %d  %d  ",
+   str_len = strlen("Current:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Current:  %d  0x%x  %d  ",
link->cur_link_settings.lane_count,
link->cur_link_settings.link_rate,
link->cur_link_settings.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Verified:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Verified:  %d  %d  %d  ",
+   str_len = strlen("Verified:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Verified:  %d  0x%x  %d  ",
link->verified_link_cap.lane_count,
link->verified_link_cap.link_rate,
link->verified_link_cap.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Reported:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Reported:  %d  %d  %d  ",
+   str_len = strlen("Reported:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Reported:  %d  0x%x  %d  ",
link->reported_link_cap.lane_count,
link->reported_link_cap.link_rate,
link->reported_link_cap.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Preferred:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Preferred:  %d  %d  %d\n",
+   str_len = strlen("Preferred:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Preferred:  %d  0x%x  %d\n",
link->preferred_link_setting.lane_count,
link->preferred_link_setting.link_rate,
link->preferred_link_setting.link_spread);
-- 
2.30.2



[PATCH AUTOSEL 5.4 023/109] drm/amd/display: Fix timer_per_pixel unit error

2021-09-09 Thread Sasha Levin
From: Oliver Logush 

[ Upstream commit 23e55639b87fb16a9f0f66032ecb57060df6c46c ]

[why]
The units of the time_per_pixel variable were incorrect, this had to be
changed for the code to properly function.

[how]
The change was very straightforward, only required one line of code to
be changed where the calculation was done.

Acked-by: Rodrigo Siqueira 
Signed-off-by: Oliver Logush 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index 2b1175bb2dae..d2ea4c003d44 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -2232,7 +2232,7 @@ void dcn20_set_mcif_arb_params(
wb_arb_params->cli_watermark[k] = 
get_wm_writeback_urgent(>bw_ctx.dml, pipes, pipe_cnt) * 1000;
wb_arb_params->pstate_watermark[k] = 
get_wm_writeback_dram_clock_change(>bw_ctx.dml, pipes, pipe_cnt) * 
1000;
}
-   wb_arb_params->time_per_pixel = 16.0 / 
context->res_ctx.pipe_ctx[i].stream->phy_pix_clk; /* 4 bit fraction, ms */
+   wb_arb_params->time_per_pixel = 16.0 * 1000 / 
(context->res_ctx.pipe_ctx[i].stream->phy_pix_clk / 1000); /* 4 bit fraction, 
ms */
wb_arb_params->slice_lines = 32;
wb_arb_params->arbitration_slice = 2;
wb_arb_params->max_scaled_time = 
dcn20_calc_max_scaled_time(wb_arb_params->time_per_pixel,
-- 
2.30.2



[PATCH AUTOSEL 5.4 001/109] drm/amdgpu: Fix amdgpu_ras_eeprom_init()

2021-09-09 Thread Sasha Levin
From: Luben Tuikov 

[ Upstream commit dce4400e6516d18313d23de45b5be8a18980b00e ]

No need to account for the 2 bytes of EEPROM
address--this is now well abstracted away by
the fixes the the lower layers.

Cc: Andrey Grodzovsky 
Cc: Alexander Deucher 
Signed-off-by: Luben Tuikov 
Acked-by: Alexander Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 8a32b5c93778..bd7ae3e130b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -138,7 +138,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control 
*control)
return ret;
}
 
-   __decode_table_header_from_buff(hdr, [2]);
+   __decode_table_header_from_buff(hdr, buff);
 
if (hdr->header == EEPROM_TABLE_HDR_VAL) {
control->num_recs = (hdr->tbl_size - EEPROM_TABLE_HEADER_SIZE) /
-- 
2.30.2



[PATCH AUTOSEL 5.10 158/176] drm/amdkfd: Account for SH/SE count when setting up cu masks.

2021-09-09 Thread Sasha Levin
From: Sean Keely 

[ Upstream commit 1ec06c2dee679e9f089e78ed20cb74ee90155f61 ]

On systems with multiple SH per SE compute_static_thread_mgmt_se#
is split into independent masks, one for each SH, in the upper and
lower 16 bits.  We need to detect this and apply cu masking to each
SH.  The cu mask bits are assigned first to each SE, then to
alternate SHs, then finally to higher CU id.  This ensures that
the maximum number of SPIs are engaged as early as possible while
balancing CU assignment to each SH.

v2: Use max SH/SE rather than max SH in cu_per_sh.

v3: Fix comment blocks, ensure se_mask is initially zero filled,
and correctly assign se.sh.cu positions to unset bits in cu_mask.

Signed-off-by: Sean Keely 
Reviewed-by: Felix Kuehling 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c | 84 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h |  1 +
 2 files changed, 64 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
index 88813dad731f..c021519af810 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
@@ -98,36 +98,78 @@ void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
uint32_t *se_mask)
 {
struct kfd_cu_info cu_info;
-   uint32_t cu_per_se[KFD_MAX_NUM_SE] = {0};
-   int i, se, sh, cu = 0;
-
+   uint32_t cu_per_sh[KFD_MAX_NUM_SE][KFD_MAX_NUM_SH_PER_SE] = {0};
+   int i, se, sh, cu;
amdgpu_amdkfd_get_cu_info(mm->dev->kgd, _info);
 
if (cu_mask_count > cu_info.cu_active_number)
cu_mask_count = cu_info.cu_active_number;
 
+   /* Exceeding these bounds corrupts the stack and indicates a coding 
error.
+* Returning with no CU's enabled will hang the queue, which should be
+* attention grabbing.
+*/
+   if (cu_info.num_shader_engines > KFD_MAX_NUM_SE) {
+   pr_err("Exceeded KFD_MAX_NUM_SE, chip reports %d\n", 
cu_info.num_shader_engines);
+   return;
+   }
+   if (cu_info.num_shader_arrays_per_engine > KFD_MAX_NUM_SH_PER_SE) {
+   pr_err("Exceeded KFD_MAX_NUM_SH, chip reports %d\n",
+   cu_info.num_shader_arrays_per_engine * 
cu_info.num_shader_engines);
+   return;
+   }
+   /* Count active CUs per SH.
+*
+* Some CUs in an SH may be disabled.   HW expects disabled CUs to be
+* represented in the high bits of each SH's enable mask (the upper and 
lower
+* 16 bits of se_mask) and will take care of the actual distribution of
+* disabled CUs within each SH automatically.
+* Each half of se_mask must be filled only on bits 
0-cu_per_sh[se][sh]-1.
+*
+* See note on Arcturus cu_bitmap layout in gfx_v9_0_get_cu_info.
+*/
for (se = 0; se < cu_info.num_shader_engines; se++)
for (sh = 0; sh < cu_info.num_shader_arrays_per_engine; sh++)
-   cu_per_se[se] += hweight32(cu_info.cu_bitmap[se % 4][sh 
+ (se / 4)]);
-
-   /* Symmetrically map cu_mask to all SEs:
-* cu_mask[0] bit0 -> se_mask[0] bit0;
-* cu_mask[0] bit1 -> se_mask[1] bit0;
-* ... (if # SE is 4)
-* cu_mask[0] bit4 -> se_mask[0] bit1;
+   cu_per_sh[se][sh] = hweight32(cu_info.cu_bitmap[se % 
4][sh + (se / 4)]);
+
+   /* Symmetrically map cu_mask to all SEs & SHs:
+* se_mask programs up to 2 SH in the upper and lower 16 bits.
+*
+* Examples
+* Assuming 1 SH/SE, 4 SEs:
+* cu_mask[0] bit0 -> se_mask[0] bit0
+* cu_mask[0] bit1 -> se_mask[1] bit0
+* ...
+* cu_mask[0] bit4 -> se_mask[0] bit1
+* ...
+*
+* Assuming 2 SH/SE, 4 SEs
+* cu_mask[0] bit0 -> se_mask[0] bit0 (SE0,SH0,CU0)
+* cu_mask[0] bit1 -> se_mask[1] bit0 (SE1,SH0,CU0)
+* ...
+* cu_mask[0] bit4 -> se_mask[0] bit16 (SE0,SH1,CU0)
+* cu_mask[0] bit5 -> se_mask[1] bit16 (SE1,SH1,CU0)
+* ...
+* cu_mask[0] bit8 -> se_mask[0] bit1 (SE0,SH0,CU1)
 * ...
+*
+* First ensure all CUs are disabled, then enable user specified CUs.
 */
-   se = 0;
-   for (i = 0; i < cu_mask_count; i++) {
-   if (cu_mask[i / 32] & (1 << (i % 32)))
-   se_mask[se] |= 1 << cu;
-
-   do {
-   se++;
-   if (se == cu_info.num_shader_engines) {
-   se = 0;
-   cu++;
+   for (i = 0; i < cu_info.num_shader_engines; i++)
+   se_mask[i] = 0;
+
+   i = 0;
+   for (cu = 0; cu < 16; cu++) {
+   for (sh = 0; sh < cu_info.num_shader_arrays_per_engine; sh++) {
+   for (se = 0; se < 

[PATCH AUTOSEL 5.10 106/176] drm/display: fix possible null-pointer dereference in dcn10_set_clock()

2021-09-09 Thread Sasha Levin
From: Tuo Li 

[ Upstream commit 554594567b1fa3da74f88ec7b2dc83d000c58e98 ]

The variable dc->clk_mgr is checked in:
  if (dc->clk_mgr && dc->clk_mgr->funcs->get_clock)

This indicates dc->clk_mgr can be NULL.
However, it is dereferenced in:
if (!dc->clk_mgr->funcs->get_clock)

To fix this null-pointer dereference, check dc->clk_mgr and the function
pointer dc->clk_mgr->funcs->get_clock earlier, and return if one of them
is NULL.

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 0d1e7b56fb39..532f6a1145b5 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -3740,13 +3740,12 @@ enum dc_status dcn10_set_clock(struct dc *dc,
struct dc_clock_config clock_cfg = {0};
struct dc_clocks *current_clocks = >bw_ctx.bw.dcn.clk;
 
-   if (dc->clk_mgr && dc->clk_mgr->funcs->get_clock)
-   dc->clk_mgr->funcs->get_clock(dc->clk_mgr,
-   context, clock_type, 
_cfg);
-
-   if (!dc->clk_mgr->funcs->get_clock)
+   if (!dc->clk_mgr || !dc->clk_mgr->funcs->get_clock)
return DC_FAIL_UNSUPPORTED_1;
 
+   dc->clk_mgr->funcs->get_clock(dc->clk_mgr,
+   context, clock_type, _cfg);
+
if (clk_khz > clock_cfg.max_clock_khz)
return DC_FAIL_CLK_EXCEED_MAX;
 
@@ -3764,7 +3763,7 @@ enum dc_status dcn10_set_clock(struct dc *dc,
else
return DC_ERROR_UNEXPECTED;
 
-   if (dc->clk_mgr && dc->clk_mgr->funcs->update_clocks)
+   if (dc->clk_mgr->funcs->update_clocks)
dc->clk_mgr->funcs->update_clocks(dc->clk_mgr,
context, true);
return DC_OK;
-- 
2.30.2



[PATCH AUTOSEL 5.10 105/176] gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable access in amdgpu_i2c_router_select_ddc_port()

2021-09-09 Thread Sasha Levin
From: Tuo Li 

[ Upstream commit a211260c34cfadc6068fece8c9e99e0fe1e2a2b6 ]

The variable val is declared without initialization, and its address is
passed to amdgpu_i2c_get_byte(). In this function, the value of val is
accessed in:
  DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n",
   addr, *val);

Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized,
but it is accessed in:
  val &= ~amdgpu_connector->router.ddc_mux_control_pin;

To fix this possible uninitialized-variable access, initialize val to 0 in
amdgpu_i2c_router_select_ddc_port().

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
index 47cad23a6b9e..b91d3d29b410 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@@ -339,7 +339,7 @@ static void amdgpu_i2c_put_byte(struct amdgpu_i2c_chan 
*i2c_bus,
 void
 amdgpu_i2c_router_select_ddc_port(const struct amdgpu_connector 
*amdgpu_connector)
 {
-   u8 val;
+   u8 val = 0;
 
if (!amdgpu_connector->router.ddc_valid)
return;
-- 
2.30.2



[PATCH AUTOSEL 5.10 098/176] drm/amd/display: fix incorrect CM/TF programming sequence in dwb

2021-09-09 Thread Sasha Levin
From: Roy Chan 

[ Upstream commit 781e1e23131cce56fb557e6ec2260480a6bd08cc ]

[How]
the programming sequeune was for old asic.
the correct programming sequeunce should be similar to the one
used in mpc. the fix is copied from the mpc programming sequeunce.

Reviewed-by: Anthony Koo 
Acked-by: Anson Jacob 
Signed-off-by: Roy Chan 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../drm/amd/display/dc/dcn30/dcn30_dwb_cm.c   | 90 +--
 1 file changed, 64 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c
index 8593145379d9..6d621f07be48 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c
@@ -49,6 +49,11 @@
 static void dwb3_get_reg_field_ogam(struct dcn30_dwbc *dwbc30,
struct dcn3_xfer_func_reg *reg)
 {
+   reg->shifts.field_region_start_base = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION_START_BASE_B;
+   reg->masks.field_region_start_base = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION_START_BASE_B;
+   reg->shifts.field_offset = dwbc30->dwbc_shift->DWB_OGAM_RAMA_OFFSET_B;
+   reg->masks.field_offset = dwbc30->dwbc_mask->DWB_OGAM_RAMA_OFFSET_B;
+
reg->shifts.exp_region0_lut_offset = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION0_LUT_OFFSET;
reg->masks.exp_region0_lut_offset = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION0_LUT_OFFSET;
reg->shifts.exp_region0_num_segments = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION0_NUM_SEGMENTS;
@@ -66,8 +71,6 @@ static void dwb3_get_reg_field_ogam(struct dcn30_dwbc *dwbc30,
reg->masks.field_region_end_base = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION_END_BASE_B;
reg->shifts.field_region_linear_slope = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION_START_SLOPE_B;
reg->masks.field_region_linear_slope = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION_START_SLOPE_B;
-   reg->masks.field_offset = dwbc30->dwbc_mask->DWB_OGAM_RAMA_OFFSET_B;
-   reg->shifts.field_offset = dwbc30->dwbc_shift->DWB_OGAM_RAMA_OFFSET_B;
reg->shifts.exp_region_start = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION_START_B;
reg->masks.exp_region_start = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION_START_B;
reg->shifts.exp_resion_start_segment = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION_START_SEGMENT_B;
@@ -147,18 +150,19 @@ static enum dc_lut_mode dwb3_get_ogam_current(
uint32_t state_mode;
uint32_t ram_select;
 
-   REG_GET(DWB_OGAM_CONTROL,
-   DWB_OGAM_MODE, _mode);
-   REG_GET(DWB_OGAM_CONTROL,
-   DWB_OGAM_SELECT, _select);
+   REG_GET_2(DWB_OGAM_CONTROL,
+   DWB_OGAM_MODE_CURRENT, _mode,
+   DWB_OGAM_SELECT_CURRENT, _select);
 
if (state_mode == 0) {
mode = LUT_BYPASS;
} else if (state_mode == 2) {
if (ram_select == 0)
mode = LUT_RAM_A;
-   else
+   else if (ram_select == 1)
mode = LUT_RAM_B;
+   else
+   mode = LUT_BYPASS;
} else {
// Reserved value
mode = LUT_BYPASS;
@@ -172,10 +176,10 @@ static void dwb3_configure_ogam_lut(
struct dcn30_dwbc *dwbc30,
bool is_ram_a)
 {
-   REG_UPDATE(DWB_OGAM_LUT_CONTROL,
-   DWB_OGAM_LUT_READ_COLOR_SEL, 7);
-   REG_UPDATE(DWB_OGAM_CONTROL,
-   DWB_OGAM_SELECT, is_ram_a == true ? 0 : 1);
+   REG_UPDATE_2(DWB_OGAM_LUT_CONTROL,
+   DWB_OGAM_LUT_WRITE_COLOR_MASK, 7,
+   DWB_OGAM_LUT_HOST_SEL, (is_ram_a == true) ? 0 : 1);
+
REG_SET(DWB_OGAM_LUT_INDEX, 0, DWB_OGAM_LUT_INDEX, 0);
 }
 
@@ -185,17 +189,45 @@ static void dwb3_program_ogam_pwl(struct dcn30_dwbc 
*dwbc30,
 {
uint32_t i;
 
-// triple base implementation
-   for (i = 0; i < num/2; i++) {
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+0].red_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+0].green_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+0].blue_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+1].red_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+1].green_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+1].blue_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+2].red_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+2].green_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+2].blue_reg);
+   uint32_t last_base_value_red = rgb[num-1].red_reg + 
rgb[num-1].delta_red_reg;
+   uint32_t last_base_value_green = rgb[num-1].green_reg + 

[PATCH AUTOSEL 5.10 097/176] drm/amd/display: fix missing writeback disablement if plane is removed

2021-09-09 Thread Sasha Levin
From: Roy Chan 

[ Upstream commit 82367e7f22d085092728f45fd5fbb15e3fb997c0 ]

[Why]
If the plane has been removed, the writeback disablement logic
doesn't run

[How]
fix the logic order

Acked-by: Anson Jacob 
Signed-off-by: Roy Chan 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 14 --
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 12 +++-
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index 9d3ccdd35582..79a2b9c785f0 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -1704,13 +1704,15 @@ void dcn20_program_front_end_for_ctx(
dcn20_program_pipe(dc, pipe, context);
pipe = pipe->bottom_pipe;
}
-   /* Program secondary blending tree and writeback pipes 
*/
-   pipe = >res_ctx.pipe_ctx[i];
-   if (!pipe->prev_odm_pipe && pipe->stream->num_wb_info > 0
-   && (pipe->update_flags.raw || 
pipe->plane_state->update_flags.raw || pipe->stream->update_flags.raw)
-   && 
hws->funcs.program_all_writeback_pipes_in_tree)
-   
hws->funcs.program_all_writeback_pipes_in_tree(dc, pipe->stream, context);
}
+   /* Program secondary blending tree and writeback pipes */
+   pipe = >res_ctx.pipe_ctx[i];
+   if (!pipe->top_pipe && !pipe->prev_odm_pipe
+   && pipe->stream && pipe->stream->num_wb_info > 0
+   && (pipe->update_flags.raw || 
(pipe->plane_state && pipe->plane_state->update_flags.raw)
+   || pipe->stream->update_flags.raw)
+   && 
hws->funcs.program_all_writeback_pipes_in_tree)
+   hws->funcs.program_all_writeback_pipes_in_tree(dc, 
pipe->stream, context);
}
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
index 97909d5aab34..22c77e96f6a5 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
@@ -396,12 +396,22 @@ void dcn30_program_all_writeback_pipes_in_tree(
for (i_pipe = 0; i_pipe < dc->res_pool->pipe_count; 
i_pipe++) {
struct pipe_ctx *pipe_ctx = 
>res_ctx.pipe_ctx[i_pipe];
 
+   if (!pipe_ctx->plane_state)
+   continue;
+
if (pipe_ctx->plane_state == 
wb_info.writeback_source_plane) {
wb_info.mpcc_inst = 
pipe_ctx->plane_res.mpcc_inst;
break;
}
}
-   ASSERT(wb_info.mpcc_inst != -1);
+
+   if (wb_info.mpcc_inst == -1) {
+   /* Disable writeback pipe and disconnect from 
MPCC
+* if source plane has been removed
+*/
+   dc->hwss.disable_writeback(dc, 
wb_info.dwb_pipe_inst);
+   continue;
+   }
 
ASSERT(wb_info.dwb_pipe_inst < 
dc->res_pool->res_cap->num_dwb);
dwb = dc->res_pool->dwbc[wb_info.dwb_pipe_inst];
-- 
2.30.2



[PATCH AUTOSEL 5.10 041/176] drm/amd/amdgpu: Update debugfs link_settings output link_rate field in hex

2021-09-09 Thread Sasha Levin
From: Anson Jacob 

[ Upstream commit 1a394b3c3de2577f200cb623c52a5c2b82805cec ]

link_rate is updated via debugfs using hex values, set it to output
in hex as well.

eg: Resolution: 1920x1080@144Hz
cat /sys/kernel/debug/dri/0/DP-1/link_settings
Current:  4  0x14  0  Verified:  4  0x1e  0  Reported:  4  0x1e  16  Preferred: 
 0  0x0  0

echo "4 0x1e" > /sys/kernel/debug/dri/0/DP-1/link_settings

cat /sys/kernel/debug/dri/0/DP-1/link_settings
Current:  4  0x1e  0  Verified:  4  0x1e  0  Reported:  4  0x1e  16  Preferred: 
 4  0x1e  0

Signed-off-by: Anson Jacob 
Reviewed-by: Harry Wentland 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c| 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index e02a55fc1382..fbb65c95464b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -197,29 +197,29 @@ static ssize_t dp_link_settings_read(struct file *f, char 
__user *buf,
 
rd_buf_ptr = rd_buf;
 
-   str_len = strlen("Current:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Current:  %d  %d  %d  ",
+   str_len = strlen("Current:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Current:  %d  0x%x  %d  ",
link->cur_link_settings.lane_count,
link->cur_link_settings.link_rate,
link->cur_link_settings.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Verified:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Verified:  %d  %d  %d  ",
+   str_len = strlen("Verified:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Verified:  %d  0x%x  %d  ",
link->verified_link_cap.lane_count,
link->verified_link_cap.link_rate,
link->verified_link_cap.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Reported:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Reported:  %d  %d  %d  ",
+   str_len = strlen("Reported:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Reported:  %d  0x%x  %d  ",
link->reported_link_cap.lane_count,
link->reported_link_cap.link_rate,
link->reported_link_cap.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Preferred:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Preferred:  %d  %d  %d\n",
+   str_len = strlen("Preferred:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Preferred:  %d  0x%x  %d\n",
link->preferred_link_setting.lane_count,
link->preferred_link_setting.link_rate,
link->preferred_link_setting.link_spread);
-- 
2.30.2



[PATCH AUTOSEL 5.10 040/176] drm/amdgpu: Fix a printing message

2021-09-09 Thread Sasha Levin
From: Oak Zeng 

[ Upstream commit 95f71f12aa45d65b7f2ccab95569795edffd379a ]

The printing message "PSP loading VCN firmware" is mis-leading because
people might think driver is loading VCN firmware. Actually when this
message is printed, driver is just preparing some VCN ucode, not loading
VCN firmware yet. The actual VCN firmware loading will be in the PSP block
hw_init. Fix the printing message

Signed-off-by: Oak Zeng 
Reviewed-by: Christian Konig 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index aa8ae0ca62f9..e8737fa438f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -120,7 +120,7 @@ static int vcn_v1_0_sw_init(void *handle)
adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-   DRM_INFO("PSP loading VCN firmware\n");
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
}
 
r = amdgpu_vcn_resume(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
index fc939d4f4841..f493b5c3d382 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
@@ -122,7 +122,7 @@ static int vcn_v2_0_sw_init(void *handle)
adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-   DRM_INFO("PSP loading VCN firmware\n");
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
}
 
r = amdgpu_vcn_resume(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index 2c328362eee3..ce64d4016f90 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -152,7 +152,7 @@ static int vcn_v2_5_sw_init(void *handle)
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), 
PAGE_SIZE);
}
-   DRM_INFO("PSP loading VCN firmware\n");
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
}
 
r = amdgpu_vcn_resume(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index c9c888be1228..2099f6ebd833 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -148,7 +148,7 @@ static int vcn_v3_0_sw_init(void *handle)
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), 
PAGE_SIZE);
}
-   DRM_INFO("PSP loading VCN firmware\n");
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
}
 
r = amdgpu_vcn_resume(adev);
-- 
2.30.2



[PATCH AUTOSEL 5.10 032/176] drm/amd/display: Fix timer_per_pixel unit error

2021-09-09 Thread Sasha Levin
From: Oliver Logush 

[ Upstream commit 23e55639b87fb16a9f0f66032ecb57060df6c46c ]

[why]
The units of the time_per_pixel variable were incorrect, this had to be
changed for the code to properly function.

[how]
The change was very straightforward, only required one line of code to
be changed where the calculation was done.

Acked-by: Rodrigo Siqueira 
Signed-off-by: Oliver Logush 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index cfe85ba1018e..5dbc290bcbe8 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -2455,7 +2455,7 @@ void dcn20_set_mcif_arb_params(
wb_arb_params->cli_watermark[k] = 
get_wm_writeback_urgent(>bw_ctx.dml, pipes, pipe_cnt) * 1000;
wb_arb_params->pstate_watermark[k] = 
get_wm_writeback_dram_clock_change(>bw_ctx.dml, pipes, pipe_cnt) * 
1000;
}
-   wb_arb_params->time_per_pixel = 16.0 / 
context->res_ctx.pipe_ctx[i].stream->phy_pix_clk; /* 4 bit fraction, ms */
+   wb_arb_params->time_per_pixel = 16.0 * 1000 / 
(context->res_ctx.pipe_ctx[i].stream->phy_pix_clk / 1000); /* 4 bit fraction, 
ms */
wb_arb_params->slice_lines = 32;
wb_arb_params->arbitration_slice = 2;
wb_arb_params->max_scaled_time = 
dcn20_calc_max_scaled_time(wb_arb_params->time_per_pixel,
-- 
2.30.2



[PATCH AUTOSEL 5.10 002/176] drm/amdgpu: Fix amdgpu_ras_eeprom_init()

2021-09-09 Thread Sasha Levin
From: Luben Tuikov 

[ Upstream commit dce4400e6516d18313d23de45b5be8a18980b00e ]

No need to account for the 2 bytes of EEPROM
address--this is now well abstracted away by
the fixes the the lower layers.

Cc: Andrey Grodzovsky 
Cc: Alexander Deucher 
Signed-off-by: Luben Tuikov 
Acked-by: Alexander Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 0e64c39a2372..7c3efc5f1be0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -305,7 +305,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control 
*control,
return ret;
}
 
-   __decode_table_header_from_buff(hdr, [2]);
+   __decode_table_header_from_buff(hdr, buff);
 
if (hdr->header == EEPROM_TABLE_HDR_VAL) {
control->num_recs = (hdr->tbl_size - EEPROM_TABLE_HEADER_SIZE) /
-- 
2.30.2



[PATCH AUTOSEL 5.13 198/219] drm/amdkfd: Account for SH/SE count when setting up cu masks.

2021-09-09 Thread Sasha Levin
From: Sean Keely 

[ Upstream commit 1ec06c2dee679e9f089e78ed20cb74ee90155f61 ]

On systems with multiple SH per SE compute_static_thread_mgmt_se#
is split into independent masks, one for each SH, in the upper and
lower 16 bits.  We need to detect this and apply cu masking to each
SH.  The cu mask bits are assigned first to each SE, then to
alternate SHs, then finally to higher CU id.  This ensures that
the maximum number of SPIs are engaged as early as possible while
balancing CU assignment to each SH.

v2: Use max SH/SE rather than max SH in cu_per_sh.

v3: Fix comment blocks, ensure se_mask is initially zero filled,
and correctly assign se.sh.cu positions to unset bits in cu_mask.

Signed-off-by: Sean Keely 
Reviewed-by: Felix Kuehling 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c | 84 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h |  1 +
 2 files changed, 64 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
index 88813dad731f..c021519af810 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
@@ -98,36 +98,78 @@ void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
uint32_t *se_mask)
 {
struct kfd_cu_info cu_info;
-   uint32_t cu_per_se[KFD_MAX_NUM_SE] = {0};
-   int i, se, sh, cu = 0;
-
+   uint32_t cu_per_sh[KFD_MAX_NUM_SE][KFD_MAX_NUM_SH_PER_SE] = {0};
+   int i, se, sh, cu;
amdgpu_amdkfd_get_cu_info(mm->dev->kgd, _info);
 
if (cu_mask_count > cu_info.cu_active_number)
cu_mask_count = cu_info.cu_active_number;
 
+   /* Exceeding these bounds corrupts the stack and indicates a coding 
error.
+* Returning with no CU's enabled will hang the queue, which should be
+* attention grabbing.
+*/
+   if (cu_info.num_shader_engines > KFD_MAX_NUM_SE) {
+   pr_err("Exceeded KFD_MAX_NUM_SE, chip reports %d\n", 
cu_info.num_shader_engines);
+   return;
+   }
+   if (cu_info.num_shader_arrays_per_engine > KFD_MAX_NUM_SH_PER_SE) {
+   pr_err("Exceeded KFD_MAX_NUM_SH, chip reports %d\n",
+   cu_info.num_shader_arrays_per_engine * 
cu_info.num_shader_engines);
+   return;
+   }
+   /* Count active CUs per SH.
+*
+* Some CUs in an SH may be disabled.   HW expects disabled CUs to be
+* represented in the high bits of each SH's enable mask (the upper and 
lower
+* 16 bits of se_mask) and will take care of the actual distribution of
+* disabled CUs within each SH automatically.
+* Each half of se_mask must be filled only on bits 
0-cu_per_sh[se][sh]-1.
+*
+* See note on Arcturus cu_bitmap layout in gfx_v9_0_get_cu_info.
+*/
for (se = 0; se < cu_info.num_shader_engines; se++)
for (sh = 0; sh < cu_info.num_shader_arrays_per_engine; sh++)
-   cu_per_se[se] += hweight32(cu_info.cu_bitmap[se % 4][sh 
+ (se / 4)]);
-
-   /* Symmetrically map cu_mask to all SEs:
-* cu_mask[0] bit0 -> se_mask[0] bit0;
-* cu_mask[0] bit1 -> se_mask[1] bit0;
-* ... (if # SE is 4)
-* cu_mask[0] bit4 -> se_mask[0] bit1;
+   cu_per_sh[se][sh] = hweight32(cu_info.cu_bitmap[se % 
4][sh + (se / 4)]);
+
+   /* Symmetrically map cu_mask to all SEs & SHs:
+* se_mask programs up to 2 SH in the upper and lower 16 bits.
+*
+* Examples
+* Assuming 1 SH/SE, 4 SEs:
+* cu_mask[0] bit0 -> se_mask[0] bit0
+* cu_mask[0] bit1 -> se_mask[1] bit0
+* ...
+* cu_mask[0] bit4 -> se_mask[0] bit1
+* ...
+*
+* Assuming 2 SH/SE, 4 SEs
+* cu_mask[0] bit0 -> se_mask[0] bit0 (SE0,SH0,CU0)
+* cu_mask[0] bit1 -> se_mask[1] bit0 (SE1,SH0,CU0)
+* ...
+* cu_mask[0] bit4 -> se_mask[0] bit16 (SE0,SH1,CU0)
+* cu_mask[0] bit5 -> se_mask[1] bit16 (SE1,SH1,CU0)
+* ...
+* cu_mask[0] bit8 -> se_mask[0] bit1 (SE0,SH0,CU1)
 * ...
+*
+* First ensure all CUs are disabled, then enable user specified CUs.
 */
-   se = 0;
-   for (i = 0; i < cu_mask_count; i++) {
-   if (cu_mask[i / 32] & (1 << (i % 32)))
-   se_mask[se] |= 1 << cu;
-
-   do {
-   se++;
-   if (se == cu_info.num_shader_engines) {
-   se = 0;
-   cu++;
+   for (i = 0; i < cu_info.num_shader_engines; i++)
+   se_mask[i] = 0;
+
+   i = 0;
+   for (cu = 0; cu < 16; cu++) {
+   for (sh = 0; sh < cu_info.num_shader_arrays_per_engine; sh++) {
+   for (se = 0; se < 

[PATCH AUTOSEL 5.13 130/219] drm/display: fix possible null-pointer dereference in dcn10_set_clock()

2021-09-09 Thread Sasha Levin
From: Tuo Li 

[ Upstream commit 554594567b1fa3da74f88ec7b2dc83d000c58e98 ]

The variable dc->clk_mgr is checked in:
  if (dc->clk_mgr && dc->clk_mgr->funcs->get_clock)

This indicates dc->clk_mgr can be NULL.
However, it is dereferenced in:
if (!dc->clk_mgr->funcs->get_clock)

To fix this null-pointer dereference, check dc->clk_mgr and the function
pointer dc->clk_mgr->funcs->get_clock earlier, and return if one of them
is NULL.

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 7c939c0a977b..29f61a8d3e29 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -3938,13 +3938,12 @@ enum dc_status dcn10_set_clock(struct dc *dc,
struct dc_clock_config clock_cfg = {0};
struct dc_clocks *current_clocks = >bw_ctx.bw.dcn.clk;
 
-   if (dc->clk_mgr && dc->clk_mgr->funcs->get_clock)
-   dc->clk_mgr->funcs->get_clock(dc->clk_mgr,
-   context, clock_type, 
_cfg);
-
-   if (!dc->clk_mgr->funcs->get_clock)
+   if (!dc->clk_mgr || !dc->clk_mgr->funcs->get_clock)
return DC_FAIL_UNSUPPORTED_1;
 
+   dc->clk_mgr->funcs->get_clock(dc->clk_mgr,
+   context, clock_type, _cfg);
+
if (clk_khz > clock_cfg.max_clock_khz)
return DC_FAIL_CLK_EXCEED_MAX;
 
@@ -3962,7 +3961,7 @@ enum dc_status dcn10_set_clock(struct dc *dc,
else
return DC_ERROR_UNEXPECTED;
 
-   if (dc->clk_mgr && dc->clk_mgr->funcs->update_clocks)
+   if (dc->clk_mgr->funcs->update_clocks)
dc->clk_mgr->funcs->update_clocks(dc->clk_mgr,
context, true);
return DC_OK;
-- 
2.30.2



[PATCH AUTOSEL 5.13 129/219] gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable access in amdgpu_i2c_router_select_ddc_port()

2021-09-09 Thread Sasha Levin
From: Tuo Li 

[ Upstream commit a211260c34cfadc6068fece8c9e99e0fe1e2a2b6 ]

The variable val is declared without initialization, and its address is
passed to amdgpu_i2c_get_byte(). In this function, the value of val is
accessed in:
  DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n",
   addr, *val);

Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized,
but it is accessed in:
  val &= ~amdgpu_connector->router.ddc_mux_control_pin;

To fix this possible uninitialized-variable access, initialize val to 0 in
amdgpu_i2c_router_select_ddc_port().

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
index bca45a15..82608df43396 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@@ -339,7 +339,7 @@ static void amdgpu_i2c_put_byte(struct amdgpu_i2c_chan 
*i2c_bus,
 void
 amdgpu_i2c_router_select_ddc_port(const struct amdgpu_connector 
*amdgpu_connector)
 {
-   u8 val;
+   u8 val = 0;
 
if (!amdgpu_connector->router.ddc_valid)
return;
-- 
2.30.2



[PATCH AUTOSEL 5.13 119/219] drm/amd/display: fix incorrect CM/TF programming sequence in dwb

2021-09-09 Thread Sasha Levin
From: Roy Chan 

[ Upstream commit 781e1e23131cce56fb557e6ec2260480a6bd08cc ]

[How]
the programming sequeune was for old asic.
the correct programming sequeunce should be similar to the one
used in mpc. the fix is copied from the mpc programming sequeunce.

Reviewed-by: Anthony Koo 
Acked-by: Anson Jacob 
Signed-off-by: Roy Chan 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../drm/amd/display/dc/dcn30/dcn30_dwb_cm.c   | 90 +--
 1 file changed, 64 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c
index 3fe9e41e4dbd..6a3d3a0ec0a3 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c
@@ -49,6 +49,11 @@
 static void dwb3_get_reg_field_ogam(struct dcn30_dwbc *dwbc30,
struct dcn3_xfer_func_reg *reg)
 {
+   reg->shifts.field_region_start_base = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION_START_BASE_B;
+   reg->masks.field_region_start_base = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION_START_BASE_B;
+   reg->shifts.field_offset = dwbc30->dwbc_shift->DWB_OGAM_RAMA_OFFSET_B;
+   reg->masks.field_offset = dwbc30->dwbc_mask->DWB_OGAM_RAMA_OFFSET_B;
+
reg->shifts.exp_region0_lut_offset = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION0_LUT_OFFSET;
reg->masks.exp_region0_lut_offset = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION0_LUT_OFFSET;
reg->shifts.exp_region0_num_segments = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION0_NUM_SEGMENTS;
@@ -66,8 +71,6 @@ static void dwb3_get_reg_field_ogam(struct dcn30_dwbc *dwbc30,
reg->masks.field_region_end_base = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION_END_BASE_B;
reg->shifts.field_region_linear_slope = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION_START_SLOPE_B;
reg->masks.field_region_linear_slope = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION_START_SLOPE_B;
-   reg->masks.field_offset = dwbc30->dwbc_mask->DWB_OGAM_RAMA_OFFSET_B;
-   reg->shifts.field_offset = dwbc30->dwbc_shift->DWB_OGAM_RAMA_OFFSET_B;
reg->shifts.exp_region_start = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION_START_B;
reg->masks.exp_region_start = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION_START_B;
reg->shifts.exp_resion_start_segment = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION_START_SEGMENT_B;
@@ -147,18 +150,19 @@ static enum dc_lut_mode dwb3_get_ogam_current(
uint32_t state_mode;
uint32_t ram_select;
 
-   REG_GET(DWB_OGAM_CONTROL,
-   DWB_OGAM_MODE, _mode);
-   REG_GET(DWB_OGAM_CONTROL,
-   DWB_OGAM_SELECT, _select);
+   REG_GET_2(DWB_OGAM_CONTROL,
+   DWB_OGAM_MODE_CURRENT, _mode,
+   DWB_OGAM_SELECT_CURRENT, _select);
 
if (state_mode == 0) {
mode = LUT_BYPASS;
} else if (state_mode == 2) {
if (ram_select == 0)
mode = LUT_RAM_A;
-   else
+   else if (ram_select == 1)
mode = LUT_RAM_B;
+   else
+   mode = LUT_BYPASS;
} else {
// Reserved value
mode = LUT_BYPASS;
@@ -172,10 +176,10 @@ static void dwb3_configure_ogam_lut(
struct dcn30_dwbc *dwbc30,
bool is_ram_a)
 {
-   REG_UPDATE(DWB_OGAM_LUT_CONTROL,
-   DWB_OGAM_LUT_READ_COLOR_SEL, 7);
-   REG_UPDATE(DWB_OGAM_CONTROL,
-   DWB_OGAM_SELECT, is_ram_a == true ? 0 : 1);
+   REG_UPDATE_2(DWB_OGAM_LUT_CONTROL,
+   DWB_OGAM_LUT_WRITE_COLOR_MASK, 7,
+   DWB_OGAM_LUT_HOST_SEL, (is_ram_a == true) ? 0 : 1);
+
REG_SET(DWB_OGAM_LUT_INDEX, 0, DWB_OGAM_LUT_INDEX, 0);
 }
 
@@ -185,17 +189,45 @@ static void dwb3_program_ogam_pwl(struct dcn30_dwbc 
*dwbc30,
 {
uint32_t i;
 
-// triple base implementation
-   for (i = 0; i < num/2; i++) {
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+0].red_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+0].green_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+0].blue_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+1].red_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+1].green_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+1].blue_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+2].red_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+2].green_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+2].blue_reg);
+   uint32_t last_base_value_red = rgb[num-1].red_reg + 
rgb[num-1].delta_red_reg;
+   uint32_t last_base_value_green = rgb[num-1].green_reg + 

[PATCH AUTOSEL 5.13 118/219] drm/amd/display: fix missing writeback disablement if plane is removed

2021-09-09 Thread Sasha Levin
From: Roy Chan 

[ Upstream commit 82367e7f22d085092728f45fd5fbb15e3fb997c0 ]

[Why]
If the plane has been removed, the writeback disablement logic
doesn't run

[How]
fix the logic order

Acked-by: Anson Jacob 
Signed-off-by: Roy Chan 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 14 --
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 12 +++-
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index 793554e61c52..03b941e76de2 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -1703,13 +1703,15 @@ void dcn20_program_front_end_for_ctx(
dcn20_program_pipe(dc, pipe, context);
pipe = pipe->bottom_pipe;
}
-   /* Program secondary blending tree and writeback pipes 
*/
-   pipe = >res_ctx.pipe_ctx[i];
-   if (!pipe->prev_odm_pipe && pipe->stream->num_wb_info > 0
-   && (pipe->update_flags.raw || 
pipe->plane_state->update_flags.raw || pipe->stream->update_flags.raw)
-   && 
hws->funcs.program_all_writeback_pipes_in_tree)
-   
hws->funcs.program_all_writeback_pipes_in_tree(dc, pipe->stream, context);
}
+   /* Program secondary blending tree and writeback pipes */
+   pipe = >res_ctx.pipe_ctx[i];
+   if (!pipe->top_pipe && !pipe->prev_odm_pipe
+   && pipe->stream && pipe->stream->num_wb_info > 0
+   && (pipe->update_flags.raw || 
(pipe->plane_state && pipe->plane_state->update_flags.raw)
+   || pipe->stream->update_flags.raw)
+   && 
hws->funcs.program_all_writeback_pipes_in_tree)
+   hws->funcs.program_all_writeback_pipes_in_tree(dc, 
pipe->stream, context);
}
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
index d53f8b39699b..37944f94c693 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
@@ -396,12 +396,22 @@ void dcn30_program_all_writeback_pipes_in_tree(
for (i_pipe = 0; i_pipe < dc->res_pool->pipe_count; 
i_pipe++) {
struct pipe_ctx *pipe_ctx = 
>res_ctx.pipe_ctx[i_pipe];
 
+   if (!pipe_ctx->plane_state)
+   continue;
+
if (pipe_ctx->plane_state == 
wb_info.writeback_source_plane) {
wb_info.mpcc_inst = 
pipe_ctx->plane_res.mpcc_inst;
break;
}
}
-   ASSERT(wb_info.mpcc_inst != -1);
+
+   if (wb_info.mpcc_inst == -1) {
+   /* Disable writeback pipe and disconnect from 
MPCC
+* if source plane has been removed
+*/
+   dc->hwss.disable_writeback(dc, 
wb_info.dwb_pipe_inst);
+   continue;
+   }
 
ASSERT(wb_info.dwb_pipe_inst < 
dc->res_pool->res_cap->num_dwb);
dwb = dc->res_pool->dwbc[wb_info.dwb_pipe_inst];
-- 
2.30.2



[PATCH AUTOSEL 5.13 054/219] drm/amd/amdgpu: Update debugfs link_settings output link_rate field in hex

2021-09-09 Thread Sasha Levin
From: Anson Jacob 

[ Upstream commit 1a394b3c3de2577f200cb623c52a5c2b82805cec ]

link_rate is updated via debugfs using hex values, set it to output
in hex as well.

eg: Resolution: 1920x1080@144Hz
cat /sys/kernel/debug/dri/0/DP-1/link_settings
Current:  4  0x14  0  Verified:  4  0x1e  0  Reported:  4  0x1e  16  Preferred: 
 0  0x0  0

echo "4 0x1e" > /sys/kernel/debug/dri/0/DP-1/link_settings

cat /sys/kernel/debug/dri/0/DP-1/link_settings
Current:  4  0x1e  0  Verified:  4  0x1e  0  Reported:  4  0x1e  16  Preferred: 
 4  0x1e  0

Signed-off-by: Anson Jacob 
Reviewed-by: Harry Wentland 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c| 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 1b6b15708b96..08ff1166ffc8 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -197,29 +197,29 @@ static ssize_t dp_link_settings_read(struct file *f, char 
__user *buf,
 
rd_buf_ptr = rd_buf;
 
-   str_len = strlen("Current:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Current:  %d  %d  %d  ",
+   str_len = strlen("Current:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Current:  %d  0x%x  %d  ",
link->cur_link_settings.lane_count,
link->cur_link_settings.link_rate,
link->cur_link_settings.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Verified:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Verified:  %d  %d  %d  ",
+   str_len = strlen("Verified:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Verified:  %d  0x%x  %d  ",
link->verified_link_cap.lane_count,
link->verified_link_cap.link_rate,
link->verified_link_cap.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Reported:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Reported:  %d  %d  %d  ",
+   str_len = strlen("Reported:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Reported:  %d  0x%x  %d  ",
link->reported_link_cap.lane_count,
link->reported_link_cap.link_rate,
link->reported_link_cap.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Preferred:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Preferred:  %d  %d  %d\n",
+   str_len = strlen("Preferred:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Preferred:  %d  0x%x  %d\n",
link->preferred_link_setting.lane_count,
link->preferred_link_setting.link_rate,
link->preferred_link_setting.link_spread);
-- 
2.30.2



[PATCH AUTOSEL 5.13 053/219] drm/amdgpu: Fix a printing message

2021-09-09 Thread Sasha Levin
From: Oak Zeng 

[ Upstream commit 95f71f12aa45d65b7f2ccab95569795edffd379a ]

The printing message "PSP loading VCN firmware" is mis-leading because
people might think driver is loading VCN firmware. Actually when this
message is printed, driver is just preparing some VCN ucode, not loading
VCN firmware yet. The actual VCN firmware loading will be in the PSP block
hw_init. Fix the printing message

Signed-off-by: Oak Zeng 
Reviewed-by: Christian Konig 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index 27b1ced145d2..14ae2bfad59d 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -119,7 +119,7 @@ static int vcn_v1_0_sw_init(void *handle)
adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-   DRM_INFO("PSP loading VCN firmware\n");
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
}
 
r = amdgpu_vcn_resume(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
index 8af567c546db..f4686e918e0d 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
@@ -122,7 +122,7 @@ static int vcn_v2_0_sw_init(void *handle)
adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-   DRM_INFO("PSP loading VCN firmware\n");
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
}
 
r = amdgpu_vcn_resume(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index 888b17d84691..e0c0c3734432 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -152,7 +152,7 @@ static int vcn_v2_5_sw_init(void *handle)
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), 
PAGE_SIZE);
}
-   DRM_INFO("PSP loading VCN firmware\n");
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
}
 
r = amdgpu_vcn_resume(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 3b23de996db2..c2c5c4af51d2 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -152,7 +152,7 @@ static int vcn_v3_0_sw_init(void *handle)
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), 
PAGE_SIZE);
}
-   DRM_INFO("PSP loading VCN firmware\n");
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
}
 
r = amdgpu_vcn_resume(adev);
-- 
2.30.2



[PATCH AUTOSEL 5.13 044/219] drm/amd/display: Fix timer_per_pixel unit error

2021-09-09 Thread Sasha Levin
From: Oliver Logush 

[ Upstream commit 23e55639b87fb16a9f0f66032ecb57060df6c46c ]

[why]
The units of the time_per_pixel variable were incorrect, this had to be
changed for the code to properly function.

[how]
The change was very straightforward, only required one line of code to
be changed where the calculation was done.

Acked-by: Rodrigo Siqueira 
Signed-off-by: Oliver Logush 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index 81f583733fa8..12e92f620483 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -2461,7 +2461,7 @@ void dcn20_set_mcif_arb_params(
wb_arb_params->cli_watermark[k] = 
get_wm_writeback_urgent(>bw_ctx.dml, pipes, pipe_cnt) * 1000;
wb_arb_params->pstate_watermark[k] = 
get_wm_writeback_dram_clock_change(>bw_ctx.dml, pipes, pipe_cnt) * 
1000;
}
-   wb_arb_params->time_per_pixel = 16.0 / 
context->res_ctx.pipe_ctx[i].stream->phy_pix_clk; /* 4 bit fraction, ms */
+   wb_arb_params->time_per_pixel = 16.0 * 1000 / 
(context->res_ctx.pipe_ctx[i].stream->phy_pix_clk / 1000); /* 4 bit fraction, 
ms */
wb_arb_params->slice_lines = 32;
wb_arb_params->arbitration_slice = 2;
wb_arb_params->max_scaled_time = 
dcn20_calc_max_scaled_time(wb_arb_params->time_per_pixel,
-- 
2.30.2



[PATCH AUTOSEL 5.13 005/219] drm/amdgpu: Fix amdgpu_ras_eeprom_init()

2021-09-09 Thread Sasha Levin
From: Luben Tuikov 

[ Upstream commit dce4400e6516d18313d23de45b5be8a18980b00e ]

No need to account for the 2 bytes of EEPROM
address--this is now well abstracted away by
the fixes the the lower layers.

Cc: Andrey Grodzovsky 
Cc: Alexander Deucher 
Signed-off-by: Luben Tuikov 
Acked-by: Alexander Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index f40c871da0c6..fb701c4fd5c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -321,7 +321,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control 
*control,
return ret;
}
 
-   __decode_table_header_from_buff(hdr, [2]);
+   __decode_table_header_from_buff(hdr, buff);
 
if (hdr->header == EEPROM_TABLE_HDR_VAL) {
control->num_recs = (hdr->tbl_size - EEPROM_TABLE_HEADER_SIZE) /
-- 
2.30.2



[PATCH AUTOSEL 5.14 225/252] drm/amdkfd: Account for SH/SE count when setting up cu masks.

2021-09-09 Thread Sasha Levin
From: Sean Keely 

[ Upstream commit 1ec06c2dee679e9f089e78ed20cb74ee90155f61 ]

On systems with multiple SH per SE compute_static_thread_mgmt_se#
is split into independent masks, one for each SH, in the upper and
lower 16 bits.  We need to detect this and apply cu masking to each
SH.  The cu mask bits are assigned first to each SE, then to
alternate SHs, then finally to higher CU id.  This ensures that
the maximum number of SPIs are engaged as early as possible while
balancing CU assignment to each SH.

v2: Use max SH/SE rather than max SH in cu_per_sh.

v3: Fix comment blocks, ensure se_mask is initially zero filled,
and correctly assign se.sh.cu positions to unset bits in cu_mask.

Signed-off-by: Sean Keely 
Reviewed-by: Felix Kuehling 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c | 84 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h |  1 +
 2 files changed, 64 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
index 88813dad731f..c021519af810 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
@@ -98,36 +98,78 @@ void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
uint32_t *se_mask)
 {
struct kfd_cu_info cu_info;
-   uint32_t cu_per_se[KFD_MAX_NUM_SE] = {0};
-   int i, se, sh, cu = 0;
-
+   uint32_t cu_per_sh[KFD_MAX_NUM_SE][KFD_MAX_NUM_SH_PER_SE] = {0};
+   int i, se, sh, cu;
amdgpu_amdkfd_get_cu_info(mm->dev->kgd, _info);
 
if (cu_mask_count > cu_info.cu_active_number)
cu_mask_count = cu_info.cu_active_number;
 
+   /* Exceeding these bounds corrupts the stack and indicates a coding 
error.
+* Returning with no CU's enabled will hang the queue, which should be
+* attention grabbing.
+*/
+   if (cu_info.num_shader_engines > KFD_MAX_NUM_SE) {
+   pr_err("Exceeded KFD_MAX_NUM_SE, chip reports %d\n", 
cu_info.num_shader_engines);
+   return;
+   }
+   if (cu_info.num_shader_arrays_per_engine > KFD_MAX_NUM_SH_PER_SE) {
+   pr_err("Exceeded KFD_MAX_NUM_SH, chip reports %d\n",
+   cu_info.num_shader_arrays_per_engine * 
cu_info.num_shader_engines);
+   return;
+   }
+   /* Count active CUs per SH.
+*
+* Some CUs in an SH may be disabled.   HW expects disabled CUs to be
+* represented in the high bits of each SH's enable mask (the upper and 
lower
+* 16 bits of se_mask) and will take care of the actual distribution of
+* disabled CUs within each SH automatically.
+* Each half of se_mask must be filled only on bits 
0-cu_per_sh[se][sh]-1.
+*
+* See note on Arcturus cu_bitmap layout in gfx_v9_0_get_cu_info.
+*/
for (se = 0; se < cu_info.num_shader_engines; se++)
for (sh = 0; sh < cu_info.num_shader_arrays_per_engine; sh++)
-   cu_per_se[se] += hweight32(cu_info.cu_bitmap[se % 4][sh 
+ (se / 4)]);
-
-   /* Symmetrically map cu_mask to all SEs:
-* cu_mask[0] bit0 -> se_mask[0] bit0;
-* cu_mask[0] bit1 -> se_mask[1] bit0;
-* ... (if # SE is 4)
-* cu_mask[0] bit4 -> se_mask[0] bit1;
+   cu_per_sh[se][sh] = hweight32(cu_info.cu_bitmap[se % 
4][sh + (se / 4)]);
+
+   /* Symmetrically map cu_mask to all SEs & SHs:
+* se_mask programs up to 2 SH in the upper and lower 16 bits.
+*
+* Examples
+* Assuming 1 SH/SE, 4 SEs:
+* cu_mask[0] bit0 -> se_mask[0] bit0
+* cu_mask[0] bit1 -> se_mask[1] bit0
+* ...
+* cu_mask[0] bit4 -> se_mask[0] bit1
+* ...
+*
+* Assuming 2 SH/SE, 4 SEs
+* cu_mask[0] bit0 -> se_mask[0] bit0 (SE0,SH0,CU0)
+* cu_mask[0] bit1 -> se_mask[1] bit0 (SE1,SH0,CU0)
+* ...
+* cu_mask[0] bit4 -> se_mask[0] bit16 (SE0,SH1,CU0)
+* cu_mask[0] bit5 -> se_mask[1] bit16 (SE1,SH1,CU0)
+* ...
+* cu_mask[0] bit8 -> se_mask[0] bit1 (SE0,SH0,CU1)
 * ...
+*
+* First ensure all CUs are disabled, then enable user specified CUs.
 */
-   se = 0;
-   for (i = 0; i < cu_mask_count; i++) {
-   if (cu_mask[i / 32] & (1 << (i % 32)))
-   se_mask[se] |= 1 << cu;
-
-   do {
-   se++;
-   if (se == cu_info.num_shader_engines) {
-   se = 0;
-   cu++;
+   for (i = 0; i < cu_info.num_shader_engines; i++)
+   se_mask[i] = 0;
+
+   i = 0;
+   for (cu = 0; cu < 16; cu++) {
+   for (sh = 0; sh < cu_info.num_shader_arrays_per_engine; sh++) {
+   for (se = 0; se < 

[PATCH AUTOSEL 5.14 148/252] drm/display: fix possible null-pointer dereference in dcn10_set_clock()

2021-09-09 Thread Sasha Levin
From: Tuo Li 

[ Upstream commit 554594567b1fa3da74f88ec7b2dc83d000c58e98 ]

The variable dc->clk_mgr is checked in:
  if (dc->clk_mgr && dc->clk_mgr->funcs->get_clock)

This indicates dc->clk_mgr can be NULL.
However, it is dereferenced in:
if (!dc->clk_mgr->funcs->get_clock)

To fix this null-pointer dereference, check dc->clk_mgr and the function
pointer dc->clk_mgr->funcs->get_clock earlier, and return if one of them
is NULL.

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index dee1ce5f9609..75fa4adcf5f4 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -3628,13 +3628,12 @@ enum dc_status dcn10_set_clock(struct dc *dc,
struct dc_clock_config clock_cfg = {0};
struct dc_clocks *current_clocks = >bw_ctx.bw.dcn.clk;
 
-   if (dc->clk_mgr && dc->clk_mgr->funcs->get_clock)
-   dc->clk_mgr->funcs->get_clock(dc->clk_mgr,
-   context, clock_type, 
_cfg);
-
-   if (!dc->clk_mgr->funcs->get_clock)
+   if (!dc->clk_mgr || !dc->clk_mgr->funcs->get_clock)
return DC_FAIL_UNSUPPORTED_1;
 
+   dc->clk_mgr->funcs->get_clock(dc->clk_mgr,
+   context, clock_type, _cfg);
+
if (clk_khz > clock_cfg.max_clock_khz)
return DC_FAIL_CLK_EXCEED_MAX;
 
@@ -3652,7 +3651,7 @@ enum dc_status dcn10_set_clock(struct dc *dc,
else
return DC_ERROR_UNEXPECTED;
 
-   if (dc->clk_mgr && dc->clk_mgr->funcs->update_clocks)
+   if (dc->clk_mgr->funcs->update_clocks)
dc->clk_mgr->funcs->update_clocks(dc->clk_mgr,
context, true);
return DC_OK;
-- 
2.30.2



[PATCH AUTOSEL 5.14 147/252] gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable access in amdgpu_i2c_router_select_ddc_port()

2021-09-09 Thread Sasha Levin
From: Tuo Li 

[ Upstream commit a211260c34cfadc6068fece8c9e99e0fe1e2a2b6 ]

The variable val is declared without initialization, and its address is
passed to amdgpu_i2c_get_byte(). In this function, the value of val is
accessed in:
  DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n",
   addr, *val);

Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized,
but it is accessed in:
  val &= ~amdgpu_connector->router.ddc_mux_control_pin;

To fix this possible uninitialized-variable access, initialize val to 0 in
amdgpu_i2c_router_select_ddc_port().

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
index bca45a15..82608df43396 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@@ -339,7 +339,7 @@ static void amdgpu_i2c_put_byte(struct amdgpu_i2c_chan 
*i2c_bus,
 void
 amdgpu_i2c_router_select_ddc_port(const struct amdgpu_connector 
*amdgpu_connector)
 {
-   u8 val;
+   u8 val = 0;
 
if (!amdgpu_connector->router.ddc_valid)
return;
-- 
2.30.2



[PATCH AUTOSEL 5.14 136/252] drm/amd/display: fix incorrect CM/TF programming sequence in dwb

2021-09-09 Thread Sasha Levin
From: Roy Chan 

[ Upstream commit 781e1e23131cce56fb557e6ec2260480a6bd08cc ]

[How]
the programming sequeune was for old asic.
the correct programming sequeunce should be similar to the one
used in mpc. the fix is copied from the mpc programming sequeunce.

Reviewed-by: Anthony Koo 
Acked-by: Anson Jacob 
Signed-off-by: Roy Chan 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../drm/amd/display/dc/dcn30/dcn30_dwb_cm.c   | 90 +--
 1 file changed, 64 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c
index 3fe9e41e4dbd..6a3d3a0ec0a3 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c
@@ -49,6 +49,11 @@
 static void dwb3_get_reg_field_ogam(struct dcn30_dwbc *dwbc30,
struct dcn3_xfer_func_reg *reg)
 {
+   reg->shifts.field_region_start_base = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION_START_BASE_B;
+   reg->masks.field_region_start_base = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION_START_BASE_B;
+   reg->shifts.field_offset = dwbc30->dwbc_shift->DWB_OGAM_RAMA_OFFSET_B;
+   reg->masks.field_offset = dwbc30->dwbc_mask->DWB_OGAM_RAMA_OFFSET_B;
+
reg->shifts.exp_region0_lut_offset = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION0_LUT_OFFSET;
reg->masks.exp_region0_lut_offset = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION0_LUT_OFFSET;
reg->shifts.exp_region0_num_segments = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION0_NUM_SEGMENTS;
@@ -66,8 +71,6 @@ static void dwb3_get_reg_field_ogam(struct dcn30_dwbc *dwbc30,
reg->masks.field_region_end_base = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION_END_BASE_B;
reg->shifts.field_region_linear_slope = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION_START_SLOPE_B;
reg->masks.field_region_linear_slope = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION_START_SLOPE_B;
-   reg->masks.field_offset = dwbc30->dwbc_mask->DWB_OGAM_RAMA_OFFSET_B;
-   reg->shifts.field_offset = dwbc30->dwbc_shift->DWB_OGAM_RAMA_OFFSET_B;
reg->shifts.exp_region_start = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION_START_B;
reg->masks.exp_region_start = 
dwbc30->dwbc_mask->DWB_OGAM_RAMA_EXP_REGION_START_B;
reg->shifts.exp_resion_start_segment = 
dwbc30->dwbc_shift->DWB_OGAM_RAMA_EXP_REGION_START_SEGMENT_B;
@@ -147,18 +150,19 @@ static enum dc_lut_mode dwb3_get_ogam_current(
uint32_t state_mode;
uint32_t ram_select;
 
-   REG_GET(DWB_OGAM_CONTROL,
-   DWB_OGAM_MODE, _mode);
-   REG_GET(DWB_OGAM_CONTROL,
-   DWB_OGAM_SELECT, _select);
+   REG_GET_2(DWB_OGAM_CONTROL,
+   DWB_OGAM_MODE_CURRENT, _mode,
+   DWB_OGAM_SELECT_CURRENT, _select);
 
if (state_mode == 0) {
mode = LUT_BYPASS;
} else if (state_mode == 2) {
if (ram_select == 0)
mode = LUT_RAM_A;
-   else
+   else if (ram_select == 1)
mode = LUT_RAM_B;
+   else
+   mode = LUT_BYPASS;
} else {
// Reserved value
mode = LUT_BYPASS;
@@ -172,10 +176,10 @@ static void dwb3_configure_ogam_lut(
struct dcn30_dwbc *dwbc30,
bool is_ram_a)
 {
-   REG_UPDATE(DWB_OGAM_LUT_CONTROL,
-   DWB_OGAM_LUT_READ_COLOR_SEL, 7);
-   REG_UPDATE(DWB_OGAM_CONTROL,
-   DWB_OGAM_SELECT, is_ram_a == true ? 0 : 1);
+   REG_UPDATE_2(DWB_OGAM_LUT_CONTROL,
+   DWB_OGAM_LUT_WRITE_COLOR_MASK, 7,
+   DWB_OGAM_LUT_HOST_SEL, (is_ram_a == true) ? 0 : 1);
+
REG_SET(DWB_OGAM_LUT_INDEX, 0, DWB_OGAM_LUT_INDEX, 0);
 }
 
@@ -185,17 +189,45 @@ static void dwb3_program_ogam_pwl(struct dcn30_dwbc 
*dwbc30,
 {
uint32_t i;
 
-// triple base implementation
-   for (i = 0; i < num/2; i++) {
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+0].red_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+0].green_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+0].blue_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+1].red_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+1].green_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+1].blue_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+2].red_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+2].green_reg);
-   REG_SET(DWB_OGAM_LUT_DATA, 0, DWB_OGAM_LUT_DATA, 
rgb[2*i+2].blue_reg);
+   uint32_t last_base_value_red = rgb[num-1].red_reg + 
rgb[num-1].delta_red_reg;
+   uint32_t last_base_value_green = rgb[num-1].green_reg + 

[PATCH AUTOSEL 5.14 135/252] drm/amd/display: fix missing writeback disablement if plane is removed

2021-09-09 Thread Sasha Levin
From: Roy Chan 

[ Upstream commit 82367e7f22d085092728f45fd5fbb15e3fb997c0 ]

[Why]
If the plane has been removed, the writeback disablement logic
doesn't run

[How]
fix the logic order

Acked-by: Anson Jacob 
Signed-off-by: Roy Chan 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 14 --
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 12 +++-
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index 5c2853654cca..a47ba1d45be9 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -1723,13 +1723,15 @@ void dcn20_program_front_end_for_ctx(
 
pipe = pipe->bottom_pipe;
}
-   /* Program secondary blending tree and writeback pipes 
*/
-   pipe = >res_ctx.pipe_ctx[i];
-   if (!pipe->prev_odm_pipe && pipe->stream->num_wb_info > 0
-   && (pipe->update_flags.raw || 
pipe->plane_state->update_flags.raw || pipe->stream->update_flags.raw)
-   && 
hws->funcs.program_all_writeback_pipes_in_tree)
-   
hws->funcs.program_all_writeback_pipes_in_tree(dc, pipe->stream, context);
}
+   /* Program secondary blending tree and writeback pipes */
+   pipe = >res_ctx.pipe_ctx[i];
+   if (!pipe->top_pipe && !pipe->prev_odm_pipe
+   && pipe->stream && pipe->stream->num_wb_info > 0
+   && (pipe->update_flags.raw || 
(pipe->plane_state && pipe->plane_state->update_flags.raw)
+   || pipe->stream->update_flags.raw)
+   && 
hws->funcs.program_all_writeback_pipes_in_tree)
+   hws->funcs.program_all_writeback_pipes_in_tree(dc, 
pipe->stream, context);
}
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
index 2e8ab9775fa3..fafed1e4a998 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
@@ -398,12 +398,22 @@ void dcn30_program_all_writeback_pipes_in_tree(
for (i_pipe = 0; i_pipe < dc->res_pool->pipe_count; 
i_pipe++) {
struct pipe_ctx *pipe_ctx = 
>res_ctx.pipe_ctx[i_pipe];
 
+   if (!pipe_ctx->plane_state)
+   continue;
+
if (pipe_ctx->plane_state == 
wb_info.writeback_source_plane) {
wb_info.mpcc_inst = 
pipe_ctx->plane_res.mpcc_inst;
break;
}
}
-   ASSERT(wb_info.mpcc_inst != -1);
+
+   if (wb_info.mpcc_inst == -1) {
+   /* Disable writeback pipe and disconnect from 
MPCC
+* if source plane has been removed
+*/
+   dc->hwss.disable_writeback(dc, 
wb_info.dwb_pipe_inst);
+   continue;
+   }
 
ASSERT(wb_info.dwb_pipe_inst < 
dc->res_pool->res_cap->num_dwb);
dwb = dc->res_pool->dwbc[wb_info.dwb_pipe_inst];
-- 
2.30.2



[PATCH AUTOSEL 5.14 084/252] drm/amd/display: Fix PSR command version

2021-09-09 Thread Sasha Levin
From: Mikita Lipski 

[ Upstream commit af1f2b19fd7d404d299355cc95930efee5b3ed8b ]

[why]
For dual eDP when setting the new settings we need to set
command version to DMUB_CMD_PSR_CONTROL_VERSION_1, otherwise
DMUB will not read panel_inst parameter.
[how]
Instead of PSR_VERSION_1 pass DMUB_CMD_PSR_CONTROL_VERSION_1

Reviewed-by: Wood Wyatt 
Acked-by: Solomon Chiu 
Signed-off-by: Mikita Lipski 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c 
b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
index 10d42ae0cffe..3428334c6c57 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
@@ -207,7 +207,7 @@ static void dmub_psr_set_level(struct dmub_psr *dmub, 
uint16_t psr_level, uint8_
cmd.psr_set_level.header.sub_type = DMUB_CMD__PSR_SET_LEVEL;
cmd.psr_set_level.header.payload_bytes = sizeof(struct 
dmub_cmd_psr_set_level_data);
cmd.psr_set_level.psr_set_level_data.psr_level = psr_level;
-   cmd.psr_set_level.psr_set_level_data.cmd_version = PSR_VERSION_1;
+   cmd.psr_set_level.psr_set_level_data.cmd_version = 
DMUB_CMD_PSR_CONTROL_VERSION_1;
cmd.psr_set_level.psr_set_level_data.panel_inst = panel_inst;
dc_dmub_srv_cmd_queue(dc->dmub_srv, );
dc_dmub_srv_cmd_execute(dc->dmub_srv);
@@ -293,7 +293,7 @@ static bool dmub_psr_copy_settings(struct dmub_psr *dmub,
copy_settings_data->debug.bitfields.use_hw_lock_mgr = 1;
copy_settings_data->fec_enable_status = (link->fec_state == 
dc_link_fec_enabled);
copy_settings_data->fec_enable_delay_in100us = 
link->dc->debug.fec_enable_delay_in100us;
-   copy_settings_data->cmd_version =  PSR_VERSION_1;
+   copy_settings_data->cmd_version =  DMUB_CMD_PSR_CONTROL_VERSION_1;
copy_settings_data->panel_inst = panel_inst;
 
dc_dmub_srv_cmd_queue(dc->dmub_srv, );
-- 
2.30.2



[PATCH AUTOSEL 5.14 063/252] drm/amd/amdgpu: Update debugfs link_settings output link_rate field in hex

2021-09-09 Thread Sasha Levin
From: Anson Jacob 

[ Upstream commit 1a394b3c3de2577f200cb623c52a5c2b82805cec ]

link_rate is updated via debugfs using hex values, set it to output
in hex as well.

eg: Resolution: 1920x1080@144Hz
cat /sys/kernel/debug/dri/0/DP-1/link_settings
Current:  4  0x14  0  Verified:  4  0x1e  0  Reported:  4  0x1e  16  Preferred: 
 0  0x0  0

echo "4 0x1e" > /sys/kernel/debug/dri/0/DP-1/link_settings

cat /sys/kernel/debug/dri/0/DP-1/link_settings
Current:  4  0x1e  0  Verified:  4  0x1e  0  Reported:  4  0x1e  16  Preferred: 
 4  0x1e  0

Signed-off-by: Anson Jacob 
Reviewed-by: Harry Wentland 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c| 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index f1145086a468..1d15a9af9956 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -197,29 +197,29 @@ static ssize_t dp_link_settings_read(struct file *f, char 
__user *buf,
 
rd_buf_ptr = rd_buf;
 
-   str_len = strlen("Current:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Current:  %d  %d  %d  ",
+   str_len = strlen("Current:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Current:  %d  0x%x  %d  ",
link->cur_link_settings.lane_count,
link->cur_link_settings.link_rate,
link->cur_link_settings.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Verified:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Verified:  %d  %d  %d  ",
+   str_len = strlen("Verified:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Verified:  %d  0x%x  %d  ",
link->verified_link_cap.lane_count,
link->verified_link_cap.link_rate,
link->verified_link_cap.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Reported:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Reported:  %d  %d  %d  ",
+   str_len = strlen("Reported:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Reported:  %d  0x%x  %d  ",
link->reported_link_cap.lane_count,
link->reported_link_cap.link_rate,
link->reported_link_cap.link_spread);
rd_buf_ptr += str_len;
 
-   str_len = strlen("Preferred:  %d  %d  %d  ");
-   snprintf(rd_buf_ptr, str_len, "Preferred:  %d  %d  %d\n",
+   str_len = strlen("Preferred:  %d  0x%x  %d  ");
+   snprintf(rd_buf_ptr, str_len, "Preferred:  %d  0x%x  %d\n",
link->preferred_link_setting.lane_count,
link->preferred_link_setting.link_rate,
link->preferred_link_setting.link_spread);
-- 
2.30.2



[PATCH AUTOSEL 5.14 062/252] drm/amdgpu: Fix a printing message

2021-09-09 Thread Sasha Levin
From: Oak Zeng 

[ Upstream commit 95f71f12aa45d65b7f2ccab95569795edffd379a ]

The printing message "PSP loading VCN firmware" is mis-leading because
people might think driver is loading VCN firmware. Actually when this
message is printed, driver is just preparing some VCN ucode, not loading
VCN firmware yet. The actual VCN firmware loading will be in the PSP block
hw_init. Fix the printing message

Signed-off-by: Oak Zeng 
Reviewed-by: Christian Konig 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index 284bb42d6c86..121ee9f2b8d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -119,7 +119,7 @@ static int vcn_v1_0_sw_init(void *handle)
adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-   DRM_INFO("PSP loading VCN firmware\n");
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
}
 
r = amdgpu_vcn_resume(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
index 8af567c546db..f4686e918e0d 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
@@ -122,7 +122,7 @@ static int vcn_v2_0_sw_init(void *handle)
adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-   DRM_INFO("PSP loading VCN firmware\n");
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
}
 
r = amdgpu_vcn_resume(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index 888b17d84691..e0c0c3734432 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -152,7 +152,7 @@ static int vcn_v2_5_sw_init(void *handle)
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), 
PAGE_SIZE);
}
-   DRM_INFO("PSP loading VCN firmware\n");
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
}
 
r = amdgpu_vcn_resume(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 47d4f04cbd69..2f017560948e 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -160,7 +160,7 @@ static int vcn_v3_0_sw_init(void *handle)
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), 
PAGE_SIZE);
}
-   DRM_INFO("PSP loading VCN firmware\n");
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
}
 
r = amdgpu_vcn_resume(adev);
-- 
2.30.2



[PATCH AUTOSEL 5.14 061/252] drm/amd/display: Fixed hardware power down bypass during headless boot

2021-09-09 Thread Sasha Levin
From: Jake Wang 

[ Upstream commit 3addbde269f21ffc735f6d3d0c2237664923824e ]

[Why]
During headless boot, DIG may be on which causes HW/SW discrepancies.
To avoid this we power down hardware on boot if DIG is turned on. With
introduction of multiple eDP, hardware power down is being bypassed
under certain conditions.

[How]
Fixed hardware power down bypass, and ensured hardware will power down
if DIG is on and seamless boot is not enabled.

Reviewed-by: Nicholas Kazlauskas 
Acked-by: Rodrigo Siqueira 
Signed-off-by: Jake Wang 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 27 +--
 .../drm/amd/display/dc/dcn30/dcn30_hwseq.c| 25 -
 .../drm/amd/display/dc/dcn31/dcn31_hwseq.c|  5 +++-
 3 files changed, 27 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index c545eddabdcc..dee1ce5f9609 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -1502,25 +1502,22 @@ void dcn10_init_hw(struct dc *dc)
 void dcn10_power_down_on_boot(struct dc *dc)
 {
struct dc_link *edp_links[MAX_NUM_EDP];
-   struct dc_link *edp_link;
+   struct dc_link *edp_link = NULL;
int edp_num;
int i = 0;
 
get_edp_links(dc, edp_links, _num);
-
-   if (edp_num) {
-   for (i = 0; i < edp_num; i++) {
-   edp_link = edp_links[i];
-   if (edp_link->link_enc->funcs->is_dig_enabled &&
-   
edp_link->link_enc->funcs->is_dig_enabled(edp_link->link_enc) &&
-   dc->hwseq->funcs.edp_backlight_control 
&&
-   dc->hwss.power_down &&
-   dc->hwss.edp_power_control) {
-   
dc->hwseq->funcs.edp_backlight_control(edp_link, false);
-   dc->hwss.power_down(dc);
-   dc->hwss.edp_power_control(edp_link, false);
-   }
-   }
+   if (edp_num)
+   edp_link = edp_links[0];
+
+   if (edp_link && edp_link->link_enc->funcs->is_dig_enabled &&
+   
edp_link->link_enc->funcs->is_dig_enabled(edp_link->link_enc) &&
+   dc->hwseq->funcs.edp_backlight_control &&
+   dc->hwss.power_down &&
+   dc->hwss.edp_power_control) {
+   dc->hwseq->funcs.edp_backlight_control(edp_link, false);
+   dc->hwss.power_down(dc);
+   dc->hwss.edp_power_control(edp_link, false);
} else {
for (i = 0; i < dc->link_count; i++) {
struct dc_link *link = dc->links[i];
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
index c68e3a708a33..2e8ab9775fa3 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
@@ -580,22 +580,19 @@ void dcn30_init_hw(struct dc *dc)
 */
if (dc->config.power_down_display_on_boot) {
struct dc_link *edp_links[MAX_NUM_EDP];
-   struct dc_link *edp_link;
+   struct dc_link *edp_link = NULL;
 
get_edp_links(dc, edp_links, _num);
-   if (edp_num) {
-   for (i = 0; i < edp_num; i++) {
-   edp_link = edp_links[i];
-   if (edp_link->link_enc->funcs->is_dig_enabled &&
-   
edp_link->link_enc->funcs->is_dig_enabled(edp_link->link_enc) &&
-   dc->hwss.edp_backlight_control 
&&
-   dc->hwss.power_down &&
-   dc->hwss.edp_power_control) {
-   
dc->hwss.edp_backlight_control(edp_link, false);
-   dc->hwss.power_down(dc);
-   dc->hwss.edp_power_control(edp_link, 
false);
-   }
-   }
+   if (edp_num)
+   edp_link = edp_links[0];
+   if (edp_link && edp_link->link_enc->funcs->is_dig_enabled &&
+   
edp_link->link_enc->funcs->is_dig_enabled(edp_link->link_enc) &&
+   dc->hwss.edp_backlight_control &&
+   dc->hwss.power_down &&
+   dc->hwss.edp_power_control) {
+   dc->hwss.edp_backlight_control(edp_link, false);
+   dc->hwss.power_down(dc);
+   

[PATCH AUTOSEL 5.14 052/252] drm/amd/display: Fix timer_per_pixel unit error

2021-09-09 Thread Sasha Levin
From: Oliver Logush 

[ Upstream commit 23e55639b87fb16a9f0f66032ecb57060df6c46c ]

[why]
The units of the time_per_pixel variable were incorrect, this had to be
changed for the code to properly function.

[how]
The change was very straightforward, only required one line of code to
be changed where the calculation was done.

Acked-by: Rodrigo Siqueira 
Signed-off-by: Oliver Logush 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index b173fa3653b5..c78933a9d31c 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -2462,7 +2462,7 @@ void dcn20_set_mcif_arb_params(
wb_arb_params->cli_watermark[k] = 
get_wm_writeback_urgent(>bw_ctx.dml, pipes, pipe_cnt) * 1000;
wb_arb_params->pstate_watermark[k] = 
get_wm_writeback_dram_clock_change(>bw_ctx.dml, pipes, pipe_cnt) * 
1000;
}
-   wb_arb_params->time_per_pixel = 16.0 / 
context->res_ctx.pipe_ctx[i].stream->phy_pix_clk; /* 4 bit fraction, ms */
+   wb_arb_params->time_per_pixel = 16.0 * 1000 / 
(context->res_ctx.pipe_ctx[i].stream->phy_pix_clk / 1000); /* 4 bit fraction, 
ms */
wb_arb_params->slice_lines = 32;
wb_arb_params->arbitration_slice = 2;
wb_arb_params->max_scaled_time = 
dcn20_calc_max_scaled_time(wb_arb_params->time_per_pixel,
-- 
2.30.2



[PATCH AUTOSEL 5.14 008/252] drm/amdgpu: Fix koops when accessing RAS EEPROM

2021-09-09 Thread Sasha Levin
From: Luben Tuikov 

[ Upstream commit 1d9d2ca85b32605ac9c74c8fa42d0c1cfbe019d4 ]

Debugfs RAS EEPROM files are available when
the ASIC supports RAS, and when the debugfs is
enabled, an also when "ras_enable" module
parameter is set to 0. However in this case,
we get a kernel oops when accessing some of
the "ras_..." controls in debugfs. The reason
for this is that struct amdgpu_ras::adev is
unset. This commit sets it, thus enabling access
to those facilities. Note that this facilitates
EEPROM access and not necessarily RAS features or
functionality.

Cc: Alexander Deucher 
Cc: John Clements 
Cc: Hawking Zhang 
Signed-off-by: Luben Tuikov 
Acked-by: Alexander Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index fc66aca28594..95d5842385b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1966,11 +1966,20 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
bool exc_err_limit = false;
int ret;
 
-   if (adev->ras_enabled && con)
-   data = >eh_data;
-   else
+   if (!con)
+   return 0;
+
+   /* Allow access to RAS EEPROM via debugfs, when the ASIC
+* supports RAS and debugfs is enabled, but when
+* adev->ras_enabled is unset, i.e. when "ras_enable"
+* module parameter is set to 0.
+*/
+   con->adev = adev;
+
+   if (!adev->ras_enabled)
return 0;
 
+   data = >eh_data;
*data = kmalloc(sizeof(**data), GFP_KERNEL | __GFP_ZERO);
if (!*data) {
ret = -ENOMEM;
@@ -1980,7 +1989,6 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
mutex_init(>recovery_lock);
INIT_WORK(>recovery_work, amdgpu_ras_do_recovery);
atomic_set(>in_recovery, 0);
-   con->adev = adev;
 
max_eeprom_records_len = amdgpu_ras_eeprom_get_record_max_length();
amdgpu_ras_validate_threshold(adev, max_eeprom_records_len);
-- 
2.30.2



[PATCH AUTOSEL 5.14 007/252] drm/amdgpu: Fix amdgpu_ras_eeprom_init()

2021-09-09 Thread Sasha Levin
From: Luben Tuikov 

[ Upstream commit dce4400e6516d18313d23de45b5be8a18980b00e ]

No need to account for the 2 bytes of EEPROM
address--this is now well abstracted away by
the fixes the the lower layers.

Cc: Andrey Grodzovsky 
Cc: Alexander Deucher 
Signed-off-by: Luben Tuikov 
Acked-by: Alexander Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 38222de921d1..8dd151c9e459 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -325,7 +325,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control 
*control,
return ret;
}
 
-   __decode_table_header_from_buff(hdr, [2]);
+   __decode_table_header_from_buff(hdr, buff);
 
if (hdr->header == EEPROM_TABLE_HDR_VAL) {
control->num_recs = (hdr->tbl_size - EEPROM_TABLE_HEADER_SIZE) /
-- 
2.30.2



Re: [PATCH v2] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings(v2)

2021-09-09 Thread Huang Rui
On Thu, Sep 09, 2021 at 04:00:05PM +0800, Yu, Lang wrote:
> sysfs_emit and sysfs_emit_at requrie a page boundary
> aligned buf address. Make them happy!
> 
> v2: use an inline function
> 
> Warning Log:
> [  492.545174] invalid sysfs_emit_at: buf:f19bdfde at:0
> [  492.546416] WARNING: CPU: 7 PID: 1304 at fs/sysfs/file.c:765 
> sysfs_emit_at+0x4a/0xa0
> [  492.654805] Call Trace:
> [  492.655353]  ? smu_cmn_get_metrics_table+0x40/0x50 [amdgpu]
> [  492.656780]  vangogh_print_clk_levels+0x369/0x410 [amdgpu]
> [  492.658245]  vangogh_common_print_clk_levels+0x77/0x80 [amdgpu]
> [  492.659733]  ? preempt_schedule_common+0x18/0x30
> [  492.660713]  smu_print_ppclk_levels+0x65/0x90 [amdgpu]
> [  492.662107]  amdgpu_get_pp_od_clk_voltage+0x13d/0x190 [amdgpu]
> [  492.663620]  dev_attr_show+0x1d/0x40
> 
> Signed-off-by: Lang Yu 

Looks OK for me. Although it's not perfect, the legacy design impact a lot
of ASICs and it's hard to change it one by one, so this solution is OK with
minimal impact at this moment.

Acked-by: Huang Rui 

> ---
>  .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c|  8 ++--
>  drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c  |  4 +++-
>  .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c  |  4 +++-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 16 ++--
>  drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c  |  2 ++
>  .../gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c   | 12 
>  .../gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c |  6 --
>  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h   | 13 +
>  8 files changed, 49 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> index e343cc218990..2e5a362aa06b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> @@ -771,8 +771,12 @@ static int arcturus_print_clk_levels(struct smu_context 
> *smu,
>   struct smu_11_0_dpm_context *dpm_context = NULL;
>   uint32_t gen_speed, lane_width;
>  
> - if (amdgpu_ras_intr_triggered())
> - return sysfs_emit(buf, "unavailable\n");
> + smu_cmn_get_sysfs_buf(, size);
> +
> + if (amdgpu_ras_intr_triggered()) {
> + size += sysfs_emit_at(buf, size, "unavailable\n");
> + return size;
> + }
>  
>   dpm_context = smu_dpm->dpm_context;
>  
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 4c81989b8162..63e1f0db579c 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -1279,6 +1279,8 @@ static int navi10_print_clk_levels(struct smu_context 
> *smu,
>   struct smu_11_0_overdrive_table *od_settings = smu->od_settings;
>   uint32_t min_value, max_value;
>  
> + smu_cmn_get_sysfs_buf(, );
> +
>   switch (clk_type) {
>   case SMU_GFXCLK:
>   case SMU_SCLK:
> @@ -1392,7 +1394,7 @@ static int navi10_print_clk_levels(struct smu_context 
> *smu,
>   case SMU_OD_RANGE:
>   if (!smu->od_enabled || !od_table || !od_settings)
>   break;
> - size = sysfs_emit(buf, "%s:\n", "OD_RANGE");
> + size += sysfs_emit_at(buf, size, "%s:\n", "OD_RANGE");
>  
>   if (navi10_od_feature_is_supported(od_settings, 
> SMU_11_0_ODCAP_GFXCLK_LIMITS)) {
>   navi10_od_setting_get_range(od_settings, 
> SMU_11_0_ODSETTING_GFXCLKFMIN,
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> index 5e292c3f5050..d7519688065f 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> @@ -1058,6 +1058,8 @@ static int sienna_cichlid_print_clk_levels(struct 
> smu_context *smu,
>   uint32_t min_value, max_value;
>   uint32_t smu_version;
>  
> + smu_cmn_get_sysfs_buf(, );
> +
>   switch (clk_type) {
>   case SMU_GFXCLK:
>   case SMU_SCLK:
> @@ -1180,7 +1182,7 @@ static int sienna_cichlid_print_clk_levels(struct 
> smu_context *smu,
>   if (!smu->od_enabled || !od_table || !od_settings)
>   break;
>  
> - size = sysfs_emit(buf, "%s:\n", "OD_RANGE");
> + size += sysfs_emit_at(buf, size, "%s:\n", "OD_RANGE");
>  
>   if (sienna_cichlid_is_od_feature_supported(od_settings, 
> SMU_11_0_7_ODCAP_GFXCLK_LIMITS)) {
>   sienna_cichlid_get_od_setting_range(od_settings, 
> SMU_11_0_7_ODSETTING_GFXCLKFMIN,
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> index 3a3421452e57..f6ef0ce6e9e2 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> 

Re: [PATCH] Enable '-Werror' by default for all kernel builds

2021-09-09 Thread Arnd Bergmann
On Thu, Sep 9, 2021 at 12:54 PM Marco Elver  wrote:
> On Thu, 9 Sept 2021 at 07:59, Christoph Hellwig  wrote:
> > On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
> > > It'd be good to avoid. It has helped uncover build issues with KASAN in
> > > the past. Or at least make it dependent on the problematic architecture.
> > > For example if arm is a problem, something like this:
> >
> > I'm also seeing quite a few stack size warnings with KASAN on x86_64
> > without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
> > few warnings without KASAN, but with KASAN there are a lot more.
> > I'll try to find some time to dig into them.
>
> Right, this reminded me that we actually at least double the real
> stack size for KASAN builds, because it inherently requires more stack
> space. I think we need Wframe-larger-than to match that, otherwise
> we'll just keep having this problem:
>
> https://lkml.kernel.org/r/20210909104925.809674-1-el...@google.com

The problem with this is that it completely defeats the point of the
stack size warnings in allmodconfig kernels when they have KASAN
enabled and end up missing obvious code bugs in drivers that put
large structures on the stack. Let's not go there.

Arnd


Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings

2021-09-09 Thread Christian König

Am 09.09.21 um 10:36 schrieb Lazar, Lijo:

On 9/9/2021 1:31 PM, Christian König wrote:

Am 09.09.21 um 05:28 schrieb Lazar, Lijo:

On 9/9/2021 8:13 AM, Yu, Lang wrote:

[AMD Official Use Only]

So the final decision is rollback to scnprintf().

If we can define our own helper functions like 
sysfs_emit/sysfs_emit_at


but without page boundary aligned limitation to make life easier?



No, we do want to make it clear that this function is used for sysfs 
files and make use of the extra checks provided by the sysfs_emit* 
functions. Looking at the origins of sysf_emit_at() specifically, 
there are indeed some cases of printing more than one values per 
file and multi-line usage.


Correct, but those are rather limited and well documented special 
cases. E.g. for example if you need to grab a lock to get multiple 
values which are supposed to be coherent to each other.


I think that's the case here, so printing multiple values is probably 
ok in general. But we still need to get the implementation straight.


So I'm fine with your original patch. Maybe, you can make the 
intention explicit by keeping the offset and buf start calculations 
in a separate inline function.

smu_get_sysfs_buf()


Exactly that is what is not ok. So once more the intended use case of 
those functions is:


offs = sysfs_emit(page, ...);
offs += sysfs_emit_at(page, offs, );
offs += sysfs_emit_at(page, offs, );
...

Another possible alternative which I think should be allowed is:

offs = 0;
for_each_clock_in_my_device(..) {
 offs += sysfs_emit_at(page, offs, );
}

But when you are calculating the initial offset manually then there 
is certainly something wrong here and that is not the intended usage 
pattern.




Actually, the issue is not within one function invocation. The issue 
is at the caller side with multiple invocations -


 size = amdgpu_dpm_print_clock_levels(adev, OD_SCLK, 
buf);

 size += amdgpu_dpm_print_clock_levels(adev, OD_MCLK,
buf+size);

Having amdgpu_dpm_print_clock_levels() helped to consolidate sysfs 
calls in single function for different parameters and used for 
different nodes. However in this case, different parameters are 
presented as a single "logical entity" in a sysfs node and the 
function is called multiple times for different parameters 
(questionable as per sysfs guidelines, Alex needs to come back on this).


Yes, exactly that. But as I noted above multiple values are ok when you 
need to keep them coherent, e.g. returning SCLK and MCLK without a 
performance level switch in between.


Within one invocation of amdgpu_dpm_print_clock_levels(), the APIs are 
used correctly. For the second call, it needs to pass the page aligned 
buf pointer correctly to sysfs_emit* calls.


Presently, amdgpu_dpm_print_clock_levels() takes only buff pointer as 
argument and probably it is that way since the function existed before 
sysfs_emit* patches got added and was originally using sprintfs.


Now, two possible options are -

1) Make a pre-requisite that this function is always going to print to 
sysfs files. For that use sysfs_emit*. Also, as with a sysfs buffer 
calculate the page aligned start address of buf and offset for use 
with sysfs_emit* in the beginning. At least for now, this assumption 
is inline with the buffer start address requirement in sysfs_emit* 
patches. This is what the original patch does. That said, if the 
buffer properties change in future this will not hold good.


2) Pass the offset along with the buff in API. That will be extensive 
since it affects older powerplay based HWs also.


I think that should be the way to go then.


There may be other ways, but those could be even more extensive than 2).


From a high level view my feeling says me that returning the values as 
numbers and printing them in the higher level function is a better 
design, but there might as well be reason against that.


Regards,
Christian.



Thanks,
Lijo


Regards,
Christian.



Thanks,
Lijo


Regards,

Lang

*From:* Powell, Darren 
*Sent:* Thursday, September 9, 2021 6:18 AM
*To:* Christian König ; Lazar, 
Lijo ; Yu, Lang ; 
amd-gfx@lists.freedesktop.org
*Cc:* Deucher, Alexander ; Huang, Ray 
; Tian Tao 
*Subject:* Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at 
warnings


[AMD Official Use Only]

 



*From:*Christian König >

*Sent:* Wednesday, September 8, 2021 8:43 AM
*To:* Lazar, Lijo mailto:lijo.la...@amd.com>>; 
Yu, Lang mailto:lang...@amd.com>>; 
amd-gfx@lists.freedesktop.org 
 
mailto:amd-gfx@lists.freedesktop.org>>
*Cc:* Deucher, Alexander >; Huang, Ray >; Tian Tao >; Powell, Darren 
mailto:darren.pow...@amd.com>>
*Subject:* Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at 
warnings


Am 08.09.21 

Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings

2021-09-09 Thread Lazar, Lijo




On 9/9/2021 1:31 PM, Christian König wrote:

Am 09.09.21 um 05:28 schrieb Lazar, Lijo:

On 9/9/2021 8:13 AM, Yu, Lang wrote:

[AMD Official Use Only]

So the final decision is rollback to scnprintf().

If we can define our own helper functions like sysfs_emit/sysfs_emit_at

but without page boundary aligned limitation to make life easier?



No, we do want to make it clear that this function is used for sysfs 
files and make use of the extra checks provided by the sysfs_emit* 
functions. Looking at the origins of sysf_emit_at() specifically, 
there are indeed some cases of printing more than one values per file 
and multi-line usage.


Correct, but those are rather limited and well documented special cases. 
E.g. for example if you need to grab a lock to get multiple values which 
are supposed to be coherent to each other.


I think that's the case here, so printing multiple values is probably ok 
in general. But we still need to get the implementation straight.


So I'm fine with your original patch. Maybe, you can make the 
intention explicit by keeping the offset and buf start calculations in 
a separate inline function.

smu_get_sysfs_buf()


Exactly that is what is not ok. So once more the intended use case of 
those functions is:


offs = sysfs_emit(page, ...);
offs += sysfs_emit_at(page, offs, );
offs += sysfs_emit_at(page, offs, );
...

Another possible alternative which I think should be allowed is:

offs = 0;
for_each_clock_in_my_device(..) {
     offs += sysfs_emit_at(page, offs, );
}

But when you are calculating the initial offset manually then there is 
certainly something wrong here and that is not the intended usage pattern.




Actually, the issue is not within one function invocation. The issue is 
at the caller side with multiple invocations -


 size = amdgpu_dpm_print_clock_levels(adev, OD_SCLK, buf);
 size += amdgpu_dpm_print_clock_levels(adev, OD_MCLK,
buf+size);

Having amdgpu_dpm_print_clock_levels() helped to consolidate sysfs calls 
in single function for different parameters and used for different 
nodes. However in this case, different parameters are presented as a 
single "logical entity" in a sysfs node and the function is called 
multiple times for different parameters (questionable as per sysfs 
guidelines, Alex needs to come back on this).


Within one invocation of amdgpu_dpm_print_clock_levels(), the APIs are 
used correctly. For the second call, it needs to pass the page aligned 
buf pointer correctly to sysfs_emit* calls.


Presently, amdgpu_dpm_print_clock_levels() takes only buff pointer as 
argument and probably it is that way since the function existed before 
sysfs_emit* patches got added and was originally using sprintfs.


Now, two possible options are -

1) Make a pre-requisite that this function is always going to print to 
sysfs files. For that use sysfs_emit*. Also, as with a sysfs buffer 
calculate the page aligned start address of buf and offset for use with 
sysfs_emit* in the beginning. At least for now, this assumption is 
inline with the buffer start address requirement in sysfs_emit* patches. 
This is what the original patch does. That said, if the buffer 
properties change in future this will not hold good.


2) Pass the offset along with the buff in API. That will be extensive 
since it affects older powerplay based HWs also.


There may be other ways, but those could be even more extensive than 2).

Thanks,
Lijo


Regards,
Christian.



Thanks,
Lijo


Regards,

Lang

*From:* Powell, Darren 
*Sent:* Thursday, September 9, 2021 6:18 AM
*To:* Christian König ; Lazar, Lijo 
; Yu, Lang ; 
amd-gfx@lists.freedesktop.org
*Cc:* Deucher, Alexander ; Huang, Ray 
; Tian Tao 

*Subject:* Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings

[AMD Official Use Only]



*From:*Christian König >

*Sent:* Wednesday, September 8, 2021 8:43 AM
*To:* Lazar, Lijo mailto:lijo.la...@amd.com>>; 
Yu, Lang mailto:lang...@amd.com>>; 
amd-gfx@lists.freedesktop.org  
mailto:amd-gfx@lists.freedesktop.org>>
*Cc:* Deucher, Alexander >; Huang, Ray >; Tian Tao >; Powell, Darren 
mailto:darren.pow...@amd.com>>

*Subject:* Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings

Am 08.09.21 um 12:22 schrieb Lazar, Lijo:
 > On 9/8/2021 3:08 PM, Christian König wrote:
 >> Am 08.09.21 um 11:29 schrieb Lazar, Lijo:
 >>> On 9/8/2021 2:32 PM, Yu, Lang wrote:
  [AMD Official Use Only]
 > -Original Message-
 > From: Lazar, Lijo >

 > Sent: Wednesday, September 8, 2021 4:55 PM
 > To: Yu, Lang mailto:lang...@amd.com>>; 
Christian König
 > >; 
amd-gfx@lists.freedesktop.org 

Re: [PATCH 2/2] drm/amdgpu: alloc IB extra msg from IB pool

2021-09-09 Thread Christian König
Yes, correct but the key point is I want the same handling for old and 
new UVD blocks to be the same.


This makes it more likely that the code keeps working on all hardware 
generations when we change something.


See those old hardware is rare to get these days and I don't want to 
risk any bug report from end users when we didn't tested that well on 
SI/CIK.


Christian.

Am 09.09.21 um 10:11 schrieb Pan, Xinhui:


[AMD Official Use Only]


well, If IB test fails because we use gtt domain or
the above 256MB vram. Then the failure is expected.
Doesn't IB test exist to detect such issue?


*发件人:* Koenig, Christian 
*发送时间:* 2021年9月9日星期四 15:16
*收件人:* Pan, Xinhui; amd-gfx@lists.freedesktop.org
*抄送:* Deucher, Alexander
*主题:* Re: [PATCH 2/2] drm/amdgpu: alloc IB extra msg from IB pool

Am 09.09.21 um 07:55 schrieb Pan, Xinhui:
> [AMD Official Use Only]
>
> There is one dedicated IB pool for IB test. So lets use it for extra msg
> too.
>
> For UVD on older HW, use one reserved BO at specific range.
>
> Signed-off-by: xinhui pan 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 173 +++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h |   1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c |  18 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c |  99 ++
>   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c   |  28 ++--
>   drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c   |  28 ++--
>   6 files changed, 185 insertions(+), 162 deletions(-)

Please split that up into one patch for UVD, one for VCE and a third for
VCN.

>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c

> index d451c359606a..733cfc848c6c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> @@ -299,8 +299,36 @@ int amdgpu_uvd_sw_init(struct amdgpu_device *adev)
>  }
>
>  /* from uvd v5.0 HW addressing capacity increased to 64 bits */
> -   if (!amdgpu_device_ip_block_version_cmp(adev, 
AMD_IP_BLOCK_TYPE_UVD, 5, 0))
> +   if (!amdgpu_device_ip_block_version_cmp(adev, 
AMD_IP_BLOCK_TYPE_UVD, 5, 0)) {

>  adev->uvd.address_64_bit = true;

Yeah, that's exactly what I'm trying to avoid.

We should use the BO approach both for old and new UVD blocks, just
making sure that we place it correctly for the old ones.

This way we have much lower chance of breaking the old stuff.

Thanks,
Christian.

> +   } else {
> +   struct amdgpu_bo *bo = NULL;
> +   void *addr;
> +
> +   r = amdgpu_bo_create_reserved(adev, PAGE_SIZE, 
PAGE_SIZE,

> + AMDGPU_GEM_DOMAIN_VRAM,
> +   , NULL, );
> +   if (r)
> +   return r;
> +   amdgpu_bo_kunmap(bo);
> +   amdgpu_bo_unpin(bo);
> +   r = amdgpu_bo_pin_restricted(bo, AMDGPU_GEM_DOMAIN_VRAM,
> +   0, 256 << 20);
> +   if (r) {
> +   amdgpu_bo_unreserve(bo);
> +   amdgpu_bo_unref();
> +   return r;
> +   }
> +   r = amdgpu_bo_kmap(bo, );
> +   if (r) {
> +   amdgpu_bo_unpin(bo);
> +   amdgpu_bo_unreserve(bo);
> +   amdgpu_bo_unref();
> +   return r;
> +   }
> +   adev->uvd.ib_bo = bo;
> +   amdgpu_bo_unreserve(bo);
> +   }
>
>  switch (adev->asic_type) {
>  case CHIP_TONGA:
> @@ -342,6 +370,7 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev)
>  for (i = 0; i < AMDGPU_MAX_UVD_ENC_RINGS; ++i)
> amdgpu_ring_fini(>uvd.inst[j].ring_enc[i]);
>  }
> + amdgpu_bo_free_kernel(>uvd.ib_bo, NULL, NULL);
>  release_firmware(adev->uvd.fw);
>
>  return 0;
> @@ -1066,7 +1095,7 @@ int amdgpu_uvd_ring_parse_cs(struct 
amdgpu_cs_parser *parser, uint32_t ib_idx)

>  return 0;
>   }
>
> -static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct 
amdgpu_bo *bo,

> +static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, uint64_t addr,
> bool direct, struct dma_fence **fence)
>   {
>  struct amdgpu_device *adev = ring->adev;
> @@ -1074,29 +1103,15 @@ static int amdgpu_uvd_send_msg(struct 
amdgpu_ring *ring, struct amdgpu_bo *bo,

>  struct amdgpu_job *job;
>  struct amdgpu_ib *ib;
>  uint32_t data[4];
> -   uint64_t addr;
>  long r;
>  int i;
>  unsigned offset_idx = 0;
>  unsigned offset[3] = { UVD_BASE_SI, 0, 0 };
>
> -   amdgpu_bo_kunmap(bo);
> -   amdgpu_bo_unpin(bo);
> -
> -   if (!ring->adev->uvd.address_64_bit) {
> -   struct ttm_operation_ctx ctx = { true, false };
> -
> -   amdgpu_bo_placement_from_domain(bo, 
AMDGPU_GEM_DOMAIN_VRAM);

> - 

RE: Subject: [PATCH 1/1] drm/amdgpu: Update RAS trigger error block support

2021-09-09 Thread Zhang, Hawking
[AMD Official Use Only]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
From: Clements, John 
Sent: Thursday, September 9, 2021 15:59
To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Li, 
Candice 
Subject: Subject: [PATCH 1/1] drm/amdgpu: Update RAS trigger error block support


[AMD Official Use Only]

Submitting patch to update RAS trigger error to support additional blocks


Re: [PATCH 1/2] drm/amdgpu: Increase direct IB pool size

2021-09-09 Thread Christian König
Ok, good to know. That's probably the reason why we didn't push that 
stuff into the IB in the first place.


And yes, using fixed 256kiB sounds like a plan to me then, but please 
also double check the AMDGPU_IB_POOL_SIZE define.


I also won't mind if you just open code the two initialization since 
there probably will never be any more than that.


Thanks,
Christian.

Am 09.09.21 um 09:57 schrieb Pan, Xinhui:


[AMD Official Use Only]


yep, vcn need 128kb extra memory.  I will make the pool size constant 
as 256kb.


*From:* Koenig, Christian 
*Sent:* Thursday, September 9, 2021 3:14:15 PM
*To:* Pan, Xinhui ; amd-gfx@lists.freedesktop.org 


*Cc:* Deucher, Alexander 
*Subject:* Re: [PATCH 1/2] drm/amdgpu: Increase direct IB pool size
Am 09.09.21 um 07:54 schrieb Pan, Xinhui:
> [AMD Official Use Only]
>
> Direct IB pool is used for vce/uvd/vcn IB extra msg too. Increase its
> size to 64 pages.

Do you really run into issues with that? 64 pages are 256kiB on x86 and
the extra msg are maybe 2kiB.

Additional to that we should probably make this a constant independent
of the CPU page size.

Christian.

>
> Signed-off-by: xinhui pan 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c

> index c076a6b9a5a2..cd2c7073fdd9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -308,7 +308,7 @@ int amdgpu_ib_pool_init(struct amdgpu_device *adev)
>
>  for (i = 0; i < AMDGPU_IB_POOL_MAX; i++) {
>  if (i == AMDGPU_IB_POOL_DIRECT)
> -   size = PAGE_SIZE * 6;
> +   size = PAGE_SIZE * 64;
>  else
>  size = AMDGPU_IB_POOL_SIZE;
>
> --
> 2.25.1
>





Re: [PATCH 2/2] drm/amdgpu: alloc IB extra msg from IB pool

2021-09-09 Thread Pan, Xinhui
[AMD Official Use Only]

well, If IB test fails because we use gtt domain or
the above 256MB vram. Then the failure is expected.
Doesn't IB test exist to detect such issue?


发件人: Koenig, Christian 
发送时间: 2021年9月9日星期四 15:16
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander
主题: Re: [PATCH 2/2] drm/amdgpu: alloc IB extra msg from IB pool

Am 09.09.21 um 07:55 schrieb Pan, Xinhui:
> [AMD Official Use Only]
>
> There is one dedicated IB pool for IB test. So lets use it for extra msg
> too.
>
> For UVD on older HW, use one reserved BO at specific range.
>
> Signed-off-by: xinhui pan 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 173 +++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h |   1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c |  18 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c |  99 ++
>   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c   |  28 ++--
>   drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c   |  28 ++--
>   6 files changed, 185 insertions(+), 162 deletions(-)

Please split that up into one patch for UVD, one for VCE and a third for
VCN.

>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> index d451c359606a..733cfc848c6c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> @@ -299,8 +299,36 @@ int amdgpu_uvd_sw_init(struct amdgpu_device *adev)
>  }
>
>  /* from uvd v5.0 HW addressing capacity increased to 64 bits */
> -   if (!amdgpu_device_ip_block_version_cmp(adev, AMD_IP_BLOCK_TYPE_UVD, 
> 5, 0))
> +   if (!amdgpu_device_ip_block_version_cmp(adev, AMD_IP_BLOCK_TYPE_UVD, 
> 5, 0)) {
>  adev->uvd.address_64_bit = true;

Yeah, that's exactly what I'm trying to avoid.

We should use the BO approach both for old and new UVD blocks, just
making sure that we place it correctly for the old ones.

This way we have much lower chance of breaking the old stuff.

Thanks,
Christian.

> +   } else {
> +   struct amdgpu_bo *bo = NULL;
> +   void *addr;
> +
> +   r = amdgpu_bo_create_reserved(adev, PAGE_SIZE, PAGE_SIZE,
> +   AMDGPU_GEM_DOMAIN_VRAM,
> +   , NULL, );
> +   if (r)
> +   return r;
> +   amdgpu_bo_kunmap(bo);
> +   amdgpu_bo_unpin(bo);
> +   r = amdgpu_bo_pin_restricted(bo, AMDGPU_GEM_DOMAIN_VRAM,
> +   0, 256 << 20);
> +   if (r) {
> +   amdgpu_bo_unreserve(bo);
> +   amdgpu_bo_unref();
> +   return r;
> +   }
> +   r = amdgpu_bo_kmap(bo, );
> +   if (r) {
> +   amdgpu_bo_unpin(bo);
> +   amdgpu_bo_unreserve(bo);
> +   amdgpu_bo_unref();
> +   return r;
> +   }
> +   adev->uvd.ib_bo = bo;
> +   amdgpu_bo_unreserve(bo);
> +   }
>
>  switch (adev->asic_type) {
>  case CHIP_TONGA:
> @@ -342,6 +370,7 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev)
>  for (i = 0; i < AMDGPU_MAX_UVD_ENC_RINGS; ++i)
>  amdgpu_ring_fini(>uvd.inst[j].ring_enc[i]);
>  }
> +   amdgpu_bo_free_kernel(>uvd.ib_bo, NULL, NULL);
>  release_firmware(adev->uvd.fw);
>
>  return 0;
> @@ -1066,7 +1095,7 @@ int amdgpu_uvd_ring_parse_cs(struct amdgpu_cs_parser 
> *parser, uint32_t ib_idx)
>  return 0;
>   }
>
> -static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo 
> *bo,
> +static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, uint64_t addr,
> bool direct, struct dma_fence **fence)
>   {
>  struct amdgpu_device *adev = ring->adev;
> @@ -1074,29 +1103,15 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring 
> *ring, struct amdgpu_bo *bo,
>  struct amdgpu_job *job;
>  struct amdgpu_ib *ib;
>  uint32_t data[4];
> -   uint64_t addr;
>  long r;
>  int i;
>  unsigned offset_idx = 0;
>  unsigned offset[3] = { UVD_BASE_SI, 0, 0 };
>
> -   amdgpu_bo_kunmap(bo);
> -   amdgpu_bo_unpin(bo);
> -
> -   if (!ring->adev->uvd.address_64_bit) {
> -   struct ttm_operation_ctx ctx = { true, false };
> -
> -   amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_VRAM);
> -   amdgpu_uvd_force_into_uvd_segment(bo);
> -   r = ttm_bo_validate(>tbo, >placement, );
> -   if (r)
> -   goto err;
> -   }
> -
>  r = amdgpu_job_alloc_with_ib(adev, 64, direct ? 
> AMDGPU_IB_POOL_DIRECT :
>   AMDGPU_IB_POOL_DELAYED, );
>  if (r)
> -   goto err;
> +   

RE: [PATCH] drm/amdgpu: Update RAS status print

2021-09-09 Thread Zhang, Hawking
[AMD Official Use Only]

Please change the coding style for the bracket.

+ if (ras_cmd->ras_status)
+ {

Other than that, the patch is

Reviewed-by: Hawking Zhang 

Regards,
Hawking
From: Clements, John 
Sent: Thursday, September 9, 2021 16:00
To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Li, 
Candice 
Subject: [PATCH] drm/amdgpu: Update RAS status print


[AMD Official Use Only]

Submitting patch to remove parser for RAS status response codes and just print 
RAS status value on non-zero value


Re: [PATCH 0/4] Fix stack usage of DML

2021-09-09 Thread Christian König
It's nice to see at least some of them addressed, feel free to add an 
Acked-by: Christian König 


Regards,
Christian.

Am 09.09.21 um 03:00 schrieb Harry Wentland:

With the '-Werror' enablement patch the amdgpu build was failing
on clang builds because a bunch of functions were blowing past
the 1024 byte stack frame default. Due to this we also noticed
that a lot of functions were passing large structs by value
instead of by pointer.

This series attempts to fix this.

There is still one remaining function that blows the 1024 limit by 40 bytes:

drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_mode_vba_21.c:3397:6:
  
error: stack frame size of 1064 bytes in function

'dml21_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than=]

This will be a slightly more challenging fix but I'll see if we can get it
below 1024 by breaking it into smaller functions.

With this series I can build amdgpu with CC=clang and a stack frame limit of
1064.

This series boots on a Radeon RX 5500 XT.

Harry Wentland (4):
   drm/amd/display: Pass display_pipe_params_st as const in DML
   drm/amd/display: Pass all structs in display_rq_dlg_helpers by pointer
   drm/amd/display: Fix rest of pass-by-value structs in DML
   drm/amd/display: Allocate structs needed by dcn_bw_calc_rq_dlg_ttu in
 pipe_ctx

  .../gpu/drm/amd/display/dc/calcs/dcn_calcs.c  |  55 ++--
  .../drm/amd/display/dc/dcn20/dcn20_resource.c |   2 +-
  .../dc/dml/dcn20/display_rq_dlg_calc_20.c | 158 +--
  .../dc/dml/dcn20/display_rq_dlg_calc_20.h |   4 +-
  .../dc/dml/dcn20/display_rq_dlg_calc_20v2.c   | 156 +--
  .../dc/dml/dcn20/display_rq_dlg_calc_20v2.h   |   4 +-
  .../dc/dml/dcn21/display_rq_dlg_calc_21.c | 156 +--
  .../dc/dml/dcn21/display_rq_dlg_calc_21.h |   4 +-
  .../dc/dml/dcn30/display_rq_dlg_calc_30.c | 132 -
  .../dc/dml/dcn30/display_rq_dlg_calc_30.h |   4 +-
  .../dc/dml/dcn31/display_rq_dlg_calc_31.c | 166 ++--
  .../dc/dml/dcn31/display_rq_dlg_calc_31.h |   4 +-
  .../drm/amd/display/dc/dml/display_mode_lib.h |   4 +-
  .../display/dc/dml/display_rq_dlg_helpers.c   | 256 +-
  .../display/dc/dml/display_rq_dlg_helpers.h   |  20 +-
  .../display/dc/dml/dml1_display_rq_dlg_calc.c | 246 -
  .../display/dc/dml/dml1_display_rq_dlg_calc.h |  10 +-
  .../gpu/drm/amd/display/dc/inc/core_types.h   |   3 +
  18 files changed, 695 insertions(+), 689 deletions(-)





Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings

2021-09-09 Thread Christian König

Am 09.09.21 um 05:28 schrieb Lazar, Lijo:

On 9/9/2021 8:13 AM, Yu, Lang wrote:

[AMD Official Use Only]

So the final decision is rollback to scnprintf().

If we can define our own helper functions like sysfs_emit/sysfs_emit_at

but without page boundary aligned limitation to make life easier?



No, we do want to make it clear that this function is used for sysfs 
files and make use of the extra checks provided by the sysfs_emit* 
functions. Looking at the origins of sysf_emit_at() specifically, 
there are indeed some cases of printing more than one values per file 
and multi-line usage.


Correct, but those are rather limited and well documented special cases. 
E.g. for example if you need to grab a lock to get multiple values which 
are supposed to be coherent to each other.


I think that's the case here, so printing multiple values is probably ok 
in general. But we still need to get the implementation straight.


So I'm fine with your original patch. Maybe, you can make the 
intention explicit by keeping the offset and buf start calculations in 
a separate inline function.

smu_get_sysfs_buf()


Exactly that is what is not ok. So once more the intended use case of 
those functions is:


offs = sysfs_emit(page, ...);
offs += sysfs_emit_at(page, offs, );
offs += sysfs_emit_at(page, offs, );
...

Another possible alternative which I think should be allowed is:

offs = 0;
for_each_clock_in_my_device(..) {
    offs += sysfs_emit_at(page, offs, );
}

But when you are calculating the initial offset manually then there is 
certainly something wrong here and that is not the intended usage pattern.


Regards,
Christian.



Thanks,
Lijo


Regards,

Lang

*From:* Powell, Darren 
*Sent:* Thursday, September 9, 2021 6:18 AM
*To:* Christian König ; Lazar, Lijo 
; Yu, Lang ; 
amd-gfx@lists.freedesktop.org
*Cc:* Deucher, Alexander ; Huang, Ray 
; Tian Tao 

*Subject:* Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings

[AMD Official Use Only]



*From:*Christian König >

*Sent:* Wednesday, September 8, 2021 8:43 AM
*To:* Lazar, Lijo mailto:lijo.la...@amd.com>>; 
Yu, Lang mailto:lang...@amd.com>>; 
amd-gfx@lists.freedesktop.org  
mailto:amd-gfx@lists.freedesktop.org>>
*Cc:* Deucher, Alexander >; Huang, Ray >; Tian Tao >; Powell, Darren 
mailto:darren.pow...@amd.com>>

*Subject:* Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings

Am 08.09.21 um 12:22 schrieb Lazar, Lijo:
 > On 9/8/2021 3:08 PM, Christian König wrote:
 >> Am 08.09.21 um 11:29 schrieb Lazar, Lijo:
 >>> On 9/8/2021 2:32 PM, Yu, Lang wrote:
  [AMD Official Use Only]
 > -Original Message-
 > From: Lazar, Lijo >

 > Sent: Wednesday, September 8, 2021 4:55 PM
 > To: Yu, Lang mailto:lang...@amd.com>>; 
Christian König
 > >; 
amd-gfx@lists.freedesktop.org 
 > Cc: Deucher, Alexander >; Huang, Ray
 > mailto:ray.hu...@amd.com>>; Tian Tao 
mailto:tiant...@hisilicon.com>>

 > Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at
 > warnings
 >
 >
 >
 > On 9/8/2021 1:14 PM, Yu, Lang wrote:
 >> [AMD Official Use Only]
 >>
 >>
 >>
 >>> -Original Message-
 >>> From: Lazar, Lijo >

 >>> Sent: Wednesday, September 8, 2021 3:36 PM
 >>> To: Christian König >; Yu, Lang
 >>> mailto:lang...@amd.com>>; 
amd-gfx@lists.freedesktop.org 
 >>> Cc: Deucher, Alexander >; Huang, Ray
 >>> mailto:ray.hu...@amd.com>>; Tian Tao 
mailto:tiant...@hisilicon.com>>

 >>> Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at
 >>> warnings
 >>>
 >>>
 >>>
 >>> On 9/8/2021 12:07 PM, Christian König wrote:
  Am 08.09.21 um 07:56 schrieb Lang Yu:
 > sysfs_emit and sysfs_emit_at requrie a page boundary 
aligned buf

 > address. Make them happy!
 >
 > Warning Log:
 > [  492.545174] invalid sysfs_emit_at: buf:f19bdfde 
at:0 [

 > 492.546416] WARNING: CPU: 7 PID: 1304 at fs/sysfs/file.c:765
 > sysfs_emit_at+0x4a/0xa0
 > [  492.654805] Call Trace:
 > [  492.655353]  ? smu_cmn_get_metrics_table+0x40/0x50 
[amdgpu] [

 > 492.656780] vangogh_print_clk_levels+0x369/0x410 [amdgpu] [
 > 492.658245] vangogh_common_print_clk_levels+0x77/0x80 
[amdgpu] [
 > 492.659733]  ? preempt_schedule_common+0x18/0x30 [ 
492.660713]

 > smu_print_ppclk_levels+0x65/0x90 [amdgpu] [ 492.662107]
 > 

[PATCH] drm/amdgpu: Update RAS status print

2021-09-09 Thread Clements, John
[AMD Official Use Only]

Submitting patch to remove parser for RAS status response codes and just print 
RAS status value on non-zero value


0001-drm-amdgpu-Update-RAS-status-print.patch
Description: 0001-drm-amdgpu-Update-RAS-status-print.patch


[PATCH v2] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings(v2)

2021-09-09 Thread Lang Yu
sysfs_emit and sysfs_emit_at requrie a page boundary
aligned buf address. Make them happy!

v2: use an inline function

Warning Log:
[  492.545174] invalid sysfs_emit_at: buf:f19bdfde at:0
[  492.546416] WARNING: CPU: 7 PID: 1304 at fs/sysfs/file.c:765 
sysfs_emit_at+0x4a/0xa0
[  492.654805] Call Trace:
[  492.655353]  ? smu_cmn_get_metrics_table+0x40/0x50 [amdgpu]
[  492.656780]  vangogh_print_clk_levels+0x369/0x410 [amdgpu]
[  492.658245]  vangogh_common_print_clk_levels+0x77/0x80 [amdgpu]
[  492.659733]  ? preempt_schedule_common+0x18/0x30
[  492.660713]  smu_print_ppclk_levels+0x65/0x90 [amdgpu]
[  492.662107]  amdgpu_get_pp_od_clk_voltage+0x13d/0x190 [amdgpu]
[  492.663620]  dev_attr_show+0x1d/0x40

Signed-off-by: Lang Yu 
---
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c|  8 ++--
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c  |  4 +++-
 .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c  |  4 +++-
 drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 16 ++--
 drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c  |  2 ++
 .../gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c   | 12 
 .../gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c |  6 --
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h   | 13 +
 8 files changed, 49 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index e343cc218990..2e5a362aa06b 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -771,8 +771,12 @@ static int arcturus_print_clk_levels(struct smu_context 
*smu,
struct smu_11_0_dpm_context *dpm_context = NULL;
uint32_t gen_speed, lane_width;
 
-   if (amdgpu_ras_intr_triggered())
-   return sysfs_emit(buf, "unavailable\n");
+   smu_cmn_get_sysfs_buf(, size);
+
+   if (amdgpu_ras_intr_triggered()) {
+   size += sysfs_emit_at(buf, size, "unavailable\n");
+   return size;
+   }
 
dpm_context = smu_dpm->dpm_context;
 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 4c81989b8162..63e1f0db579c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -1279,6 +1279,8 @@ static int navi10_print_clk_levels(struct smu_context 
*smu,
struct smu_11_0_overdrive_table *od_settings = smu->od_settings;
uint32_t min_value, max_value;
 
+   smu_cmn_get_sysfs_buf(, );
+
switch (clk_type) {
case SMU_GFXCLK:
case SMU_SCLK:
@@ -1392,7 +1394,7 @@ static int navi10_print_clk_levels(struct smu_context 
*smu,
case SMU_OD_RANGE:
if (!smu->od_enabled || !od_table || !od_settings)
break;
-   size = sysfs_emit(buf, "%s:\n", "OD_RANGE");
+   size += sysfs_emit_at(buf, size, "%s:\n", "OD_RANGE");
 
if (navi10_od_feature_is_supported(od_settings, 
SMU_11_0_ODCAP_GFXCLK_LIMITS)) {
navi10_od_setting_get_range(od_settings, 
SMU_11_0_ODSETTING_GFXCLKFMIN,
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 5e292c3f5050..d7519688065f 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -1058,6 +1058,8 @@ static int sienna_cichlid_print_clk_levels(struct 
smu_context *smu,
uint32_t min_value, max_value;
uint32_t smu_version;
 
+   smu_cmn_get_sysfs_buf(, );
+
switch (clk_type) {
case SMU_GFXCLK:
case SMU_SCLK:
@@ -1180,7 +1182,7 @@ static int sienna_cichlid_print_clk_levels(struct 
smu_context *smu,
if (!smu->od_enabled || !od_table || !od_settings)
break;
 
-   size = sysfs_emit(buf, "%s:\n", "OD_RANGE");
+   size += sysfs_emit_at(buf, size, "%s:\n", "OD_RANGE");
 
if (sienna_cichlid_is_od_feature_supported(od_settings, 
SMU_11_0_7_ODCAP_GFXCLK_LIMITS)) {
sienna_cichlid_get_od_setting_range(od_settings, 
SMU_11_0_7_ODSETTING_GFXCLKFMIN,
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
index 3a3421452e57..f6ef0ce6e9e2 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
@@ -589,10 +589,12 @@ static int vangogh_print_legacy_clk_levels(struct 
smu_context *smu,
if (ret)
return ret;
 
+   smu_cmn_get_sysfs_buf(, );
+
switch (clk_type) {
case SMU_OD_SCLK:
if (smu_dpm_ctx->dpm_level == AMD_DPM_FORCED_LEVEL_MANUAL) {
-   size = sysfs_emit(buf, "%s:\n", "OD_SCLK");
+   size += 

Re: [PATCH 1/2] drm/amdgpu: Increase direct IB pool size

2021-09-09 Thread Pan, Xinhui
[AMD Official Use Only]

yep, vcn need 128kb extra memory.  I will make the pool size constant as 256kb.

From: Koenig, Christian 
Sent: Thursday, September 9, 2021 3:14:15 PM
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org 

Cc: Deucher, Alexander 
Subject: Re: [PATCH 1/2] drm/amdgpu: Increase direct IB pool size

Am 09.09.21 um 07:54 schrieb Pan, Xinhui:
> [AMD Official Use Only]
>
> Direct IB pool is used for vce/uvd/vcn IB extra msg too. Increase its
> size to 64 pages.

Do you really run into issues with that? 64 pages are 256kiB on x86 and
the extra msg are maybe 2kiB.

Additional to that we should probably make this a constant independent
of the CPU page size.

Christian.

>
> Signed-off-by: xinhui pan 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index c076a6b9a5a2..cd2c7073fdd9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -308,7 +308,7 @@ int amdgpu_ib_pool_init(struct amdgpu_device *adev)
>
>  for (i = 0; i < AMDGPU_IB_POOL_MAX; i++) {
>  if (i == AMDGPU_IB_POOL_DIRECT)
> -   size = PAGE_SIZE * 6;
> +   size = PAGE_SIZE * 64;
>  else
>  size = AMDGPU_IB_POOL_SIZE;
>
> --
> 2.25.1
>



RE: [PATCH] drm/amdgpu: refactor function to init no-psp fw

2021-09-09 Thread Zhang, Hawking
[AMD Official Use Only]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Gao, Likun  
Sent: Thursday, September 9, 2021 15:15
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Gao, Likun 
Subject: [PATCH] drm/amdgpu: refactor function to init no-psp fw

From: Likun Gao 

Refactor the code of amdgpu_ucode_init_single_fw to make it more readable as 
too many ucode need to handle on this function currently.

Signed-off-by: Likun Gao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 160 ++
 1 file changed, 75 insertions(+), 85 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
index abd8469380e5..5f396936c6ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
@@ -572,6 +572,7 @@ static int amdgpu_ucode_init_single_fw(struct amdgpu_device 
*adev,
const struct dmcu_firmware_header_v1_0 *dmcu_hdr = NULL;
const struct dmcub_firmware_header_v1_0 *dmcub_hdr = NULL;
const struct mes_firmware_header_v1_0 *mes_hdr = NULL;
+   u8 *ucode_addr;
 
if (NULL == ucode->fw)
return 0;
@@ -588,94 +589,83 @@ static int amdgpu_ucode_init_single_fw(struct 
amdgpu_device *adev,
dmcub_hdr = (const struct dmcub_firmware_header_v1_0 *)ucode->fw->data;
mes_hdr = (const struct mes_firmware_header_v1_0 *)ucode->fw->data;
 
-   if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP ||
-   (ucode->ucode_id != AMDGPU_UCODE_ID_CP_MEC1 &&
-ucode->ucode_id != AMDGPU_UCODE_ID_CP_MEC2 &&
-ucode->ucode_id != AMDGPU_UCODE_ID_CP_MEC1_JT &&
-ucode->ucode_id != AMDGPU_UCODE_ID_CP_MEC2_JT &&
-ucode->ucode_id != AMDGPU_UCODE_ID_CP_MES &&
-ucode->ucode_id != AMDGPU_UCODE_ID_CP_MES_DATA &&
-ucode->ucode_id != AMDGPU_UCODE_ID_RLC_RESTORE_LIST_CNTL &&
-ucode->ucode_id != AMDGPU_UCODE_ID_RLC_RESTORE_LIST_GPM_MEM &&
-ucode->ucode_id != AMDGPU_UCODE_ID_RLC_RESTORE_LIST_SRM_MEM &&
-ucode->ucode_id != AMDGPU_UCODE_ID_RLC_IRAM &&
-ucode->ucode_id != AMDGPU_UCODE_ID_RLC_DRAM &&
-ucode->ucode_id != AMDGPU_UCODE_ID_DMCU_ERAM &&
-ucode->ucode_id != AMDGPU_UCODE_ID_DMCU_INTV &&
-ucode->ucode_id != AMDGPU_UCODE_ID_DMCUB)) {
-   ucode->ucode_size = le32_to_cpu(header->ucode_size_bytes);
-
-   memcpy(ucode->kaddr, (void *)((uint8_t *)ucode->fw->data +
- 
le32_to_cpu(header->ucode_array_offset_bytes)),
-  ucode->ucode_size);
-   } else if (ucode->ucode_id == AMDGPU_UCODE_ID_CP_MEC1 ||
-  ucode->ucode_id == AMDGPU_UCODE_ID_CP_MEC2) {
-   ucode->ucode_size = le32_to_cpu(header->ucode_size_bytes) -
-   le32_to_cpu(cp_hdr->jt_size) * 4;
-
-   memcpy(ucode->kaddr, (void *)((uint8_t *)ucode->fw->data +
- 
le32_to_cpu(header->ucode_array_offset_bytes)),
-  ucode->ucode_size);
-   } else if (ucode->ucode_id == AMDGPU_UCODE_ID_CP_MEC1_JT ||
-  ucode->ucode_id == AMDGPU_UCODE_ID_CP_MEC2_JT) {
-   ucode->ucode_size = le32_to_cpu(cp_hdr->jt_size) * 4;
-
-   memcpy(ucode->kaddr, (void *)((uint8_t *)ucode->fw->data +
- 
le32_to_cpu(header->ucode_array_offset_bytes) +
- le32_to_cpu(cp_hdr->jt_offset) * 
4),
-  ucode->ucode_size);
-   } else if (ucode->ucode_id == AMDGPU_UCODE_ID_DMCU_ERAM) {
-   ucode->ucode_size = le32_to_cpu(header->ucode_size_bytes) -
+   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
+   switch (ucode->ucode_id) {
+   case AMDGPU_UCODE_ID_CP_MEC1:
+   case AMDGPU_UCODE_ID_CP_MEC2:
+   ucode->ucode_size = 
le32_to_cpu(header->ucode_size_bytes) -
+   le32_to_cpu(cp_hdr->jt_size) * 4;
+   ucode_addr = (u8 *)ucode->fw->data +
+   le32_to_cpu(header->ucode_array_offset_bytes);
+   break;
+   case AMDGPU_UCODE_ID_CP_MEC1_JT:
+   case AMDGPU_UCODE_ID_CP_MEC2_JT:
+   ucode->ucode_size = le32_to_cpu(cp_hdr->jt_size) * 4;
+   ucode_addr = (u8 *)ucode->fw->data +
+   le32_to_cpu(header->ucode_array_offset_bytes) +
+   le32_to_cpu(cp_hdr->jt_offset) * 4;
+   break;
+   case AMDGPU_UCODE_ID_RLC_RESTORE_LIST_CNTL:
+   ucode->ucode_size = 
adev->gfx.rlc.save_restore_list_cntl_size_bytes;
+   ucode_addr = adev->gfx.rlc.save_restore_list_cntl;
+  

Re: [PATCH] Enable '-Werror' by default for all kernel builds

2021-09-09 Thread Christian König

Am 09.09.21 um 08:07 schrieb Guenter Roeck:

On 9/8/21 10:58 PM, Christoph Hellwig wrote:

On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:

It'd be good to avoid. It has helped uncover build issues with KASAN in
the past. Or at least make it dependent on the problematic 
architecture.

For example if arm is a problem, something like this:


I'm also seeing quite a few stack size warnings with KASAN on x86_64
without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
few warnings without KASAN, but with KASAN there are a lot more.
I'll try to find some time to dig into them.

While we're at it, with -Werror something like this is really futile:

drivers/gpu/drm/amd/amdgpu/amdgpu_object.c: In function 
‘amdgpu_bo_support_uswc’:

drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:493:2: warning: #warning
Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance 
thanks to write-combining [-Wcpp
   493 | #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for 
better performance \

   |  ^~~


Ah, yes good point!



I have been wondering if all those #warning "errors" should either
be removed or be replaced with "#pragma message".


Well we started to add those warnings because people compiled their 
kernel with CONFIG_MTRR and CONFIG_X86_PAT and was then wondering why 
the performance of the display driver was so crappy.


When those warning now generate an error which you have to disable 
explicitly then that might not be bad at all.


It at least points people to this setting and makes it really clear that 
they are doing something very unusual and need to keep in mind that it 
might not have the desired result.


Regards,
Christian.



Guenter




Re: [PATCH] Enable '-Werror' by default for all kernel builds

2021-09-09 Thread Christoph Hellwig
On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
> It'd be good to avoid. It has helped uncover build issues with KASAN in
> the past. Or at least make it dependent on the problematic architecture.
> For example if arm is a problem, something like this:

I'm also seeing quite a few stack size warnings with KASAN on x86_64
without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
few warnings without KASAN, but with KASAN there are a lot more.
I'll try to find some time to dig into them.

While we're at it, with -Werror something like this is really futile:

drivers/gpu/drm/amd/amdgpu/amdgpu_object.c: In function 
???amdgpu_bo_support_uswc???:
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:493:2: warning: #warning
Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance thanks to 
write-combining [-Wcpp
  493 | #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better 
performance \
  |  ^~~


Re: [PATCH] Enable '-Werror' by default for all kernel builds

2021-09-09 Thread Guenter Roeck

On 9/8/21 10:58 PM, Christoph Hellwig wrote:

On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:

It'd be good to avoid. It has helped uncover build issues with KASAN in
the past. Or at least make it dependent on the problematic architecture.
For example if arm is a problem, something like this:


I'm also seeing quite a few stack size warnings with KASAN on x86_64
without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
few warnings without KASAN, but with KASAN there are a lot more.
I'll try to find some time to dig into them.

While we're at it, with -Werror something like this is really futile:

drivers/gpu/drm/amd/amdgpu/amdgpu_object.c: In function 
‘amdgpu_bo_support_uswc’:
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:493:2: warning: #warning
Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance thanks to 
write-combining [-Wcpp
   493 | #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better 
performance \
   |  ^~~



I have been wondering if all those #warning "errors" should either
be removed or be replaced with "#pragma message".

Guenter


[PATCH] drm/amd/amdkfd: fix possible memory leak in svm_range_restore_pages

2021-09-09 Thread Xiyu Yang
The memory leak issue may take place in an error handling path. When
p->xnack_enabled is NULL, the function simply returns with -EFAULT and
forgets to decrement the reference count of a kfd_process object bumped
by kfd_lookup_process_by_pasid, which may incur memory leaks.

Fix it by jumping to label "out", in which kfd_unref_process() decreases
the refcount.

Signed-off-by: Xiyu Yang 
Signed-off-by: Xin Xiong 
Signed-off-by: Xin Tan 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index e883731c3f8f..0f7f1e5621ea 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2426,7 +2426,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, 
unsigned int pasid,
}
if (!p->xnack_enabled) {
pr_debug("XNACK not enabled for pasid 0x%x\n", pasid);
-   return -EFAULT;
+   r = -EFAULT;
+   goto out;
}
svms = >svms;
 
-- 
2.7.4



Re: [PATCH 2/2] drm/amdgpu: alloc IB extra msg from IB pool

2021-09-09 Thread Christian König

Am 09.09.21 um 07:55 schrieb Pan, Xinhui:

[AMD Official Use Only]

There is one dedicated IB pool for IB test. So lets use it for extra msg
too.

For UVD on older HW, use one reserved BO at specific range.

Signed-off-by: xinhui pan 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 173 +++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h |   1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c |  18 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c |  99 ++
  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c   |  28 ++--
  drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c   |  28 ++--
  6 files changed, 185 insertions(+), 162 deletions(-)


Please split that up into one patch for UVD, one for VCE and a third for 
VCN.




diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index d451c359606a..733cfc848c6c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -299,8 +299,36 @@ int amdgpu_uvd_sw_init(struct amdgpu_device *adev)
 }

 /* from uvd v5.0 HW addressing capacity increased to 64 bits */
-   if (!amdgpu_device_ip_block_version_cmp(adev, AMD_IP_BLOCK_TYPE_UVD, 5, 
0))
+   if (!amdgpu_device_ip_block_version_cmp(adev, AMD_IP_BLOCK_TYPE_UVD, 5, 
0)) {
 adev->uvd.address_64_bit = true;


Yeah, that's exactly what I'm trying to avoid.

We should use the BO approach both for old and new UVD blocks, just 
making sure that we place it correctly for the old ones.


This way we have much lower chance of breaking the old stuff.

Thanks,
Christian.


+   } else {
+   struct amdgpu_bo *bo = NULL;
+   void *addr;
+
+   r = amdgpu_bo_create_reserved(adev, PAGE_SIZE, PAGE_SIZE,
+   AMDGPU_GEM_DOMAIN_VRAM,
+   , NULL, );
+   if (r)
+   return r;
+   amdgpu_bo_kunmap(bo);
+   amdgpu_bo_unpin(bo);
+   r = amdgpu_bo_pin_restricted(bo, AMDGPU_GEM_DOMAIN_VRAM,
+   0, 256 << 20);
+   if (r) {
+   amdgpu_bo_unreserve(bo);
+   amdgpu_bo_unref();
+   return r;
+   }
+   r = amdgpu_bo_kmap(bo, );
+   if (r) {
+   amdgpu_bo_unpin(bo);
+   amdgpu_bo_unreserve(bo);
+   amdgpu_bo_unref();
+   return r;
+   }
+   adev->uvd.ib_bo = bo;
+   amdgpu_bo_unreserve(bo);
+   }

 switch (adev->asic_type) {
 case CHIP_TONGA:
@@ -342,6 +370,7 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev)
 for (i = 0; i < AMDGPU_MAX_UVD_ENC_RINGS; ++i)
 amdgpu_ring_fini(>uvd.inst[j].ring_enc[i]);
 }
+   amdgpu_bo_free_kernel(>uvd.ib_bo, NULL, NULL);
 release_firmware(adev->uvd.fw);

 return 0;
@@ -1066,7 +1095,7 @@ int amdgpu_uvd_ring_parse_cs(struct amdgpu_cs_parser 
*parser, uint32_t ib_idx)
 return 0;
  }

-static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *bo,
+static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, uint64_t addr,
bool direct, struct dma_fence **fence)
  {
 struct amdgpu_device *adev = ring->adev;
@@ -1074,29 +1103,15 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring 
*ring, struct amdgpu_bo *bo,
 struct amdgpu_job *job;
 struct amdgpu_ib *ib;
 uint32_t data[4];
-   uint64_t addr;
 long r;
 int i;
 unsigned offset_idx = 0;
 unsigned offset[3] = { UVD_BASE_SI, 0, 0 };

-   amdgpu_bo_kunmap(bo);
-   amdgpu_bo_unpin(bo);
-
-   if (!ring->adev->uvd.address_64_bit) {
-   struct ttm_operation_ctx ctx = { true, false };
-
-   amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_VRAM);
-   amdgpu_uvd_force_into_uvd_segment(bo);
-   r = ttm_bo_validate(>tbo, >placement, );
-   if (r)
-   goto err;
-   }
-
 r = amdgpu_job_alloc_with_ib(adev, 64, direct ? AMDGPU_IB_POOL_DIRECT :
  AMDGPU_IB_POOL_DELAYED, );
 if (r)
-   goto err;
+   return r;

 if (adev->asic_type >= CHIP_VEGA10) {
 offset_idx = 1 + ring->me;
@@ -1110,7 +1125,6 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, 
struct amdgpu_bo *bo,
 data[3] = PACKET0(offset[offset_idx] + UVD_NO_OP, 0);

 ib = >ibs[0];
-   addr = amdgpu_bo_gpu_offset(bo);
 ib->ptr[0] = data[0];
 ib->ptr[1] = addr;
 ib->ptr[2] = data[1];
@@ -1123,33 +1137,13 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring 
*ring, struct amdgpu_bo *bo,
 }
 ib->length_dw = 16;

-   if (direct) {
-

[PATCH] drm/amdgpu: refactor function to init no-psp fw

2021-09-09 Thread Likun Gao
From: Likun Gao 

Refactor the code of amdgpu_ucode_init_single_fw to make it more
readable as too many ucode need to handle on this function currently.

Signed-off-by: Likun Gao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 160 ++
 1 file changed, 75 insertions(+), 85 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
index abd8469380e5..5f396936c6ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
@@ -572,6 +572,7 @@ static int amdgpu_ucode_init_single_fw(struct amdgpu_device 
*adev,
const struct dmcu_firmware_header_v1_0 *dmcu_hdr = NULL;
const struct dmcub_firmware_header_v1_0 *dmcub_hdr = NULL;
const struct mes_firmware_header_v1_0 *mes_hdr = NULL;
+   u8 *ucode_addr;
 
if (NULL == ucode->fw)
return 0;
@@ -588,94 +589,83 @@ static int amdgpu_ucode_init_single_fw(struct 
amdgpu_device *adev,
dmcub_hdr = (const struct dmcub_firmware_header_v1_0 *)ucode->fw->data;
mes_hdr = (const struct mes_firmware_header_v1_0 *)ucode->fw->data;
 
-   if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP ||
-   (ucode->ucode_id != AMDGPU_UCODE_ID_CP_MEC1 &&
-ucode->ucode_id != AMDGPU_UCODE_ID_CP_MEC2 &&
-ucode->ucode_id != AMDGPU_UCODE_ID_CP_MEC1_JT &&
-ucode->ucode_id != AMDGPU_UCODE_ID_CP_MEC2_JT &&
-ucode->ucode_id != AMDGPU_UCODE_ID_CP_MES &&
-ucode->ucode_id != AMDGPU_UCODE_ID_CP_MES_DATA &&
-ucode->ucode_id != AMDGPU_UCODE_ID_RLC_RESTORE_LIST_CNTL &&
-ucode->ucode_id != AMDGPU_UCODE_ID_RLC_RESTORE_LIST_GPM_MEM &&
-ucode->ucode_id != AMDGPU_UCODE_ID_RLC_RESTORE_LIST_SRM_MEM &&
-ucode->ucode_id != AMDGPU_UCODE_ID_RLC_IRAM &&
-ucode->ucode_id != AMDGPU_UCODE_ID_RLC_DRAM &&
-ucode->ucode_id != AMDGPU_UCODE_ID_DMCU_ERAM &&
-ucode->ucode_id != AMDGPU_UCODE_ID_DMCU_INTV &&
-ucode->ucode_id != AMDGPU_UCODE_ID_DMCUB)) {
-   ucode->ucode_size = le32_to_cpu(header->ucode_size_bytes);
-
-   memcpy(ucode->kaddr, (void *)((uint8_t *)ucode->fw->data +
- 
le32_to_cpu(header->ucode_array_offset_bytes)),
-  ucode->ucode_size);
-   } else if (ucode->ucode_id == AMDGPU_UCODE_ID_CP_MEC1 ||
-  ucode->ucode_id == AMDGPU_UCODE_ID_CP_MEC2) {
-   ucode->ucode_size = le32_to_cpu(header->ucode_size_bytes) -
-   le32_to_cpu(cp_hdr->jt_size) * 4;
-
-   memcpy(ucode->kaddr, (void *)((uint8_t *)ucode->fw->data +
- 
le32_to_cpu(header->ucode_array_offset_bytes)),
-  ucode->ucode_size);
-   } else if (ucode->ucode_id == AMDGPU_UCODE_ID_CP_MEC1_JT ||
-  ucode->ucode_id == AMDGPU_UCODE_ID_CP_MEC2_JT) {
-   ucode->ucode_size = le32_to_cpu(cp_hdr->jt_size) * 4;
-
-   memcpy(ucode->kaddr, (void *)((uint8_t *)ucode->fw->data +
- 
le32_to_cpu(header->ucode_array_offset_bytes) +
- le32_to_cpu(cp_hdr->jt_offset) * 
4),
-  ucode->ucode_size);
-   } else if (ucode->ucode_id == AMDGPU_UCODE_ID_DMCU_ERAM) {
-   ucode->ucode_size = le32_to_cpu(header->ucode_size_bytes) -
+   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
+   switch (ucode->ucode_id) {
+   case AMDGPU_UCODE_ID_CP_MEC1:
+   case AMDGPU_UCODE_ID_CP_MEC2:
+   ucode->ucode_size = 
le32_to_cpu(header->ucode_size_bytes) -
+   le32_to_cpu(cp_hdr->jt_size) * 4;
+   ucode_addr = (u8 *)ucode->fw->data +
+   le32_to_cpu(header->ucode_array_offset_bytes);
+   break;
+   case AMDGPU_UCODE_ID_CP_MEC1_JT:
+   case AMDGPU_UCODE_ID_CP_MEC2_JT:
+   ucode->ucode_size = le32_to_cpu(cp_hdr->jt_size) * 4;
+   ucode_addr = (u8 *)ucode->fw->data +
+   le32_to_cpu(header->ucode_array_offset_bytes) +
+   le32_to_cpu(cp_hdr->jt_offset) * 4;
+   break;
+   case AMDGPU_UCODE_ID_RLC_RESTORE_LIST_CNTL:
+   ucode->ucode_size = 
adev->gfx.rlc.save_restore_list_cntl_size_bytes;
+   ucode_addr = adev->gfx.rlc.save_restore_list_cntl;
+   break;
+   case AMDGPU_UCODE_ID_RLC_RESTORE_LIST_GPM_MEM:
+   ucode->ucode_size = 
adev->gfx.rlc.save_restore_list_gpm_size_bytes;
+   ucode_addr = adev->gfx.rlc.save_restore_list_gpm;
+   break;
+ 

Re: [PATCH 1/2] drm/amdgpu: Increase direct IB pool size

2021-09-09 Thread Christian König

Am 09.09.21 um 07:54 schrieb Pan, Xinhui:

[AMD Official Use Only]

Direct IB pool is used for vce/uvd/vcn IB extra msg too. Increase its
size to 64 pages.


Do you really run into issues with that? 64 pages are 256kiB on x86 and 
the extra msg are maybe 2kiB.


Additional to that we should probably make this a constant independent 
of the CPU page size.


Christian.



Signed-off-by: xinhui pan 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index c076a6b9a5a2..cd2c7073fdd9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -308,7 +308,7 @@ int amdgpu_ib_pool_init(struct amdgpu_device *adev)

 for (i = 0; i < AMDGPU_IB_POOL_MAX; i++) {
 if (i == AMDGPU_IB_POOL_DIRECT)
-   size = PAGE_SIZE * 6;
+   size = PAGE_SIZE * 64;
 else
 size = AMDGPU_IB_POOL_SIZE;

--
2.25.1





Re: [resend PATCH] drm/ttm: Fix a deadlock if the target BO is not idle during swap

2021-09-09 Thread Christian König

Am 08.09.21 um 20:27 schrieb Daniel Vetter:

On Tue, Sep 07, 2021 at 11:28:23AM +0200, Christian König wrote:

Am 07.09.21 um 11:05 schrieb Daniel Vetter:

On Tue, Sep 07, 2021 at 08:22:20AM +0200, Christian König wrote:

Added a Fixes tag and pushed this to drm-misc-fixes.

We're in the merge window, this should have been drm-misc-next-fixes. I'll
poke misc maintainers so it's not lost.

Hui? It's a fix for a problem in stable and not in drm-misc-next.

Ah the flow chart is confusing. There is no current -rc, so it's always
-next-fixes. Or you're running the risk that it's lost until after -rc1.
Maybe we should clarify that "is the bug in current -rc?" only applies if
there is a current -rc.


Yeah, I've noticed this as well.

But when there is no current -rc because we are in the merge window then 
the question is how do I submit patches to the current stable?


In other words this patch here is really for 5.14 and should then be 
backported to 5.13 and maybe even 5.10 as well.


The code was restructured for 5.15 and I even need to double check if 
that still applies there as well.


Or should I send patches like those directly to Greg?

Regards,
Christian.



Anyway Thomas sent out a pr, so it's all good.
-Daniel


Christian.


-Daniel


It will take a while until it cycles back into the development branches, so
feel free to push some version to amd-staging-drm-next as well. Just ping
Alex when you do this.

Thanks,
Christian.

Am 07.09.21 um 06:08 schrieb xinhui pan:

The ret value might be -EBUSY, caller will think lru lock is still
locked but actually NOT. So return -ENOSPC instead. Otherwise we hit
list corruption.

ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0,
caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) will
be stuck as we actually did not free any BO memory. This usually happens
when the fence is not signaled for a long time.

Signed-off-by: xinhui pan 
Reviewed-by: Christian König 
---
drivers/gpu/drm/ttm/ttm_bo.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 8d7fd65ccced..23f906941ac9 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1152,9 +1152,9 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct 
ttm_operation_ctx *ctx,
}
if (bo->deleted) {
-   ttm_bo_cleanup_refs(bo, false, false, locked);
+   ret = ttm_bo_cleanup_refs(bo, false, false, locked);
ttm_bo_put(bo);
-   return 0;
+   return ret == -EBUSY ? -ENOSPC : ret;
}
ttm_bo_del_from_lru(bo);
@@ -1208,7 +1208,7 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct 
ttm_operation_ctx *ctx,
if (locked)
dma_resv_unlock(bo->base.resv);
ttm_bo_put(bo);
-   return ret;
+   return ret == -EBUSY ? -ENOSPC : ret;
}
void ttm_bo_tt_destroy(struct ttm_buffer_object *bo)