[PATCH] drm/amdkfd: print address in hex format rather than decimal

2022-09-05 Thread Yifan Zhang
Addresses should be printed in hex format.

Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index cbd593f7d553..2170db83e41d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1728,7 +1728,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
add_kgd_mem_to_kfd_bo_list(*mem, avm->process_info, user_addr);
 
if (user_addr) {
-   pr_debug("creating userptr BO for user_addr = %llu\n", 
user_addr);
+   pr_debug("creating userptr BO for user_addr = %llx\n", 
user_addr);
ret = init_user_pages(*mem, user_addr, criu_resume);
if (ret)
goto allocate_init_user_pages_failed;
-- 
2.37.1



[PATCH] drm/amdgpu: correct doorbell range/size value for CSDMA_DOORBELL_RANGE

2022-09-05 Thread Yifan Zhang
current function mixes CSDMA_DOORBELL_RANGE and SDMA0_DOORBELL_RANGE
range/size manipulation, while these 2 registers have difference size
field mask. Remove range/size manipulation for SDMA0_DOORBELL_RANGE.

Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c
index 1dc95ef21da6..f30bc826a878 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c
@@ -68,12 +68,6 @@ static void nbio_v7_7_sdma_doorbell_range(struct 
amdgpu_device *adev, int instan
doorbell_range = REG_SET_FIELD(doorbell_range,
   GDC0_BIF_CSDMA_DOORBELL_RANGE,
   SIZE, doorbell_size);
-   doorbell_range = REG_SET_FIELD(doorbell_range,
-  GDC0_BIF_SDMA0_DOORBELL_RANGE,
-  OFFSET, doorbell_index);
-   doorbell_range = REG_SET_FIELD(doorbell_range,
-  GDC0_BIF_SDMA0_DOORBELL_RANGE,
-  SIZE, doorbell_size);
} else {
doorbell_range = REG_SET_FIELD(doorbell_range,
   GDC0_BIF_SDMA0_DOORBELL_RANGE,
-- 
2.37.1



RE: Gang submit

2022-09-05 Thread Liu, Monk
[AMD Official Use Only - General]

Hi Christian


> A gang submission guarantees that multiple IBs can run on different engines 
> at the same time.
> The effect is that as long as members of a gang are waiting to be submitted 
> no other gang can start pushing jobs to the hardware and so deadlocks are 
> effectively prevented.

Could you please help to explain or confirm:

1) If one gfx ib and one compute ib are in a gang, can they run literally  in 
parallel on GPU ? 
2) if one gfx ib and one compute ib are belong to two gang, they will be put to 
the gfx and compute ring one by one (e.g:  gang1-gfx-ib scheduled and signaled, 
and then gang2-compute-ib scheduled )

Thanks 
---
Monk Liu | Cloud GPU & Virtualization Solution | AMD
---
we are hiring software manager for CVS core team
---

-Original Message-
From: amd-gfx  On Behalf Of Christian 
König
Sent: 2022年3月3日 16:23
To: amd-gfx@lists.freedesktop.org; Olsak, Marek 
Subject: Gang submit

Hi guys,

this patch set implements the the requirement for so called gang submissions in 
the CS interface.

A gang submission guarantees that multiple IBs can run on different engines at 
the same time.

This is implemented by keeping a global per-device gang around represented by a 
dma_fence which signals as soon as all jobs in a gang are pushed to the 
hardware.

The effect is that as long as members of a gang are waiting to be submitted no 
other gang can start pushing jobs to the hardware and so deadlocks are 
effectively prevented.

The whole set is based on top of my dma_resv_usage work and a few patches 
merged over from amd-staging-drm-next, so it won't easily apply anywhere.

Please review and comment,
Christian.



Re: amd iommu configuration

2022-09-05 Thread Vasant Hegde
Steven,

[+Felix, amd-fgx list]


On 9/3/2022 4:29 AM, Steven J Abner wrote:
> Hi
> I was referred to you from linux-ker...@vger.kernel.org about the following 
> issue.
> Here is as was written:
> On 9/1/22 11:36, Steven J Abner wrote:
> Hi
> Building a kernel tailored for AMD 2400g on ASRock B450 using 5.18.12 as base.
> I stumbled across an odd situation and which lacked Kconfig info and lead to
> oddity.
> /drivers/iommu/amd/Kconfig states 'config AMD_IOMMU_V2' is 'tristate' but 
> unlike
> many
> other tristate configures doesn't mention that module name is 'iommu_v2.ko' 
> and
> loading should be done by adding to modules-load.d.
> 
> The oddity is that by loading as module is as follows (differences):
> 
> builtin iommu_v2 version dmesg:
> amdgpu: HMM registered 2048MB device memory
> amdgpu: Topology: Add APU node [0x0:0x0]
> amdgpu: Topology: Add APU node [0x15dd:0x1002]
> AMD-Vi: AMD IOMMUv2 loaded and initialized
> kfd kfd: amdgpu: added device 1002:15dd
> kfd kfd: amdgpu: Allocated 3969056 bytes on gart
> memmap_init_zone_device initialised 524288 pages in 0ms

IOMMU V2 modules provides IOMMU feature like attaching device to
process. I think amdgpu uses those features if available.
So in this case amdgpu is using those IOMMU features.

> 
> module not loaded due to missing iommu.conf dmesg:
> amdgpu: CRAT table disabled by module option
> amdgpu: Topology: Add CPU node
> amdgpu: Virtual CRAT table created for CPU
> kfd kfd: amdgpu: GC IP 090100 not supported in kfd
> 
> module load through iommu.conf dmesg:
> amdgpu: CRAT table disabled by module option
> amdgpu: Topology: Add CPU node
> amdgpu: Virtual CRAT table created for CPU
> AMD-Vi: AMD IOMMUv2 loaded and initialized
> kfd kfd: amdgpu: GC IP 090100 not supported in kfd
> 
> Note, only difference on witk/without iommu.conf is:
> AMD-Vi: AMD IOMMUv2 loaded and initialized

I think in this case iommu_v2.ko module got loaded after GPU
initialized. Hence amdgpu is not using iommu v2 features.


> 
> So does this mean missing features by not having builtin?
> If not, should Kconfig have hint about module and loading?

@Felix,
  I see that drivers/gpu/drm/amd/amdkfd/Kconfig contains below line
imply AMD_IOMMU_V2 if X86_64


  Should we change `s/imply/select` ?

-Vasant


> 
> Steve
> 
> I wish to be personally CC'ed the answers/comments posted to the list
> in response to my posting, please:) Not a list member.
> 
> I hope you can assist linux people and myself. I assumed from dmesg that
> it must be builtin. But also wonder if it should be in amdgpu or tied to it.
> Steve
> 
> 



Re: [PATCH V2] drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled

2022-09-05 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang 

Regards,
Hawking
From: Chai, Thomas 
Date: Monday, September 5, 2022 at 14:35
To: amd-gfx@lists.freedesktop.org 
Cc: Zhang, Hawking , Zhou1, Tao 
Subject: RE: [PATCH V2] drm/amdgpu: TA unload messages are not actually sent to 
psp when amdgpu is uninstalled
[AMD Official Use Only - General]

Ping


-
Best Regards,
Thomas

-Original Message-
From: Chai, Thomas 
Sent: Thursday, September 1, 2022 4:40 PM
To: amd-gfx@lists.freedesktop.org
Cc: Chai, Thomas ; Zhang, Hawking ; 
Zhou1, Tao ; Chai, Thomas 
Subject: [PATCH V2] drm/amdgpu: TA unload messages are not actually sent to psp 
when amdgpu is uninstalled

V1:
  The psp_cmd_submit_buf function is called by psp_hw_fini to send TA unload 
messages to psp to terminate ras, asd and tmr. But when amdgpu is uninstalled, 
drm_dev_unplug is called earlier than psp_hw_fini in amdgpu_pci_remove, the 
calling order as follows:
static void amdgpu_pci_remove(struct pci_dev *pdev) {
drm_dev_unplug
..
amdgpu_driver_unload_kms->amdgpu_device_fini_hw->...
->.hw_fini->psp_hw_fini->...
->psp_ta_unload->psp_cmd_submit_buf
..
}
The program will return when calling drm_dev_enter in psp_cmd_submit_buf.

So the call to drm_dev_enter in psp_cmd_submit_buf should be removed, so that 
the TA unload messages can be sent to the psp when amdgpu is uninstalled.

V2:
1. Restore psp_cmd_submit_buf to its original code.
2. Move drm_dev_unplug call after amdgpu_driver_unload_kms in
   amdgpu_pci_remove.
3. Since amdgpu_device_fini_hw is called by amdgpu_driver_unload_kms,
   remove the unplug check to release device mmio resource in
   amdgpu_device_fini_hw before calling drm_dev_unplug.

Signed-off-by: YiPeng Chai 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 4 ++--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index afaa1056e039..62b26f0e37b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3969,8 +3969,7 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)

 amdgpu_gart_dummy_page_fini(adev);

-   if (drm_dev_is_unplugged(adev_to_drm(adev)))
-   amdgpu_device_unmap_mmio(adev);
+   amdgpu_device_unmap_mmio(adev);

 }

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index de7144b06e93..728a0933ea6f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2181,8 +2181,6 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 struct drm_device *dev = pci_get_drvdata(pdev);
 struct amdgpu_device *adev = drm_to_adev(dev);

-   drm_dev_unplug(dev);
-
 if (adev->pm.rpm_mode != AMDGPU_RUNPM_NONE) {
 pm_runtime_get_sync(dev->dev);
 pm_runtime_forbid(dev->dev);
@@ -2190,6 +2188,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)

 amdgpu_driver_unload_kms(dev);

+   drm_dev_unplug(dev);
+
 /*
  * Flush any in flight DMA operations from device.
  * Clear the Bus Master Enable bit and then wait on the PCIe Device
--
2.25.1


[PATCH 11/12] drm/amdgpu: add gang submit frontend v4

2022-09-05 Thread Christian König
Allows submitting jobs as gang which needs to run on multiple engines at the
same time.

All members of the gang get the same implicit, explicit and VM dependencies. So
no gang member will start running until everything else is ready.

The last job is considered the gang leader (usually a submission to the GFX
ring) and used for signaling output dependencies.

Each job is remembered individually as user of a buffer object, so there is no
joining of work at the end.

v2: rebase and fix review comments from Andrey and Yogesh
v3: use READ instead of BOOKKEEP for now because of VM unmaps, set gang
leader only when necessary
v4: fix order of pushing jobs and adding fences found by Trigger.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 258 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.h|  10 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h |  12 +-
 3 files changed, 184 insertions(+), 96 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 72147032bda9..294dba095aad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -69,6 +69,7 @@ static int amdgpu_cs_p1_ib(struct amdgpu_cs_parser *p,
   unsigned int *num_ibs)
 {
struct drm_sched_entity *entity;
+   unsigned int i;
int r;
 
r = amdgpu_ctx_get_entity(p->ctx, chunk_ib->ip_type,
@@ -77,17 +78,28 @@ static int amdgpu_cs_p1_ib(struct amdgpu_cs_parser *p,
if (r)
return r;
 
-   /* Abort if there is no run queue associated with this entity.
-* Possibly because of disabled HW IP*/
+   /*
+* Abort if there is no run queue associated with this entity.
+* Possibly because of disabled HW IP.
+*/
if (entity->rq == NULL)
return -EINVAL;
 
-   /* Currently we don't support submitting to multiple entities */
-   if (p->entity && p->entity != entity)
+   /* Check if we can add this IB to some existing job */
+   for (i = 0; i < p->gang_size; ++i) {
+   if (p->entities[i] == entity)
+   goto found;
+   }
+
+   /* If not increase the gang size if possible */
+   if (i == AMDGPU_CS_GANG_SIZE)
return -EINVAL;
 
-   p->entity = entity;
-   ++(*num_ibs);
+   p->entities[i] = entity;
+   p->gang_size = i + 1;
+
+found:
+   ++(num_ibs[i]);
return 0;
 }
 
@@ -161,11 +173,12 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
   union drm_amdgpu_cs *cs)
 {
struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
+   unsigned int num_ibs[AMDGPU_CS_GANG_SIZE] = { };
struct amdgpu_vm *vm = >vm;
uint64_t *chunk_array_user;
uint64_t *chunk_array;
-   unsigned size, num_ibs = 0;
uint32_t uf_offset = 0;
+   unsigned int size;
int ret;
int i;
 
@@ -228,7 +241,7 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
if (size < sizeof(struct drm_amdgpu_cs_chunk_ib))
goto free_partial_kdata;
 
-   ret = amdgpu_cs_p1_ib(p, p->chunks[i].kdata, _ibs);
+   ret = amdgpu_cs_p1_ib(p, p->chunks[i].kdata, num_ibs);
if (ret)
goto free_partial_kdata;
break;
@@ -265,21 +278,28 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
}
}
 
-   ret = amdgpu_job_alloc(p->adev, num_ibs, >job, vm);
-   if (ret)
-   goto free_all_kdata;
+   if (!p->gang_size)
+   return -EINVAL;
 
-   ret = drm_sched_job_init(>job->base, p->entity, >vm);
-   if (ret)
-   goto free_all_kdata;
+   for (i = 0; i < p->gang_size; ++i) {
+   ret = amdgpu_job_alloc(p->adev, num_ibs[i], >jobs[i], vm);
+   if (ret)
+   goto free_all_kdata;
+
+   ret = drm_sched_job_init(>jobs[i]->base, p->entities[i],
+>vm);
+   if (ret)
+   goto free_all_kdata;
+   }
+   p->gang_leader = p->jobs[p->gang_size - 1];
 
-   if (p->ctx->vram_lost_counter != p->job->vram_lost_counter) {
+   if (p->ctx->vram_lost_counter != p->gang_leader->vram_lost_counter) {
ret = -ECANCELED;
goto free_all_kdata;
}
 
if (p->uf_entry.tv.bo)
-   p->job->uf_addr = uf_offset;
+   p->gang_leader->uf_addr = uf_offset;
kvfree(chunk_array);
 
/* Use this opportunity to fill in task info for the vm */
@@ -301,21 +321,18 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
return ret;
 }
 
-static int amdgpu_cs_p2_ib(struct amdgpu_cs_parser *p,
-  struct amdgpu_cs_chunk 

[PATCH 12/12] drm/amdgpu: cleanup VCN3 and VCN4 instance limiting v2

2022-09-05 Thread Christian König
Check if the entity is already limited, not if it's assigned to the
first instance.

v2: only a cleanup, not a fix

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 5 ++---
 drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 5 ++---
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 3cabceee5f57..5e64c3426728 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -1862,13 +1862,12 @@ static int vcn_v3_0_ring_patch_cs_in_place(struct 
amdgpu_cs_parser *p,
   struct amdgpu_job *job,
   struct amdgpu_ib *ib)
 {
-   struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched);
uint32_t msg_lo = 0, msg_hi = 0;
unsigned i;
int r;
 
-   /* The first instance can decode anything */
-   if (!ring->me)
+   /* Abort if it's already limited */
+   if (job->base.entity->num_sched_list <= 1)
return 0;
 
for (i = 0; i < ib->length_dw; i += 2) {
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
index 9338172eec8b..a8264fe2201d 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
@@ -1430,13 +1430,12 @@ static int vcn_v4_0_ring_patch_cs_in_place(struct 
amdgpu_cs_parser *p,
   struct amdgpu_job *job,
   struct amdgpu_ib *ib)
 {
-   struct amdgpu_ring *ring = to_amdgpu_ring(job->base.entity->rq->sched);
struct amdgpu_vcn_decode_buffer *decode_buffer;
uint64_t addr;
uint32_t val;
 
-   /* The first instance can decode anything */
-   if (!ring->me)
+   /* Abort if it's already limited */
+   if (job->base.entity->num_sched_list <= 1)
return 0;
 
/* unified queue ib header has 8 double words. */
-- 
2.25.1



[PATCH 10/12] drm/amdgpu: add gang submit backend v2

2022-09-05 Thread Christian König
Allows submitting jobs as gang which needs to run on multiple
engines at the same time.

Basic idea is that we have a global gang submit fence representing when the
gang leader is finally pushed to run on the hardware last.

Jobs submitted as gang are never re-submitted in case of a GPU reset since this
won't work and will just deadlock the hardware immediately again.

v2: fix logic inversion, improve documentation, fix rcu

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 35 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 28 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h|  3 ++
 4 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 79bb6fd83094..ae9371b172e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -885,6 +885,7 @@ struct amdgpu_device {
u64 fence_context;
unsignednum_rings;
struct amdgpu_ring  *rings[AMDGPU_MAX_RINGS];
+   struct dma_fence __rcu  *gang_submit;
boolib_pool_ready;
struct amdgpu_sa_managerib_pools[AMDGPU_IB_POOL_MAX];
struct amdgpu_sched 
gpu_sched[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX];
@@ -1294,6 +1295,8 @@ u32 amdgpu_device_pcie_port_rreg(struct amdgpu_device 
*adev,
u32 reg);
 void amdgpu_device_pcie_port_wreg(struct amdgpu_device *adev,
u32 reg, u32 v);
+struct dma_fence *amdgpu_device_switch_gang(struct amdgpu_device *adev,
+   struct dma_fence *gang);
 
 /* atpx handler */
 #if defined(CONFIG_VGA_SWITCHEROO)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index d7eb23b8d692..172095122cc1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3501,6 +3501,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
adev->gmc.gart_size = 512 * 1024 * 1024;
adev->accel_working = false;
adev->num_rings = 0;
+   RCU_INIT_POINTER(adev->gang_submit, dma_fence_get_stub());
adev->mman.buffer_funcs = NULL;
adev->mman.buffer_funcs_ring = NULL;
adev->vm_manager.vm_pte_funcs = NULL;
@@ -3983,6 +3984,7 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev)
release_firmware(adev->firmware.gpu_info_fw);
adev->firmware.gpu_info_fw = NULL;
adev->accel_working = false;
+   dma_fence_put(rcu_dereference_protected(adev->gang_submit, true));
 
amdgpu_reset_fini(adev);
 
@@ -5916,3 +5918,36 @@ void amdgpu_device_pcie_port_wreg(struct amdgpu_device 
*adev,
(void)RREG32(data);
spin_unlock_irqrestore(>pcie_idx_lock, flags);
 }
+
+/**
+ * amdgpu_device_switch_gang - switch to a new gang
+ * @adev: amdgpu_device pointer
+ * @gang: the gang to switch to
+ *
+ * Try to switch to a new gang.
+ * Returns: NULL if we switched to the new gang or a reference to the current
+ * gang leader.
+ */
+struct dma_fence *amdgpu_device_switch_gang(struct amdgpu_device *adev,
+   struct dma_fence *gang)
+{
+   struct dma_fence *old = NULL;
+
+   do {
+   dma_fence_put(old);
+   rcu_read_lock();
+   old = dma_fence_get_rcu_safe(>gang_submit);
+   rcu_read_unlock();
+
+   if (old == gang)
+   break;
+
+   if (!dma_fence_is_signaled(old))
+   return old;
+
+   } while (cmpxchg((struct dma_fence __force **)>gang_submit,
+old, gang) != old);
+
+   dma_fence_put(old);
+   return NULL;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 37dc5ee4153d..6f6708caf0e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -173,11 +173,29 @@ static void amdgpu_job_free_cb(struct drm_sched_job 
*s_job)
dma_fence_put(>hw_fence);
 }
 
+void amdgpu_job_set_gang_leader(struct amdgpu_job *job,
+   struct amdgpu_job *leader)
+{
+   struct dma_fence *fence = >base.s_fence->scheduled;
+
+   WARN_ON(job->gang_submit);
+
+   /*
+* Don't add a reference when we are the gang leader to avoid circle
+* dependency.
+*/
+   if (job != leader)
+   dma_fence_get(fence);
+   job->gang_submit = fence;
+}
+
 void amdgpu_job_free(struct amdgpu_job *job)
 {
amdgpu_job_free_resources(job);
amdgpu_sync_free(>sync);
amdgpu_sync_free(>sched_sync);
+   if (job->gang_submit != >base.s_fence->scheduled)
+   

[PATCH 07/12] drm/amdgpu: move entity selection and job init earlier during CS

2022-09-05 Thread Christian König
Initialize the entity for the CS and scheduler job much earlier.

v2: fix job initialisation order and use correct scheduler instance

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 54 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  5 +++
 2 files changed, 30 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 05df1727e348..72147032bda9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -68,6 +68,25 @@ static int amdgpu_cs_p1_ib(struct amdgpu_cs_parser *p,
   struct drm_amdgpu_cs_chunk_ib *chunk_ib,
   unsigned int *num_ibs)
 {
+   struct drm_sched_entity *entity;
+   int r;
+
+   r = amdgpu_ctx_get_entity(p->ctx, chunk_ib->ip_type,
+ chunk_ib->ip_instance,
+ chunk_ib->ring, );
+   if (r)
+   return r;
+
+   /* Abort if there is no run queue associated with this entity.
+* Possibly because of disabled HW IP*/
+   if (entity->rq == NULL)
+   return -EINVAL;
+
+   /* Currently we don't support submitting to multiple entities */
+   if (p->entity && p->entity != entity)
+   return -EINVAL;
+
+   p->entity = entity;
++(*num_ibs);
return 0;
 }
@@ -250,6 +269,10 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
if (ret)
goto free_all_kdata;
 
+   ret = drm_sched_job_init(>job->base, p->entity, >vm);
+   if (ret)
+   goto free_all_kdata;
+
if (p->ctx->vram_lost_counter != p->job->vram_lost_counter) {
ret = -ECANCELED;
goto free_all_kdata;
@@ -286,32 +309,11 @@ static int amdgpu_cs_p2_ib(struct amdgpu_cs_parser *p,
 {
struct drm_amdgpu_cs_chunk_ib *chunk_ib = chunk->kdata;
struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
+   struct amdgpu_ring *ring = amdgpu_job_ring(p->job);
struct amdgpu_ib *ib = >job->ibs[*num_ibs];
struct amdgpu_vm *vm = >vm;
-   struct drm_sched_entity *entity;
-   struct amdgpu_ring *ring;
int r;
 
-   r = amdgpu_ctx_get_entity(p->ctx, chunk_ib->ip_type,
- chunk_ib->ip_instance,
- chunk_ib->ring, );
-   if (r)
-   return r;
-
-   /*
-* Abort if there is no run queue associated with this entity.
-* Possibly because of disabled HW IP.
-*/
-   if (entity->rq == NULL)
-   return -EINVAL;
-
-   /* Currently we don't support submitting to multiple entities */
-   if (p->entity && p->entity != entity)
-   return -EINVAL;
-
-   p->entity = entity;
-
-   ring = to_amdgpu_ring(entity->rq->sched);
/* MM engine doesn't support user fences */
if (p->job->uf_addr && ring->funcs->no_user_fence)
return -EINVAL;
@@ -982,8 +984,8 @@ static void trace_amdgpu_cs_ibs(struct amdgpu_cs_parser 
*parser)
 
 static int amdgpu_cs_patch_ibs(struct amdgpu_cs_parser *p)
 {
-   struct amdgpu_ring *ring = to_amdgpu_ring(p->entity->rq->sched);
struct amdgpu_job *job = p->job;
+   struct amdgpu_ring *ring = amdgpu_job_ring(job);
unsigned int i;
int r;
 
@@ -1171,10 +1173,6 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
job = p->job;
p->job = NULL;
 
-   r = drm_sched_job_init(>base, p->entity, >vm);
-   if (r)
-   goto error_unlock;
-
drm_sched_job_arm(>base);
 
/* No memory allocation is allowed while holding the notifier lock.
@@ -1231,8 +1229,6 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 error_abort:
drm_sched_job_cleanup(>base);
mutex_unlock(>adev->notifier_lock);
-
-error_unlock:
amdgpu_job_free(job);
return r;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
index 2a1961bf1194..866d35bbe073 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
@@ -72,6 +72,11 @@ struct amdgpu_job {
struct amdgpu_ibibs[];
 };
 
+static inline struct amdgpu_ring *amdgpu_job_ring(struct amdgpu_job *job)
+{
+   return to_amdgpu_ring(job->base.entity->rq->sched);
+}
+
 int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
 struct amdgpu_job **job, struct amdgpu_vm *vm);
 int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, unsigned size,
-- 
2.25.1



[PATCH 04/12] drm/amdgpu: revert "partial revert "remove ctx->lock" v2"

2022-09-05 Thread Christian König
This reverts commit 94f4c4965e5513ba624488f4b601d6b385635aec.

We found that the bo_list is missing a protection for its list entries.
Since that is fixed now this workaround can be removed again.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 21 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c |  2 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h |  1 -
 3 files changed, 6 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index f7bf61d96be5..52ba6325944e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -128,8 +128,6 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
*p, union drm_amdgpu_cs
goto free_chunk;
}
 
-   mutex_lock(>ctx->lock);
-
/* skip guilty context job */
if (atomic_read(>ctx->guilty) == 1) {
ret = -ECANCELED;
@@ -691,7 +689,6 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser 
*parser, int error,
dma_fence_put(parser->fence);
 
if (parser->ctx) {
-   mutex_unlock(>ctx->lock);
amdgpu_ctx_put(parser->ctx);
}
if (parser->bo_list)
@@ -1138,9 +1135,6 @@ static int amdgpu_cs_dependencies(struct amdgpu_device 
*adev,
 {
int i, r;
 
-   /* TODO: Investigate why we still need the context lock */
-   mutex_unlock(>ctx->lock);
-
for (i = 0; i < p->nchunks; ++i) {
struct amdgpu_cs_chunk *chunk;
 
@@ -1151,34 +1145,32 @@ static int amdgpu_cs_dependencies(struct amdgpu_device 
*adev,
case AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES:
r = amdgpu_cs_process_fence_dep(p, chunk);
if (r)
-   goto out;
+   return r;
break;
case AMDGPU_CHUNK_ID_SYNCOBJ_IN:
r = amdgpu_cs_process_syncobj_in_dep(p, chunk);
if (r)
-   goto out;
+   return r;
break;
case AMDGPU_CHUNK_ID_SYNCOBJ_OUT:
r = amdgpu_cs_process_syncobj_out_dep(p, chunk);
if (r)
-   goto out;
+   return r;
break;
case AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT:
r = amdgpu_cs_process_syncobj_timeline_in_dep(p, chunk);
if (r)
-   goto out;
+   return r;
break;
case AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL:
r = amdgpu_cs_process_syncobj_timeline_out_dep(p, 
chunk);
if (r)
-   goto out;
+   return r;
break;
}
}
 
-out:
-   mutex_lock(>ctx->lock);
-   return r;
+   return 0;
 }
 
 static void amdgpu_cs_post_dependencies(struct amdgpu_cs_parser *p)
@@ -1340,7 +1332,6 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
goto out;
 
r = amdgpu_cs_submit(, cs);
-
 out:
amdgpu_cs_parser_fini(, r, reserved_buffers);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 8ee4e8491f39..168337d8d4cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -315,7 +315,6 @@ static int amdgpu_ctx_init(struct amdgpu_ctx_mgr *mgr, 
int32_t priority,
kref_init(>refcount);
ctx->mgr = mgr;
spin_lock_init(>ring_lock);
-   mutex_init(>lock);
 
ctx->reset_counter = atomic_read(>adev->gpu_reset_counter);
ctx->reset_counter_query = ctx->reset_counter;
@@ -407,7 +406,6 @@ static void amdgpu_ctx_fini(struct kref *ref)
drm_dev_exit(idx);
}
 
-   mutex_destroy(>lock);
kfree(ctx);
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
index cc7c8afff414..0fa0e56daf67 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
@@ -53,7 +53,6 @@ struct amdgpu_ctx {
boolpreamble_presented;
int32_t init_priority;
int32_t override_priority;
-   struct mutexlock;
atomic_tguilty;
unsigned long   ras_counter_ce;
unsigned long   ras_counter_ue;
-- 
2.25.1



[PATCH 08/12] drm/amdgpu: revert "fix limiting AV1 to the first instance on VCN3"

2022-09-05 Thread Christian König
This reverts commit 250195ff744f260c169f5427422b6f39c58cb883.

The job should now be initialized when we reach the parser functions.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 39405f0db824..3cabceee5f57 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -1761,21 +1761,23 @@ static const struct amdgpu_ring_funcs 
vcn_v3_0_dec_sw_ring_vm_funcs = {
.emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper,
 };
 
-static int vcn_v3_0_limit_sched(struct amdgpu_cs_parser *p)
+static int vcn_v3_0_limit_sched(struct amdgpu_cs_parser *p,
+   struct amdgpu_job *job)
 {
struct drm_gpu_scheduler **scheds;
 
/* The create msg must be in the first IB submitted */
-   if (atomic_read(>entity->fence_seq))
+   if (atomic_read(>base.entity->fence_seq))
return -EINVAL;
 
scheds = p->adev->gpu_sched[AMDGPU_HW_IP_VCN_DEC]
[AMDGPU_RING_PRIO_DEFAULT].sched;
-   drm_sched_entity_modify_sched(p->entity, scheds, 1);
+   drm_sched_entity_modify_sched(job->base.entity, scheds, 1);
return 0;
 }
 
-static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, uint64_t addr)
+static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job,
+   uint64_t addr)
 {
struct ttm_operation_ctx ctx = { false, false };
struct amdgpu_bo_va_mapping *map;
@@ -1846,7 +1848,7 @@ static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, 
uint64_t addr)
if (create[0] == 0x7 || create[0] == 0x10 || create[0] == 0x11)
continue;
 
-   r = vcn_v3_0_limit_sched(p);
+   r = vcn_v3_0_limit_sched(p, job);
if (r)
goto out;
}
@@ -1860,7 +1862,7 @@ static int vcn_v3_0_ring_patch_cs_in_place(struct 
amdgpu_cs_parser *p,
   struct amdgpu_job *job,
   struct amdgpu_ib *ib)
 {
-   struct amdgpu_ring *ring = to_amdgpu_ring(p->entity->rq->sched);
+   struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched);
uint32_t msg_lo = 0, msg_hi = 0;
unsigned i;
int r;
@@ -1879,7 +1881,8 @@ static int vcn_v3_0_ring_patch_cs_in_place(struct 
amdgpu_cs_parser *p,
msg_hi = val;
} else if (reg == PACKET0(p->adev->vcn.internal.cmd, 0) &&
   val == 0) {
-   r = vcn_v3_0_dec_msg(p, ((u64)msg_hi) << 32 | msg_lo);
+   r = vcn_v3_0_dec_msg(p, job,
+((u64)msg_hi) << 32 | msg_lo);
if (r)
return r;
}
-- 
2.25.1



[PATCH 09/12] drm/amdgpu: cleanup instance limit on VCN4

2022-09-05 Thread Christian König
Similar to what we did for VCN3 use the job instead of the parser
entity. Cleanup the coding style quite a bit as well.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 46 +++
 1 file changed, 25 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
index fb2d74f30448..9338172eec8b 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
@@ -1327,21 +1327,23 @@ static void vcn_v4_0_unified_ring_set_wptr(struct 
amdgpu_ring *ring)
}
 }
 
-static int vcn_v4_0_limit_sched(struct amdgpu_cs_parser *p)
+static int vcn_v4_0_limit_sched(struct amdgpu_cs_parser *p,
+   struct amdgpu_job *job)
 {
struct drm_gpu_scheduler **scheds;
 
/* The create msg must be in the first IB submitted */
-   if (atomic_read(>entity->fence_seq))
+   if (atomic_read(>base.entity->fence_seq))
return -EINVAL;
 
-   scheds = p->adev->gpu_sched[AMDGPU_HW_IP_VCN_ENC]
-   [AMDGPU_RING_PRIO_0].sched;
-   drm_sched_entity_modify_sched(p->entity, scheds, 1);
+   scheds = p->adev->gpu_sched[AMDGPU_HW_IP_VCN_DEC]
+   [AMDGPU_RING_PRIO_DEFAULT].sched;
+   drm_sched_entity_modify_sched(job->base.entity, scheds, 1);
return 0;
 }
 
-static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, uint64_t addr)
+static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job,
+   uint64_t addr)
 {
struct ttm_operation_ctx ctx = { false, false };
struct amdgpu_bo_va_mapping *map;
@@ -1412,7 +1414,7 @@ static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, 
uint64_t addr)
if (create[0] == 0x7 || create[0] == 0x10 || create[0] == 0x11)
continue;
 
-   r = vcn_v4_0_limit_sched(p);
+   r = vcn_v4_0_limit_sched(p, job);
if (r)
goto out;
}
@@ -1425,32 +1427,34 @@ static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, 
uint64_t addr)
 #define RADEON_VCN_ENGINE_TYPE_DECODE 
(0x0003)
 
 static int vcn_v4_0_ring_patch_cs_in_place(struct amdgpu_cs_parser *p,
-   struct amdgpu_job *job,
-   struct amdgpu_ib *ib)
+  struct amdgpu_job *job,
+  struct amdgpu_ib *ib)
 {
-   struct amdgpu_ring *ring = to_amdgpu_ring(p->entity->rq->sched);
-   struct amdgpu_vcn_decode_buffer *decode_buffer = NULL;
+   struct amdgpu_ring *ring = to_amdgpu_ring(job->base.entity->rq->sched);
+   struct amdgpu_vcn_decode_buffer *decode_buffer;
+   uint64_t addr;
uint32_t val;
-   int r = 0;
 
/* The first instance can decode anything */
if (!ring->me)
-   return r;
+   return 0;
 
/* unified queue ib header has 8 double words. */
if (ib->length_dw < 8)
-   return r;
+   return 0;
 
val = amdgpu_ib_get_value(ib, 6); //RADEON_VCN_ENGINE_TYPE
+   if (val != RADEON_VCN_ENGINE_TYPE_DECODE)
+   return 0;
 
-   if (val == RADEON_VCN_ENGINE_TYPE_DECODE) {
-   decode_buffer = (struct amdgpu_vcn_decode_buffer *)>ptr[10];
+   decode_buffer = (struct amdgpu_vcn_decode_buffer *)>ptr[10];
 
-   if (decode_buffer->valid_buf_flag  & 0x1)
-   r = vcn_v4_0_dec_msg(p, 
((u64)decode_buffer->msg_buffer_address_hi) << 32 |
-   
decode_buffer->msg_buffer_address_lo);
-   }
-   return r;
+   if (!(decode_buffer->valid_buf_flag  & 0x1))
+   return 0;
+
+   addr = ((u64)decode_buffer->msg_buffer_address_hi) << 32 |
+   decode_buffer->msg_buffer_address_lo;
+   return vcn_v4_0_dec_msg(p, job, addr);
 }
 
 static const struct amdgpu_ring_funcs vcn_v4_0_unified_ring_vm_funcs = {
-- 
2.25.1



[PATCH 06/12] drm/amdgpu: cleanup and reorder amdgpu_cs.c v3

2022-09-05 Thread Christian König
Sort the functions in the order they are called and cleanup the coding
style and function names to represent the data they process.

Check the size of the IB chunk.

v2: fix job initialisation order and use correct scheduler instance
v3: try to move all functional changes into a separate patch.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 1344 
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.h |2 +-
 2 files changed, 671 insertions(+), 675 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 52ba6325944e..05df1727e348 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -39,9 +39,42 @@
 #include "amdgpu_gem.h"
 #include "amdgpu_ras.h"
 
-static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
- struct drm_amdgpu_cs_chunk_fence *data,
- uint32_t *offset)
+static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p,
+struct amdgpu_device *adev,
+struct drm_file *filp,
+union drm_amdgpu_cs *cs)
+{
+   struct amdgpu_fpriv *fpriv = filp->driver_priv;
+
+   if (cs->in.num_chunks == 0)
+   return -EINVAL;
+
+   memset(p, 0, sizeof(*p));
+   p->adev = adev;
+   p->filp = filp;
+
+   p->ctx = amdgpu_ctx_get(fpriv, cs->in.ctx_id);
+   if (!p->ctx)
+   return -EINVAL;
+
+   if (atomic_read(>ctx->guilty)) {
+   amdgpu_ctx_put(p->ctx);
+   return -ECANCELED;
+   }
+   return 0;
+}
+
+static int amdgpu_cs_p1_ib(struct amdgpu_cs_parser *p,
+  struct drm_amdgpu_cs_chunk_ib *chunk_ib,
+  unsigned int *num_ibs)
+{
+   ++(*num_ibs);
+   return 0;
+}
+
+static int amdgpu_cs_p1_user_fence(struct amdgpu_cs_parser *p,
+  struct drm_amdgpu_cs_chunk_fence *data,
+  uint32_t *offset)
 {
struct drm_gem_object *gobj;
struct amdgpu_bo *bo;
@@ -80,11 +113,11 @@ static int amdgpu_cs_user_fence_chunk(struct 
amdgpu_cs_parser *p,
return r;
 }
 
-static int amdgpu_cs_bo_handles_chunk(struct amdgpu_cs_parser *p,
- struct drm_amdgpu_bo_list_in *data)
+static int amdgpu_cs_p1_bo_handles(struct amdgpu_cs_parser *p,
+  struct drm_amdgpu_bo_list_in *data)
 {
+   struct drm_amdgpu_bo_list_entry *info;
int r;
-   struct drm_amdgpu_bo_list_entry *info = NULL;
 
r = amdgpu_bo_create_list_entry_array(data, );
if (r)
@@ -104,7 +137,9 @@ static int amdgpu_cs_bo_handles_chunk(struct 
amdgpu_cs_parser *p,
return r;
 }
 
-static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union 
drm_amdgpu_cs *cs)
+/* Copy the data from userspace and go over it the first time */
+static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
+  union drm_amdgpu_cs *cs)
 {
struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
struct amdgpu_vm *vm = >vm;
@@ -112,28 +147,14 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
*p, union drm_amdgpu_cs
uint64_t *chunk_array;
unsigned size, num_ibs = 0;
uint32_t uf_offset = 0;
-   int i;
int ret;
+   int i;
 
-   if (cs->in.num_chunks == 0)
-   return -EINVAL;
-
-   chunk_array = kvmalloc_array(cs->in.num_chunks, sizeof(uint64_t), 
GFP_KERNEL);
+   chunk_array = kvmalloc_array(cs->in.num_chunks, sizeof(uint64_t),
+GFP_KERNEL);
if (!chunk_array)
return -ENOMEM;
 
-   p->ctx = amdgpu_ctx_get(fpriv, cs->in.ctx_id);
-   if (!p->ctx) {
-   ret = -EINVAL;
-   goto free_chunk;
-   }
-
-   /* skip guilty context job */
-   if (atomic_read(>ctx->guilty) == 1) {
-   ret = -ECANCELED;
-   goto free_chunk;
-   }
-
/* get chunks */
chunk_array_user = u64_to_user_ptr(cs->in.chunks);
if (copy_from_user(chunk_array, chunk_array_user,
@@ -168,7 +189,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
*p, union drm_amdgpu_cs
size = p->chunks[i].length_dw;
cdata = u64_to_user_ptr(user_chunk.chunk_data);
 
-   p->chunks[i].kdata = kvmalloc_array(size, sizeof(uint32_t), 
GFP_KERNEL);
+   p->chunks[i].kdata = kvmalloc_array(size, sizeof(uint32_t),
+   GFP_KERNEL);
if (p->chunks[i].kdata == NULL) {
ret = -ENOMEM;
i--;
@@ -180,36 +202,35 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
*p, union drm_amdgpu_cs
goto 

[PATCH 03/12] drm/amdgpu: move setting the job resources

2022-09-05 Thread Christian König
Move setting the job resources into amdgpu_job.c

Signed-off-by: Christian König 
Reviewed-by: Andrey Grodzovsky 
Reviewed-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 21 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 17 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  2 ++
 3 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 6f80cf2ea9ae..f7bf61d96be5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -495,9 +495,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
struct amdgpu_vm *vm = >vm;
struct amdgpu_bo_list_entry *e;
struct list_head duplicates;
-   struct amdgpu_bo *gds;
-   struct amdgpu_bo *gws;
-   struct amdgpu_bo *oa;
int r;
 
INIT_LIST_HEAD(>validated);
@@ -614,22 +611,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved,
 p->bytes_moved_vis);
 
-   gds = p->bo_list->gds_obj;
-   gws = p->bo_list->gws_obj;
-   oa = p->bo_list->oa_obj;
-
-   if (gds) {
-   p->job->gds_base = amdgpu_bo_gpu_offset(gds) >> PAGE_SHIFT;
-   p->job->gds_size = amdgpu_bo_size(gds) >> PAGE_SHIFT;
-   }
-   if (gws) {
-   p->job->gws_base = amdgpu_bo_gpu_offset(gws) >> PAGE_SHIFT;
-   p->job->gws_size = amdgpu_bo_size(gws) >> PAGE_SHIFT;
-   }
-   if (oa) {
-   p->job->oa_base = amdgpu_bo_gpu_offset(oa) >> PAGE_SHIFT;
-   p->job->oa_size = amdgpu_bo_size(oa) >> PAGE_SHIFT;
-   }
+   amdgpu_job_set_resources(p->job, p->bo_list->gds_obj,
+p->bo_list->gws_obj, p->bo_list->oa_obj);
 
if (!r && p->uf_entry.tv.bo) {
struct amdgpu_bo *uf = ttm_to_amdgpu_bo(p->uf_entry.tv.bo);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 8f51adf3b329..37dc5ee4153d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -132,6 +132,23 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, 
unsigned size,
return r;
 }
 
+void amdgpu_job_set_resources(struct amdgpu_job *job, struct amdgpu_bo *gds,
+ struct amdgpu_bo *gws, struct amdgpu_bo *oa)
+{
+   if (gds) {
+   job->gds_base = amdgpu_bo_gpu_offset(gds) >> PAGE_SHIFT;
+   job->gds_size = amdgpu_bo_size(gds) >> PAGE_SHIFT;
+   }
+   if (gws) {
+   job->gws_base = amdgpu_bo_gpu_offset(gws) >> PAGE_SHIFT;
+   job->gws_size = amdgpu_bo_size(gws) >> PAGE_SHIFT;
+   }
+   if (oa) {
+   job->oa_base = amdgpu_bo_gpu_offset(oa) >> PAGE_SHIFT;
+   job->oa_size = amdgpu_bo_size(oa) >> PAGE_SHIFT;
+   }
+}
+
 void amdgpu_job_free_resources(struct amdgpu_job *job)
 {
struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
index babc0af751c2..2a1961bf1194 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
@@ -76,6 +76,8 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned 
num_ibs,
 struct amdgpu_job **job, struct amdgpu_vm *vm);
 int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, unsigned size,
enum amdgpu_ib_pool_type pool, struct amdgpu_job **job);
+void amdgpu_job_set_resources(struct amdgpu_job *job, struct amdgpu_bo *gds,
+ struct amdgpu_bo *gws, struct amdgpu_bo *oa);
 void amdgpu_job_free_resources(struct amdgpu_job *job);
 void amdgpu_job_free(struct amdgpu_job *job);
 int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
-- 
2.25.1



[PATCH 05/12] drm/amdgpu: use DMA_RESV_USAGE_BOOKKEEP v2

2022-09-05 Thread Christian König
Use DMA_RESV_USAGE_BOOKKEEP for VM page table updates and KFD preemption fence.

v2: actually update all usages for KFD

Signed-off-by: Christian König 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 26 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c   |  3 ++-
 2 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index cbd593f7d553..f1604f5cd7c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -297,7 +297,7 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
amdgpu_bo *bo,
 */
replacement = dma_fence_get_stub();
dma_resv_replace_fences(bo->tbo.base.resv, ef->base.context,
-   replacement, DMA_RESV_USAGE_READ);
+   replacement, DMA_RESV_USAGE_BOOKKEEP);
dma_fence_put(replacement);
return 0;
 }
@@ -1390,8 +1390,9 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void 
**process_info,
ret = dma_resv_reserve_fences(vm->root.bo->tbo.base.resv, 1);
if (ret)
goto reserve_shared_fail;
-   amdgpu_bo_fence(vm->root.bo,
-   >process_info->eviction_fence->base, true);
+   dma_resv_add_fence(vm->root.bo->tbo.base.resv,
+  >process_info->eviction_fence->base,
+  DMA_RESV_USAGE_BOOKKEEP);
amdgpu_bo_unreserve(vm->root.bo);
 
/* Update process info */
@@ -1987,9 +1988,9 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
}
 
if (!amdgpu_ttm_tt_get_usermm(bo->tbo.ttm) && !bo->tbo.pin_count)
-   amdgpu_bo_fence(bo,
-   >process_info->eviction_fence->base,
-   true);
+   dma_resv_add_fence(bo->tbo.base.resv,
+  >process_info->eviction_fence->base,
+  DMA_RESV_USAGE_BOOKKEEP);
ret = unreserve_bo_and_vms(, false, false);
 
goto out;
@@ -2758,15 +2759,18 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, 
struct dma_fence **ef)
if (mem->bo->tbo.pin_count)
continue;
 
-   amdgpu_bo_fence(mem->bo,
-   _info->eviction_fence->base, true);
+   dma_resv_add_fence(mem->bo->tbo.base.resv,
+  _info->eviction_fence->base,
+  DMA_RESV_USAGE_BOOKKEEP);
}
/* Attach eviction fence to PD / PT BOs */
list_for_each_entry(peer_vm, _info->vm_list_head,
vm_list_node) {
struct amdgpu_bo *bo = peer_vm->root.bo;
 
-   amdgpu_bo_fence(bo, _info->eviction_fence->base, true);
+   dma_resv_add_fence(bo->tbo.base.resv,
+  _info->eviction_fence->base,
+  DMA_RESV_USAGE_BOOKKEEP);
}
 
 validate_map_fail:
@@ -2820,7 +2824,9 @@ int amdgpu_amdkfd_add_gws_to_process(void *info, void 
*gws, struct kgd_mem **mem
ret = dma_resv_reserve_fences(gws_bo->tbo.base.resv, 1);
if (ret)
goto reserve_shared_fail;
-   amdgpu_bo_fence(gws_bo, _info->eviction_fence->base, true);
+   dma_resv_add_fence(gws_bo->tbo.base.resv,
+  _info->eviction_fence->base,
+  DMA_RESV_USAGE_BOOKKEEP);
amdgpu_bo_unreserve(gws_bo);
mutex_unlock(&(*mem)->process_info->lock);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
index 1fd3cbca20a2..03ec099d64e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
@@ -112,7 +112,8 @@ static int amdgpu_vm_sdma_commit(struct 
amdgpu_vm_update_params *p,
swap(p->vm->last_unlocked, tmp);
dma_fence_put(tmp);
} else {
-   amdgpu_bo_fence(p->vm->root.bo, f, true);
+   dma_resv_add_fence(p->vm->root.bo->tbo.base.resv, f,
+  DMA_RESV_USAGE_BOOKKEEP);
}
 
if (fence && !p->immediate)
-- 
2.25.1



[PATCH 01/12] drm/sched: move calling drm_sched_entity_select_rq

2022-09-05 Thread Christian König
We already discussed that the call to drm_sched_entity_select_rq() needs
to move to drm_sched_job_arm() to be able to set a new scheduler list
between _init() and _arm(). This was just not applied for some reason.

Signed-off-by: Christian König 
Reviewed-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/scheduler/sched_main.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..e0ab14e0fb6b 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -592,7 +592,6 @@ int drm_sched_job_init(struct drm_sched_job *job,
   struct drm_sched_entity *entity,
   void *owner)
 {
-   drm_sched_entity_select_rq(entity);
if (!entity->rq)
return -ENOENT;
 
@@ -628,7 +627,7 @@ void drm_sched_job_arm(struct drm_sched_job *job)
struct drm_sched_entity *entity = job->entity;
 
BUG_ON(!entity);
-
+   drm_sched_entity_select_rq(entity);
sched = entity->rq->sched;
 
job->sched = sched;
-- 
2.25.1



[PATCH 02/12] drm/amdgpu: remove SRIOV and MCBP dependencies from the CS

2022-09-05 Thread Christian König
We should not have any different CS constrains based
on the execution environment.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index b7bae833c804..6f80cf2ea9ae 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -814,7 +814,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
if (r)
return r;
 
-   if (amdgpu_mcbp || amdgpu_sriov_vf(adev)) {
+   if (fpriv->csa_va) {
bo_va = fpriv->csa_va;
BUG_ON(!bo_va);
r = amdgpu_vm_bo_update(adev, bo_va, false);
@@ -898,13 +898,11 @@ static int amdgpu_cs_ib_fill(struct amdgpu_device *adev,
continue;
 
if (chunk_ib->ip_type == AMDGPU_HW_IP_GFX &&
-   (amdgpu_mcbp || amdgpu_sriov_vf(adev))) {
-   if (chunk_ib->flags & AMDGPU_IB_FLAG_PREEMPT) {
-   if (chunk_ib->flags & AMDGPU_IB_FLAG_CE)
-   ce_preempt++;
-   else
-   de_preempt++;
-   }
+   chunk_ib->flags & AMDGPU_IB_FLAG_PREEMPT) {
+   if (chunk_ib->flags & AMDGPU_IB_FLAG_CE)
+   ce_preempt++;
+   else
+   de_preempt++;
 
/* each GFX command submit allows 0 or 1 IB preemptible 
for CE & DE */
if (ce_preempt > 1 || de_preempt > 1)
-- 
2.25.1



Re: 回复: Re: [PATCH] drm:Fix the blank screen problem of some 1920x1080 75Hz monitors using R520 graphics card

2022-09-05 Thread Christian König

Am 05.09.22 um 10:10 schrieb 钟沛:


Thanks for your reply!


We found that in the amdgpu_pll_compute function, when the 
target_clock is the value contained in the drm_dmt_modes defined in 
drm_edid.c, the diff is 0. When target_clock is some special value, we 
cannot find a diff value of 0, so we need to find the smallest diff 
value to fit the current target_clock. For the monitor that has the 
blank screen problem here, we found that when the ref_div_max is 128, 
the diff value is smaller and the blank screen problem can be solved. 
We tested some other monitors and added log printing to the code. We 
found that this change did not affect those monitors, and in the 
analysis of the logs, we found that the solution with a smaller diff 
value always displayed normally.



Changing the value of ref_div_max from 128 to 100 can solve the blank 
screen problem of some monitors, but it will also cause some normal 
monitors to go black, so is it a more reasonable solution to determine 
the value of ref_div_max according to the value of diff?




Nope, exactly that's just utterly nonsense.

What we could maybe do is to prefer a smaller ref_div over a larger 
ref_div, but I don't see how this will help you.


Regards,
Christian.



Thank you for taking the time to read my email.


Best Regards.









*主 题:*Re: [PATCH] drm:Fix the blank screen problem of some 1920x1080 
75Hz monitors using R520 graphics card

*日 期:*2022-09-05 14:05
*发件人:*Christian König
*收件人:*钟沛alexander.deuc...@amd.comxinhui.pan@amd.comairlied@linux.iedaniel@ffwll.chisabba...@riseup.net 




Am 05.09.22 um 05:23 schrieb zhongpei:
> We found that in the scenario of AMD R520 graphics card
> and some 1920x1080 monitors,when we switch the refresh rate
> of the monitor to 75Hz,the monitor will have a blank screen problem,
> and the restart cannot be restored.After testing, it is found that
> when we limit the maximum value of ref_div_max to 128,
> the problem can be solved.In order to keep the previous modification
> to be compatible with other monitors,we added a judgment
> when finding the minimum diff value in the loop of the
> amdgpu_pll_compute/radeon_compute_pll_avivo function.
> If no diff value of 0 is found when the maximum value of ref_div_max
> is limited to 100,continue to search when it is 128,
> and take the parameter with the smallest diff value.

Well that's at least better than what I've seen in previous tries to fix
this.

But as far as I can see this will certainly break some other monitors,
so that is pretty much a NAK.

Regards,
Christian.

>
> Signed-off-by: zhongpei 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c | 17 +
>   drivers/gpu/drm/radeon/radeon_display.c | 15 +++
>   2 files changed, 24 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c

> index 0bb2466d539a..0c298faa0f94 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c
> @@ -84,12 +84,13 @@ static void amdgpu_pll_reduce_ratio(unsigned 
*nom, unsigned *den,
>   static void amdgpu_pll_get_fb_ref_div(struct amdgpu_device *adev, 
unsigned int nom,

>        unsigned int den, unsigned int post_div,
>        unsigned int fb_div_max, unsigned int ref_div_max,
> -      unsigned int *fb_div, unsigned int *ref_div)
> +      unsigned int ref_div_limit, unsigned int *fb_div,
> +      unsigned int *ref_div)
>   {
>
>   /* limit reference * post divider to a maximum */
>   if (adev->family == AMDGPU_FAMILY_SI)
> - ref_div_max = min(100 / post_div, ref_div_max);
> + ref_div_max = min(ref_div_limit / post_div, ref_div_max);
>   else
>   ref_div_max = min(128 / post_div, ref_div_max);
>
> @@ -136,6 +137,7 @@ void amdgpu_pll_compute(struct amdgpu_device *adev,
>   unsigned ref_div_min, ref_div_max, ref_div;
>   unsigned post_div_best, diff_best;
>   unsigned nom, den;
> + unsigned ref_div_limit, ref_limit_best;
>
>   /* determine allowed feedback divider range */
>   fb_div_min = pll->min_feedback_div;
> @@ -204,11 +206,12 @@ void amdgpu_pll_compute(struct amdgpu_device 
*adev,

>   else
>   post_div_best = post_div_max;
>   diff_best = ~0;
> + ref_div_limit = ref_limit_best = 100;
>
>   for (post_div = post_div_min; post_div <= post_div_max; ++post_div) {
>   unsigned diff;
>   amdgpu_pll_get_fb_ref_div(adev, nom, den, post_div, fb_div_max,
> -  ref_div_max, _div, _div);
> +  ref_div_max, ref_div_limit, _div, _div);
>   diff = abs(target_clock - (pll->reference_freq * fb_div) /
>   (ref_div * post_div));
>
> @@ -217,13 +220,19 @@ void amdgpu_pll_compute(struct amdgpu_device 
*adev,

>
>   post_div_best = post_div;
>   diff_best = diff;
> + ref_limit_best = ref_div_limit;
>   }
> + if (post_div >= post_div_max && diff_best != 0 && ref_div_limit != 
128) {

> + ref_div_limit = 128;
> + post_div = post_div_min - 1;
> + }
> +
>   }
>   post_div = post_div_best;
>
>   /* get the feedback and reference divider for the 

Re: [PATCH] drm/ttm: update bulk move object of ghost BO

2022-09-05 Thread Christian König

Yeah, I realized that as well after sending the first mail.

The problem is that we keep the bulk move around when there currently 
isn't any resource associated with the template.


So the correct code should look something like this:

if (fbo->base.resource) {
    ttm_resource_set_bo(fbo->base.resource, >base);
    bo->resource = NULL;
    ttm_bo_set_bulk_move(>base, NULL);
} else {
    fbo->bulk_move = NULL;
}

Regards,
Christian.

Am 05.09.22 um 09:59 schrieb Yin, ZhenGuo (Chris):
Inside the function ttm_bo_set_bulk_move, it calls 
ttm_resource_del_bulk_move to remove the old resource from the 
bulk_move list.


If we set the bulk_move to NULL manually as suggested, the old 
resource attached in the ghost BO seems won't be removed from the 
bulk_move.


On 9/1/2022 7:13 PM, Christian König wrote:

Am 01.09.22 um 13:11 schrieb Christian König:

Am 01.09.22 um 11:29 schrieb ZhenGuo Yin:

[Why]
Ghost BO is released with non-empty bulk move object. There is a
warning trace:
WARNING: CPU: 19 PID: 1582 at ttm/ttm_bo.c:366 
ttm_bo_release+0x2e1/0x2f0 [amdttm]

Call Trace:
   amddma_resv_reserve_fences+0x10d/0x1f0 [amdkcl]
   amdttm_bo_put+0x28/0x30 [amdttm]
   amdttm_bo_move_accel_cleanup+0x126/0x200 [amdttm]
   amdgpu_bo_move+0x1a8/0x770 [amdgpu]
   ttm_bo_handle_move_mem+0xb0/0x140 [amdttm]
   amdttm_bo_validate+0xbf/0x100 [amdttm]

[How]
The resource of ghost BO should be moved to LRU directly, instead of
using bulk move. The bulk move object of ghost BO should set to NULL
before function ttm_bo_move_to_lru_tail_unlocked.

Fixed:·5b951e487fd6bf5f·("drm/ttm:·fix·bulk·move·handling·v2")
Signed-off-by: ZhenGuo Yin 


Good catch, but the fix is not 100% correct. Please rather just NULL 
the member while initializing the BO structure.


E.g. something like this:

 
 fbo->base.pin_count = 0;
+fbo->base.bulk_move= NULL;
 if (bo->type != ttm_bo_type_sg)
 


On the other hand thinking about it that won't work either.

You need to set bulk_move to NULL manually in an else clauses or 
something like this.


Regards,
Christian.



Thanks,
Christian.


---
  drivers/gpu/drm/ttm/ttm_bo_util.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c

index 1cbfb00c1d65..a90bbbd91910 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -238,6 +238,7 @@ static int ttm_buffer_object_transfer(struct 
ttm_buffer_object *bo,

    if (fbo->base.resource) {
  ttm_resource_set_bo(fbo->base.resource, >base);
+    ttm_bo_set_bulk_move(>base, NULL);
  bo->resource = NULL;
  }








Re: [PATCH] drm/ttm: update bulk move object of ghost BO

2022-09-05 Thread Yin, ZhenGuo (Chris)
Inside the function ttm_bo_set_bulk_move, it calls 
ttm_resource_del_bulk_move to remove the old resource from the bulk_move 
list.


If we set the bulk_move to NULL manually as suggested, the old resource 
attached in the ghost BO seems won't be removed from the bulk_move.


On 9/1/2022 7:13 PM, Christian König wrote:

Am 01.09.22 um 13:11 schrieb Christian König:

Am 01.09.22 um 11:29 schrieb ZhenGuo Yin:

[Why]
Ghost BO is released with non-empty bulk move object. There is a
warning trace:
WARNING: CPU: 19 PID: 1582 at ttm/ttm_bo.c:366 
ttm_bo_release+0x2e1/0x2f0 [amdttm]

Call Trace:
   amddma_resv_reserve_fences+0x10d/0x1f0 [amdkcl]
   amdttm_bo_put+0x28/0x30 [amdttm]
   amdttm_bo_move_accel_cleanup+0x126/0x200 [amdttm]
   amdgpu_bo_move+0x1a8/0x770 [amdgpu]
   ttm_bo_handle_move_mem+0xb0/0x140 [amdttm]
   amdttm_bo_validate+0xbf/0x100 [amdttm]

[How]
The resource of ghost BO should be moved to LRU directly, instead of
using bulk move. The bulk move object of ghost BO should set to NULL
before function ttm_bo_move_to_lru_tail_unlocked.

Fixed:·5b951e487fd6bf5f·("drm/ttm:·fix·bulk·move·handling·v2")
Signed-off-by: ZhenGuo Yin 


Good catch, but the fix is not 100% correct. Please rather just NULL 
the member while initializing the BO structure.


E.g. something like this:

 
 fbo->base.pin_count = 0;
+fbo->base.bulk_move= NULL;
 if (bo->type != ttm_bo_type_sg)
 


On the other hand thinking about it that won't work either.

You need to set bulk_move to NULL manually in an else clauses or 
something like this.


Regards,
Christian.



Thanks,
Christian.


---
  drivers/gpu/drm/ttm/ttm_bo_util.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c

index 1cbfb00c1d65..a90bbbd91910 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -238,6 +238,7 @@ static int ttm_buffer_object_transfer(struct 
ttm_buffer_object *bo,

    if (fbo->base.resource) {
  ttm_resource_set_bo(fbo->base.resource, >base);
+    ttm_bo_set_bulk_move(>base, NULL);
  bo->resource = NULL;
  }






Re: [PATCH v4 06/21] drm/i915: Prepare to dynamic dma-buf locking specification

2022-09-05 Thread Dmitry Osipenko
01.09.2022 17:02, Ruhl, Michael J пишет:
...
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
>> @@ -331,7 +331,19 @@ static void __i915_gem_free_objects(struct
>> drm_i915_private *i915,
>>  continue;
>>  }
>>
>> +/*
>> + * dma_buf_unmap_attachment() requires reservation to be
>> + * locked. The imported GEM shouldn't share reservation lock,
>> + * so it's safe to take the lock.
>> + */
>> +if (obj->base.import_attach)
>> +i915_gem_object_lock(obj, NULL);
> 
> There is a lot of stuff going here.  Taking the lock may be premature...
> 
>>  __i915_gem_object_pages_fini(obj);
> 
> The i915_gem_dmabuf.c:i915_gem_object_put_pages_dmabuf is where
> unmap_attachment is actually called, would it make more sense to make
> do the locking there?

The __i915_gem_object_put_pages() is invoked with a held reservation
lock, while freeing object is a special time when we know that GEM is
unused.

The __i915_gem_free_objects() was taking the lock two weeks ago until
the change made Chris Wilson [1] reached linux-next.

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=2826d447fbd60e6a05e53d5f918bceb8c04e315c

I don't think we can take the lock within
i915_gem_object_put_pages_dmabuf(), it may/should deadlock other code paths.


[PATCH] drm/amdgpu: getting fan speed pwm for vega10 properly

2022-09-05 Thread Yury Zhuravlev
Hello,

During the setup, the fan manager https://github.com/markusressel/fan2go I
found that my Vega56 was not working correctly. This fan manager expects
what read PWM value should be the same as you wrote before, but it's not
the case. PWM value was volatile, and what is more critical, if I wrote
200, after reading I saw ~70-100, which is very confusing.
After that, I started reading the amdgpu driver, and how fan speed works,
and I found what PWM value was calculated from RPM speed and not correct
for my case (different BIOS or fan configuration?).
Because it looked wrong, I started looking into different implementations
and found that Vega20 used mmCG_FDO_CTRL1 and mmCG_THERMAL_STATUS registers
to calculate the PWM value.
I also checked how we set PWM for Vega10 and found the same registers.
After that, I copy-pasted the function from Vega20 to Vega10, and it
started working much better. It still has some fluctuation, but as I
understand, this behavior is expected.

I have no in-depth information about amdgpu, and the original function may
have been for some reason (maybe for some broken BIOS?), but I suppose
somebody forgot to backport this code after prototype implementation.

It would be my first patch here. Sorry if I skipped some procedures, will
be appreciated it if you help me.

Regards,

---
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_thermal.c
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_thermal.c
index dad3e3741a4e..190af79f3236 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_thermal.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_thermal.c
@@ -67,22 +67,21 @@ int vega10_fan_ctrl_get_fan_speed_info(struct pp_hwmgr
*hwmgr,
 int vega10_fan_ctrl_get_fan_speed_pwm(struct pp_hwmgr *hwmgr,
uint32_t *speed)
 {
-   uint32_t current_rpm;
-   uint32_t percent = 0;
-
-   if (hwmgr->thermal_controller.fanInfo.bNoFan)
-   return 0;
+   struct amdgpu_device *adev = hwmgr->adev;
+   uint32_t duty100, duty;
+   uint64_t tmp64;

-   if (vega10_get_current_rpm(hwmgr, _rpm))
-   return -1;
+   duty100 = REG_GET_FIELD(RREG32_SOC15(THM, 0, mmCG_FDO_CTRL1),
+   CG_FDO_CTRL1, FMAX_DUTY100);
+   duty = REG_GET_FIELD(RREG32_SOC15(THM, 0, mmCG_THERMAL_STATUS),
+   CG_THERMAL_STATUS, FDO_PWM_DUTY);

-   if (hwmgr->thermal_controller.
-   advanceFanControlParameters.usMaxFanRPM != 0)
-   percent = current_rpm * 255 /
-   hwmgr->thermal_controller.
-   advanceFanControlParameters.usMaxFanRPM;
+   if (!duty100)
+   return -EINVAL;

-   *speed = MIN(percent, 255);
+   tmp64 = (uint64_t)duty * 255;
+   do_div(tmp64, duty100);
+   *speed = MIN((uint32_t)tmp64, 255);

return 0;
 }
--


[PATCH] drm/amdgpu: cleanup coding style in amdgpu_drv.c

2022-09-05 Thread Jingyu Wang
Fix something checkpatch.pl complained about in amdgpu_drv.c

Signed-off-by: Jingyu Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 31 +
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index de7144b06e93..b50fd27fb6aa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: MIT
 /*
  * Copyright 2000 VA Linux Systems, Inc., Sunnyvale, California.
  * All Rights Reserved.
@@ -140,8 +141,8 @@ uint amdgpu_pcie_lane_cap;
 u64 amdgpu_cg_mask = 0x;
 uint amdgpu_pg_mask = 0x;
 uint amdgpu_sdma_phase_quantum = 32;
-char *amdgpu_disable_cu = NULL;
-char *amdgpu_virtual_display = NULL;
+char *amdgpu_disable_cu;
+char *amdgpu_virtual_display;
 
 /*
  * OverDrive(bit 14) disabled by default
@@ -502,7 +503,7 @@ module_param_named(virtual_display, amdgpu_virtual_display, 
charp, 0444);
  * Set how much time allow a job hang and not drop it. The default is 0.
  */
 MODULE_PARM_DESC(job_hang_limit, "how much time allow a job hang and not drop 
it (default 0)");
-module_param_named(job_hang_limit, amdgpu_job_hang_limit, int ,0444);
+module_param_named(job_hang_limit, amdgpu_job_hang_limit, int, 0444);
 
 /**
  * DOC: lbpw (int)
@@ -565,8 +566,8 @@ module_param_named(timeout_period, 
amdgpu_watchdog_timer.period, uint, 0644);
  */
 #ifdef CONFIG_DRM_AMDGPU_SI
 
-#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
-int amdgpu_si_support = 0;
+#if IS_ENABLED(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
+int amdgpu_si_support;
 MODULE_PARM_DESC(si_support, "SI support (1 = enabled, 0 = disabled 
(default))");
 #else
 int amdgpu_si_support = 1;
@@ -584,8 +585,8 @@ module_param_named(si_support, amdgpu_si_support, int, 
0444);
  */
 #ifdef CONFIG_DRM_AMDGPU_CIK
 
-#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
-int amdgpu_cik_support = 0;
+#if IS_ENABLED(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
+int amdgpu_cik_support;
 MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled, 0 = disabled 
(default))");
 #else
 int amdgpu_cik_support = 1;
@@ -772,9 +773,9 @@ module_param(hws_gws_support, bool, 0444);
 MODULE_PARM_DESC(hws_gws_support, "Assume MEC2 FW supports GWS barriers (false 
= rely on FW version check (Default), true = force supported)");
 
 /**
-  * DOC: queue_preemption_timeout_ms (int)
-  * queue preemption timeout in ms (1 = Minimum, 9000 = default)
-  */
+ * DOC: queue_preemption_timeout_ms (int)
+ * queue preemption timeout in ms (1 = Minimum, 9000 = default)
+ */
 int queue_preemption_timeout_ms = 9000;
 module_param(queue_preemption_timeout_ms, int, 0644);
 MODULE_PARM_DESC(queue_preemption_timeout_ms, "queue preemption timeout in ms 
(1 = Minimum, 9000 = default)");
@@ -799,7 +800,7 @@ MODULE_PARM_DESC(no_system_mem_limit, "disable system 
memory limit (false = defa
  * DOC: no_queue_eviction_on_vm_fault (int)
  * If set, process queues will not be evicted on gpuvm fault. This is to keep 
the wavefront context for debugging (0 = queue eviction, 1 = no queue 
eviction). The default is 0 (queue eviction).
  */
-int amdgpu_no_queue_eviction_on_vm_fault = 0;
+int amdgpu_no_queue_eviction_on_vm_fault;
 MODULE_PARM_DESC(no_queue_eviction_on_vm_fault, "No queue eviction on VM fault 
(0 = queue eviction, 1 = no queue eviction)");
 module_param_named(no_queue_eviction_on_vm_fault, 
amdgpu_no_queue_eviction_on_vm_fault, int, 0444);
 #endif
@@ -1609,7 +1610,7 @@ static const u16 amdgpu_unsupported_pciidlist[] = {
 };
 
 static const struct pci_device_id pciidlist[] = {
-#ifdef  CONFIG_DRM_AMDGPU_SI
+#ifdef CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x6784, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x6788, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
@@ -2289,7 +2290,6 @@ static void amdgpu_drv_delayed_reset_work_handler(struct 
work_struct *work)
amdgpu_amdkfd_device_init(adev);
amdgpu_ttm_set_buffer_funcs_status(adev, true);
}
-   return;
 }
 
 static int amdgpu_pmops_prepare(struct device *dev)
@@ -2478,6 +2478,7 @@ static int amdgpu_pmops_runtime_suspend(struct device 
*dev)
/* wait for all rings to drain before suspending */
for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
struct amdgpu_ring *ring = adev->rings[i];
+
if (ring && ring->sched.ready) {
ret = amdgpu_fence_wait_empty(ring);
if (ret)
@@ -2600,6 +2601,7 @@ long amdgpu_drm_ioctl(struct file *filp,
struct drm_file *file_priv = filp->private_data;
struct drm_device *dev;
long ret;
+
dev = file_priv->minor->dev;
ret = pm_runtime_get_sync(dev->dev);
if (ret < 0)
@@ -2664,9 +2666,8 @@ int 

Re: [PATCH v4 11/21] misc: fastrpc: Prepare to dynamic dma-buf locking specification

2022-09-05 Thread Srinivas Kandagatla




On 31/08/2022 16:37, Dmitry Osipenko wrote:

Prepare fastrpc to the common dynamic dma-buf locking convention by
starting to use the unlocked versions of dma-buf API functions.

Signed-off-by: Dmitry Osipenko 
---


LGTM,

Incase you plan to take it via another tree.

Acked-by: Srinivas Kandagatla 


--srini

  drivers/misc/fastrpc.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
index 93ebd174d848..6fcfb2e9f7a7 100644
--- a/drivers/misc/fastrpc.c
+++ b/drivers/misc/fastrpc.c
@@ -310,8 +310,8 @@ static void fastrpc_free_map(struct kref *ref)
return;
}
}
-   dma_buf_unmap_attachment(map->attach, map->table,
-DMA_BIDIRECTIONAL);
+   dma_buf_unmap_attachment_unlocked(map->attach, map->table,
+ DMA_BIDIRECTIONAL);
dma_buf_detach(map->buf, map->attach);
dma_buf_put(map->buf);
}
@@ -726,7 +726,7 @@ static int fastrpc_map_create(struct fastrpc_user *fl, int 
fd,
goto attach_err;
}
  
-	map->table = dma_buf_map_attachment(map->attach, DMA_BIDIRECTIONAL);

+   map->table = dma_buf_map_attachment_unlocked(map->attach, 
DMA_BIDIRECTIONAL);
if (IS_ERR(map->table)) {
err = PTR_ERR(map->table);
goto map_err;


[PATCH linux-next] drm/amd/display: Remove the unneeded result variable

2022-09-05 Thread cgel . zte
From: zhang songyi 

Return the enable_link_dp() directly instead of storing it in another
redundant variable.

Reported-by: Zeal Robot 
Signed-off-by: zhang songyi 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index f9b798b7933c..4ab27e231337 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -2077,11 +2077,7 @@ static enum dc_status enable_link_edp(
struct dc_state *state,
struct pipe_ctx *pipe_ctx)
 {
-   enum dc_status status;
-
-   status = enable_link_dp(state, pipe_ctx);
-
-   return status;
+   return enable_link_dp(state, pipe_ctx);
 }
 
 static enum dc_status enable_link_dp_mst(
-- 
2.25.1




[PATCH] drm/amdgpu: cleanup coding style in amdgpu_atpx_handler.c

2022-09-05 Thread Jingyu Wang
Fix everything checkpatch.pl complained about in amdgpu_atpx_handler.c

Signed-off-by: Jingyu Wang 
---
 .../gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c  | 27 +++
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
index d6d986be906a..911d6a130ec5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
@@ -74,24 +74,29 @@ struct atpx_mux {
u16 mux;
 } __packed;
 
-bool amdgpu_has_atpx(void) {
+bool amdgpu_has_atpx(void)
+{
return amdgpu_atpx_priv.atpx_detected;
 }
 
-bool amdgpu_has_atpx_dgpu_power_cntl(void) {
+bool amdgpu_has_atpx_dgpu_power_cntl(void)
+{
return amdgpu_atpx_priv.atpx.functions.power_cntl;
 }
 
-bool amdgpu_is_atpx_hybrid(void) {
+bool amdgpu_is_atpx_hybrid(void)
+{
return amdgpu_atpx_priv.atpx.is_hybrid;
 }
 
-bool amdgpu_atpx_dgpu_req_power_for_displays(void) {
+bool amdgpu_atpx_dgpu_req_power_for_displays(void)
+{
return amdgpu_atpx_priv.atpx.dgpu_req_power_for_displays;
 }
 
 #if defined(CONFIG_ACPI)
-void *amdgpu_atpx_get_dhandle(void) {
+void *amdgpu_atpx_get_dhandle(void)
+{
return amdgpu_atpx_priv.dhandle;
 }
 #endif
@@ -134,7 +139,7 @@ static union acpi_object *amdgpu_atpx_call(acpi_handle 
handle, int function,
 
/* Fail only if calling the method fails and ATPX is supported */
if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) {
-   printk("failed to evaluate ATPX got %s\n",
+   DRM_WARN("failed to evaluate ATPX got %s\n",
   acpi_format_exception(status));
kfree(buffer.pointer);
return NULL;
@@ -190,7 +195,7 @@ static int amdgpu_atpx_validate(struct amdgpu_atpx *atpx)
 
size = *(u16 *) info->buffer.pointer;
if (size < 10) {
-   printk("ATPX buffer is too small: %zu\n", size);
+   DRM_WARN("ATPX buffer is too small: %zu\n", size);
kfree(info);
return -EINVAL;
}
@@ -223,11 +228,11 @@ static int amdgpu_atpx_validate(struct amdgpu_atpx *atpx)
atpx->is_hybrid = false;
if (valid_bits & ATPX_MS_HYBRID_GFX_SUPPORTED) {
if (amdgpu_atpx_priv.quirks & AMDGPU_PX_QUIRK_FORCE_ATPX) {
-   printk("ATPX Hybrid Graphics, forcing to ATPX\n");
+   DRM_WARN("ATPX Hybrid Graphics, forcing to ATPX\n");
atpx->functions.power_cntl = true;
atpx->is_hybrid = false;
} else {
-   printk("ATPX Hybrid Graphics\n");
+   DRM_WARN("ATPX Hybrid Graphics\n");
/*
 * Disable legacy PM methods only when pcie port PM is 
usable,
 * otherwise the device might fail to power off or 
power on.
@@ -269,7 +274,7 @@ static int amdgpu_atpx_verify_interface(struct amdgpu_atpx 
*atpx)
 
size = *(u16 *) info->buffer.pointer;
if (size < 8) {
-   printk("ATPX buffer is too small: %zu\n", size);
+   DRM_WARN("ATPX buffer is too small: %zu\n", size);
err = -EINVAL;
goto out;
}
@@ -278,7 +283,7 @@ static int amdgpu_atpx_verify_interface(struct amdgpu_atpx 
*atpx)
memcpy(, info->buffer.pointer, size);
 
/* TODO: check version? */
-   printk("ATPX version %u, functions 0x%08x\n",
+   DRM_WARN("ATPX version %u, functions 0x%08x\n",
   output.version, output.function_bits);
 
amdgpu_atpx_parse_functions(>functions, output.function_bits);

base-commit: e47eb90a0a9ae20b82635b9b99a8d0979b757ad8
-- 
2.34.1



[PATCH] drm/amd/display: fix memory leak when using debugfs_lookup()

2022-09-05 Thread Greg Kroah-Hartman
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time.  Fix this up by properly
calling dput().

Cc: Harry Wentland 
Cc: Leo Li 
Cc: Rodrigo Siqueira 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Wayne Lin 
Cc: hersen wu 
Cc: Wenjing Liu 
Cc: Patrik Jakobsson 
Cc: Thelford Williams 
Cc: Fangzhi Zuo 
Cc: Yongzhi Liu 
Cc: Mikita Lipski 
Cc: Jiapeng Chong 
Cc: Bhanuprakash Modem 
Cc: Sean Paul 
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 0e48824f55e3..ee242d9d8b06 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -3288,6 +3288,7 @@ void crtc_debugfs_init(struct drm_crtc *crtc)
   _win_y_end_fops);
debugfs_create_file_unsafe("crc_win_update", 0644, dir, crtc,
   _win_update_fops);
+   dput(dir);
 #endif
debugfs_create_file("amdgpu_current_bpc", 0644, crtc->debugfs_entry,
crtc, _current_bpc_fops);
-- 
2.37.3



[PATCH] drm/amdgpu: cleanup coding style in amdgpu_sync.c file

2022-09-05 Thread Jingyu Wang
This is a patch to the amdgpu_sync.c file that fixes some warnings found by the 
checkpatch.pl tool

Signed-off-by: Jingyu Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index 504af1b93bfa..090e66a1b284 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: MIT
 /*
  * Copyright 2014 Advanced Micro Devices, Inc.
  * All Rights Reserved.
@@ -315,6 +316,7 @@ struct dma_fence *amdgpu_sync_get_fence(struct amdgpu_sync 
*sync)
struct hlist_node *tmp;
struct dma_fence *f;
int i;
+
hash_for_each_safe(sync->fences, i, tmp, e, node) {
 
f = e->fence;
@@ -392,7 +394,7 @@ void amdgpu_sync_free(struct amdgpu_sync *sync)
 {
struct amdgpu_sync_entry *e;
struct hlist_node *tmp;
-   unsigned i;
+   unsigned int i;
 
hash_for_each_safe(sync->fences, i, tmp, e, node) {
hash_del(>node);

base-commit: e47eb90a0a9ae20b82635b9b99a8d0979b757ad8
prerequisite-patch-id: fefd0009b468430bb223fc92e4abe9710518b1ea
-- 
2.34.1



[PATCH linux-next] drm/amdgpu: Remove the unneeded result variable

2022-09-05 Thread cgel . zte
From: zhang songyi 

Return the sdma_v6_0_start() directly instead of storing it in another
redundant variable.

Reported-by: Zeal Robot 
Signed-off-by: zhang songyi 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
index 2bc1407e885e..2cc2d851b4eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
@@ -1373,12 +1373,9 @@ static int sdma_v6_0_sw_fini(void *handle)
 
 static int sdma_v6_0_hw_init(void *handle)
 {
-   int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-   r = sdma_v6_0_start(adev);
-
-   return r;
+   return sdma_v6_0_start(adev);
 }
 
 static int sdma_v6_0_hw_fini(void *handle)
-- 
2.25.1




Re: [PATCH] drm/amd/display: fix memory leak when using debugfs_lookup()

2022-09-05 Thread Greg Kroah-Hartman
On Fri, Sep 02, 2022 at 03:01:05PM +0200, Greg Kroah-Hartman wrote:
> When calling debugfs_lookup() the result must have dput() called on it,
> otherwise the memory will leak over time.  Fix this up by properly
> calling dput().
> 
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Rodrigo Siqueira 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: "Pan, Xinhui" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Wayne Lin 
> Cc: hersen wu 
> Cc: Wenjing Liu 
> Cc: Patrik Jakobsson 
> Cc: Thelford Williams 
> Cc: Fangzhi Zuo 
> Cc: Yongzhi Liu 
> Cc: Mikita Lipski 
> Cc: Jiapeng Chong 
> Cc: Bhanuprakash Modem 
> Cc: Sean Paul 
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-de...@lists.freedesktop.org
> Signed-off-by: Greg Kroah-Hartman 
> ---

Despite a zillion cc: items, I forgot to cc: stable on this.  Can the
maintainer add that here, or do you all want me to resend it with that
item added?

thanks,

greg k-h


[PATCH] drm/amdgpu: cleanup coding style in amdgpu_fence.c

2022-09-05 Thread Jingyu Wang
Fix everything checkpatch.pl complained about in amdgpu_fence.c

Signed-off-by: Jingyu Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 8adeb7469f1e..ae9daf653ad3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: MIT
 /*
  * Copyright 2009 Jerome Glisse.
  * All Rights Reserved.
@@ -42,7 +43,6 @@
 #include "amdgpu_reset.h"
 
 /*
- * Fences
  * Fences mark an event in the GPUs pipeline and are used
  * for GPU/CPU synchronization.  When the fence is written,
  * it is expected that all buffers associated with that fence
@@ -139,7 +139,7 @@ static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
  * Returns 0 on success, -ENOMEM on failure.
  */
 int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f, struct 
amdgpu_job *job,
- unsigned flags)
+ unsigned int flags)
 {
struct amdgpu_device *adev = ring->adev;
struct dma_fence *fence;
@@ -173,8 +173,7 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct 
dma_fence **f, struct amd
   adev->fence_context + ring->idx, seq);
/* Against remove in amdgpu_job_{free, free_cb} */
dma_fence_get(fence);
-   }
-   else
+   } else
dma_fence_init(fence, _fence_ops,
   >fence_drv.lock,
   adev->fence_context + ring->idx, seq);
@@ -393,7 +392,7 @@ signed long amdgpu_fence_wait_polling(struct amdgpu_ring 
*ring,
  * Returns the number of emitted fences on the ring.  Used by the
  * dynpm code to ring track activity.
  */
-unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring)
+unsigned int amdgpu_fence_count_emitted(struct amdgpu_ring *ring)
 {
uint64_t emitted;
 
@@ -422,7 +421,7 @@ unsigned amdgpu_fence_count_emitted(struct amdgpu_ring 
*ring)
  */
 int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
   struct amdgpu_irq_src *irq_src,
-  unsigned irq_type)
+  unsigned int irq_type)
 {
struct amdgpu_device *adev = ring->adev;
uint64_t index;
@@ -594,6 +593,7 @@ void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev)
 
for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
struct amdgpu_ring *ring = adev->rings[i];
+
if (!ring || !ring->fence_drv.initialized)
continue;
 
@@ -772,6 +772,7 @@ static int amdgpu_debugfs_fence_info_show(struct seq_file 
*m, void *unused)
 
for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
struct amdgpu_ring *ring = adev->rings[i];
+
if (!ring || !ring->fence_drv.initialized)
continue;
 
@@ -845,6 +846,7 @@ static void amdgpu_debugfs_reset_work(struct work_struct 
*work)
  reset_work);
 
struct amdgpu_reset_context reset_context;
+
memset(_context, 0, sizeof(reset_context));
 
reset_context.method = AMD_RESET_METHOD_NONE;

base-commit: e47eb90a0a9ae20b82635b9b99a8d0979b757ad8
prerequisite-patch-id: f039528bc88876d6e0f64e843da089e85f6d3f58
-- 
2.34.1



Re: [PATCH v4 11/21] misc: fastrpc: Prepare to dynamic dma-buf locking specification

2022-09-05 Thread Srinivas Kandagatla




On 31/08/2022 16:37, Dmitry Osipenko wrote:

Prepare fastrpc to the common dynamic dma-buf locking convention by
starting to use the unlocked versions of dma-buf API functions.

Signed-off-by: Dmitry Osipenko 
---


LGTM,

Incase you plan to take it via another tree.

Acked-by: Srinivas Kandagatla 


--srini

  drivers/misc/fastrpc.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
index 93ebd174d848..6fcfb2e9f7a7 100644
--- a/drivers/misc/fastrpc.c
+++ b/drivers/misc/fastrpc.c
@@ -310,8 +310,8 @@ static void fastrpc_free_map(struct kref *ref)
return;
}
}
-   dma_buf_unmap_attachment(map->attach, map->table,
-DMA_BIDIRECTIONAL);
+   dma_buf_unmap_attachment_unlocked(map->attach, map->table,
+ DMA_BIDIRECTIONAL);
dma_buf_detach(map->buf, map->attach);
dma_buf_put(map->buf);
}
@@ -726,7 +726,7 @@ static int fastrpc_map_create(struct fastrpc_user *fl, int 
fd,
goto attach_err;
}
  
-	map->table = dma_buf_map_attachment(map->attach, DMA_BIDIRECTIONAL);

+   map->table = dma_buf_map_attachment_unlocked(map->attach, 
DMA_BIDIRECTIONAL);
if (IS_ERR(map->table)) {
err = PTR_ERR(map->table);
goto map_err;


[PATCH] drm:Fix the blank screen problem of some 1920x1080 75Hz monitors using R520 graphics card

2022-09-05 Thread zhongpei
We found that in the scenario of AMD R520 graphics card
and some 1920x1080 monitors,when we switch the refresh rate
of the monitor to 75Hz,the monitor will have a blank screen problem,
and the restart cannot be restored.After testing, it is found that
when we limit the maximum value of ref_div_max to 128,
the problem can be solved.In order to keep the previous modification
to be compatible with other monitors,we added a judgment
when finding the minimum diff value in the loop of the
amdgpu_pll_compute/radeon_compute_pll_avivo function.
If no diff value of 0 is found when the maximum value of ref_div_max
is limited to 100,continue to search when it is 128,
and take the parameter with the smallest diff value.

Signed-off-by: zhongpei 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c | 17 +
 drivers/gpu/drm/radeon/radeon_display.c | 15 +++
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c
index 0bb2466d539a..0c298faa0f94 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c
@@ -84,12 +84,13 @@ static void amdgpu_pll_reduce_ratio(unsigned *nom, unsigned 
*den,
 static void amdgpu_pll_get_fb_ref_div(struct amdgpu_device *adev, unsigned int 
nom,
  unsigned int den, unsigned int post_div,
  unsigned int fb_div_max, unsigned int 
ref_div_max,
- unsigned int *fb_div, unsigned int 
*ref_div)
+ unsigned int ref_div_limit, unsigned int 
*fb_div,
+ unsigned int *ref_div)
 {
 
/* limit reference * post divider to a maximum */
if (adev->family == AMDGPU_FAMILY_SI)
-   ref_div_max = min(100 / post_div, ref_div_max);
+   ref_div_max = min(ref_div_limit / post_div, ref_div_max);
else
ref_div_max = min(128 / post_div, ref_div_max);
 
@@ -136,6 +137,7 @@ void amdgpu_pll_compute(struct amdgpu_device *adev,
unsigned ref_div_min, ref_div_max, ref_div;
unsigned post_div_best, diff_best;
unsigned nom, den;
+   unsigned ref_div_limit, ref_limit_best;
 
/* determine allowed feedback divider range */
fb_div_min = pll->min_feedback_div;
@@ -204,11 +206,12 @@ void amdgpu_pll_compute(struct amdgpu_device *adev,
else
post_div_best = post_div_max;
diff_best = ~0;
+   ref_div_limit = ref_limit_best = 100;
 
for (post_div = post_div_min; post_div <= post_div_max; ++post_div) {
unsigned diff;
amdgpu_pll_get_fb_ref_div(adev, nom, den, post_div, fb_div_max,
- ref_div_max, _div, _div);
+ ref_div_max, ref_div_limit, _div, 
_div);
diff = abs(target_clock - (pll->reference_freq * fb_div) /
(ref_div * post_div));
 
@@ -217,13 +220,19 @@ void amdgpu_pll_compute(struct amdgpu_device *adev,
 
post_div_best = post_div;
diff_best = diff;
+   ref_limit_best = ref_div_limit;
}
+   if (post_div >= post_div_max && diff_best != 0 && ref_div_limit 
!= 128) {
+   ref_div_limit = 128;
+   post_div = post_div_min - 1;
+   }
+
}
post_div = post_div_best;
 
/* get the feedback and reference divider for the optimal value */
amdgpu_pll_get_fb_ref_div(adev, nom, den, post_div, fb_div_max, 
ref_div_max,
- _div, _div);
+ ref_limit_best, _div, _div);
 
/* reduce the numbers to a simpler ratio once more */
/* this also makes sure that the reference divider is large enough */
diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index f12675e3d261..0fcbf45a68db 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -925,10 +925,10 @@ static void avivo_reduce_ratio(unsigned *nom, unsigned 
*den,
  */
 static void avivo_get_fb_ref_div(unsigned nom, unsigned den, unsigned post_div,
 unsigned fb_div_max, unsigned ref_div_max,
-unsigned *fb_div, unsigned *ref_div)
+unsigned ref_div_limit, unsigned *fb_div, 
unsigned *ref_div)
 {
/* limit reference * post divider to a maximum */
-   ref_div_max = max(min(100 / post_div, ref_div_max), 1u);
+   ref_div_max = max(min(ref_div_limit / post_div, ref_div_max), 1u);
 
/* get matching reference and feedback divider */
*ref_div = min(max(den/post_div, 1u), ref_div_max);
@@ -971,6 +971,7 @@ void radeon_compute_pll_avivo(struct 

Re: [PATCH v4 06/21] drm/i915: Prepare to dynamic dma-buf locking specification

2022-09-05 Thread Dmitry Osipenko
02.09.2022 13:31, Dmitry Osipenko пишет:
> 01.09.2022 17:02, Ruhl, Michael J пишет:
> ...
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>> @@ -331,7 +331,19 @@ static void __i915_gem_free_objects(struct
>>> drm_i915_private *i915,
>>> continue;
>>> }
>>>
>>> +   /*
>>> +* dma_buf_unmap_attachment() requires reservation to be
>>> +* locked. The imported GEM shouldn't share reservation lock,
>>> +* so it's safe to take the lock.
>>> +*/
>>> +   if (obj->base.import_attach)
>>> +   i915_gem_object_lock(obj, NULL);
>>
>> There is a lot of stuff going here.  Taking the lock may be premature...
>>
>>> __i915_gem_object_pages_fini(obj);
>>
>> The i915_gem_dmabuf.c:i915_gem_object_put_pages_dmabuf is where
>> unmap_attachment is actually called, would it make more sense to make
>> do the locking there?
> 
> The __i915_gem_object_put_pages() is invoked with a held reservation
> lock, while freeing object is a special time when we know that GEM is
> unused.
> 
> The __i915_gem_free_objects() was taking the lock two weeks ago until
> the change made Chris Wilson [1] reached linux-next.
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=2826d447fbd60e6a05e53d5f918bceb8c04e315c
> 
> I don't think we can take the lock within
> i915_gem_object_put_pages_dmabuf(), it may/should deadlock other code paths.

On the other hand, we can check whether the GEM's refcount number is
zero in i915_gem_object_put_pages_dmabuf() and then take the lock if
it's zero.

Also, seems it should be possible just to bail out from
i915_gem_object_put_pages_dmabuf() if refcount=0. The further
drm_prime_gem_destroy() will take care of unmapping. Perhaps this could
be the best option, I'll give it a test.


[PATCH] amd: amdgpu: fix coding style issue

2022-09-05 Thread Jingyu Wang
This is a patch to the amdgpu_sync.c file that fixes some warnings found by the 
checkpatch.pl tool

Signed-off-by: Jingyu Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index 504af1b93bfa..dfc787b749b2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -1,5 +1,6 @@
-/*
- * Copyright 2014 Advanced Micro Devices, Inc.
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright 2014 Advanced Micro Devices, Inc.
  * All Rights Reserved.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
@@ -315,6 +316,7 @@ struct dma_fence *amdgpu_sync_get_fence(struct amdgpu_sync 
*sync)
struct hlist_node *tmp;
struct dma_fence *f;
int i;
+
hash_for_each_safe(sync->fences, i, tmp, e, node) {
 
f = e->fence;
@@ -392,7 +394,7 @@ void amdgpu_sync_free(struct amdgpu_sync *sync)
 {
struct amdgpu_sync_entry *e;
struct hlist_node *tmp;
-   unsigned i;
+   unsigned int i;
 
hash_for_each_safe(sync->fences, i, tmp, e, node) {
hash_del(>node);
-- 
2.34.1



[PATCH] drm: amd: This is a patch to the amdgpu_drv.c file that fixes some warnings and errors found by the checkpatch.pl tool

2022-09-05 Thread Jingyu Wang
Signed-off-by: Jingyu Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 40 -
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index de7144b06e93..5c2ac8123450 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -140,8 +140,8 @@ uint amdgpu_pcie_lane_cap;
 u64 amdgpu_cg_mask = 0x;
 uint amdgpu_pg_mask = 0x;
 uint amdgpu_sdma_phase_quantum = 32;
-char *amdgpu_disable_cu = NULL;
-char *amdgpu_virtual_display = NULL;
+char *amdgpu_disable_cu;
+char *amdgpu_virtual_display;
 
 /*
  * OverDrive(bit 14) disabled by default
@@ -287,9 +287,9 @@ module_param_named(msi, amdgpu_msi, int, 0444);
  * jobs is 1. The timeout for compute is 6.
  */
 MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: for bare 
metal 1 for non-compute jobs and 6 for compute jobs; "
-   "for passthrough or sriov, 1 for all jobs."
-   " 0: keep default value. negative: infinity timeout), "
-   "format: for bare metal [Non-Compute] or 
[GFX,Compute,SDMA,Video]; "
+   "for passthrough or sriov, 1 for all jobs.
+   0: keep default value. negative: infinity timeout),
+   format: for bare metal [Non-Compute] or 
[GFX,Compute,SDMA,Video]; "
"for passthrough or sriov [all jobs] or 
[GFX,Compute,SDMA,Video].");
 module_param_string(lockup_timeout, amdgpu_lockup_timeout, 
sizeof(amdgpu_lockup_timeout), 0444);
 
@@ -502,7 +502,7 @@ module_param_named(virtual_display, amdgpu_virtual_display, 
charp, 0444);
  * Set how much time allow a job hang and not drop it. The default is 0.
  */
 MODULE_PARM_DESC(job_hang_limit, "how much time allow a job hang and not drop 
it (default 0)");
-module_param_named(job_hang_limit, amdgpu_job_hang_limit, int ,0444);
+module_param_named(job_hang_limit, amdgpu_job_hang_limit, int, 0444);
 
 /**
  * DOC: lbpw (int)
@@ -565,8 +565,8 @@ module_param_named(timeout_period, 
amdgpu_watchdog_timer.period, uint, 0644);
  */
 #ifdef CONFIG_DRM_AMDGPU_SI
 
-#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
-int amdgpu_si_support = 0;
+#if IS_ENABLED(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
+int amdgpu_si_support;
 MODULE_PARM_DESC(si_support, "SI support (1 = enabled, 0 = disabled 
(default))");
 #else
 int amdgpu_si_support = 1;
@@ -584,8 +584,8 @@ module_param_named(si_support, amdgpu_si_support, int, 
0444);
  */
 #ifdef CONFIG_DRM_AMDGPU_CIK
 
-#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
-int amdgpu_cik_support = 0;
+#if IS_ENABLED(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
+int amdgpu_cik_support;
 MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled, 0 = disabled 
(default))");
 #else
 int amdgpu_cik_support = 1;
@@ -601,8 +601,8 @@ module_param_named(cik_support, amdgpu_cik_support, int, 
0444);
  * E.g. 0x1 = 256Mbyte, 0x2 = 512Mbyte, 0x4 = 1 Gbyte, 0x8 = 2GByte. The 
default is 0 (disabled).
  */
 MODULE_PARM_DESC(smu_memory_pool_size,
-   "reserve gtt for smu debug usage, 0 = disable,"
-   "0x1 = 256Mbyte, 0x2 = 512Mbyte, 0x4 = 1 Gbyte, 0x8 = 2GByte");
+   "reserve gtt for smu debug usage, 0 = disable,
+   0x1 = 256Mbyte, 0x2 = 512Mbyte, 0x4 = 1 Gbyte, 0x8 = 2GByte");
 module_param_named(smu_memory_pool_size, amdgpu_smu_memory_pool_size, uint, 
0444);
 
 /**
@@ -772,9 +772,9 @@ module_param(hws_gws_support, bool, 0444);
 MODULE_PARM_DESC(hws_gws_support, "Assume MEC2 FW supports GWS barriers (false 
= rely on FW version check (Default), true = force supported)");
 
 /**
-  * DOC: queue_preemption_timeout_ms (int)
-  * queue preemption timeout in ms (1 = Minimum, 9000 = default)
-  */
+ * DOC: queue_preemption_timeout_ms (int)
+ * queue preemption timeout in ms (1 = Minimum, 9000 = default)
+ */
 int queue_preemption_timeout_ms = 9000;
 module_param(queue_preemption_timeout_ms, int, 0644);
 MODULE_PARM_DESC(queue_preemption_timeout_ms, "queue preemption timeout in ms 
(1 = Minimum, 9000 = default)");
@@ -799,7 +799,7 @@ MODULE_PARM_DESC(no_system_mem_limit, "disable system 
memory limit (false = defa
  * DOC: no_queue_eviction_on_vm_fault (int)
  * If set, process queues will not be evicted on gpuvm fault. This is to keep 
the wavefront context for debugging (0 = queue eviction, 1 = no queue 
eviction). The default is 0 (queue eviction).
  */
-int amdgpu_no_queue_eviction_on_vm_fault = 0;
+int amdgpu_no_queue_eviction_on_vm_fault;
 MODULE_PARM_DESC(no_queue_eviction_on_vm_fault, "No queue eviction on VM fault 
(0 = queue eviction, 1 = no queue eviction)");
 module_param_named(no_queue_eviction_on_vm_fault, 
amdgpu_no_queue_eviction_on_vm_fault, int, 0444);
 #endif
@@ -1609,7 +1609,7 @@ static const u16 amdgpu_unsupported_pciidlist[] = {
 };
 
 static const struct pci_device_id pciidlist[] = {

[PATCH] drm/amdgpu: cleanup coding style in amdgpu_acpi.c

2022-09-05 Thread Jingyu Wang
Fix everything checkpatch.pl complained about in amdgpu_acpi.c

Signed-off-by: Jingyu Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
index 55402d238919..3da27436922c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: MIT
 /*
  * Copyright 2012 Advanced Micro Devices, Inc.
  *
@@ -849,6 +850,7 @@ int amdgpu_acpi_init(struct amdgpu_device *adev)
if (amdgpu_device_has_dc_support(adev)) {
 #if defined(CONFIG_DRM_AMD_DC)
struct amdgpu_display_manager *dm = >dm;
+
if (dm->backlight_dev[0])
atif->bd = dm->backlight_dev[0];
 #endif
@@ -863,6 +865,7 @@ int amdgpu_acpi_init(struct amdgpu_device *adev)
if ((enc->devices & (ATOM_DEVICE_LCD_SUPPORT)) 
&&
enc->enc_priv) {
struct amdgpu_encoder_atom_dig *dig = 
enc->enc_priv;
+
if (dig->bl_dev) {
atif->bd = dig->bl_dev;
break;
@@ -919,9 +922,9 @@ static bool amdgpu_atif_pci_probe_handle(struct pci_dev 
*pdev)
return false;
 
status = acpi_get_handle(dhandle, "ATIF", _handle);
-   if (ACPI_FAILURE(status)) {
+   if (ACPI_FAILURE(status))
return false;
-   }
+
amdgpu_acpi_priv.atif.handle = atif_handle;
acpi_get_name(amdgpu_acpi_priv.atif.handle, ACPI_FULL_PATHNAME, 
);
DRM_DEBUG_DRIVER("Found ATIF handle %s\n", acpi_method_name);
@@ -954,9 +957,9 @@ static bool amdgpu_atcs_pci_probe_handle(struct pci_dev 
*pdev)
return false;
 
status = acpi_get_handle(dhandle, "ATCS", _handle);
-   if (ACPI_FAILURE(status)) {
+   if (ACPI_FAILURE(status))
return false;
-   }
+
amdgpu_acpi_priv.atcs.handle = atcs_handle;
acpi_get_name(amdgpu_acpi_priv.atcs.handle, ACPI_FULL_PATHNAME, 
);
DRM_DEBUG_DRIVER("Found ATCS handle %s\n", acpi_method_name);

base-commit: e47eb90a0a9ae20b82635b9b99a8d0979b757ad8
-- 
2.34.1



Re: [PATCH v2 3/5] Makefile.compiler: replace cc-ifversion with compiler-specific macros

2022-09-05 Thread Masahiro Yamada
On Thu, Sep 1, 2022 at 3:44 AM Nick Desaulniers  wrote:
>
> cc-ifversion is GCC specific. Replace it with compiler specific
> variants. Update the users of cc-ifversion to use these new macros.
> Provide a helper for checking compiler versions for GCC and Clang
> simultaneously, that will be used in a follow up patch.
>
> Cc: Jonathan Corbet 
> Cc: linux-...@vger.kernel.org
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-de...@lists.freedesktop.org
> Link: https://github.com/ClangBuiltLinux/linux/issues/350
> Link: 
> https://lore.kernel.org/llvm/CAGG=3QWSAUakO42kubrCap8fp-gm1ERJJAYXTnP1iHk_wrH=b...@mail.gmail.com/
> Suggested-by: Bill Wendling 
> Signed-off-by: Nick Desaulniers 
> ---
> Changes v1 -> v2:
> * New patch.
>
>  Documentation/kbuild/makefiles.rst  | 44 +++--
>  Makefile|  4 +-
>  drivers/gpu/drm/amd/display/dc/dml/Makefile | 12 ++
>  scripts/Makefile.compiler   | 15 +--
>  4 files changed, 49 insertions(+), 26 deletions(-)
>
> diff --git a/Documentation/kbuild/makefiles.rst 
> b/Documentation/kbuild/makefiles.rst
> index 11a296e52d68..e46f5b45c422 100644
> --- a/Documentation/kbuild/makefiles.rst
> +++ b/Documentation/kbuild/makefiles.rst
> @@ -682,22 +682,42 @@ more details, with real examples.
> In the above example, -Wno-unused-but-set-variable will be added to
> KBUILD_CFLAGS only if gcc really accepts it.
>
> -cc-ifversion
> -   cc-ifversion tests the version of $(CC) and equals the fourth 
> parameter
> -   if version expression is true, or the fifth (if given) if the version
> -   expression is false.
> +gcc-min-version
> +   gcc-min-version tests if the value of $(CONFIG_GCC_VERSION) is 
> greater than
> +   or equal to the provided value and evaluates to y if so.
>
> Example::
>
> -   #fs/reiserfs/Makefile
> -   ccflags-y := $(call cc-ifversion, -lt, 0402, -O1)
> +   cflags-$(call gcc-min-version, 70100) := -foo
>
> -   In this example, ccflags-y will be assigned the value -O1 if the
> -   $(CC) version is less than 4.2.
> -   cc-ifversion takes all the shell operators:
> -   -eq, -ne, -lt, -le, -gt, and -ge
> -   The third parameter may be a text as in this example, but it may also
> -   be an expanded variable or a macro.
> +   In this example, cflags-y will be assigned the value -foo if $(CC) is 
> gcc and
> +   $(CONFIG_GCC_VERSION) is >= 7.1.
> +
> +clang-min-version
> +   clang-min-version tests if the value of $(CONFIG_CLANG_VERSION) is 
> greater
> +   than or equal to the provided value and evaluates to y if so.
> +
> +   Example::
> +
> +   cflags-$(call clang-min-version, 11) := -foo
> +
> +   In this example, cflags-y will be assigned the value -foo if $(CC) is 
> clang
> +   and $(CONFIG_CLANG_VERSION) is >= 11.0.0.
> +
> +cc-min-version
> +   cc-min-version tests if the value of $(CONFIG_GCC_VERSION) is greater
> +   than or equal to the first value provided, or if the value of
> +   $(CONFIG_CLANG_VERSION) is greater than or equal to the second value
> +   provided, and evaluates
> +   to y if so.
> +
> +   Example::
> +
> +   cflags-$(call cc-min-version, 70100, 11) := -foo
> +
> +   In this example, cflags-y will be assigned the value -foo if $(CC) is 
> gcc and
> +   $(CONFIG_GCC_VERSION) is >= 7.1, or if $(CC) is clang and
> +   $(CONFIG_CLANG_VERSION) is >= 11.0.0.
>
>  cc-cross-prefix
> cc-cross-prefix is used to check if there exists a $(CC) in path with
> diff --git a/Makefile b/Makefile
> index 952d354069a4..caa39ecb1136 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -972,7 +972,7 @@ ifdef CONFIG_CC_IS_GCC
>  KBUILD_CFLAGS += -Wno-maybe-uninitialized
>  endif
>
> -ifdef CONFIG_CC_IS_GCC
> +ifeq ($(call gcc-min-version, 90100),y)
>  # The allocators already balk at large sizes, so silence the compiler
>  # warnings for bounds checks involving those possible values. While
>  # -Wno-alloc-size-larger-than would normally be used here, earlier versions
> @@ -984,7 +984,7 @@ ifdef CONFIG_CC_IS_GCC
>  # ignored, continuing to default to PTRDIFF_MAX. So, left with no other
>  # choice, we must perform a versioned check to disable this warning.
>  # https://lore.kernel.org/lkml/20210824115859.187f2...@canb.auug.org.au
> -KBUILD_CFLAGS += $(call cc-ifversion, -ge, 0901, -Wno-alloc-size-larger-than)
> +KBUILD_CFLAGS += -Wno-alloc-size-larger-than
>  endif
>
>  # disable invalid "can't wrap" optimizations for signed / pointers
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
> b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> index 86a3b5bfd699..d8ee4743b2e3 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> @@ -33,20 +33,14 @@ ifdef CONFIG_PPC64
>  dml_ccflags := -mhard-float -maltivec
>  endif
>
> 

RE: [PATCH V2] drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled

2022-09-05 Thread Chai, Thomas
[AMD Official Use Only - General]

Ping


-
Best Regards,
Thomas

-Original Message-
From: Chai, Thomas  
Sent: Thursday, September 1, 2022 4:40 PM
To: amd-gfx@lists.freedesktop.org
Cc: Chai, Thomas ; Zhang, Hawking ; 
Zhou1, Tao ; Chai, Thomas 
Subject: [PATCH V2] drm/amdgpu: TA unload messages are not actually sent to psp 
when amdgpu is uninstalled

V1:
  The psp_cmd_submit_buf function is called by psp_hw_fini to send TA unload 
messages to psp to terminate ras, asd and tmr. But when amdgpu is uninstalled, 
drm_dev_unplug is called earlier than psp_hw_fini in amdgpu_pci_remove, the 
calling order as follows:
static void amdgpu_pci_remove(struct pci_dev *pdev) {
drm_dev_unplug
..
amdgpu_driver_unload_kms->amdgpu_device_fini_hw->...
->.hw_fini->psp_hw_fini->...
->psp_ta_unload->psp_cmd_submit_buf
..
}
The program will return when calling drm_dev_enter in psp_cmd_submit_buf.

So the call to drm_dev_enter in psp_cmd_submit_buf should be removed, so that 
the TA unload messages can be sent to the psp when amdgpu is uninstalled.

V2:
1. Restore psp_cmd_submit_buf to its original code.
2. Move drm_dev_unplug call after amdgpu_driver_unload_kms in
   amdgpu_pci_remove.
3. Since amdgpu_device_fini_hw is called by amdgpu_driver_unload_kms,
   remove the unplug check to release device mmio resource in
   amdgpu_device_fini_hw before calling drm_dev_unplug.

Signed-off-by: YiPeng Chai 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 4 ++--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index afaa1056e039..62b26f0e37b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3969,8 +3969,7 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 
amdgpu_gart_dummy_page_fini(adev);
 
-   if (drm_dev_is_unplugged(adev_to_drm(adev)))
-   amdgpu_device_unmap_mmio(adev);
+   amdgpu_device_unmap_mmio(adev);
 
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index de7144b06e93..728a0933ea6f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2181,8 +2181,6 @@ amdgpu_pci_remove(struct pci_dev *pdev)
struct drm_device *dev = pci_get_drvdata(pdev);
struct amdgpu_device *adev = drm_to_adev(dev);
 
-   drm_dev_unplug(dev);
-
if (adev->pm.rpm_mode != AMDGPU_RUNPM_NONE) {
pm_runtime_get_sync(dev->dev);
pm_runtime_forbid(dev->dev);
@@ -2190,6 +2188,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 
amdgpu_driver_unload_kms(dev);
 
+   drm_dev_unplug(dev);
+
/*
 * Flush any in flight DMA operations from device.
 * Clear the Bus Master Enable bit and then wait on the PCIe Device
--
2.25.1


Re: [PATCH 10/11] drm/amdgpu: add gang submit frontend v4

2022-09-05 Thread Christian König
When we have it fully tested and ready I'm going to increase the DRM 
minor number to indicate support for it.


Regards,
Christian.

Am 05.09.22 um 04:30 schrieb Huang, Trigger:

[AMD Official Use Only - General]

Before we finally add the gang submission frontend, is there any interface/flag 
for user mode driver to detect if gang submission is supported by kernel?

Regards,
Trigger

-Original Message-
From: amd-gfx  On Behalf Of Christian 
K?nig
Sent: 2022年8月29日 21:18
To: Dong, Ruijing ; amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian 
Subject: [PATCH 10/11] drm/amdgpu: add gang submit frontend v4

Allows submitting jobs as gang which needs to run on multiple engines at the 
same time.

All members of the gang get the same implicit, explicit and VM dependencies. So 
no gang member will start running until everything else is ready.

The last job is considered the gang leader (usually a submission to the GFX
ring) and used for signaling output dependencies.

Each job is remembered individually as user of a buffer object, so there is no 
joining of work at the end.

v2: rebase and fix review comments from Andrey and Yogesh
v3: use READ instead of BOOKKEEP for now because of VM unmaps, set gang
 leader only when necessary
v4: fix order of pushing jobs and adding fences found by Trigger.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 259 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.h|  10 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h |  12 +-
  3 files changed, 184 insertions(+), 97 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 9821299dfb49..a6e50ad5e306 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -69,6 +69,7 @@ static int amdgpu_cs_p1_ib(struct amdgpu_cs_parser *p,
unsigned int *num_ibs)
  {
 struct drm_sched_entity *entity;
+   unsigned int i;
 int r;

 r = amdgpu_ctx_get_entity(p->ctx, chunk_ib->ip_type, @@ -77,17 +78,28 
@@ static int amdgpu_cs_p1_ib(struct amdgpu_cs_parser *p,
 if (r)
 return r;

-   /* Abort if there is no run queue associated with this entity.
-* Possibly because of disabled HW IP*/
+   /*
+* Abort if there is no run queue associated with this entity.
+* Possibly because of disabled HW IP.
+*/
 if (entity->rq == NULL)
 return -EINVAL;

-   /* Currently we don't support submitting to multiple entities */
-   if (p->entity && p->entity != entity)
+   /* Check if we can add this IB to some existing job */
+   for (i = 0; i < p->gang_size; ++i) {
+   if (p->entities[i] == entity)
+   goto found;
+   }
+
+   /* If not increase the gang size if possible */
+   if (i == AMDGPU_CS_GANG_SIZE)
 return -EINVAL;

-   p->entity = entity;
-   ++(*num_ibs);
+   p->entities[i] = entity;
+   p->gang_size = i + 1;
+
+found:
+   ++(num_ibs[i]);
 return 0;
  }

@@ -161,11 +173,12 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
union drm_amdgpu_cs *cs)
  {
 struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
+   unsigned int num_ibs[AMDGPU_CS_GANG_SIZE] = { };
 struct amdgpu_vm *vm = >vm;
 uint64_t *chunk_array_user;
 uint64_t *chunk_array;
-   unsigned size, num_ibs = 0;
 uint32_t uf_offset = 0;
+   unsigned int size;
 int ret;
 int i;

@@ -231,7 +244,7 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
 if (size < sizeof(struct drm_amdgpu_cs_chunk_ib))
 goto free_partial_kdata;

-   ret = amdgpu_cs_p1_ib(p, p->chunks[i].kdata, _ibs);
+   ret = amdgpu_cs_p1_ib(p, p->chunks[i].kdata, num_ibs);
 if (ret)
 goto free_partial_kdata;
 break;
@@ -268,21 +281,28 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
 }
 }

-   ret = amdgpu_job_alloc(p->adev, num_ibs, >job, vm);
-   if (ret)
-   goto free_all_kdata;
+   if (!p->gang_size)
+   return -EINVAL;

-   ret = drm_sched_job_init(>job->base, p->entity, >vm);
-   if (ret)
-   goto free_all_kdata;
+   for (i = 0; i < p->gang_size; ++i) {
+   ret = amdgpu_job_alloc(p->adev, num_ibs[i], >jobs[i], vm);
+   if (ret)
+   goto free_all_kdata;
+
+   ret = drm_sched_job_init(>jobs[i]->base, p->entities[i],
+>vm);
+   if (ret)
+   goto free_all_kdata;
+   }
+   p->gang_leader = p->jobs[p->gang_size - 1];

-   if 

Re: [PATCH] drm/amdgpu: cleanup coding style in amdgpu_fence.c

2022-09-05 Thread Christian König

Am 04.09.22 um 21:31 schrieb Jingyu Wang:

Fix everything checkpatch.pl complained about in amdgpu_fence.c

Signed-off-by: Jingyu Wang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 14 --
  1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 8adeb7469f1e..ae9daf653ad3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: MIT
  /*
   * Copyright 2009 Jerome Glisse.
   * All Rights Reserved.
@@ -42,7 +43,6 @@
  #include "amdgpu_reset.h"
  
  /*

- * Fences
   * Fences mark an event in the GPUs pipeline and are used
   * for GPU/CPU synchronization.  When the fence is written,
   * it is expected that all buffers associated with that fence
@@ -139,7 +139,7 @@ static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
   * Returns 0 on success, -ENOMEM on failure.
   */
  int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f, struct 
amdgpu_job *job,
- unsigned flags)
+ unsigned int flags)
  {
struct amdgpu_device *adev = ring->adev;
struct dma_fence *fence;
@@ -173,8 +173,7 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct 
dma_fence **f, struct amd
   adev->fence_context + ring->idx, seq);
/* Against remove in amdgpu_job_{free, free_cb} */
dma_fence_get(fence);
-   }
-   else
+   } else


That will still be complained about. This should be "} else {".

Apart from this nit pick you patches look good to me now.

Thanks,
Christian.


dma_fence_init(fence, _fence_ops,
   >fence_drv.lock,
   adev->fence_context + ring->idx, seq);
@@ -393,7 +392,7 @@ signed long amdgpu_fence_wait_polling(struct amdgpu_ring 
*ring,
   * Returns the number of emitted fences on the ring.  Used by the
   * dynpm code to ring track activity.
   */
-unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring)
+unsigned int amdgpu_fence_count_emitted(struct amdgpu_ring *ring)
  {
uint64_t emitted;
  
@@ -422,7 +421,7 @@ unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring)

   */
  int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
   struct amdgpu_irq_src *irq_src,
-  unsigned irq_type)
+  unsigned int irq_type)
  {
struct amdgpu_device *adev = ring->adev;
uint64_t index;
@@ -594,6 +593,7 @@ void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev)
  
  	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {

struct amdgpu_ring *ring = adev->rings[i];
+
if (!ring || !ring->fence_drv.initialized)
continue;
  
@@ -772,6 +772,7 @@ static int amdgpu_debugfs_fence_info_show(struct seq_file *m, void *unused)
  
  	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {

struct amdgpu_ring *ring = adev->rings[i];
+
if (!ring || !ring->fence_drv.initialized)
continue;
  
@@ -845,6 +846,7 @@ static void amdgpu_debugfs_reset_work(struct work_struct *work)

  reset_work);
  
  	struct amdgpu_reset_context reset_context;

+
memset(_context, 0, sizeof(reset_context));
  
  	reset_context.method = AMD_RESET_METHOD_NONE;


base-commit: e47eb90a0a9ae20b82635b9b99a8d0979b757ad8
prerequisite-patch-id: f039528bc88876d6e0f64e843da089e85f6d3f58




Re: [PATCH] drm:Fix the blank screen problem of some 1920x1080 75Hz monitors using R520 graphics card

2022-09-05 Thread Christian König

Am 05.09.22 um 05:23 schrieb zhongpei:

We found that in the scenario of AMD R520 graphics card
and some 1920x1080 monitors,when we switch the refresh rate
of the monitor to 75Hz,the monitor will have a blank screen problem,
and the restart cannot be restored.After testing, it is found that
when we limit the maximum value of ref_div_max to 128,
the problem can be solved.In order to keep the previous modification
to be compatible with other monitors,we added a judgment
when finding the minimum diff value in the loop of the
amdgpu_pll_compute/radeon_compute_pll_avivo function.
If no diff value of 0 is found when the maximum value of ref_div_max
is limited to 100,continue to search when it is 128,
and take the parameter with the smallest diff value.


Well that's at least better than what I've seen in previous tries to fix 
this.


But as far as I can see this will certainly break some other monitors, 
so that is pretty much a NAK.


Regards,
Christian.



Signed-off-by: zhongpei 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c | 17 +
  drivers/gpu/drm/radeon/radeon_display.c | 15 +++
  2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c
index 0bb2466d539a..0c298faa0f94 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c
@@ -84,12 +84,13 @@ static void amdgpu_pll_reduce_ratio(unsigned *nom, unsigned 
*den,
  static void amdgpu_pll_get_fb_ref_div(struct amdgpu_device *adev, unsigned 
int nom,
  unsigned int den, unsigned int post_div,
  unsigned int fb_div_max, unsigned int 
ref_div_max,
- unsigned int *fb_div, unsigned int 
*ref_div)
+ unsigned int ref_div_limit, unsigned int 
*fb_div,
+ unsigned int *ref_div)
  {
  
  	/* limit reference * post divider to a maximum */

if (adev->family == AMDGPU_FAMILY_SI)
-   ref_div_max = min(100 / post_div, ref_div_max);
+   ref_div_max = min(ref_div_limit / post_div, ref_div_max);
else
ref_div_max = min(128 / post_div, ref_div_max);
  
@@ -136,6 +137,7 @@ void amdgpu_pll_compute(struct amdgpu_device *adev,

unsigned ref_div_min, ref_div_max, ref_div;
unsigned post_div_best, diff_best;
unsigned nom, den;
+   unsigned ref_div_limit, ref_limit_best;
  
  	/* determine allowed feedback divider range */

fb_div_min = pll->min_feedback_div;
@@ -204,11 +206,12 @@ void amdgpu_pll_compute(struct amdgpu_device *adev,
else
post_div_best = post_div_max;
diff_best = ~0;
+   ref_div_limit = ref_limit_best = 100;
  
  	for (post_div = post_div_min; post_div <= post_div_max; ++post_div) {

unsigned diff;
amdgpu_pll_get_fb_ref_div(adev, nom, den, post_div, fb_div_max,
- ref_div_max, _div, _div);
+ ref_div_max, ref_div_limit, _div, 
_div);
diff = abs(target_clock - (pll->reference_freq * fb_div) /
(ref_div * post_div));
  
@@ -217,13 +220,19 @@ void amdgpu_pll_compute(struct amdgpu_device *adev,
  
  			post_div_best = post_div;

diff_best = diff;
+   ref_limit_best = ref_div_limit;
}
+   if (post_div >= post_div_max && diff_best != 0 && ref_div_limit 
!= 128) {
+   ref_div_limit = 128;
+   post_div = post_div_min - 1;
+   }
+
}
post_div = post_div_best;
  
  	/* get the feedback and reference divider for the optimal value */

amdgpu_pll_get_fb_ref_div(adev, nom, den, post_div, fb_div_max, 
ref_div_max,
- _div, _div);
+ ref_limit_best, _div, _div);
  
  	/* reduce the numbers to a simpler ratio once more */

/* this also makes sure that the reference divider is large enough */
diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index f12675e3d261..0fcbf45a68db 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -925,10 +925,10 @@ static void avivo_reduce_ratio(unsigned *nom, unsigned 
*den,
   */
  static void avivo_get_fb_ref_div(unsigned nom, unsigned den, unsigned 
post_div,
 unsigned fb_div_max, unsigned ref_div_max,
-unsigned *fb_div, unsigned *ref_div)
+unsigned ref_div_limit, unsigned *fb_div, 
unsigned *ref_div)
  {
/* limit reference * post divider to a maximum */
-   ref_div_max = max(min(100 / post_div, ref_div_max), 1u);
+   ref_div_max =