RE: [PATCH v2] drm/amd/amdgpu implement tdr advanced mode

2021-03-08 Thread Liu, Monk
[AMD Official Use Only - Internal Distribution Only] Christian what feasible and practice now is: 1) we implement the advanced TDR mode in upstream first (so we can copy the same scheme in our LTS kernel) -- if you want we can avoid change drm/scheduler part code, but that one is already

RE: [PATCH v2] drm/amd/amdgpu implement tdr advanced mode

2021-03-08 Thread Zhang, Jack (Jian)
[AMD Official Use Only - Internal Distribution Only] Hi, Christian, Since this change is a bit critical to our project, we would be grateful that could get your review. Are there anything that's is not clear enough I could help to explain? Again, Thanks for your huge help to our problem.

RE: [PATCH 4/7] drm/amdgpu: track what pmops flow we are in

2021-03-08 Thread Lazar, Lijo
[AMD Public Use] This seems a duplicate of dev_pm_info states. Can't we reuse that? Thanks, Lijo -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Tuesday, March 9, 2021 9:40 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: [PATCH 4/7] drm/amdgpu:

Re: [PATCH] drm/amdgpu: fix the hibernation suspend with s0ix

2021-03-08 Thread Huang Rui
On Tue, Mar 09, 2021 at 12:45:44PM +0800, Liang, Prike wrote: > > > > -Original Message- > > From: Alex Deucher > > Sent: Tuesday, March 9, 2021 12:07 PM > > To: Liang, Prike > > Cc: amd-gfx list ; Deucher, Alexander > > ; Huang, Ray > > Subject: Re: [PATCH] drm/amdgpu: fix the

RE: [PATCH] drm/amdgpu: remove ECO_BITS programing on gmc9

2021-03-08 Thread Xu, Feifei
[AMD Public Use] Thanks Anna. Result is good on SRIOV guest driver as well. Will push with Reviewed-by: Hawking Zhang Tested-by Anna Jin < anna@amd.com> Thanks, Feifei -Original Message- From: Zhang, Hawking Sent: 2021年3月5日 下午 8:51 To: Xu, Feifei ; amd-gfx@lists.freedesktop.org

RE: [PATCH] drm/amdgpu: fix the hibernation suspend with s0ix

2021-03-08 Thread Liang, Prike
> -Original Message- > From: Alex Deucher > Sent: Tuesday, March 9, 2021 12:07 PM > To: Liang, Prike > Cc: amd-gfx list ; Deucher, Alexander > ; Huang, Ray > Subject: Re: [PATCH] drm/amdgpu: fix the hibernation suspend with s0ix > > On Mon, Mar 8, 2021 at 10:52 PM Prike Liang

[PATCH 4/7] drm/amdgpu: track what pmops flow we are in

2021-03-08 Thread Alex Deucher
We reuse the same suspend and resume functions for all of the pmops states, so flag what state we are in so that we can alter behavior deeper in the driver depending on the current flow. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 20 +++-

[PATCH 3/7] drm/amdgpu: disentangle HG systems from vgaswitcheroo

2021-03-08 Thread Alex Deucher
There's no need to keep vgaswitcheroo around for HG systems. They don't use muxes and their power control is handled via ACPI. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 38 +-

[PATCH 6/7] drm/amdgpu: clean up S0ix logic

2021-03-08 Thread Alex Deucher
We only need special handling for the S0ix suspend and resume cases, legacy S3/S4/shutdown/reboot/reset should use the standard code pathes. This should fix systems with S0ix plus legacy S4. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 6 --

[PATCH 5/7] drm/amdgpu: don't evict vram on APUs for suspend to ram

2021-03-08 Thread Alex Deucher
Vram is system memory, so no need to evict. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index

[PATCH 7/7] drm/amdgpu: clean up non-DC suspend/resume handling

2021-03-08 Thread Alex Deucher
Move the non-DC specific code into the DCE IP blocks similar to how we handle DC. This cleans up the common suspend and resume pathes. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 82 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 88

[PATCH 2/7] drm/amdgpu: enable DPM_FLAG_MAY_SKIP_RESUME and DPM_FLAG_SMART_SUSPEND flags (v2)

2021-03-08 Thread Alex Deucher
Once the device has runtime suspended, we don't need to power it back up again for system suspend. Likewise for resume, we don't to power up the device again on resume only to power it back off again via runtime pm because it's still idle. v2: add DPM_FLAG_SMART_PREPARE as well Acked-by:

[PATCH 1/7] drm/amdgpu: add a dev_pm_ops prepare callback (v2)

2021-03-08 Thread Alex Deucher
as per: https://www.kernel.org/doc/html/latest/driver-api/pm/devices.html The prepare callback is required to support the DPM_FLAG_SMART_SUSPEND driver flag. This allows runtime pm to auto complete when the system goes into suspend avoiding a wake up on suspend and on resume. Apply this for

Re: [PATCH] drm/amdgpu: fix the hibernation suspend with s0ix

2021-03-08 Thread Alex Deucher
On Mon, Mar 8, 2021 at 10:52 PM Prike Liang wrote: > > During system hibernation suspend still need un-gate gfx CG/PG firstly to > handle HW > status check before HW resource destory. > > Signed-off-by: Prike Liang This is fine for stable, but we should work on cleaning this up. I have a

[PATCH] drm/amdgpu: fix the hibernation suspend with s0ix

2021-03-08 Thread Prike Liang
During system hibernation suspend still need un-gate gfx CG/PG firstly to handle HW status check before HW resource destory. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git

Re: [PATCH v2] drm/amdgpu: Verify bo size can fit framebuffer size on init.

2021-03-08 Thread Alex Deucher
On Mon, Mar 8, 2021 at 4:36 PM Mark Yacoub wrote: > > From: Mark Yacoub > > To initialize the framebuffer, call drm_gem_fb_init_with_funcs which > verifies that the BO size can fit the FB size by calculating the minimum > expected size of each plane. > > The bug was caught using igt-gpu-tools

Re: [PATCH 5/5] drm/amdgpu: use metadata members of struct amdgpu_bo_user

2021-03-08 Thread Felix Kuehling
Am 2021-03-05 um 10:06 a.m. schrieb Nirmoy Das: > These members are only needed for BOs created by > amdgpu_gem_object_create(), so we can remove these from the > base class. > > CC: felix.kuehl...@amd.com > Signed-off-by: Nirmoy Das Acked-by: Felix Kuehling > --- >

[PATCH 1/1] drm/amdkfd: fix build error with AMD_IOMMU_V2=m

2021-03-08 Thread Felix Kuehling
Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link against the exported functions. If the GPU driver is built-in but the IOMMU driver is a loadable module, the kfd_iommu.c file is indeed built but does not work: x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in

Re: [PATCH] [variant b] drm/amdkfd: fix build error with missing AMD_IOMMU_V2

2021-03-08 Thread Felix Kuehling
Am 2021-03-08 um 3:45 p.m. schrieb Arnd Bergmann: > From: Arnd Bergmann > > Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link > against the exported functions. If the GPU driver is built-in but the > IOMMU driver is a loadable module, the kfd_iommu.c file is indeed > built

[PATCH] drm/amdgpu: update secure display TA header

2021-03-08 Thread Jinzhou Su
update secure display TA header file. Signed-off-by: Jinzhou Su --- drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 3 +++ drivers/gpu/drm/amd/amdgpu/ta_secureDisplay_if.h | 1 + 2 files changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c

RE: [PATCH] drm/amdgpu: capture invalid hardware access v2

2021-03-08 Thread Li, Dennis
[AMD Official Use Only - Internal Distribution Only] Hi, Christian, amdgpu_device_skip_hw_access will always assert in reset thread, which seems not a good idea. Best Regards Dennis Li -Original Message- From: Christian König Sent: Tuesday, March 9, 2021 2:07 AM To:

2021 X.Org Foundation Membership renewal ENDS on THURSDAY Mar 11

2021-03-08 Thread Harry Wentland
The nomination period for the 2021 X.Org Foundation Board of Directors Election closed yesterday and the election is rapidly approaching. We currently only see membership renewals for 59 people. If you have not renewed your membership please do so by Thursday, Mar 11 at https://members.x.org.

Re: [PATCH v3] drm/amd/amdgpu implement tdr advanced mode

2021-03-08 Thread Andrey Grodzovsky
On 2021-03-08 7:33 a.m., Jack Zhang wrote: [Why] Previous tdr design treats the first job in job_timeout as the bad job. But sometimes a later bad compute job can block a good gfx job and cause an unexpected gfx job timeout because gfx and compute ring share internal GC HW mutually. [How]

Re: [PATCH] drm/amdgpu: Remove unnecessary conversion to bool

2021-03-08 Thread Alex Deucher
On Sun, Mar 7, 2021 at 10:14 PM Jiapeng Chong wrote: > > Fix the following coccicheck warnings: > > ./drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c:1600:40-45: WARNING: conversion > to bool not needed here. > > ./drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c:1598:40-45: WARNING: conversion > to bool not needed

Re: [PATCH] drm/amd/display: Remove unnecessary conversion to bool

2021-03-08 Thread Alex Deucher
On Sun, Mar 7, 2021 at 10:00 PM Jiapeng Chong wrote: > > Fix the following coccicheck warnings: > > ./drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c:561:34-39: WARNING: > conversion to bool not needed here. > > Reported-by: Abaci Robot > Signed-off-by: Jiapeng Chong This patch was already

[PATCH v2] drm/amdgpu: Verify bo size can fit framebuffer size on init.

2021-03-08 Thread Mark Yacoub
From: Mark Yacoub To initialize the framebuffer, call drm_gem_fb_init_with_funcs which verifies that the BO size can fit the FB size by calculating the minimum expected size of each plane. The bug was caught using igt-gpu-tools test: kms_addfb_basic.too-high and kms_addfb_basic.bo-too-small

Re: [PATCH] drm/amd/display: remove duplicate include in dcn21 and gpio

2021-03-08 Thread Alex Deucher
Applied. Thanks! Alex On Sat, Mar 6, 2021 at 6:05 AM wrote: > > From: Zhang Yunkai > > 'dce110_resource.h' included in 'dcn21_resource.c' is duplicated. > 'hw_gpio.h' included in 'hw_factory_dce110.c' is duplicated. > > Signed-off-by: Zhang Yunkai > --- >

Re: [PATCH] drm/amd/display: remove duplicate include in amdgpu_dm.c

2021-03-08 Thread Alex Deucher
Applied. Thanks! Alex On Sat, Mar 6, 2021 at 5:48 AM wrote: > > From: Zhang Yunkai > > 'drm/drm_hdcp.h' included in 'amdgpu_dm.c' is duplicated. > It is also included in the 79th line. > > Signed-off-by: Zhang Yunkai > --- > drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 - > 1 file

Re: [PATCH] drm/amdgpu: Verify bo size can fit framebuffer size on init.

2021-03-08 Thread Alex Deucher
On Thu, Mar 4, 2021 at 2:15 PM Mark Yacoub wrote: > > From: Mark Yacoub > > To initialize the framebuffer, use drm_gem_fb_init_with_funcs which > verifies that the BO size can fit the FB size by calculating the minimum > expected size of each plane. > > The bug was caught using igt-gpu-tools

Re: [PATCH] drm/amd/pm: correct the watermark settings for Polaris

2021-03-08 Thread Alex Deucher
On Fri, Mar 5, 2021 at 1:25 AM Evan Quan wrote: > > The "/ 10" should be applied to the right-hand operand instead of > the left-hand one. > > Change-Id: Ie730a1981aa5dee45cd6c3efccc7fb0f088cd679 > Signed-off-by: Evan Quan > Noticed-by: Georgios Toptsidis Reviewed-by: Alex Deucher > --- >

Re: [PATCH] gpu: drm: swsmu: fix error return code of smu_v11_0_set_allowed_mask()

2021-03-08 Thread Alex Deucher
Applied. Thanks! Alex On Thu, Mar 4, 2021 at 11:02 PM Quan, Evan wrote: > > [AMD Public Use] > > Thanks. Reviewed-by: Evan Quan > > -Original Message- > From: Jia-Ju Bai > Sent: Friday, March 5, 2021 11:54 AM > To: Deucher, Alexander ; Koenig, Christian > ; airl...@linux.ie;

RE: [PATCH] drm/amdgpu: add ih waiter on process until checkpoint

2021-03-08 Thread Kim, Jonathan
[AMD Official Use Only - Internal Distribution Only] > -Original Message- > From: Koenig, Christian > Sent: Saturday, March 6, 2021 4:12 AM > To: Kim, Jonathan ; Christian König > ; amd-gfx@lists.freedesktop.org > Cc: Yang, Philip ; Kuehling, Felix > > Subject: Re: [PATCH] drm/amdgpu:

[PATCH] [variant b] drm/amdkfd: fix build error with missing AMD_IOMMU_V2

2021-03-08 Thread Arnd Bergmann
From: Arnd Bergmann Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link against the exported functions. If the GPU driver is built-in but the IOMMU driver is a loadable module, the kfd_iommu.c file is indeed built but does not work: x86_64-linux-ld:

Re: [PATCH 3/6] amd/display: fail on cursor plane without an underlying plane

2021-03-08 Thread Kazlauskas, Nicholas
On 2021-03-08 3:18 p.m., Daniel Vetter wrote: On Fri, Mar 5, 2021 at 10:24 AM Michel Dänzer wrote: On 2021-03-04 7:26 p.m., Kazlauskas, Nicholas wrote: On 2021-03-04 10:35 a.m., Michel Dänzer wrote: On 2021-03-04 4:09 p.m., Kazlauskas, Nicholas wrote: On 2021-03-04 4:05 a.m., Michel Dänzer

Re: [PATCH 3/5] drm/amdgpu: fb BO should be ttm_bo_type_device

2021-03-08 Thread Christian König
Am 08.03.21 um 21:34 schrieb Alex Deucher: On Mon, Mar 8, 2021 at 3:20 PM Christian König wrote: Am 08.03.21 um 16:37 schrieb Nirmoy Das: FB BO should not be ttm_bo_type_kernel type and amdgpufb_create_pinned_object() pins the FB BO anyway. Mhm, why the heck was that a kernel object? Maybe

Re: [PATCH] drm/amdkfd: fix build error with missing AMD_IOMMU_V2

2021-03-08 Thread Arnd Bergmann
On Mon, Mar 8, 2021 at 9:12 PM Christian König wrote: > Am 08.03.21 um 21:02 schrieb Felix Kuehling: > > Am 2021-03-08 um 2:33 p.m. schrieb Arnd Bergmann: > > I don't want to create a hard dependency on AMD_IOMMU_V2 if I can avoid > > it, because it is only really needed for a small number of

Re: [PATCH 3/5] drm/amdgpu: fb BO should be ttm_bo_type_device

2021-03-08 Thread Alex Deucher
On Mon, Mar 8, 2021 at 3:20 PM Christian König wrote: > > Am 08.03.21 um 16:37 schrieb Nirmoy Das: > > FB BO should not be ttm_bo_type_kernel type and > > amdgpufb_create_pinned_object() pins the FB BO anyway. > > Mhm, why the heck was that a kernel object? Maybe because the fbcon was the main

Re: [PATCH 2/5] drm/amdgpu: introduce struct amdgpu_bo_user

2021-03-08 Thread Christian König
Am 08.03.21 um 16:37 schrieb Nirmoy Das: Implement a new struct amdgpu_bo_user as subclass of struct amdgpu_bo and a function to created amdgpu_bo_user bo with a flag to identify the owner. Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 28 ++

Re: [PATCH 5/5] drm/amdgpu: use amdgpu_bo_user bo for metadata and tiling flag

2021-03-08 Thread Christian König
Am 08.03.21 um 16:37 schrieb Nirmoy Das: Tiling flag and metadata are only needed for BOs created by amdgpu_gem_object_create(), so we can remove those from the base class. CC: felix.kuehl...@amd.com Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 -

Re: [PATCH 3/5] drm/amdgpu: fb BO should be ttm_bo_type_device

2021-03-08 Thread Christian König
Am 08.03.21 um 16:37 schrieb Nirmoy Das: FB BO should not be ttm_bo_type_kernel type and amdgpufb_create_pinned_object() pins the FB BO anyway. Mhm, why the heck was that a kernel object? Signed-off-by: Nirmoy Das Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c

Re: [PATCH 3/6] amd/display: fail on cursor plane without an underlying plane

2021-03-08 Thread Daniel Vetter
On Fri, Mar 5, 2021 at 10:24 AM Michel Dänzer wrote: > > On 2021-03-04 7:26 p.m., Kazlauskas, Nicholas wrote: > > On 2021-03-04 10:35 a.m., Michel Dänzer wrote: > >> On 2021-03-04 4:09 p.m., Kazlauskas, Nicholas wrote: > >>> On 2021-03-04 4:05 a.m., Michel Dänzer wrote: > On 2021-03-03 8:17

Re: [PATCH] drm/amdkfd: fix build error with missing AMD_IOMMU_V2

2021-03-08 Thread Christian König
Am 08.03.21 um 21:02 schrieb Felix Kuehling: Am 2021-03-08 um 2:33 p.m. schrieb Arnd Bergmann: On Mon, Mar 8, 2021 at 8:11 PM Felix Kuehling wrote: Am 2021-03-08 um 2:05 p.m. schrieb Arnd Bergmann: On Mon, Mar 8, 2021 at 5:24 PM Felix Kuehling wrote: The driver build should work without

Re: [PATCH] drm/amdkfd: fix build error with missing AMD_IOMMU_V2

2021-03-08 Thread Felix Kuehling
Am 2021-03-08 um 2:33 p.m. schrieb Arnd Bergmann: > On Mon, Mar 8, 2021 at 8:11 PM Felix Kuehling wrote: >> Am 2021-03-08 um 2:05 p.m. schrieb Arnd Bergmann: >>> On Mon, Mar 8, 2021 at 5:24 PM Felix Kuehling >>> wrote: The driver build should work without IOMMUv2. In amdkfd/Makefile, we

Re: [PATCH] drm/amdkfd: fix build error with missing AMD_IOMMU_V2

2021-03-08 Thread Arnd Bergmann
On Mon, Mar 8, 2021 at 8:11 PM Felix Kuehling wrote: > > Am 2021-03-08 um 2:05 p.m. schrieb Arnd Bergmann: > > On Mon, Mar 8, 2021 at 5:24 PM Felix Kuehling > > wrote: > >> The driver build should work without IOMMUv2. In amdkfd/Makefile, we > >> have this condition: > >> > >> ifneq

Re: [PATCH] drm/gem: add checks of drm_gem_object->funcs

2021-03-08 Thread Alex Deucher
On Mon, Mar 1, 2021 at 5:25 AM Christian König wrote: > > > > Am 01.03.21 um 11:04 schrieb Daniel Vetter: > > On Mon, Mar 1, 2021 at 10:56 AM Thomas Zimmermann > > wrote: > >> (cc'ing amd devs) > >> > >> Hi > >> > >> Am 28.02.21 um 17:10 schrieb Pavel Turinský: > >>> The checks were removed in

Re: [PATCH] drm/amdkfd: fix build error with missing AMD_IOMMU_V2

2021-03-08 Thread Felix Kuehling
Am 2021-03-08 um 2:05 p.m. schrieb Arnd Bergmann: > On Mon, Mar 8, 2021 at 5:24 PM Felix Kuehling wrote: >> The driver build should work without IOMMUv2. In amdkfd/Makefile, we >> have this condition: >> >> ifneq ($(CONFIG_AMD_IOMMU_V2),) >> AMDKFD_FILES += $(AMDKFD_PATH)/kfd_iommu.o >> endif >>

Re: [PATCH] drm/amdkfd: fix build error with missing AMD_IOMMU_V2

2021-03-08 Thread Arnd Bergmann
On Mon, Mar 8, 2021 at 5:24 PM Felix Kuehling wrote: > > The driver build should work without IOMMUv2. In amdkfd/Makefile, we > have this condition: > > ifneq ($(CONFIG_AMD_IOMMU_V2),) > AMDKFD_FILES += $(AMDKFD_PATH)/kfd_iommu.o > endif > > In amdkfd/kfd_iommu.h we define inline stubs of the

Re: [PATCH 3/3] drm/radeon: keep __user during cast

2021-03-08 Thread Alex Deucher
Series is: Reviewed-by: Alex Deucher On Mon, Mar 8, 2021 at 1:36 PM Christian König wrote: > > Silence static checker warning. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/radeon/radeon_ttm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git

[PATCH 3/3] drm/radeon: keep __user during cast

2021-03-08 Thread Christian König
Silence static checker warning. Signed-off-by: Christian König --- drivers/gpu/drm/radeon/radeon_ttm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index 5ea647f454d3..808941e31d34 100644 ---

[PATCH 2/3] drm/radeon: fix AGP dependency

2021-03-08 Thread Christian König
When AGP is compiled as module radeon must be compiled as module as well. Signed-off-by: Christian König --- drivers/gpu/drm/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index e392a90ca687..85b79a7fee63 100644 ---

[PATCH 1/3] drm/radeon: also init GEM funcs in radeon_gem_prime_import_sg_table

2021-03-08 Thread Christian König
Otherwise we will run into a NULL ptr deref. Signed-off-by: Christian König Bug: https://bugzilla.kernel.org/show_bug.cgi?id=212137 --- drivers/gpu/drm/radeon/radeon.h | 2 ++ drivers/gpu/drm/radeon/radeon_gem.c | 4 ++-- drivers/gpu/drm/radeon/radeon_prime.c | 2 ++ 3 files changed, 6

Re: [PATCH 1/5] drm/amdgpu: allow variable BO struct creation

2021-03-08 Thread Christian König
Am 08.03.21 um 16:37 schrieb Nirmoy Das: Allow allocating BO structures with different structure size than struct amdgpu_bo. Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 9 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 + 2 files changed, 8

[PATCH] drm/amdgpu: capture invalid hardware access v2

2021-03-08 Thread Christian König
From: Dennis Li When recovery thread has begun GPU reset, there should be not other threads to access hardware, otherwise system randomly hang. v2 (chk): rewritten from scratch, use trylock and lockdep instead of hand wiring the logic. Signed-off-by: Dennis Li Signed-off-by:

Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-08 Thread Andrey Grodzovsky
Sure, patch 4 Reviewed-by: Andrey Grodzovsky andrey.grodzov...@amd.com and patch 5 Acked-by: Andrey Grodzovsky andrey.grodzov...@amd.com since I am not sure about the KFD bits. Andrey On 2021-03-08 11:10 a.m., Liu, Shaoyun wrote: [AMD Official Use Only - Internal Distribution Only] Hi,

Re: [PATCH] drm/amdkfd: fix build error with missing AMD_IOMMU_V2

2021-03-08 Thread Felix Kuehling
The driver build should work without IOMMUv2. In amdkfd/Makefile, we have this condition: ifneq ($(CONFIG_AMD_IOMMU_V2),) AMDKFD_FILES += $(AMDKFD_PATH)/kfd_iommu.o endif In amdkfd/kfd_iommu.h we define inline stubs of the functions that are causing your link-failures if IOMMU_V2 is not enabled:

RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-08 Thread Liu, Shaoyun
[AMD Official Use Only - Internal Distribution Only] Hi, Andrey. The first 3 patches in this serial already been acked by Alex. D, can you help review the rest two ? Thanks Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Monday, March 8, 2021 10:53 AM To: Liu, Shaoyun

Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-08 Thread Andrey Grodzovsky
I see, thanks for explaning. Andrey On 2021-03-08 10:27 a.m., Liu, Shaoyun wrote: [AMD Official Use Only - Internal Distribution Only] Check the function amdgpu_xgmi_add_device, when psp XGMI TA is bot available , the driver will assign a faked hive ID 0x10 for all GPUs, it means all GPU

[PATCH 4/5] drm/amdgpu: use amdgpu_bo_create_user() for when possible

2021-03-08 Thread Nirmoy Das
Use amdgpu_bo_create_user() for all the BO allocations for ttm_bo_type_device type. CC: felix.kuehl...@amd.com Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 4 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-)

[PATCH 5/5] drm/amdgpu: use amdgpu_bo_user bo for metadata and tiling flag

2021-03-08 Thread Nirmoy Das
Tiling flag and metadata are only needed for BOs created by amdgpu_gem_object_create(), so we can remove those from the base class. CC: felix.kuehl...@amd.com Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 - drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 59

[PATCH 3/5] drm/amdgpu: fb BO should be ttm_bo_type_device

2021-03-08 Thread Nirmoy Das
FB BO should not be ttm_bo_type_kernel type and amdgpufb_create_pinned_object() pins the FB BO anyway. Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c

[PATCH 1/5] drm/amdgpu: allow variable BO struct creation

2021-03-08 Thread Nirmoy Das
Allow allocating BO structures with different structure size than struct amdgpu_bo. Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 9 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 + 2 files changed, 8 insertions(+), 2 deletions(-) diff --git

[PATCH 2/5] drm/amdgpu: introduce struct amdgpu_bo_user

2021-03-08 Thread Nirmoy Das
Implement a new struct amdgpu_bo_user as subclass of struct amdgpu_bo and a function to created amdgpu_bo_user bo with a flag to identify the owner. Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 28 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |

[PATCH] drm/amdkfd: fix build error with missing AMD_IOMMU_V2

2021-03-08 Thread Arnd Bergmann
From: Arnd Bergmann Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link against the exported functions. If the GPU driver is built-in but the IOMMU driver is a loadable module, the kfd_iommu.c file is indeed built but does not work: x86_64-linux-ld:

RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-08 Thread Liu, Shaoyun
[AMD Official Use Only - Internal Distribution Only] Check the function amdgpu_xgmi_add_device, when psp XGMI TA is bot available , the driver will assign a faked hive ID 0x10 for all GPUs, it means all GPU will belongs to one same hive . So I can still use hive->tb to sync the reset on

Re: [PATCH v1 12/15] powerpc/uaccess: Refactor get/put_user() and __get/put_user()

2021-03-08 Thread Christian König
The radeon warning is trivial to fix, going to send out a patch in a few moments. Regards, Christian. Am 08.03.21 um 13:14 schrieb Christophe Leroy: +Evgeniy for W1 Dallas +Alex & Christian for RADEON Le 07/03/2021 à 11:23, kernel test robot a écrit : Hi Christophe, I love your patch!

Re: [PATCH v1 12/15] powerpc/uaccess: Refactor get/put_user() and __get/put_user()

2021-03-08 Thread Christophe Leroy
+Evgeniy for W1 Dallas +Alex & Christian for RADEON Le 07/03/2021 à 11:23, kernel test robot a écrit : Hi Christophe, I love your patch! Perhaps something to improve: [auto build test WARNING on powerpc/next] [also build test WARNING on v5.12-rc2 next-20210305] [If your patch is applied to

Re: [RESEND 00/53] Rid GPU from W=1 warnings

2021-03-08 Thread Lee Jones
On Fri, 05 Mar 2021, Roland Scheidegger wrote: > The vmwgfx ones look all good to me, so for > 23-53: Reviewed-by: Roland Scheidegger > That said, they were already signed off by Zack, so not sure what > happened here. Yes, they were accepted at one point, then dropped without a reason. Since

Re: [PATCH 3/5] drm/amdgpu: use amdgpu_bo_create_user() for gem object

2021-03-08 Thread Nirmoy
On 3/8/21 2:58 PM, Christian König wrote: Am 08.03.21 um 14:56 schrieb Nirmoy: On 3/5/21 4:11 PM, Christian König wrote: We might need to use this for the KFD as well. Do you mean for amdgpu_amdkfd_alloc_gws() ? For example, yes. Basically all places where KFD allocated an BO with the

Re: [PATCH 3/5] drm/amdgpu: use amdgpu_bo_create_user() for gem object

2021-03-08 Thread Christian König
Am 08.03.21 um 14:56 schrieb Nirmoy: On 3/5/21 4:11 PM, Christian König wrote: We might need to use this for the KFD as well. Do you mean for amdgpu_amdkfd_alloc_gws() ? For example, yes. Basically all places where KFD allocated an BO with the TTM type device/user. Regards, Christian.

Re: [PATCH 3/5] drm/amdgpu: use amdgpu_bo_create_user() for gem object

2021-03-08 Thread Nirmoy
On 3/5/21 4:11 PM, Christian König wrote: We might need to use this for the KFD as well. Do you mean for amdgpu_amdkfd_alloc_gws() ? Regards, Nirmoy Christian. Am 05.03.21 um 15:35 schrieb Nirmoy Das: GEM objects encapsulate amdgpu_bo for userspace applications. Now that we have a

[PATCH 5/8] drm/amdgpu: use the new cursor in amdgpu_ttm_access_memory

2021-03-08 Thread Christian König
Separate the drm_mm_node walking from the actual handling. Signed-off-by: Christian König Acked-by: Oak Zeng Tested-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 67 +++-- 1 file changed, 18 insertions(+), 49 deletions(-) diff --git

[PATCH 8/8] drm/amdgpu: use the new cursor in the VM code

2021-03-08 Thread Christian König
Separate the drm_mm_node walking from the actual handling. Signed-off-by: Christian König Acked-by: Oak Zeng Tested-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 54 +- 1 file changed, 18 insertions(+), 36 deletions(-) diff --git

[PATCH 7/8] drm/amdgpu: use the new cursor in amdgpu_ttm_bo_eviction_valuable

2021-03-08 Thread Christian König
Separate the drm_mm_node walking from the actual handling. Signed-off-by: Christian König Acked-by: Oak Zeng Tested-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git

[PATCH 3/8] drm/amdgpu: use the new cursor in amdgpu_fill_buffer

2021-03-08 Thread Christian König
Separate the drm_mm_node walking from the actual handling. Signed-off-by: Christian König Acked-by: Oak Zeng Tested-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 65 ++--- 1 file changed, 15 insertions(+), 50 deletions(-) diff --git

[PATCH 4/8] drm/amdgpu: use new cursor in amdgpu_ttm_io_mem_pfn

2021-03-08 Thread Christian König
Separate the drm_mm_node walking from the actual handling. Signed-off-by: Christian König Acked-by: Oak Zeng Tested-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

[PATCH 6/8] drm/amdgpu: use new cursor in amdgpu_mem_visible

2021-03-08 Thread Christian König
Separate the drm_mm_node walking from the actual handling. Signed-off-by: Christian König Acked-by: Oak Zeng Tested-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

[PATCH 1/8] drm/amdgpu: new resource cursor

2021-03-08 Thread Christian König
Allows to walk over the drm_mm nodes in a TTM resource object. Signed-off-by: Christian König Acked-by: Oak Zeng Tested-by: Nirmoy Das --- .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h| 105 ++ 1 file changed, 105 insertions(+) create mode 100644

[PATCH 2/8] drm/amdgpu: use the new cursor in amdgpu_ttm_copy_mem_to_mem

2021-03-08 Thread Christian König
Separate the drm_mm_node walking from the actual handling. Signed-off-by: Christian König Acked-by: Oak Zeng Tested-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 87 - 1 file changed, 26 insertions(+), 61 deletions(-) diff --git

RE: [PATCH] drm/amdgpu: Check if FB BAR is enabled for ROM read

2021-03-08 Thread Zhang, Hawking
[AMD Public Use] Reviewed-by: Hawking Zhang Regards, Hawking From: Lazar, Lijo Sent: Monday, March 8, 2021 21:16 To: Lazar, Lijo ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Xu, Feifei ; Zhang, Hawking Subject: RE: [PATCH] drm/amdgpu: Check if FB BAR is enabled for ROM read

RE: [PATCH] drm/amdgpu: Check if FB BAR is enabled for ROM read

2021-03-08 Thread Lazar, Lijo
[AMD Public Use] From: amd-gfx On Behalf Of Lazar, Lijo Sent: Wednesday, March 3, 2021 10:15 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Xu, Feifei ; Zhang, Hawking Subject: [PATCH] drm/amdgpu: Check if FB BAR is enabled for ROM read [AMD Public Use] Some configurations

Re: [PATCH v2] drm/amd/amdgpu implement tdr advanced mode

2021-03-08 Thread Christian König
Hi Jack, yes that comes pretty close. I'm going over the patch right now. Some things still look a bit complicated to me, but I need to wrap my head around how and why we are doing it this way once more. Christian. Am 08.03.21 um 13:43 schrieb Zhang, Jack (Jian): [AMD Public Use] Hi,

RE: [PATCH v2] drm/amd/amdgpu implement tdr advanced mode

2021-03-08 Thread Zhang, Jack (Jian)
[AMD Public Use] Hi, Christian, I made some change on V3 patch that insert a dma_fence_wait for the first jobs after resubmit jobs. It seems simpler than the V2 patch. Is this what you first thinks of in your mind? Thanks, Jack -Original Message- From: Koenig, Christian Sent:

[PATCH v3] drm/amd/amdgpu implement tdr advanced mode

2021-03-08 Thread Jack Zhang
[Why] Previous tdr design treats the first job in job_timeout as the bad job. But sometimes a later bad compute job can block a good gfx job and cause an unexpected gfx job timeout because gfx and compute ring share internal GC HW mutually. [How] This patch implements an advanced tdr mode. 1.It

[PATCH] drm/amdgpu: Remove unnecessary conversion to bool

2021-03-08 Thread Jiapeng Chong
Fix the following coccicheck warnings: ./drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c:1600:40-45: WARNING: conversion to bool not needed here. ./drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c:1598:40-45: WARNING: conversion to bool not needed here. Reported-by: Abaci Robot Signed-off-by: Jiapeng Chong ---

[PATCH] gpu: drm: amd: amdgpu: fix error return code of amdgpu_acpi_init()

2021-03-08 Thread Jia-Ju Bai
Add error return code in error hanlding code of amdgpu_acpi_init(). Reported-by: TOTE Robot Signed-off-by: Jia-Ju Bai --- drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c

[PATCH] drm/amd/display: Remove unnecessary conversion to bool

2021-03-08 Thread Jiapeng Chong
Fix the following coccicheck warnings: ./drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c:561:34-39: WARNING: conversion to bool not needed here. Reported-by: Abaci Robot Signed-off-by: Jiapeng Chong --- drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c | 2 +- 1 file changed, 1

[PATCH] drm/amd/display: remove duplicate include in amdgpu_dm.c

2021-03-08 Thread menglong8 . dong
From: Zhang Yunkai 'drm/drm_hdcp.h' included in 'amdgpu_dm.c' is duplicated. It is also included in the 79th line. Signed-off-by: Zhang Yunkai --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 - 1 file changed, 1 deletion(-) diff --git

[PATCH] drm/amd/display: remove duplicate include in dcn21 and gpio

2021-03-08 Thread menglong8 . dong
From: Zhang Yunkai 'dce110_resource.h' included in 'dcn21_resource.c' is duplicated. 'hw_gpio.h' included in 'hw_factory_dce110.c' is duplicated. Signed-off-by: Zhang Yunkai --- drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 1 -