Re: [Intel-gfx] [PATCH v2 00/17] drm: cleanup: Use DRM_MODESET_LOCK_ALL_* helpers where possible

2021-10-01 Thread Ville Syrjälä
On Sat, Oct 02, 2021 at 01:05:47AM +0300, Ville Syrjälä wrote:
> On Fri, Oct 01, 2021 at 04:48:15PM -0400, Sean Paul wrote:
> > On Fri, Oct 01, 2021 at 10:00:50PM +0300, Ville Syrjälä wrote:
> > > On Fri, Oct 01, 2021 at 02:36:55PM -0400, Sean Paul wrote:
> > > > On Fri, Sep 24, 2021 at 08:43:07AM +0200, Fernando Ramos wrote:
> > > > > Hi all,
> > > > > 
> > > > > One of the things in the DRM TODO list ("Documentation/gpu/todo.rst") 
> > > > > was to
> > > > > "use DRM_MODESET_LOCAL_ALL_* helpers instead of boilerplate". That's 
> > > > > what this
> > > > > patch series is about.
> > > > > 
> > > > > You will find two types of changes here:
> > > > > 
> > > > >   - Replacing "drm_modeset_lock_all_ctx()" (and surrounding 
> > > > > boilerplate) with
> > > > > "DRM_MODESET_LOCK_ALL_BEGIN()/END()" in the remaining places (as 
> > > > > it has
> > > > > already been done in previous commits such as b7ea04d2)
> > > > > 
> > > > >   - Replacing "drm_modeset_lock_all()" with 
> > > > > "DRM_MODESET_LOCK_ALL_BEGIN()/END()"
> > > > > in the remaining places (as it has already been done in previous 
> > > > > commits
> > > > > such as 57037094)
> > > > > 
> > > > > Most of the changes are straight forward, except for a few cases in 
> > > > > the "amd"
> > > > > and "i915" drivers where some extra dancing was needed to overcome the
> > > > > limitation that the DRM_MODESET_LOCK_ALL_BEGIN()/END() macros can 
> > > > > only be used
> > > > > once inside the same function (the reason being that the macro 
> > > > > expansion
> > > > > includes *labels*, and you can not have two labels named the same 
> > > > > inside one
> > > > > function)
> > > > > 
> > > > > Notice that, even after this patch series, some places remain where
> > > > > "drm_modeset_lock_all()" and "drm_modeset_lock_all_ctx()" are still 
> > > > > present,
> > > > > all inside drm core (which makes sense), except for two (in "amd" and 
> > > > > "i915")
> > > > > which cannot be replaced due to the way they are being used.
> > > > > 
> > > > > Changes in v2:
> > > > > 
> > > > >   - Fix commit message typo
> > > > >   - Use the value returned by DRM_MODESET_LOCK_ALL_END when possible
> > > > >   - Split drm/i915 patch into two simpler ones
> > > > >   - Remove drm_modeset_(un)lock_all()
> > > > >   - Fix build problems in non-x86 platforms
> > > > > 
> > > > > Fernando Ramos (17):
> > > > >   drm: cleanup: drm_modeset_lock_all_ctx() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm/i915: cleanup: drm_modeset_lock_all_ctx() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm/msm: cleanup: drm_modeset_lock_all_ctx() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm: cleanup: drm_modeset_lock_all() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN() drm/vmwgfx: cleanup: 
> > > > > drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm/tegra: cleanup: drm_modeset_lock_all() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm/shmobile: cleanup: drm_modeset_lock_all() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm/radeon: cleanup: drm_modeset_lock_all() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm/omapdrm: cleanup: drm_modeset_lock_all() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm/nouveau: cleanup: drm_modeset_lock_all() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm/msm: cleanup: drm_modeset_lock_all() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm/i915: cleanup: drm_modeset_lock_all() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm/i915: cleanup: drm_modeset_lock_all() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN() part 2
> > > > >   drm/gma500: cleanup: drm_modeset_lock_all() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm/amd: cleanup: drm_modeset_lock_all() --> 
> > > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > > >   drm: cleanup: remove drm_modeset_(un)lock_all()
> > > > >   doc: drm: remove TODO entry regarding DRM_MODSET_LOCK_ALL cleanup
> > > > > 
> > > > 
> > > > Thank you for revising, Fernando! I've pushed the set to drm-misc-next 
> > > > (along
> > > > with the necessary drm-tip conflict resolutions).
> > > 
> > > Ugh. Did anyone actually review the locking changes this does?
> > > I shot the previous i915 stuff down because the commit messages
> > > did not address any of it.
> > 
> > I reviewed the set on 9/17, I didn't see your feedback on that thread.
> 
> It was much earlir than that.
> https://lists.freedesktop.org/archives/dri-devel/2021-June/313193.html
> 
> And I think I might have also shot down a similar thing earlier.
> 
> I was actually half considering sending a patch to nuke that
> misleading TODO item. I don't think anything which changes
> which locks are taken should be considred a starter level task.
> And the commit messages here don't seem to address any of it.

And i915 is now broken :(

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10680/fi-bwr-2160/boot.html

-- 

Re: [PATCH v2 00/17] drm: cleanup: Use DRM_MODESET_LOCK_ALL_* helpers where possible

2021-10-01 Thread Ville Syrjälä
On Fri, Oct 01, 2021 at 04:48:15PM -0400, Sean Paul wrote:
> On Fri, Oct 01, 2021 at 10:00:50PM +0300, Ville Syrjälä wrote:
> > On Fri, Oct 01, 2021 at 02:36:55PM -0400, Sean Paul wrote:
> > > On Fri, Sep 24, 2021 at 08:43:07AM +0200, Fernando Ramos wrote:
> > > > Hi all,
> > > > 
> > > > One of the things in the DRM TODO list ("Documentation/gpu/todo.rst") 
> > > > was to
> > > > "use DRM_MODESET_LOCAL_ALL_* helpers instead of boilerplate". That's 
> > > > what this
> > > > patch series is about.
> > > > 
> > > > You will find two types of changes here:
> > > > 
> > > >   - Replacing "drm_modeset_lock_all_ctx()" (and surrounding 
> > > > boilerplate) with
> > > > "DRM_MODESET_LOCK_ALL_BEGIN()/END()" in the remaining places (as it 
> > > > has
> > > > already been done in previous commits such as b7ea04d2)
> > > > 
> > > >   - Replacing "drm_modeset_lock_all()" with 
> > > > "DRM_MODESET_LOCK_ALL_BEGIN()/END()"
> > > > in the remaining places (as it has already been done in previous 
> > > > commits
> > > > such as 57037094)
> > > > 
> > > > Most of the changes are straight forward, except for a few cases in the 
> > > > "amd"
> > > > and "i915" drivers where some extra dancing was needed to overcome the
> > > > limitation that the DRM_MODESET_LOCK_ALL_BEGIN()/END() macros can only 
> > > > be used
> > > > once inside the same function (the reason being that the macro expansion
> > > > includes *labels*, and you can not have two labels named the same 
> > > > inside one
> > > > function)
> > > > 
> > > > Notice that, even after this patch series, some places remain where
> > > > "drm_modeset_lock_all()" and "drm_modeset_lock_all_ctx()" are still 
> > > > present,
> > > > all inside drm core (which makes sense), except for two (in "amd" and 
> > > > "i915")
> > > > which cannot be replaced due to the way they are being used.
> > > > 
> > > > Changes in v2:
> > > > 
> > > >   - Fix commit message typo
> > > >   - Use the value returned by DRM_MODESET_LOCK_ALL_END when possible
> > > >   - Split drm/i915 patch into two simpler ones
> > > >   - Remove drm_modeset_(un)lock_all()
> > > >   - Fix build problems in non-x86 platforms
> > > > 
> > > > Fernando Ramos (17):
> > > >   drm: cleanup: drm_modeset_lock_all_ctx() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm/i915: cleanup: drm_modeset_lock_all_ctx() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm/msm: cleanup: drm_modeset_lock_all_ctx() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN() 
> > > > drm/vmwgfx: cleanup: drm_modeset_lock_all() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm/tegra: cleanup: drm_modeset_lock_all() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm/shmobile: cleanup: drm_modeset_lock_all() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm/radeon: cleanup: drm_modeset_lock_all() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm/omapdrm: cleanup: drm_modeset_lock_all() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm/nouveau: cleanup: drm_modeset_lock_all() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm/msm: cleanup: drm_modeset_lock_all() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm/i915: cleanup: drm_modeset_lock_all() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm/i915: cleanup: drm_modeset_lock_all() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN() part 2
> > > >   drm/gma500: cleanup: drm_modeset_lock_all() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm/amd: cleanup: drm_modeset_lock_all() --> 
> > > > DRM_MODESET_LOCK_ALL_BEGIN()
> > > >   drm: cleanup: remove drm_modeset_(un)lock_all()
> > > >   doc: drm: remove TODO entry regarding DRM_MODSET_LOCK_ALL cleanup
> > > > 
> > > 
> > > Thank you for revising, Fernando! I've pushed the set to drm-misc-next 
> > > (along
> > > with the necessary drm-tip conflict resolutions).
> > 
> > Ugh. Did anyone actually review the locking changes this does?
> > I shot the previous i915 stuff down because the commit messages
> > did not address any of it.
> 
> I reviewed the set on 9/17, I didn't see your feedback on that thread.

It was much earlir than that.
https://lists.freedesktop.org/archives/dri-devel/2021-June/313193.html

And I think I might have also shot down a similar thing earlier.

I was actually half considering sending a patch to nuke that
misleading TODO item. I don't think anything which changes
which locks are taken should be considred a starter level task.
And the commit messages here don't seem to address any of it.

-- 
Ville Syrjälä
Intel


Re: [PATCH v2 00/17] drm: cleanup: Use DRM_MODESET_LOCK_ALL_* helpers where possible

2021-10-01 Thread Sean Paul
On Fri, Oct 01, 2021 at 10:00:50PM +0300, Ville Syrjälä wrote:
> On Fri, Oct 01, 2021 at 02:36:55PM -0400, Sean Paul wrote:
> > On Fri, Sep 24, 2021 at 08:43:07AM +0200, Fernando Ramos wrote:
> > > Hi all,
> > > 
> > > One of the things in the DRM TODO list ("Documentation/gpu/todo.rst") was 
> > > to
> > > "use DRM_MODESET_LOCAL_ALL_* helpers instead of boilerplate". That's what 
> > > this
> > > patch series is about.
> > > 
> > > You will find two types of changes here:
> > > 
> > >   - Replacing "drm_modeset_lock_all_ctx()" (and surrounding boilerplate) 
> > > with
> > > "DRM_MODESET_LOCK_ALL_BEGIN()/END()" in the remaining places (as it 
> > > has
> > > already been done in previous commits such as b7ea04d2)
> > > 
> > >   - Replacing "drm_modeset_lock_all()" with 
> > > "DRM_MODESET_LOCK_ALL_BEGIN()/END()"
> > > in the remaining places (as it has already been done in previous 
> > > commits
> > > such as 57037094)
> > > 
> > > Most of the changes are straight forward, except for a few cases in the 
> > > "amd"
> > > and "i915" drivers where some extra dancing was needed to overcome the
> > > limitation that the DRM_MODESET_LOCK_ALL_BEGIN()/END() macros can only be 
> > > used
> > > once inside the same function (the reason being that the macro expansion
> > > includes *labels*, and you can not have two labels named the same inside 
> > > one
> > > function)
> > > 
> > > Notice that, even after this patch series, some places remain where
> > > "drm_modeset_lock_all()" and "drm_modeset_lock_all_ctx()" are still 
> > > present,
> > > all inside drm core (which makes sense), except for two (in "amd" and 
> > > "i915")
> > > which cannot be replaced due to the way they are being used.
> > > 
> > > Changes in v2:
> > > 
> > >   - Fix commit message typo
> > >   - Use the value returned by DRM_MODESET_LOCK_ALL_END when possible
> > >   - Split drm/i915 patch into two simpler ones
> > >   - Remove drm_modeset_(un)lock_all()
> > >   - Fix build problems in non-x86 platforms
> > > 
> > > Fernando Ramos (17):
> > >   drm: cleanup: drm_modeset_lock_all_ctx() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm/i915: cleanup: drm_modeset_lock_all_ctx() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm/msm: cleanup: drm_modeset_lock_all_ctx() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN() 
> > > drm/vmwgfx: cleanup: drm_modeset_lock_all() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm/tegra: cleanup: drm_modeset_lock_all() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm/shmobile: cleanup: drm_modeset_lock_all() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm/radeon: cleanup: drm_modeset_lock_all() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm/omapdrm: cleanup: drm_modeset_lock_all() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm/nouveau: cleanup: drm_modeset_lock_all() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm/msm: cleanup: drm_modeset_lock_all() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm/i915: cleanup: drm_modeset_lock_all() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm/i915: cleanup: drm_modeset_lock_all() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN() part 2
> > >   drm/gma500: cleanup: drm_modeset_lock_all() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm/amd: cleanup: drm_modeset_lock_all() --> 
> > > DRM_MODESET_LOCK_ALL_BEGIN()
> > >   drm: cleanup: remove drm_modeset_(un)lock_all()
> > >   doc: drm: remove TODO entry regarding DRM_MODSET_LOCK_ALL cleanup
> > > 
> > 
> > Thank you for revising, Fernando! I've pushed the set to drm-misc-next 
> > (along
> > with the necessary drm-tip conflict resolutions).
> 
> Ugh. Did anyone actually review the locking changes this does?
> I shot the previous i915 stuff down because the commit messages
> did not address any of it.

I reviewed the set on 9/17, I didn't see your feedback on that thread.

Sean

> 
> -- 
> Ville Syrjälä
> Intel

-- 
Sean Paul, Software Engineer, Google / Chromium OS


Re: [PATCH 1/2] drm: Add Gamma and Degamma LUT sizes props to drm_crtc to validate.

2021-10-01 Thread Sean Paul
On Wed, Sep 29, 2021 at 03:39:25PM -0400, Mark Yacoub wrote:
> From: Mark Yacoub 
> 
> [Why]
> 1. drm_atomic_helper_check doesn't check for the LUT sizes of either Gamma
> or Degamma props in the new CRTC state, allowing any invalid size to
> be passed on.
> 2. Each driver has its own LUT size, which could also be different for
> legacy users.
> 
> [How]
> 1. Create |degamma_lut_size| and |gamma_lut_size| to save the LUT sizes
> assigned by the driver when it's initializing its color and CTM
> management.
> 2. Create drm_atomic_helper_check_crtc which is called by
> drm_atomic_helper_check to check the LUT sizes saved in drm_crtc that
> they match the sizes in the new CRTC state.
> 

Did you consider extending drm_color_lut_check() with the size checks?

> Fixes: igt@kms_color@pipe-A-invalid-gamma-lut-sizes on MTK
> Tested on Zork(amdgpu) and Jacuzzi(mediatek)
> 
> Signed-off-by: Mark Yacoub

nit: missing a space between name and email


> ---
>  drivers/gpu/drm/drm_atomic_helper.c | 56 +
>  drivers/gpu/drm/drm_color_mgmt.c|  2 ++
>  include/drm/drm_atomic_helper.h |  1 +
>  include/drm/drm_crtc.h  | 11 ++
>  4 files changed, 70 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
> b/drivers/gpu/drm/drm_atomic_helper.c
> index 2c0c6ec928200..265b9747250d1 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -930,6 +930,58 @@ drm_atomic_helper_check_planes(struct drm_device *dev,
>  }
>  EXPORT_SYMBOL(drm_atomic_helper_check_planes);
>  
> +/**
> + * drm_atomic_helper_check_planes - validate state object for CRTC changes

Ctrl+c/Ctrl+v error here

> + * @state: the driver state object
> + *
> + * Check the CRTC state object such as the Gamma/Degamma LUT sizes if the new

Are there missing words between "object" and "such"?

> + * state holds them.
> + *
> + * RETURNS:
> + * Zero for success or -errno
> + */
> +int drm_atomic_helper_check_crtc(struct drm_atomic_state *state)

drm_atomic_helper_check_crtcs to be consistent with
drm_atomic_helper_check_planes

> +{
> + struct drm_crtc *crtc;
> + struct drm_crtc_state *new_crtc_state;
> + int i;
> +
> + for_each_new_crtc_in_state (state, crtc, new_crtc_state, i) {

no space before (

> + if (new_crtc_state->gamma_lut) {

Perhaps gate these with a check of state->color_mgmt_changed first?

> + uint64_t supported_lut_size = crtc->gamma_lut_size;
> + uint32_t supported_legacy_lut_size = crtc->gamma_size;
> + uint32_t new_state_lut_size =
> + drm_color_lut_size(new_crtc_state->gamma_lut);

nit: new_state_lut_size and supported_lut_size can be pulled out to top level 
scope
to avoid re-instantiation on each iteration

> +
> + if (new_state_lut_size != supported_lut_size &&
> + new_state_lut_size != supported_legacy_lut_size) {

According to the docbook, "If drivers support multiple LUT sizes then they
should publish the largest size, and sub-sample smaller sized LUTs". So
should this check be > instead of != ?

> + DRM_DEBUG_DRIVER(

drm_dbg_state() is probably more appropriate

> + "Invalid Gamma LUT size. Should be %u 
> (or %u for legacy) but got %u.\n",
> + supported_lut_size,
> + supported_legacy_lut_size,
> + new_state_lut_size);
> + return -EINVAL;
> + }
> + }
> +
> + if (new_crtc_state->degamma_lut) {
> + uint32_t new_state_lut_size =
> + drm_color_lut_size(new_crtc_state->degamma_lut);
> + uint64_t supported_lut_size = crtc->degamma_lut_size;
> +
> + if (new_state_lut_size != supported_lut_size) {
> + DRM_DEBUG_DRIVER(

drm_dbg_state()

> + "Invalid Degamma LUT size. Should be %u 
> but got %u.\n",
> + supported_lut_size, new_state_lut_size);
> + return -EINVAL;
> + }
> + }
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(drm_atomic_helper_check_crtc);
> +
>  /**
>   * drm_atomic_helper_check - validate state object
>   * @dev: DRM device
> @@ -975,6 +1027,10 @@ int drm_atomic_helper_check(struct drm_device *dev,
>   if (ret)
>   return ret;
>  
> + ret = drm_atomic_helper_check_crtc(state);
> + if (ret)
> + return ret;
> +
>   if (state->legacy_cursor_update)
>   state->async_update = !drm_atomic_helper_async_check(dev, 
> state);
>  
> diff --git a/drivers/gpu/drm/drm_color_mgmt.c 
> b/drivers/gpu/drm/drm_color_mgmt.c
> index 

Re: [PATCH] drm/amdgpu/nv: add missing CONFIG_DRM_AMD_DC check

2021-10-01 Thread Randy Dunlap

On 10/1/21 12:48 PM, Alex Deucher wrote:

Check was missing for cyan skillfish.

Fixes: 82d96c34b0d49b ("drm/amd/display: add cyan_skillfish display support")
Reported-by: Randy Dunlap 
Signed-off-by: Alex Deucher 


Acked-by: Randy Dunlap  # build-tested

Thanks.


---
  drivers/gpu/drm/amd/amdgpu/nv.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 96975c8cc026..898e688be63c 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -891,8 +891,10 @@ int nv_set_ip_blocks(struct amdgpu_device *adev)
}
if (adev->enable_virtual_display || amdgpu_sriov_vf(adev))
amdgpu_device_ip_block_add(adev, _vkms_ip_block);
+#if defined(CONFIG_DRM_AMD_DC)
else if (amdgpu_device_has_dc_support(adev))
amdgpu_device_ip_block_add(adev, _ip_block);
+#endif
amdgpu_device_ip_block_add(adev, _v10_0_ip_block);
amdgpu_device_ip_block_add(adev, _v5_0_ip_block);
break;




--
~Randy


Re: [PATCH 2/2] amd/amdgpu_dm: Verify Gamma and Degamma LUT sizes using DRM Core check

2021-10-01 Thread Sean Paul
On Wed, Sep 29, 2021 at 03:39:26PM -0400, Mark Yacoub wrote:
> From: Mark Yacoub 
> 
> [Why]
> drm_atomic_helper_check_crtc now verifies both legacy and non-legacy LUT
> sizes. There is no need to check it within amdgpu_dm_atomic_check.
> 
> [How]
> Remove the local call to verify LUT sizes and use DRM Core function
> instead.
> 
> Tested on ChromeOS Zork.
> 
> Signed-off-by: Mark Yacoub 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 07adac1a8c42b..96a1d006b777e 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -10683,6 +10683,10 @@ static int amdgpu_dm_atomic_check(struct drm_device 
> *dev,
>   }
>   }
>  #endif
> + ret = drm_atomic_helper_check_crtc(state);
> + if (ret)
> + return ret;
> +
>   for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
> new_crtc_state, i) {
>   dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
>  
> @@ -10692,10 +10696,6 @@ static int amdgpu_dm_atomic_check(struct drm_device 
> *dev,
>   dm_old_crtc_state->dsc_force_changed == false)
>   continue;
>  
> - ret = amdgpu_dm_verify_lut_sizes(new_crtc_state);

>From a quick glance, I think you can now delete this function. It's called from
amdgpu_dm_update_crtc_color_mgmt() which is part of the commit, so the lut sizes
should have already been checked.

If the call from amdgpu_dm_update_crtc_color_mgmt() is not possible to remove,
you could replace it with a call to the new helper function. And if _that_ is
not possible, please make amdgpu_dm_verify_lut_sizes() static :-)

Sean

> - if (ret)
> - goto fail;
> -
>   if (!new_crtc_state->enable)
>   continue;
>  
> -- 
> 2.33.0.685.g46640cef36-goog
> 

-- 
Sean Paul, Software Engineer, Google / Chromium OS


[PATCH] drm/amdgpu/gmc9: convert to IP version checking

2021-10-01 Thread Alex Deucher
Use IP versions rather than asic_type to differentiate
IP version specific features.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 136 ++
 1 file changed, 71 insertions(+), 65 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 5551359d5dfd..cb82404df534 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -579,7 +579,7 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device 
*adev,
 * the new fast GRBM interface.
 */
if ((entry->vmid_src == AMDGPU_GFXHUB_0) &&
-   (adev->asic_type < CHIP_ALDEBARAN))
+   (adev->ip_versions[GC_HWIP][0] < IP_VERSION(9, 4, 2)))
RREG32(hub->vm_l2_pro_fault_status);
 
status = RREG32(hub->vm_l2_pro_fault_status);
@@ -597,26 +597,28 @@ static int gmc_v9_0_process_interrupt(struct 
amdgpu_device *adev,
gfxhub_client_ids[cid],
cid);
} else {
-   switch (adev->asic_type) {
-   case CHIP_VEGA10:
+   switch (adev->ip_versions[MMHUB_HWIP][0]) {
+   case IP_VERSION(9, 0, 0):
mmhub_cid = mmhub_client_ids_vega10[cid][rw];
break;
-   case CHIP_VEGA12:
+   case IP_VERSION(9, 3, 0):
mmhub_cid = mmhub_client_ids_vega12[cid][rw];
break;
-   case CHIP_VEGA20:
+   case IP_VERSION(9, 4, 0):
mmhub_cid = mmhub_client_ids_vega20[cid][rw];
break;
-   case CHIP_ARCTURUS:
+   case IP_VERSION(9, 4, 1):
mmhub_cid = mmhub_client_ids_arcturus[cid][rw];
break;
-   case CHIP_RAVEN:
+   case IP_VERSION(9, 1, 0):
+   case IP_VERSION(9, 2, 0):
mmhub_cid = mmhub_client_ids_raven[cid][rw];
break;
-   case CHIP_RENOIR:
+   case IP_VERSION(1, 5, 0):
+   case IP_VERSION(2, 4, 0):
mmhub_cid = mmhub_client_ids_renoir[cid][rw];
break;
-   case CHIP_ALDEBARAN:
+   case IP_VERSION(9, 4, 2):
mmhub_cid = mmhub_client_ids_aldebaran[cid][rw];
break;
default:
@@ -694,7 +696,7 @@ static uint32_t gmc_v9_0_get_invalidate_req(unsigned int 
vmid,
 static bool gmc_v9_0_use_invalidate_semaphore(struct amdgpu_device *adev,
   uint32_t vmhub)
 {
-   if (adev->asic_type == CHIP_ALDEBARAN)
+   if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 2))
return false;
 
return ((vmhub == AMDGPU_MMHUB_0 ||
@@ -745,7 +747,7 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
 
hub = >vmhub[vmhub];
if (adev->gmc.xgmi.num_physical_nodes &&
-   adev->asic_type == CHIP_VEGA20) {
+   adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 0)) {
/* Vega20+XGMI caches PTEs in TC and TLB. Add a
 * heavy-weight TLB flush (type 2), which flushes
 * both. Due to a race condition with concurrent
@@ -808,7 +810,7 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
 * GRBM interface.
 */
if ((vmhub == AMDGPU_GFXHUB_0) &&
-   (adev->asic_type < CHIP_ALDEBARAN))
+   (adev->ip_versions[GC_HWIP][0] < IP_VERSION(9, 4, 2)))
RREG32_NO_KIQ(hub->vm_inv_eng0_req +
  hub->eng_distance * eng);
 
@@ -874,7 +876,7 @@ static int gmc_v9_0_flush_gpu_tlb_pasid(struct 
amdgpu_device *adev,
 * still need a second TLB flush after this.
 */
bool vega20_xgmi_wa = (adev->gmc.xgmi.num_physical_nodes &&
-  adev->asic_type == CHIP_VEGA20);
+  adev->ip_versions[GC_HWIP][0] == 
IP_VERSION(9, 4, 0));
/* 2 dwords flush + 8 dwords fence */
unsigned int ndw = kiq->pmf->invalidate_tlbs_size + 8;
 
@@ -1088,13 +1090,13 @@ static void gmc_v9_0_get_vm_pte(struct amdgpu_device 
*adev,
*flags &= ~AMDGPU_PTE_VALID;
}
 
-   if ((adev->asic_type == CHIP_ARCTURUS ||
-   adev->asic_type == CHIP_ALDEBARAN) &&
+   if ((adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 1) ||
+adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 2)) &&
!(*flags & AMDGPU_PTE_SYSTEM) &&
mapping->bo_va->is_xgmi)
*flags |= AMDGPU_PTE_SNOOPED;
 
-   if (adev->asic_type == CHIP_ALDEBARAN)
+   if 

[PATCH] drm/amdgpu/nv: add missing CONFIG_DRM_AMD_DC check

2021-10-01 Thread Alex Deucher
Check was missing for cyan skillfish.

Fixes: 82d96c34b0d49b ("drm/amd/display: add cyan_skillfish display support")
Reported-by: Randy Dunlap 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/nv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 96975c8cc026..898e688be63c 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -891,8 +891,10 @@ int nv_set_ip_blocks(struct amdgpu_device *adev)
}
if (adev->enable_virtual_display || amdgpu_sriov_vf(adev))
amdgpu_device_ip_block_add(adev, _vkms_ip_block);
+#if defined(CONFIG_DRM_AMD_DC)
else if (amdgpu_device_has_dc_support(adev))
amdgpu_device_ip_block_add(adev, _ip_block);
+#endif
amdgpu_device_ip_block_add(adev, _v10_0_ip_block);
amdgpu_device_ip_block_add(adev, _v5_0_ip_block);
break;
-- 
2.31.1



[PATCH] drm/amdgpu/display: fix dependencies for DRM_AMD_DC_SI

2021-10-01 Thread Alex Deucher
Depends on DRM_AMDGPU_SI and DRM_AMD_DC

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/Kconfig 
b/drivers/gpu/drm/amd/display/Kconfig
index 7dffc04a557e..127667e549c1 100644
--- a/drivers/gpu/drm/amd/display/Kconfig
+++ b/drivers/gpu/drm/amd/display/Kconfig
@@ -25,6 +25,8 @@ config DRM_AMD_DC_HDCP
 
 config DRM_AMD_DC_SI
bool "AMD DC support for Southern Islands ASICs"
+   depends on DRM_AMDGPU_SI
+   depends on DRM_AMD_DC
default n
help
  Choose this option to enable new AMD DC support for SI asics
-- 
2.31.1



Re: linux-next: Tree for Oct 1 [drivers/gpu/drm/amd/amdgpu/amdgpu.ko]

2021-10-01 Thread Randy Dunlap

On 10/1/21 12:09 AM, Stephen Rothwell wrote:

Hi all,

News: there will be no linux-next release on Monday.

Changes since 20210930:



on i386:

ERROR: modpost: "dm_ip_block" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

Full randconfig file is attached.

--
~Randy


config-r5146.gz
Description: application/gzip


Re: [PATCH v2 00/17] drm: cleanup: Use DRM_MODESET_LOCK_ALL_* helpers where possible

2021-10-01 Thread Ville Syrjälä
On Fri, Oct 01, 2021 at 02:36:55PM -0400, Sean Paul wrote:
> On Fri, Sep 24, 2021 at 08:43:07AM +0200, Fernando Ramos wrote:
> > Hi all,
> > 
> > One of the things in the DRM TODO list ("Documentation/gpu/todo.rst") was to
> > "use DRM_MODESET_LOCAL_ALL_* helpers instead of boilerplate". That's what 
> > this
> > patch series is about.
> > 
> > You will find two types of changes here:
> > 
> >   - Replacing "drm_modeset_lock_all_ctx()" (and surrounding boilerplate) 
> > with
> > "DRM_MODESET_LOCK_ALL_BEGIN()/END()" in the remaining places (as it has
> > already been done in previous commits such as b7ea04d2)
> > 
> >   - Replacing "drm_modeset_lock_all()" with 
> > "DRM_MODESET_LOCK_ALL_BEGIN()/END()"
> > in the remaining places (as it has already been done in previous commits
> > such as 57037094)
> > 
> > Most of the changes are straight forward, except for a few cases in the 
> > "amd"
> > and "i915" drivers where some extra dancing was needed to overcome the
> > limitation that the DRM_MODESET_LOCK_ALL_BEGIN()/END() macros can only be 
> > used
> > once inside the same function (the reason being that the macro expansion
> > includes *labels*, and you can not have two labels named the same inside one
> > function)
> > 
> > Notice that, even after this patch series, some places remain where
> > "drm_modeset_lock_all()" and "drm_modeset_lock_all_ctx()" are still present,
> > all inside drm core (which makes sense), except for two (in "amd" and 
> > "i915")
> > which cannot be replaced due to the way they are being used.
> > 
> > Changes in v2:
> > 
> >   - Fix commit message typo
> >   - Use the value returned by DRM_MODESET_LOCK_ALL_END when possible
> >   - Split drm/i915 patch into two simpler ones
> >   - Remove drm_modeset_(un)lock_all()
> >   - Fix build problems in non-x86 platforms
> > 
> > Fernando Ramos (17):
> >   drm: cleanup: drm_modeset_lock_all_ctx() --> DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm/i915: cleanup: drm_modeset_lock_all_ctx() --> 
> > DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm/msm: cleanup: drm_modeset_lock_all_ctx() --> 
> > DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN() 
> > drm/vmwgfx: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm/tegra: cleanup: drm_modeset_lock_all() --> 
> > DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm/shmobile: cleanup: drm_modeset_lock_all() --> 
> > DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm/radeon: cleanup: drm_modeset_lock_all() --> 
> > DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm/omapdrm: cleanup: drm_modeset_lock_all() --> 
> > DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm/nouveau: cleanup: drm_modeset_lock_all() --> 
> > DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm/msm: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm/i915: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm/i915: cleanup: drm_modeset_lock_all() --> 
> > DRM_MODESET_LOCK_ALL_BEGIN() part 2
> >   drm/gma500: cleanup: drm_modeset_lock_all() --> 
> > DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm/amd: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
> >   drm: cleanup: remove drm_modeset_(un)lock_all()
> >   doc: drm: remove TODO entry regarding DRM_MODSET_LOCK_ALL cleanup
> > 
> 
> Thank you for revising, Fernando! I've pushed the set to drm-misc-next (along
> with the necessary drm-tip conflict resolutions).

Ugh. Did anyone actually review the locking changes this does?
I shot the previous i915 stuff down because the commit messages
did not address any of it.

-- 
Ville Syrjälä
Intel


Re: [PATCH v2 00/17] drm: cleanup: Use DRM_MODESET_LOCK_ALL_* helpers where possible

2021-10-01 Thread Sean Paul
On Fri, Sep 24, 2021 at 08:43:07AM +0200, Fernando Ramos wrote:
> Hi all,
> 
> One of the things in the DRM TODO list ("Documentation/gpu/todo.rst") was to
> "use DRM_MODESET_LOCAL_ALL_* helpers instead of boilerplate". That's what this
> patch series is about.
> 
> You will find two types of changes here:
> 
>   - Replacing "drm_modeset_lock_all_ctx()" (and surrounding boilerplate) with
> "DRM_MODESET_LOCK_ALL_BEGIN()/END()" in the remaining places (as it has
> already been done in previous commits such as b7ea04d2)
> 
>   - Replacing "drm_modeset_lock_all()" with 
> "DRM_MODESET_LOCK_ALL_BEGIN()/END()"
> in the remaining places (as it has already been done in previous commits
> such as 57037094)
> 
> Most of the changes are straight forward, except for a few cases in the "amd"
> and "i915" drivers where some extra dancing was needed to overcome the
> limitation that the DRM_MODESET_LOCK_ALL_BEGIN()/END() macros can only be used
> once inside the same function (the reason being that the macro expansion
> includes *labels*, and you can not have two labels named the same inside one
> function)
> 
> Notice that, even after this patch series, some places remain where
> "drm_modeset_lock_all()" and "drm_modeset_lock_all_ctx()" are still present,
> all inside drm core (which makes sense), except for two (in "amd" and "i915")
> which cannot be replaced due to the way they are being used.
> 
> Changes in v2:
> 
>   - Fix commit message typo
>   - Use the value returned by DRM_MODESET_LOCK_ALL_END when possible
>   - Split drm/i915 patch into two simpler ones
>   - Remove drm_modeset_(un)lock_all()
>   - Fix build problems in non-x86 platforms
> 
> Fernando Ramos (17):
>   drm: cleanup: drm_modeset_lock_all_ctx() --> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm/i915: cleanup: drm_modeset_lock_all_ctx() --> 
> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm/msm: cleanup: drm_modeset_lock_all_ctx() --> 
> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN() 
> drm/vmwgfx: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm/tegra: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm/shmobile: cleanup: drm_modeset_lock_all() --> 
> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm/radeon: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm/omapdrm: cleanup: drm_modeset_lock_all() --> 
> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm/nouveau: cleanup: drm_modeset_lock_all() --> 
> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm/msm: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm/i915: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm/i915: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN() 
> part 2
>   drm/gma500: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm/amd: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
>   drm: cleanup: remove drm_modeset_(un)lock_all()
>   doc: drm: remove TODO entry regarding DRM_MODSET_LOCK_ALL cleanup
> 

Thank you for revising, Fernando! I've pushed the set to drm-misc-next (along
with the necessary drm-tip conflict resolutions).

Sean

>  Documentation/gpu/todo.rst| 17 
>  Documentation/locking/ww-mutex-design.rst |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 21 +++--
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 50 +-
>  .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 25 ++---
>  drivers/gpu/drm/drm_client_modeset.c  | 14 ++-
>  drivers/gpu/drm/drm_crtc_helper.c | 18 ++--
>  drivers/gpu/drm/drm_fb_helper.c   | 10 +-
>  drivers/gpu/drm/drm_framebuffer.c |  6 +-
>  drivers/gpu/drm/drm_modeset_lock.c| 94 +--
>  drivers/gpu/drm/gma500/psb_device.c   | 18 ++--
>  drivers/gpu/drm/i915/display/intel_audio.c| 16 ++--
>  drivers/gpu/drm/i915/display/intel_display.c  | 23 ++---
>  .../drm/i915/display/intel_display_debugfs.c  | 46 +
>  drivers/gpu/drm/i915/display/intel_overlay.c  | 46 -
>  drivers/gpu/drm/i915/display/intel_pipe_crc.c |  7 +-
>  drivers/gpu/drm/i915/i915_drv.c   | 13 ++-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c  | 10 +-
>  .../gpu/drm/msm/disp/msm_disp_snapshot_util.c | 12 +--
>  drivers/gpu/drm/nouveau/dispnv50/disp.c   | 15 ++-
>  drivers/gpu/drm/omapdrm/omap_fb.c |  9 +-
>  drivers/gpu/drm/radeon/radeon_device.c| 21 +++--
>  drivers/gpu/drm/radeon/radeon_dp_mst.c| 10 +-
>  drivers/gpu/drm/shmobile/shmob_drm_drv.c  |  6 +-
>  drivers/gpu/drm/tegra/dsi.c   |  6 +-
>  drivers/gpu/drm/tegra/hdmi.c  |  6 +-
>  drivers/gpu/drm/tegra/sor.c   | 11 ++-
>  drivers/gpu/drm/vmwgfx/vmwgfx_ioctl.c | 11 ++-
>  drivers/gpu/drm/vmwgfx/vmwgfx_kms.c   | 12 ++-
>  include/drm/drm_modeset_lock.h|  2 -
>  30 files changed, 265 

Re: [PATCH] drm/amdgpu: add missing case for HDP for renoir

2021-10-01 Thread Rodrigo Siqueira
Hi,

It looks like Nicholas Choi tested your patch, and it fixed the issue.

Reviewed-by: Rodrigo Siqueira 

Thanks

On 10/01, Alex Deucher wrote:
> Missing 4.1.2.
> 
> Fixes: 30bce8c7213829 ("drm/amdgpu: add initial IP discovery support for vega 
> based parts")
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> index 291a47f7992a..eca5515c34b4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> @@ -1151,6 +1151,7 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device 
> *adev)
>   case IP_VERSION(4, 0, 1):
>   case IP_VERSION(4, 1, 0):
>   case IP_VERSION(4, 1, 1):
> + case IP_VERSION(4, 1, 2):
>   case IP_VERSION(4, 2, 0):
>   case IP_VERSION(4, 2, 1):
>   case IP_VERSION(4, 4, 0):
> -- 
> 2.31.1
> 

-- 
Rodrigo Siqueira
https://siqueira.tech


[PATCH] drm/amdgpu: add missing case for HDP for renoir

2021-10-01 Thread Alex Deucher
Missing 4.1.2.

Fixes: 30bce8c7213829 ("drm/amdgpu: add initial IP discovery support for vega 
based parts")
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 291a47f7992a..eca5515c34b4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -1151,6 +1151,7 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device 
*adev)
case IP_VERSION(4, 0, 1):
case IP_VERSION(4, 1, 0):
case IP_VERSION(4, 1, 1):
+   case IP_VERSION(4, 1, 2):
case IP_VERSION(4, 2, 0):
case IP_VERSION(4, 2, 1):
case IP_VERSION(4, 4, 0):
-- 
2.31.1



RE: [PATCH] drm/amdgpu/display: fold DRM_AMD_DC_DCN201 into DRM_AMD_DC_DCN

2021-10-01 Thread Liu, Zhan
[Public]

> -Original Message-
> From: amd-gfx  On Behalf Of Alex
> Deucher
> Sent: 2021/October/01, Friday 10:31 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: [PATCH] drm/amdgpu/display: fold DRM_AMD_DC_DCN201 into
> DRM_AMD_DC_DCN
>
> No need for a separate kconfig option at this point.
>
> Signed-off-by: Alex Deucher 

Reviewed-by: Zhan Liu 

> ---
>  drivers/gpu/drm/amd/display/Kconfig   | 9 -
>  drivers/gpu/drm/amd/display/dc/Makefile   | 2 --
>  drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile   | 2 --
>  drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c  | 2 --
> drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 2 --
>  drivers/gpu/drm/amd/display/dc/irq/Makefile   | 2 --
>  6 files changed, 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/Kconfig
> b/drivers/gpu/drm/amd/display/Kconfig
> index fb074a6067b2..7dffc04a557e 100644
> --- a/drivers/gpu/drm/amd/display/Kconfig
> +++ b/drivers/gpu/drm/amd/display/Kconfig
> @@ -17,15 +17,6 @@ config DRM_AMD_DC_DCN
>   help
> Raven, Navi, and newer family support for display engine
>
> -config DRM_AMD_DC_DCN201
> - bool "Enable DCN201 support in DC"
> - default y
> - depends on DRM_AMD_DC && X86
> - depends on DRM_AMD_DC_DCN
> - help
> -   Choose this option if you want to have
> -   201 support for display engine
> -
>  config DRM_AMD_DC_HDCP
>   bool "Enable HDCP support in DC"
>   depends on DRM_AMD_DC
> diff --git a/drivers/gpu/drm/amd/display/dc/Makefile
> b/drivers/gpu/drm/amd/display/dc/Makefile
> index 520f58538364..b5482980e995 100644
> --- a/drivers/gpu/drm/amd/display/dc/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/Makefile
> @@ -30,9 +30,7 @@ DC_LIBS += dcn20
>  DC_LIBS += dsc
>  DC_LIBS += dcn10 dml
>  DC_LIBS += dcn21
> -ifdef CONFIG_DRM_AMD_DC_DCN201
>  DC_LIBS += dcn201
> -endif
>  DC_LIBS += dcn30
>  DC_LIBS += dcn301
>  DC_LIBS += dcn302
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> index 7f70985b7a1b..6bd73e49a6d2 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> @@ -93,7 +93,6 @@ AMD_DAL_CLK_MGR_DCN20 = $(addprefix
> $(AMDDALPATH)/dc/clk_mgr/dcn20/,$(CLK_MGR_DC
>
>  AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN20)
>
> -ifdef CONFIG_DRM_AMD_DC_DCN201
>
> #
> ##
>  # DCN201
>
> #
> ##
> @@ -102,7 +101,6 @@ CLK_MGR_DCN201 = dcn201_clk_mgr.o
>  AMD_DAL_CLK_MGR_DCN201 = $(addprefix
> $(AMDDALPATH)/dc/clk_mgr/dcn201/,$(CLK_MGR_DCN201))
>
>  AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN201) -endif
>
>
> #
> ##
>  # DCN21
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
> b/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
> index 421f5135b701..1548b2a3fe03 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
> @@ -257,12 +257,10 @@ struct clk_mgr *dc_clk_mgr_create(struct dc_context
> *ctx, struct pp_smu_funcs *p
>   dcn3_clk_mgr_construct(ctx, clk_mgr, pp_smu, dccg);
>   return _mgr->base;
>   }
> -#if defined(CONFIG_DRM_AMD_DC_DCN201)
>   if (asic_id.chip_id == DEVICE_ID_NV_13FE) {
>   dcn201_clk_mgr_construct(ctx, clk_mgr, pp_smu, dccg);
>   return _mgr->base;
>   }
> -#endif
>   dcn20_clk_mgr_construct(ctx, clk_mgr, pp_smu, dccg);
>   return _mgr->base;
>   }
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> index fc490b77f47d..561c10a92bb5 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> @@ -224,11 +224,9 @@ struct resource_pool
> *dc_create_resource_pool(struct dc  *dc,
>   case DCN_VERSION_2_1:
>   res_pool = dcn21_create_resource_pool(init_data, dc);
>   break;
> -#if defined(CONFIG_DRM_AMD_DC_DCN201)
>   case DCN_VERSION_2_01:
>   res_pool = dcn201_create_resource_pool(init_data, dc);
>   break;
> -#endif
>   case DCN_VERSION_3_0:
>   res_pool = dcn30_create_resource_pool(init_data, dc);
>   break;
> diff --git a/drivers/gpu/drm/amd/display/dc/irq/Makefile
> b/drivers/gpu/drm/amd/display/dc/irq/Makefile
> index 8a182772eed2..fd739aecf104 100644
> --- a/drivers/gpu/drm/amd/display/dc/irq/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/irq/Makefile
> @@ -94,7 +94,6 @@ AMD_DAL_IRQ_DCN21= $(addprefix
> $(AMDDALPATH)/dc/irq/dcn21/,$(IRQ_DCN21))
>
>  AMD_DISPLAY_FILES += $(AMD_DAL_IRQ_DCN21)
>
> -ifdef 

Re: [PATCH] drm/amdgpu: remove some repeated includings

2021-10-01 Thread Alex Deucher
Applied.  Thanks!

Alex

On Fri, Oct 1, 2021 at 6:16 AM Christian König  wrote:
>
> Am 01.10.21 um 12:13 schrieb Guo Zhengkui:
> > Remove two repeated includings in line 46 and 47.
> >
> > Signed-off-by: Guo Zhengkui 
>
> Acked-by: Christian König 
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 --
> >   1 file changed, 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> > index 291a47f7992a..daa798c5b882 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> > @@ -30,34 +30,32 @@
> >
> >   #include "soc15.h"
> >   #include "gfx_v9_0.h"
> >   #include "gmc_v9_0.h"
> >   #include "df_v1_7.h"
> >   #include "df_v3_6.h"
> >   #include "nbio_v6_1.h"
> >   #include "nbio_v7_0.h"
> >   #include "nbio_v7_4.h"
> >   #include "hdp_v4_0.h"
> >   #include "vega10_ih.h"
> >   #include "vega20_ih.h"
> >   #include "sdma_v4_0.h"
> >   #include "uvd_v7_0.h"
> >   #include "vce_v4_0.h"
> >   #include "vcn_v1_0.h"
> > -#include "vcn_v2_0.h"
> > -#include "jpeg_v2_0.h"
> >   #include "vcn_v2_5.h"
> >   #include "jpeg_v2_5.h"
> >   #include "smuio_v9_0.h"
> >   #include "gmc_v10_0.h"
> >   #include "gfxhub_v2_0.h"
> >   #include "mmhub_v2_0.h"
> >   #include "nbio_v2_3.h"
> >   #include "nbio_v7_2.h"
> >   #include "hdp_v5_0.h"
> >   #include "nv.h"
> >   #include "navi10_ih.h"
> >   #include "gfx_v10_0.h"
> >   #include "sdma_v5_0.h"
> >   #include "sdma_v5_2.h"
> >   #include "vcn_v2_0.h"
> >   #include "jpeg_v2_0.h"
>


RE: [PATCH] drm/amdgpu: add missed write lock for pci detected state pci_channel_io_normal

2021-10-01 Thread Chen, Guchun
[Public]

Got your point. Will send a new patch to address this.

Regards,
Guchun

-Original Message-
From: Grodzovsky, Andrey  
Sent: Friday, October 1, 2021 10:29 PM
To: Chen, Guchun ; amd-gfx@lists.freedesktop.org; Koenig, 
Christian ; Pan, Xinhui ; 
Deucher, Alexander 
Subject: Re: [PATCH] drm/amdgpu: add missed write lock for pci detected state 
pci_channel_io_normal

No, scheduler restart and device unlock must take place inamdgpu_pci_resume 
(see struct pci_error_handlers for the various states of PCI recovery). So just 
add a flag (probably in amdgpu_device) so we can remember what 
pci_channel_state_t we came from (unfortunately it's not passed to us in  
amdgpu_pci_resume) and unless it's set don't do anything in amdgpu_pci_resume.

Andrey

On 2021-10-01 4:21 a.m., Chen, Guchun wrote:
> [Public]
>
> Hi Andrey,
>
> Do you mean to move the code of drm_sched_resubmit_jobs and drm_sched_start 
> in amdgpu_pci_resume to amdgpu_pci_error_detected, under the case 
> pci_channel_io_frozen?
> Then leave amdgpu_pci_resume as a null function, and in this way, we can drop 
> the acquire/lock write lock for case of pci_channel_io_normal as well?
>
> Regards,
> Guchun
>
> -Original Message-
> From: Grodzovsky, Andrey 
> Sent: Friday, October 1, 2021 10:22 AM
> To: Chen, Guchun ; amd-gfx@lists.freedesktop.org; 
> Koenig, Christian ; Pan, Xinhui 
> ; Deucher, Alexander 
> Subject: Re: [PATCH] drm/amdgpu: add missed write lock for pci 
> detected state pci_channel_io_normal
>
> On 2021-09-30 10:00 p.m., Guchun Chen wrote:
>
>> When a PCI error state pci_channel_io_normal is detectd, it will 
>> report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI 
>> driver will continue the execution of PCI resume callback 
>> report_resume by pci_walk_bridge, and the callback will go into 
>> amdgpu_pci_resume finally, where write lock is releasd 
>> unconditionally without acquiring such lock.
>
> Good catch but, the issue is even wider in scope, what about 
> drm_sched_resubmit_jobs and drm_sched_start called without being stopped 
> before ? Better to put the entire scope of code in this function under flag 
> that set only in pci_channel_io_frozen. As far as i remember we don't need to 
> do anything in case of pci_channel_io_normal.
>
> Andrey
>
>
>> Fixes: c9a6b82f45e2("drm/amdgpu: Implement DPC recovery")
>> Signed-off-by: Guchun Chen 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
>>1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index bb5ad2b6ca13..12f822d51de2 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -5370,6 +5370,7 @@ pci_ers_result_t 
>> amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
>>
>>  switch (state) {
>>  case pci_channel_io_normal:
>> +amdgpu_device_lock_adev(adev, NULL);
>>  return PCI_ERS_RESULT_CAN_RECOVER;
>>  /* Fatal error, prepare for slot reset */
>>  case pci_channel_io_frozen:


Re: Lockdep spalt on killing a processes

2021-10-01 Thread Andrey Grodzovsky
From what I see here you supposed to have actual deadlock and not only 
warning, sched_fence->finished is  first signaled from within
hw fence done callback (drm_sched_job_done_cb) but then again from 
within it's own callback (drm_sched_entity_kill_jobs_cb) and so
looks like same fence  object is recursively signaled twice. This leads 
to attempt to lock fence->lock second time while it's already
locked. I don't see a need to call drm_sched_fence_finished from within 
drm_sched_entity_kill_jobs_cb as this callback already registered
on sched_fence->finished fence (entity->last_scheduled == 
s_fence->finished) and hence the signaling already took place.


Andrey

On 2021-10-01 6:50 a.m., Christian König wrote:

Hey, Andrey.

while investigating some memory management problems I've got the 
logdep splat below.


Looks like something is wrong with drm_sched_entity_kill_jobs_cb(), 
can you investigate?


Thanks,
Christian.

[11176.741052] 
[11176.741056] WARNING: possible recursive locking detected
[11176.741060] 5.15.0-rc1-00031-g9d546d600800 #171 Not tainted
[11176.741066] 
[11176.741070] swapper/12/0 is trying to acquire lock:
[11176.741074] 9c337ed175a8 (>lock){-.-.}-{3:3}, at: 
dma_fence_signal+0x28/0x80

[11176.741088]
   but task is already holding lock:
[11176.741092] 9c337ed172a8 (>lock){-.-.}-{3:3}, at: 
dma_fence_signal+0x28/0x80

[11176.741100]
   other info that might help us debug this:
[11176.741104]  Possible unsafe locking scenario:

[11176.741108]    CPU0
[11176.741110]    
[11176.741113]   lock(>lock);
[11176.741118]   lock(>lock);
[11176.741122]
    *** DEADLOCK ***

[11176.741125]  May be due to missing lock nesting notation

[11176.741128] 2 locks held by swapper/12/0:
[11176.741133]  #0: 9c339c30f768 
(>fence_drv.lock){-.-.}-{3:3}, at: dma_fence_signal+0x28/0x80
[11176.741142]  #1: 9c337ed172a8 (>lock){-.-.}-{3:3}, at: 
dma_fence_signal+0x28/0x80

[11176.741151]
   stack backtrace:
[11176.741155] CPU: 12 PID: 0 Comm: swapper/12 Not tainted 
5.15.0-rc1-00031-g9d546d600800 #171
[11176.741160] Hardware name: System manufacturer System Product 
Name/PRIME X399-A, BIOS 0808 10/12/2018

[11176.741165] Call Trace:
[11176.741169]  
[11176.741173]  dump_stack_lvl+0x5b/0x74
[11176.741181]  dump_stack+0x10/0x12
[11176.741186]  __lock_acquire.cold+0x208/0x2df
[11176.741197]  lock_acquire+0xc6/0x2d0
[11176.741204]  ? dma_fence_signal+0x28/0x80
[11176.741212]  _raw_spin_lock_irqsave+0x4d/0x70
[11176.741219]  ? dma_fence_signal+0x28/0x80
[11176.741225]  dma_fence_signal+0x28/0x80
[11176.741230]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[11176.741240]  drm_sched_entity_kill_jobs_cb+0x1c/0x50 [gpu_sched]
[11176.741248]  dma_fence_signal_timestamp_locked+0xac/0x1a0
[11176.741254]  dma_fence_signal+0x3b/0x80
[11176.741260]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[11176.741268]  drm_sched_job_done.isra.0+0x7f/0x1a0 [gpu_sched]
[11176.741277]  drm_sched_job_done_cb+0x12/0x20 [gpu_sched]
[11176.741284]  dma_fence_signal_timestamp_locked+0xac/0x1a0
[11176.741290]  dma_fence_signal+0x3b/0x80
[11176.741296]  amdgpu_fence_process+0xd1/0x140 [amdgpu]
[11176.741504]  sdma_v4_0_process_trap_irq+0x8c/0xb0 [amdgpu]
[11176.741731]  amdgpu_irq_dispatch+0xce/0x250 [amdgpu]
[11176.741954]  amdgpu_ih_process+0x81/0x100 [amdgpu]
[11176.742174]  amdgpu_irq_handler+0x26/0xa0 [amdgpu]
[11176.742393]  __handle_irq_event_percpu+0x4f/0x2c0
[11176.742402]  handle_irq_event_percpu+0x33/0x80
[11176.742408]  handle_irq_event+0x39/0x60
[11176.742414]  handle_edge_irq+0x93/0x1d0
[11176.742419]  __common_interrupt+0x50/0xe0
[11176.742426]  common_interrupt+0x80/0x90
[11176.742431]  
[11176.742436]  asm_common_interrupt+0x1e/0x40
[11176.742442] RIP: 0010:cpuidle_enter_state+0xff/0x470
[11176.742449] Code: 0f a3 05 04 54 24 01 0f 82 70 02 00 00 31 ff e8 
37 5d 6f ff 80 7d d7 00 0f 85 e9 01 00 00 e8 58 a2 7f ff fb 66 0f 1f 
44 00 00 <45> 85 ff 0f 88 01 01 00 00 49 63 c7 4c 2b 75 c8 48 8d 14 40 
48 8d

[11176.742455] RSP: 0018:b6970021fe48 EFLAGS: 0202
[11176.742461] RAX: 0059be25 RBX: 0002 RCX: 

[11176.742465] RDX:  RSI:  RDI: 
9efeed78
[11176.742470] RBP: b6970021fe80 R08: 0001 R09: 
0001
[11176.742473] R10: 0001 R11: 0001 R12: 
9c3350b0e800
[11176.742477] R13: a00e9680 R14: 0a2a49ada060 R15: 
0002

[11176.742483]  ? cpuidle_enter_state+0xf8/0x470
[11176.742489]  ? cpuidle_enter_state+0xf8/0x470
[11176.742495]  cpuidle_enter+0x2e/0x40
[11176.742500]  call_cpuidle+0x23/0x40
[11176.742506]  do_idle+0x201/0x280
[11176.742512]  cpu_startup_entry+0x20/0x30
[11176.742517]  start_secondary+0x11f/0x160
[11176.742523]  secondary_startup_64_no_verify+0xb0/0xbb



Re: [pull] amdgpu drm-fixes-5.15

2021-10-01 Thread Daniel Vetter
On Wed, Sep 29, 2021 at 10:30:13PM -0400, Alex Deucher wrote:
> Hi Dave, Daniel,
> 
> Fixes for 5.15.
> 
> The following changes since commit 05812b971c6d605c00987750f422918589aa4486:
> 
>   Merge tag 'drm/tegra/for-5.15-rc3' of 
> ssh://git.freedesktop.org/git/tegra/linux into drm-fixes (2021-09-28 17:08:44 
> +1000)
> 
> are available in the Git repository at:
> 
>   https://gitlab.freedesktop.org/agd5f/linux.git 
> tags/amd-drm-fixes-5.15-2021-09-29
> 
> for you to fetch changes up to 26db706a6d77b9e184feb11725e97e53b7a89519:
> 
>   drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix (2021-09-28 
> 14:40:27 -0400)

Pulled into drm-fixes, thanks.
-Daniel

> 
> 
> amd-drm-fixes-5.15-2021-09-29:
> 
> amdgpu:
> - gart pin count fix
> - eDP flicker fix
> - GFX9 MQD fix
> - Display fixes
> - Tiling flags fix for pre-GFX9
> - SDMA resume fix for S0ix
> 
> 
> Charlene Liu (1):
>   drm/amd/display: Pass PCI deviceid into DC
> 
> Hawking Zhang (1):
>   drm/amdgpu: correct initial cp_hqd_quantum for gfx9
> 
> Josip Pavic (1):
>   drm/amd/display: initialize backlight_ramping_override to false
> 
> Leslie Shi (1):
>   drm/amdgpu: fix gart.bo pin_count leak
> 
> Praful Swarnakar (1):
>   drm/amd/display: Fix Display Flicker on embedded panels
> 
> Prike Liang (1):
>   drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix
> 
> Simon Ser (1):
>   drm/amdgpu: check tiling flags when creating FB on GFX8-
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 31 
> +++
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |  2 +-
>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c|  3 ++-
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  3 ++-
>  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c|  8 ++
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  2 ++
>  drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c  | 15 +--
>  7 files changed, 53 insertions(+), 11 deletions(-)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Lockdep spalt on killing a processes

2021-10-01 Thread Daniel Vetter
On Fri, Oct 01, 2021 at 12:50:35PM +0200, Christian König wrote:
> Hey, Andrey.
> 
> while investigating some memory management problems I've got the logdep
> splat below.
> 
> Looks like something is wrong with drm_sched_entity_kill_jobs_cb(), can you
> investigate?

Probably needs more irq_work so that we get rid of fence completion
signalling recursions.
-Daniel

> 
> Thanks,
> Christian.
> 
> [11176.741052] 
> [11176.741056] WARNING: possible recursive locking detected
> [11176.741060] 5.15.0-rc1-00031-g9d546d600800 #171 Not tainted
> [11176.741066] 
> [11176.741070] swapper/12/0 is trying to acquire lock:
> [11176.741074] 9c337ed175a8 (>lock){-.-.}-{3:3}, at:
> dma_fence_signal+0x28/0x80
> [11176.741088]
>    but task is already holding lock:
> [11176.741092] 9c337ed172a8 (>lock){-.-.}-{3:3}, at:
> dma_fence_signal+0x28/0x80
> [11176.741100]
>    other info that might help us debug this:
> [11176.741104]  Possible unsafe locking scenario:
> 
> [11176.741108]    CPU0
> [11176.741110]    
> [11176.741113]   lock(>lock);
> [11176.741118]   lock(>lock);
> [11176.741122]
>     *** DEADLOCK ***
> 
> [11176.741125]  May be due to missing lock nesting notation
> 
> [11176.741128] 2 locks held by swapper/12/0:
> [11176.741133]  #0: 9c339c30f768 (>fence_drv.lock){-.-.}-{3:3},
> at: dma_fence_signal+0x28/0x80
> [11176.741142]  #1: 9c337ed172a8 (>lock){-.-.}-{3:3}, at:
> dma_fence_signal+0x28/0x80
> [11176.741151]
>    stack backtrace:
> [11176.741155] CPU: 12 PID: 0 Comm: swapper/12 Not tainted
> 5.15.0-rc1-00031-g9d546d600800 #171
> [11176.741160] Hardware name: System manufacturer System Product Name/PRIME
> X399-A, BIOS 0808 10/12/2018
> [11176.741165] Call Trace:
> [11176.741169]  
> [11176.741173]  dump_stack_lvl+0x5b/0x74
> [11176.741181]  dump_stack+0x10/0x12
> [11176.741186]  __lock_acquire.cold+0x208/0x2df
> [11176.741197]  lock_acquire+0xc6/0x2d0
> [11176.741204]  ? dma_fence_signal+0x28/0x80
> [11176.741212]  _raw_spin_lock_irqsave+0x4d/0x70
> [11176.741219]  ? dma_fence_signal+0x28/0x80
> [11176.741225]  dma_fence_signal+0x28/0x80
> [11176.741230]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
> [11176.741240]  drm_sched_entity_kill_jobs_cb+0x1c/0x50 [gpu_sched]
> [11176.741248]  dma_fence_signal_timestamp_locked+0xac/0x1a0
> [11176.741254]  dma_fence_signal+0x3b/0x80
> [11176.741260]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
> [11176.741268]  drm_sched_job_done.isra.0+0x7f/0x1a0 [gpu_sched]
> [11176.741277]  drm_sched_job_done_cb+0x12/0x20 [gpu_sched]
> [11176.741284]  dma_fence_signal_timestamp_locked+0xac/0x1a0
> [11176.741290]  dma_fence_signal+0x3b/0x80
> [11176.741296]  amdgpu_fence_process+0xd1/0x140 [amdgpu]
> [11176.741504]  sdma_v4_0_process_trap_irq+0x8c/0xb0 [amdgpu]
> [11176.741731]  amdgpu_irq_dispatch+0xce/0x250 [amdgpu]
> [11176.741954]  amdgpu_ih_process+0x81/0x100 [amdgpu]
> [11176.742174]  amdgpu_irq_handler+0x26/0xa0 [amdgpu]
> [11176.742393]  __handle_irq_event_percpu+0x4f/0x2c0
> [11176.742402]  handle_irq_event_percpu+0x33/0x80
> [11176.742408]  handle_irq_event+0x39/0x60
> [11176.742414]  handle_edge_irq+0x93/0x1d0
> [11176.742419]  __common_interrupt+0x50/0xe0
> [11176.742426]  common_interrupt+0x80/0x90
> [11176.742431]  
> [11176.742436]  asm_common_interrupt+0x1e/0x40
> [11176.742442] RIP: 0010:cpuidle_enter_state+0xff/0x470
> [11176.742449] Code: 0f a3 05 04 54 24 01 0f 82 70 02 00 00 31 ff e8 37 5d
> 6f ff 80 7d d7 00 0f 85 e9 01 00 00 e8 58 a2 7f ff fb 66 0f 1f 44 00 00 <45>
> 85 ff 0f 88 01 01 00 00 49 63 c7 4c 2b 75 c8 48 8d 14 40 48 8d
> [11176.742455] RSP: 0018:b6970021fe48 EFLAGS: 0202
> [11176.742461] RAX: 0059be25 RBX: 0002 RCX:
> 
> [11176.742465] RDX:  RSI:  RDI:
> 9efeed78
> [11176.742470] RBP: b6970021fe80 R08: 0001 R09:
> 0001
> [11176.742473] R10: 0001 R11: 0001 R12:
> 9c3350b0e800
> [11176.742477] R13: a00e9680 R14: 0a2a49ada060 R15:
> 0002
> [11176.742483]  ? cpuidle_enter_state+0xf8/0x470
> [11176.742489]  ? cpuidle_enter_state+0xf8/0x470
> [11176.742495]  cpuidle_enter+0x2e/0x40
> [11176.742500]  call_cpuidle+0x23/0x40
> [11176.742506]  do_idle+0x201/0x280
> [11176.742512]  cpu_startup_entry+0x20/0x30
> [11176.742517]  start_secondary+0x11f/0x160
> [11176.742523]  secondary_startup_64_no_verify+0xb0/0xbb
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 2/2] habanalabs: add support for dma-buf exporter

2021-10-01 Thread Jason Gunthorpe
On Wed, Sep 29, 2021 at 12:17:35AM +0300, Oded Gabbay wrote:
> On Tue, Sep 28, 2021 at 8:36 PM Jason Gunthorpe  wrote:
> >
> > On Sun, Sep 12, 2021 at 07:53:09PM +0300, Oded Gabbay wrote:
> > > From: Tomer Tayar 
> > >
> > > Implement the calls to the dma-buf kernel api to create a dma-buf
> > > object backed by FD.
> > >
> > > We block the option to mmap the DMA-BUF object because we don't support
> > > DIRECT_IO and implicit P2P.
> >
> > This statement doesn't make sense, you can mmap your dmabuf if you
> > like. All dmabuf mmaps are supposed to set the special bit/etc to
> > exclude them from get_user_pages() anyhow - and since this is BAR
> > memory not struct page memory this driver would be doing it anyhow.
> >
> But we block mmap the dmabuf fd from user-space.
> If you try to do it, you will get MAP_FAILED.

You do, I'm saying the above paragraph explaining *why* that was done
is not correct.

> > > We check the p2p distance using pci_p2pdma_distance_many() and refusing
> > > to map dmabuf in case the distance doesn't allow p2p.
> >
> > Does this actually allow the p2p transfer for your intended use cases?
>
> It depends on the system. If we are working bare-metal, then yes, it allows.
> If inside a VM, then no. The virtualized root complex is not
> white-listed and the kernel can't know the distance.
> But I remember you asked me to add this check, in v3 of the review IIRC.
> I don't mind removing this check if you don't object.

Yes, i tis the right code, I was curious how far along things have
gotten

> > Don't write to the kernel log from user space triggered actions
> at all ?

At all.

> It's the first time I hear about this limitation...

Oh? It is a security issue, we don't want to allow userspace to DOS
the kerne logging.

> How do you tell the user it has done something wrong ?

dev_dbg is the usual way and then users doing debugging can opt in to
the logging.


> > Why doesn't this return a sg_table * and an ERR_PTR?
> Basically I modeled this function after amdgpu_vram_mgr_alloc_sgt()
> And in that function they also return int and pass the sg_table as **
> 
> If it's critical I can change.

Please follow the normal kernel style

Jason


Re: [PATCH v6 2/2] habanalabs: add support for dma-buf exporter

2021-10-01 Thread Jason Gunthorpe
On Thu, Sep 30, 2021 at 03:46:35PM +0300, Oded Gabbay wrote:

> After reading the kernel iommu code, I think this is not relevant
> here, and I'll add a comment appropriately but I'll also write it
> here, and please correct me if my understanding is wrong.
> 
> The memory behind this specific dma-buf has *always* resided on the
> device itself, i.e. it lives only in the 'device' domain (after all,
> it maps a PCI bar address which points to the device memory).
> Therefore, it was never in the 'CPU' domain and hence, there is no
> need to perform a sync of the memory to the CPU's cache, as it was
> never inside that cache to begin with.
> 
> This is not the same case as with regular memory which is dma-mapped
> and then copied into the device using a dma engine. In that case,
> the memory started in the 'CPU' domain and moved to the 'device'
> domain. When it is unmapped it will indeed be recycled to be used
> for another purpose and therefore we need to sync the CPU cache.
> 
> Is my understanding correct ?

It makes sense to me

Jason


[PATCH 14/14] drm/amd/display: Fix error in dmesg at boot

2021-10-01 Thread Solomon Chiu
From: "Leo (Hanghong) Ma" 

[Why]
During DQE's promotion test, error appears in dmesg at boot
on dcn3.1;

[How]
Add NULL pointor check for the pointor to the amdgpu_dm_connector;

Reviewed-by: Nicholas Kazlauskas 
Acked-by: Solomon Chiu 
Signed-off-by: Leo (Hanghong) Ma 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 81bf1e5a64c8..64b9c493dce2 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -1917,7 +1917,7 @@ void blank_all_dp_displays(struct dc *dc, bool hw_init)
 
if ((signal == SIGNAL_TYPE_EDP) ||
(signal == SIGNAL_TYPE_DISPLAY_PORT)) {
-   if (hw_init && signal != SIGNAL_TYPE_EDP) {
+   if (hw_init && signal != SIGNAL_TYPE_EDP && 
dc->links[i]->priv != NULL) {
/* DP 2.0 spec requires that we read LTTPR caps 
first */
dp_retrieve_lttpr_cap(dc->links[i]);
/* if any of the displays are lit up turn them 
off */
@@ -1943,7 +1943,7 @@ void blank_all_dp_displays(struct dc *dc, bool hw_init)
}
 
if 
(!dc->links[i]->wa_flags.dp_keep_receiver_powered ||
-   (hw_init && signal != SIGNAL_TYPE_EDP))
+   (hw_init && signal != SIGNAL_TYPE_EDP 
&& dc->links[i]->priv != NULL))
dp_receiver_power_ctrl(dc->links[i], 
false);
}
}
-- 
2.25.1



[PATCH 13/14] drm/amd/display: Fix concurrent dynamic encoder assignment.

2021-10-01 Thread Solomon Chiu
From: Jimmy Kizito 

[Why]
Trying to enable multiple displays simultaneously exposed shortcomings
with the algorithm for dynamic link encoder assignment.

The main problems were:
- Assuming stream order remained constant across states would
sometimes lead to invalid DIG encoder assignment.
- Incorrect logic for deciding whether or not a DIG could support a
stream would also sometimes lead to invalid DIG encoder assignment.
- Changes in encoder assignment were wholesale while updating of the
pipe backend is incremental. This would lead to the hardware state
not matching the software state even with valid encoder assignments.

[How]

The following changes fix the identified problems.
- Use stream pointer rather than stream index to track streams across
states.
- Fix DIG compatibility check by examining the link signal type
rather than the stream signal type.
- Modify assignment algorithm to make incremental updates so software
and hardware states remain coherent.

Additionally:
- Add assertions and an encoder assignment validation
function link_enc_cfg_validate() to detect potential problems with
encoder assignment closer to their root cause.
- Reduce the frequency with which the assignment algorithm is
executed. It should not be necessary for fast state validation.

Reviewed-by: Jun Lei 
Acked-by: Solomon Chiu 
Signed-off-by: Jimmy Kizito 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 74da226efffe..81bf1e5a64c8 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -1926,9 +1926,9 @@ void blank_all_dp_displays(struct dc *dc, bool hw_init)
}
 
if ((signal != SIGNAL_TYPE_EDP && status == DC_OK && 
dpcd_power_state == DP_POWER_STATE_D0) ||
-   (!hw_init && 
dc->links[i]->link_enc->funcs->is_dig_enabled(dc->links[i]->link_enc))) {
-   if (dc->links[i]->ep_type == 
DISPLAY_ENDPOINT_PHY &&
-   
dc->links[i]->link_enc->funcs->get_dig_frontend) {
+   (!hw_init && dc->links[i]->link_enc &&
+   
dc->links[i]->link_enc->funcs->is_dig_enabled(dc->links[i]->link_enc))) {
+   if 
(dc->links[i]->link_enc->funcs->get_dig_frontend) {
fe = 
dc->links[i]->link_enc->funcs->get_dig_frontend(dc->links[i]->link_enc);
if (fe == ENGINE_ID_UNKNOWN)
continue;
-- 
2.25.1



[PATCH 12/14] drm/amd/display: Add helper for blanking all dp displays

2021-10-01 Thread Solomon Chiu
From: "Leo (Hanghong) Ma" 

[Why & How]
The codes to blank all dp display have been called many times,
so add a helper in dc_link to make it more concise.

Reviewed-by: Aric Cyr 
Acked-by: Solomon Chiu 
Signed-off-by: Leo (Hanghong) Ma 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 45 +++
 drivers/gpu/drm/amd/display/dc/dc_link.h  |  1 +
 .../display/dc/dce110/dce110_hw_sequencer.c   | 24 ++
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 41 ++---
 .../drm/amd/display/dc/dcn30/dcn30_hwseq.c| 39 ++--
 .../drm/amd/display/dc/dcn31/dcn31_hwseq.c| 39 ++--
 6 files changed, 59 insertions(+), 130 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 02c7a18c095f..74da226efffe 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -1906,6 +1906,51 @@ static enum dc_status enable_link_dp_mst(
return enable_link_dp(state, pipe_ctx);
 }
 
+void blank_all_dp_displays(struct dc *dc, bool hw_init)
+{
+   unsigned int i, j, fe;
+   uint8_t dpcd_power_state = '\0';
+   enum dc_status status = DC_ERROR_UNEXPECTED;
+
+   for (i = 0; i < dc->link_count; i++) {
+   enum signal_type signal = dc->links[i]->connector_signal;
+
+   if ((signal == SIGNAL_TYPE_EDP) ||
+   (signal == SIGNAL_TYPE_DISPLAY_PORT)) {
+   if (hw_init && signal != SIGNAL_TYPE_EDP) {
+   /* DP 2.0 spec requires that we read LTTPR caps 
first */
+   dp_retrieve_lttpr_cap(dc->links[i]);
+   /* if any of the displays are lit up turn them 
off */
+   status = core_link_read_dpcd(dc->links[i], 
DP_SET_POWER,
+   _power_state, 
sizeof(dpcd_power_state));
+   }
+
+   if ((signal != SIGNAL_TYPE_EDP && status == DC_OK && 
dpcd_power_state == DP_POWER_STATE_D0) ||
+   (!hw_init && 
dc->links[i]->link_enc->funcs->is_dig_enabled(dc->links[i]->link_enc))) {
+   if (dc->links[i]->ep_type == 
DISPLAY_ENDPOINT_PHY &&
+   
dc->links[i]->link_enc->funcs->get_dig_frontend) {
+   fe = 
dc->links[i]->link_enc->funcs->get_dig_frontend(dc->links[i]->link_enc);
+   if (fe == ENGINE_ID_UNKNOWN)
+   continue;
+
+   for (j = 0; j < 
dc->res_pool->stream_enc_count; j++) {
+   if (fe == 
dc->res_pool->stream_enc[j]->id) {
+   
dc->res_pool->stream_enc[j]->funcs->dp_blank(dc->links[i],
+   
dc->res_pool->stream_enc[j]);
+   break;
+   }
+   }
+   }
+
+   if 
(!dc->links[i]->wa_flags.dp_keep_receiver_powered ||
+   (hw_init && signal != SIGNAL_TYPE_EDP))
+   dp_receiver_power_ctrl(dc->links[i], 
false);
+   }
+   }
+   }
+
+}
+
 static bool get_ext_hdmi_settings(struct pipe_ctx *pipe_ctx,
enum engine_id eng_id,
struct ext_hdmi_settings *settings)
diff --git a/drivers/gpu/drm/amd/display/dc/dc_link.h 
b/drivers/gpu/drm/amd/display/dc/dc_link.h
index a73d64b1fd33..69b008bafbbc 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_link.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_link.h
@@ -277,6 +277,7 @@ bool dc_link_setup_psr(struct dc_link *dc_link,
struct psr_context *psr_context);
 
 void dc_link_get_psr_residency(const struct dc_link *link, uint32_t 
*residency);
+void blank_all_dp_displays(struct dc *dc, bool hw_init);
 
 /* Request DC to detect if there is a Panel connected.
  * boot - If this call is during initial boot.
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index af3e68d3e747..8108f9ae2638 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1649,31 +1649,13 @@ static enum dc_status apply_single_controller_ctx_to_hw(
 
 static void power_down_encoders(struct dc *dc)
 {
-   int i, j;
+   int i;
+
+   blank_all_dp_displays(dc, false);
 
for (i = 0; i < dc->link_count; i++) {
enum signal_type signal = 

[PATCH 11/14] drm/amd/display: 3.2.156

2021-10-01 Thread Solomon Chiu
From: Aric Cyr 

This version brings along following fixes:
- New firmware version
- Fix DMUB problems on stress test.
- Improve link training by skip overrride for preferred link
- Refinement of FPU code structure for DCN2
- Fix 3DLUT skipped programming
- Fix detection of 4 lane for DPALT
- Fix dcn3 failure due to dmcbu_abm not created
- Limit display scaling to up to 4k for DCN 3.1
- Add helper for blanking all dp displays

Acked-by: Solomon Chiu 
Signed-off-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 5ffe2a41258f..204bab6f82ef 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -45,7 +45,7 @@
 /* forward declaration */
 struct aux_payload;
 
-#define DC_VER "3.2.155"
+#define DC_VER "3.2.156"
 
 #define MAX_SURFACES 3
 #define MAX_PLANES 6
-- 
2.25.1



[PATCH 10/14] drm/amd/display: [FW Promotion] Release 0.0.87

2021-10-01 Thread Solomon Chiu
From: Anthony Koo 

Acked-by: Solomon Chiu 
Signed-off-by: Anthony Koo 
---
 drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h 
b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
index f5974562aa23..42956dd398f3 100644
--- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
+++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
@@ -47,10 +47,10 @@
 
 /* Firmware versioning. */
 #ifdef DMUB_EXPOSE_VERSION
-#define DMUB_FW_VERSION_GIT_HASH 0x42c0e74b
+#define DMUB_FW_VERSION_GIT_HASH 0xf0c64c97
 #define DMUB_FW_VERSION_MAJOR 0
 #define DMUB_FW_VERSION_MINOR 0
-#define DMUB_FW_VERSION_REVISION 86
+#define DMUB_FW_VERSION_REVISION 87
 #define DMUB_FW_VERSION_TEST 0
 #define DMUB_FW_VERSION_VBIOS 0
 #define DMUB_FW_VERSION_HOTFIX 0
-- 
2.25.1



[PATCH 09/14] drm/amd/display: Fix detection of 4 lane for DPALT

2021-10-01 Thread Solomon Chiu
From: Hansen 

[Why]
DPALT detection for B0 PHY has its own set of RDPCSPIPE registers

[How]
Use RDPCSPIPE registers to detect if DPALT lane is 4 lane

Reviewed-by: Charlene Liu 
Acked-by: Solomon Chiu 
Signed-off-by: Hansen 
---
 .../display/dc/dcn31/dcn31_dio_link_encoder.c | 33 ++-
 .../display/dc/dcn31/dcn31_dio_link_encoder.h |  3 ++
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dio_link_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dio_link_encoder.c
index 4f0a0803db6c..616a48d72afa 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dio_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dio_link_encoder.c
@@ -63,6 +63,10 @@
 #define AUX_REG_WRITE(reg_name, val) \
dm_write_reg(CTX, AUX_REG(reg_name), val)
 
+#ifndef MIN
+#define MIN(X, Y) ((X) < (Y) ? (X) : (Y))
+#endif
+
 void dcn31_link_encoder_set_dio_phy_mux(
struct link_encoder *enc,
enum encoder_type_select sel,
@@ -217,7 +221,7 @@ static const struct link_encoder_funcs dcn31_link_enc_funcs 
= {
.get_dig_frontend = dcn10_get_dig_frontend,
.get_dig_mode = dcn10_get_dig_mode,
.is_in_alt_mode = dcn31_link_encoder_is_in_alt_mode,
-   .get_max_link_cap = dcn20_link_encoder_get_max_link_cap,
+   .get_max_link_cap = dcn31_link_encoder_get_max_link_cap,
.set_dio_phy_mux = dcn31_link_encoder_set_dio_phy_mux,
 };
 
@@ -439,3 +443,30 @@ bool dcn31_link_encoder_is_in_alt_mode(struct link_encoder 
*enc)
 
return is_usb_c_alt_mode;
 }
+
+void dcn31_link_encoder_get_max_link_cap(struct link_encoder *enc,
+   
 struct dc_link_settings *link_settings)
+{
+   struct dcn10_link_encoder *enc10 = TO_DCN10_LINK_ENC(enc);
+   uint32_t is_in_usb_c_dp4_mode = 0;
+
+   dcn10_link_encoder_get_max_link_cap(enc, link_settings);
+
+   /* in usb c dp2 mode, max lane count is 2 */
+   if (enc->funcs->is_in_alt_mode && enc->funcs->is_in_alt_mode(enc)) {
+   if (enc->ctx->asic_id.hw_internal_rev != YELLOW_CARP_B0) {
+   // [Note] no need to check hw_internal_rev once phy mux 
selection is ready
+   REG_GET(RDPCSTX_PHY_CNTL6, RDPCS_PHY_DPALT_DP4, 
_in_usb_c_dp4_mode);
+   } else {
+   if ((enc10->base.transmitter == TRANSMITTER_UNIPHY_A)
+   || (enc10->base.transmitter == 
TRANSMITTER_UNIPHY_B)
+   || (enc10->base.transmitter == 
TRANSMITTER_UNIPHY_E)) {
+   REG_GET(RDPCSTX_PHY_CNTL6, RDPCS_PHY_DPALT_DP4, 
_in_usb_c_dp4_mode);
+   } else {
+   REG_GET(RDPCSPIPE_PHY_CNTL6, 
RDPCS_PHY_DPALT_DP4, _in_usb_c_dp4_mode);
+   }
+   }
+   if (!is_in_usb_c_dp4_mode)
+   link_settings->lane_count = MIN(LANE_COUNT_TWO, 
link_settings->lane_count);
+   }
+}
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dio_link_encoder.h 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dio_link_encoder.h
index bec50e4402ff..3454f1e7c1f1 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dio_link_encoder.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dio_link_encoder.h
@@ -252,4 +252,7 @@ void dcn31_link_encoder_disable_output(
 bool dcn31_link_encoder_is_in_alt_mode(
struct link_encoder *enc);
 
+void dcn31_link_encoder_get_max_link_cap(struct link_encoder *enc,
+   struct dc_link_settings *link_settings);
+
 #endif /* __DC_LINK_ENCODER__DCN31_H__ */
-- 
2.25.1



[PATCH 08/14] drm/amd/display: Limit display scaling to up to 4k for DCN 3.1

2021-10-01 Thread Solomon Chiu
From: Nikola Cornij 

[why]
The existing limit was mistakenly bigger than 4k for DCN 3.1

Reviewed-by: Zhan Liu 
Acked-by: Solomon Chiu 
Signed-off-by: Nikola Cornij 
---
 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
index 613d34bde7dd..d5b58025f0cc 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
@@ -998,7 +998,7 @@ static const struct dc_debug_options debug_defaults_drv = {
.disable_dcc = DCC_ENABLE,
.vsr_support = true,
.performance_trace = false,
-   .max_downscale_src_width = 7680,/*upto 8K*/
+   .max_downscale_src_width = 3840,/*upto 4K*/
.disable_pplib_wm_range = false,
.scl_reset_length10 = true,
.sanity_checks = false,
-- 
2.25.1



[PATCH 07/14] drm/amd/display: Added root clock optimization flags

2021-10-01 Thread Solomon Chiu
From: Jake Wang 

[Why & How]
Added root clock optimization debug flags for future debugging.

Reviewed-by: Nicholas Kazlauskas 
Acked-by: Solomon Chiu 
Signed-off-by: Jake Wang 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index c5a091d0bbfc..5ffe2a41258f 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -477,6 +477,23 @@ union mem_low_power_enable_options {
uint32_t u32All;
 };
 
+union root_clock_optimization_options {
+   struct {
+   bool dpp: 1;
+   bool dsc: 1;
+   bool hdmistream: 1;
+   bool hdmichar: 1;
+   bool dpstream: 1;
+   bool symclk32_se: 1;
+   bool symclk32_le: 1;
+   bool symclk_fe: 1;
+   bool physymclk: 1;
+   bool dpiasymclk: 1;
+   uint32_t reserved: 22;
+   } bits;
+   uint32_t u32All;
+};
+
 struct dc_debug_data {
uint32_t ltFailCount;
uint32_t i2cErrorCount;
@@ -637,6 +654,7 @@ struct dc_debug_options {
bool legacy_dp2_lt;
 #endif
union mem_low_power_enable_options enable_mem_low_power;
+   union root_clock_optimization_options root_clock_optimization;
bool force_vblank_alignment;
 
/* Enable dmub aux for legacy ddc */
-- 
2.25.1



[PATCH 06/14] drm/amd/display: dcn3 failed due to dmcbu_abm not created

2021-10-01 Thread Solomon Chiu
From: Charlene Liu 

[why]
dc->config.disable_dmcu set to true, but it still need create
dmcub based abm.

[how]
check to dc->caps.dmcub_support check.


Reviewed-by: Aric Cyr 
Acked-by: Solomon Chiu 
Signed-off-by: Charlene Liu 
---
 drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c 
b/drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c
index 0f273ac0c83f..6ab81d609c97 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c
@@ -256,7 +256,7 @@ struct abm *dmub_abm_create(
const struct dce_abm_shift *abm_shift,
const struct dce_abm_mask *abm_mask)
 {
-   if (!ctx->dc->config.disable_dmcu) {
+   if (ctx->dc->caps.dmcub_support) {
struct dce_abm *abm_dce = kzalloc(sizeof(*abm_dce), GFP_KERNEL);
 
if (abm_dce == NULL) {
-- 
2.25.1



[PATCH 05/14] drm/amd/display: Fix 3DLUT skipped programming

2021-10-01 Thread Solomon Chiu
From: Aric Cyr 

[Why]
3DLUT not updated due to missing condition

[How]
Check if 3DLUT update is needed


Reviewed-by: Anthony Koo 
Acked-by: Solomon Chiu 
Signed-off-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 0f0440408a16..8e0bcd4fd000 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -2187,6 +2187,9 @@ static enum surface_update_type det_surface_update(const 
struct dc *dc,
update_flags->bits.gamma_change = 1;
}
 
+   if (u->lut3d_func || u->func_shaper)
+   update_flags->bits.lut_3d = 1;
+
if (u->hdr_mult.value)
if (u->hdr_mult.value != u->surface->hdr_mult.value) {
update_flags->bits.hdr_mult = 1;
@@ -2200,6 +2203,7 @@ static enum surface_update_type det_surface_update(const 
struct dc *dc,
 
if (update_flags->bits.input_csc_change
|| update_flags->bits.coeff_reduction_change
+   || update_flags->bits.lut_3d
|| update_flags->bits.gamma_change
|| update_flags->bits.gamut_remap_change) {
type = UPDATE_TYPE_FULL;
-- 
2.25.1



[PATCH 04/14] drm/amd/display: Re-arrange FPU code structure for dcn2x

2021-10-01 Thread Solomon Chiu
From: Qingqing Zhuo 

[Why]
Current FPU code for DCN2x is located under dml/dcn2x.
This is not aligned with DC's general source tree
structure.

[How]
Move FPU code for DCN2x to dml/dcn20.

Reviewed-by: Rodrigo Siqueira 
Acked-by: Solomon Chiu 
Signed-off-by: Qingqing Zhuo 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c  | 2 +-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c   | 2 +-
 drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c | 2 +-
 drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c   | 2 +-
 drivers/gpu/drm/amd/display/dc/dml/Makefile | 4 ++--
 .../amd/display/dc/dml/{dcn2x/dcn2x.c => dcn20/dcn20_fpu.c} | 2 +-
 .../amd/display/dc/dml/{dcn2x/dcn2x.h => dcn20/dcn20_fpu.h} | 6 +++---
 7 files changed, 10 insertions(+), 10 deletions(-)
 rename drivers/gpu/drm/amd/display/dc/dml/{dcn2x/dcn2x.c => dcn20/dcn20_fpu.c} 
(99%)
 rename drivers/gpu/drm/amd/display/dc/dml/{dcn2x/dcn2x.h => dcn20/dcn20_fpu.h} 
(94%)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
index b1bf80da3a55..ab0c6d191038 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
@@ -52,7 +52,7 @@ static DEFINE_PER_CPU(int, fpu_recursion_depth);
  * This function tells if the code is already under FPU protection or not. A
  * function that works as an API for a set of FPU operations can use this
  * function for checking if the caller invoked it after DC_FP_START(). For
- * example, take a look at dcn2x.c file.
+ * example, take a look at dcn20_fpu.c file.
  */
 inline void dc_assert_fp_enabled(void)
 {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index 899d0086ffbe..756f5d411d9a 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -35,7 +35,7 @@
 #include "include/irq_service_interface.h"
 #include "dcn20/dcn20_resource.h"
 
-#include "dml/dcn2x/dcn2x.h"
+#include "dml/dcn20/dcn20_fpu.h"
 
 #include "dcn10/dcn10_hubp.h"
 #include "dcn10/dcn10_ipp.h"
diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c
index aec276e1db65..5881dc49f7c5 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c
@@ -27,7 +27,7 @@
 #include "dc.h"
 
 #include "dcn201_init.h"
-#include "dml/dcn2x/dcn2x.h"
+#include "dml/dcn20/dcn20_fpu.h"
 #include "resource.h"
 #include "include/irq_service_interface.h"
 #include "dcn201_resource.h"
diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
index fbbdf9976183..d452a0d1777e 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
@@ -35,7 +35,7 @@
 #include "include/irq_service_interface.h"
 #include "dcn20/dcn20_resource.h"
 
-#include "dml/dcn2x/dcn2x.h"
+#include "dml/dcn20/dcn20_fpu.h"
 
 #include "clk_mgr.h"
 #include "dcn10/dcn10_hubp.h"
diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index 56055df2e8d2..169a4e68f86e 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -58,7 +58,7 @@ CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_lib.o := 
$(dml_ccflags)
 
 ifdef CONFIG_DRM_AMD_DC_DCN
 CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_vba.o := $(dml_ccflags)
-CFLAGS_$(AMDDALPATH)/dc/dml/dcn2x/dcn2x.o := $(dml_ccflags)
+CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/dcn20_fpu.o := $(dml_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20.o := $(dml_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20.o := $(dml_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_rq_dlg_calc_20.o := $(dml_ccflags)
@@ -93,8 +93,8 @@ CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/display_rq_dlg_helpers.o 
:= $(dml_rcflags)
 DML = display_mode_lib.o display_rq_dlg_helpers.o dml1_display_rq_dlg_calc.o \
 
 ifdef CONFIG_DRM_AMD_DC_DCN
+DML += dcn20/dcn20_fpu.o
 DML += display_mode_vba.o dcn20/display_rq_dlg_calc_20.o 
dcn20/display_mode_vba_20.o
-DML += dcn2x/dcn2x.o
 DML += dcn20/display_rq_dlg_calc_20v2.o dcn20/display_mode_vba_20v2.o
 DML += dcn21/display_rq_dlg_calc_21.o dcn21/display_mode_vba_21.o
 DML += dcn30/display_mode_vba_30.o dcn30/display_rq_dlg_calc_30.o
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn2x/dcn2x.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
similarity index 99%
rename from drivers/gpu/drm/amd/display/dc/dml/dcn2x/dcn2x.c
rename to drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
index c58522436291..d590dc917363 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn2x/dcn2x.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
@@ -26,7 +26,7 @@
 
 #include "resource.h"
 

[PATCH 03/14] drm/amd/display: Skip override for preferred link settings during link training

2021-10-01 Thread Solomon Chiu
From: George Shen 

[Why]
Overriding link setting inside override_training_settings
result in fallback link settings being ignored. This can
potentially cause link training to always fail and consequently
result in an infinite loop of link training to occur in
dp_verify_link_cap during detection.

[How]
Since preferred link settings are already considered inside
decide_link_settings, skip the check in override_training_settings
to avoid infinite link training loops.

Reviewed-by: Wenjing Liu 
Acked-by: Solomon Chiu 
Signed-off-by: George Shen 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 029cc78bc9e9..649a9da338a7 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -1645,12 +1645,6 @@ static void override_training_settings(
 {
uint32_t lane;
 
-   /* Override link settings */
-   if (link->preferred_link_setting.link_rate != LINK_RATE_UNKNOWN)
-   lt_settings->link_settings.link_rate = 
link->preferred_link_setting.link_rate;
-   if (link->preferred_link_setting.lane_count != LANE_COUNT_UNKNOWN)
-   lt_settings->link_settings.lane_count = 
link->preferred_link_setting.lane_count;
-
/* Override link spread */
if (!link->dp_ss_off && overrides->downspread != NULL)
lt_settings->link_settings.link_spread = *overrides->downspread 
?
-- 
2.25.1



[PATCH 02/14] drm/amd/display: update irq_service and other required change part 2.

2021-10-01 Thread Solomon Chiu
From: Charlene Liu 

[why]
fix NULL pointer in irq_service_dcn201

[how]
initialize proper num of irq source for linu

Reviewed-by: Sung joon Kim 
Acked-by: Solomon Chiu 
Signed-off-by: Charlene Liu 
---
 drivers/gpu/drm/amd/display/dc/dc.h|  1 +
 drivers/gpu/drm/amd/display/dc/dce/dce_hwseq.h |  9 +
 drivers/gpu/drm/amd/display/dc/dce/dce_opp.h   |  1 +
 drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c  | 17 ++---
 4 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 8cc9626fc111..c5a091d0bbfc 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -899,6 +899,7 @@ union surface_update_flags {
uint32_t bandwidth_change:1;
uint32_t clock_change:1;
uint32_t stereo_format_change:1;
+   uint32_t lut_3d:1;
uint32_t full_update:1;
} bits;
 
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_hwseq.h 
b/drivers/gpu/drm/amd/display/dc/dce/dce_hwseq.h
index 296b2f80a1ec..307369b52b42 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_hwseq.h
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_hwseq.h
@@ -663,14 +663,15 @@ struct dce_hwseq_registers {
uint32_t MC_VM_XGMI_LFB_CNTL;
uint32_t AZALIA_AUDIO_DTO;
uint32_t AZALIA_CONTROLLER_CLOCK_GATING;
+   /* MMHUB VM */
+   uint32_t MC_VM_FB_LOCATION_BASE;
+   uint32_t MC_VM_FB_LOCATION_TOP;
+   uint32_t MC_VM_FB_OFFSET;
+   uint32_t MMHUBBUB_MEM_PWR_CNTL;
uint32_t HPO_TOP_CLOCK_CONTROL;
uint32_t ODM_MEM_PWR_CTRL3;
uint32_t DMU_MEM_PWR_CNTL;
-   uint32_t MMHUBBUB_MEM_PWR_CNTL;
uint32_t DCHUBBUB_ARB_HOSTVM_CNTL;
-   uint32_t MC_VM_FB_LOCATION_BASE;
-   uint32_t MC_VM_FB_LOCATION_TOP;
-   uint32_t MC_VM_FB_OFFSET;
 };
  /* set field name */
 #define HWS_SF(blk_name, reg_name, field_name, post_fix)\
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_opp.h 
b/drivers/gpu/drm/amd/display/dc/dce/dce_opp.h
index bf1ffc3629c7..3d9be87aae45 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_opp.h
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_opp.h
@@ -111,6 +111,7 @@ enum dce110_opp_reg_type {
OPP_SF(FMT_DITHER_RAND_R_SEED, FMT_RAND_R_SEED, mask_sh),\
OPP_SF(FMT_DITHER_RAND_G_SEED, FMT_RAND_G_SEED, mask_sh),\
OPP_SF(FMT_DITHER_RAND_B_SEED, FMT_RAND_B_SEED, mask_sh),\
+   OPP_SF(FMT_BIT_DEPTH_CONTROL, FMT_TEMPORAL_DITHER_EN, mask_sh),\
OPP_SF(FMT_BIT_DEPTH_CONTROL, FMT_TEMPORAL_DITHER_RESET, mask_sh),\
OPP_SF(FMT_BIT_DEPTH_CONTROL, FMT_TEMPORAL_DITHER_OFFSET, mask_sh),\
OPP_SF(FMT_BIT_DEPTH_CONTROL, FMT_TEMPORAL_DITHER_DEPTH, mask_sh),\
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c 
b/drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c
index fb0dec4ed3a6..0f273ac0c83f 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c
@@ -256,16 +256,19 @@ struct abm *dmub_abm_create(
const struct dce_abm_shift *abm_shift,
const struct dce_abm_mask *abm_mask)
 {
-   struct dce_abm *abm_dce = kzalloc(sizeof(*abm_dce), GFP_KERNEL);
+   if (!ctx->dc->config.disable_dmcu) {
+   struct dce_abm *abm_dce = kzalloc(sizeof(*abm_dce), GFP_KERNEL);
 
-   if (abm_dce == NULL) {
-   BREAK_TO_DEBUGGER();
-   return NULL;
-   }
+   if (abm_dce == NULL) {
+   BREAK_TO_DEBUGGER();
+   return NULL;
+   }
 
-   dmub_abm_construct(abm_dce, ctx, regs, abm_shift, abm_mask);
+   dmub_abm_construct(abm_dce, ctx, regs, abm_shift, abm_mask);
 
-   return _dce->base;
+   return _dce->base;
+   }
+   return NULL;
 }
 
 void dmub_abm_destroy(struct abm **abm)
-- 
2.25.1



[PATCH 01/14] drm/amd/display: Prevent using DMUB rptr that is out-of-bounds

2021-10-01 Thread Solomon Chiu
From: Wyatt Wood 

[Why]
Running into bugchecks during stress test where rptr is 0x.
Typically this is caused by a hard hang, and can come from HW outside
of DCN.

[How]
To prevent bugchecks when writing the DMUB rptr, fist check that the
rptr is valid.

Reviewed-by: Nicholas Kazlauskas 
Acked-by: Solomon Chiu 
Signed-off-by: Wyatt Wood 
---
 drivers/gpu/drm/amd/display/dmub/dmub_srv.h |  1 +
 drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c | 10 --
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h 
b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h
index ef324fc39315..efb667cf6c98 100644
--- a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h
+++ b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h
@@ -84,6 +84,7 @@ enum dmub_status {
DMUB_STATUS_QUEUE_FULL,
DMUB_STATUS_TIMEOUT,
DMUB_STATUS_INVALID,
+   DMUB_STATUS_HW_FAILURE,
 };
 
 /* enum dmub_asic - dmub asic identifier */
diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c 
b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c
index a6188d067d65..77c67222cabd 100644
--- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c
+++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c
@@ -655,13 +655,19 @@ enum dmub_status dmub_srv_wait_for_phy_init(struct 
dmub_srv *dmub,
 enum dmub_status dmub_srv_wait_for_idle(struct dmub_srv *dmub,
uint32_t timeout_us)
 {
-   uint32_t i;
+   uint32_t i, rptr;
 
if (!dmub->hw_init)
return DMUB_STATUS_INVALID;
 
for (i = 0; i <= timeout_us; ++i) {
-   dmub->inbox1_rb.rptr = 
dmub->hw_funcs.get_inbox1_rptr(dmub);
+   rptr = dmub->hw_funcs.get_inbox1_rptr(dmub);
+
+   if (rptr > dmub->inbox1_rb.capacity)
+   return DMUB_STATUS_HW_FAILURE;
+
+   dmub->inbox1_rb.rptr = rptr;
+
if (dmub_rb_empty(>inbox1_rb))
return DMUB_STATUS_OK;
 
-- 
2.25.1



[PATCH 00/14] DC Patches October 1, 2021

2021-10-01 Thread Solomon Chiu


This DC patchset brings improvements in multiple areas. In summary, we
highlight:

- New firmware version
- Fix DMUB problems on stress test.
- Improve link training by skip overrride for preferred link
- Refinement of FPU code structure for DCN2
- Fix 3DLUT skipped programming
- Fix detection of 4 lane for DPALT
- Fix dcn3 failure due to dmcbu_abm not created
- Limit display scaling to up to 4k for DCN 3.1
- Add helper for blanking all dp displays



Anthony Koo (1):
  drm/amd/display: [FW Promotion] Release 0.0.87

Aric Cyr (2):
  drm/amd/display: Fix 3DLUT skipped programming
  drm/amd/display: 3.2.156

Charlene Liu (2):
  drm/amd/display: update irq_service and other required change part 2.
  drm/amd/display: dcn3 failed due to dmcbu_abm not created

George Shen (1):
  drm/amd/display: Skip override for preferred link settings during link
training

Hansen (1):
  drm/amd/display: Fix detection of 4 lane for DPALT

Jake Wang (1):
  drm/amd/display: Added root clock optimization flags

Jimmy Kizito (1):
  drm/amd/display: Fix concurrent dynamic encoder assignment.

Leo (Hanghong) Ma (2):
  drm/amd/display: Add helper for blanking all dp displays
  drm/amd/display: Fix error in dmesg at boot

Nikola Cornij (1):
  drm/amd/display: Limit display scaling to up to 4k for DCN 3.1

Qingqing Zhuo (1):
  drm/amd/display: Re-arrange FPU code structure for dcn2x

Wyatt Wood (1):
  drm/amd/display: Prevent using DMUB rptr that is out-of-bounds

 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c|  2 +-
 drivers/gpu/drm/amd/display/dc/core/dc.c  |  4 ++
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 45 +++
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  |  6 ---
 drivers/gpu/drm/amd/display/dc/dc.h   | 21 -
 drivers/gpu/drm/amd/display/dc/dc_link.h  |  1 +
 .../gpu/drm/amd/display/dc/dce/dce_hwseq.h|  9 ++--
 drivers/gpu/drm/amd/display/dc/dce/dce_opp.h  |  1 +
 drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c | 17 ---
 .../display/dc/dce110/dce110_hw_sequencer.c   | 24 ++
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 41 ++---
 .../drm/amd/display/dc/dcn20/dcn20_resource.c |  2 +-
 .../amd/display/dc/dcn201/dcn201_resource.c   |  2 +-
 .../drm/amd/display/dc/dcn21/dcn21_resource.c |  2 +-
 .../drm/amd/display/dc/dcn30/dcn30_hwseq.c| 39 ++--
 .../display/dc/dcn31/dcn31_dio_link_encoder.c | 33 +-
 .../display/dc/dcn31/dcn31_dio_link_encoder.h |  3 ++
 .../drm/amd/display/dc/dcn31/dcn31_hwseq.c| 39 ++--
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |  2 +-
 drivers/gpu/drm/amd/display/dc/dml/Makefile   |  4 +-
 .../dml/{dcn2x/dcn2x.c => dcn20/dcn20_fpu.c}  |  2 +-
 .../dml/{dcn2x/dcn2x.h => dcn20/dcn20_fpu.h}  |  6 +--
 drivers/gpu/drm/amd/display/dmub/dmub_srv.h   |  1 +
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   |  4 +-
 .../gpu/drm/amd/display/dmub/src/dmub_srv.c   | 10 -
 25 files changed, 156 insertions(+), 164 deletions(-)
 rename drivers/gpu/drm/amd/display/dc/dml/{dcn2x/dcn2x.c => dcn20/dcn20_fpu.c} 
(99%)
 rename drivers/gpu/drm/amd/display/dc/dml/{dcn2x/dcn2x.h => dcn20/dcn20_fpu.h} 
(94%)

-- 
2.25.1



[PATCH] drm/amdgpu/display: fold DRM_AMD_DC_DCN201 into DRM_AMD_DC_DCN

2021-10-01 Thread Alex Deucher
No need for a separate kconfig option at this point.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/Kconfig   | 9 -
 drivers/gpu/drm/amd/display/dc/Makefile   | 2 --
 drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile   | 2 --
 drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c  | 2 --
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 2 --
 drivers/gpu/drm/amd/display/dc/irq/Makefile   | 2 --
 6 files changed, 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/Kconfig 
b/drivers/gpu/drm/amd/display/Kconfig
index fb074a6067b2..7dffc04a557e 100644
--- a/drivers/gpu/drm/amd/display/Kconfig
+++ b/drivers/gpu/drm/amd/display/Kconfig
@@ -17,15 +17,6 @@ config DRM_AMD_DC_DCN
help
  Raven, Navi, and newer family support for display engine
 
-config DRM_AMD_DC_DCN201
-   bool "Enable DCN201 support in DC"
-   default y
-   depends on DRM_AMD_DC && X86
-   depends on DRM_AMD_DC_DCN
-   help
- Choose this option if you want to have
- 201 support for display engine
-
 config DRM_AMD_DC_HDCP
bool "Enable HDCP support in DC"
depends on DRM_AMD_DC
diff --git a/drivers/gpu/drm/amd/display/dc/Makefile 
b/drivers/gpu/drm/amd/display/dc/Makefile
index 520f58538364..b5482980e995 100644
--- a/drivers/gpu/drm/amd/display/dc/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/Makefile
@@ -30,9 +30,7 @@ DC_LIBS += dcn20
 DC_LIBS += dsc
 DC_LIBS += dcn10 dml
 DC_LIBS += dcn21
-ifdef CONFIG_DRM_AMD_DC_DCN201
 DC_LIBS += dcn201
-endif
 DC_LIBS += dcn30
 DC_LIBS += dcn301
 DC_LIBS += dcn302
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
index 7f70985b7a1b..6bd73e49a6d2 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
@@ -93,7 +93,6 @@ AMD_DAL_CLK_MGR_DCN20 = $(addprefix 
$(AMDDALPATH)/dc/clk_mgr/dcn20/,$(CLK_MGR_DC
 
 AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN20)
 
-ifdef CONFIG_DRM_AMD_DC_DCN201
 ###
 # DCN201
 ###
@@ -102,7 +101,6 @@ CLK_MGR_DCN201 = dcn201_clk_mgr.o
 AMD_DAL_CLK_MGR_DCN201 = $(addprefix 
$(AMDDALPATH)/dc/clk_mgr/dcn201/,$(CLK_MGR_DCN201))
 
 AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN201)
-endif
 
 ###
 # DCN21
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
index 421f5135b701..1548b2a3fe03 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
@@ -257,12 +257,10 @@ struct clk_mgr *dc_clk_mgr_create(struct dc_context *ctx, 
struct pp_smu_funcs *p
dcn3_clk_mgr_construct(ctx, clk_mgr, pp_smu, dccg);
return _mgr->base;
}
-#if defined(CONFIG_DRM_AMD_DC_DCN201)
if (asic_id.chip_id == DEVICE_ID_NV_13FE) {
dcn201_clk_mgr_construct(ctx, clk_mgr, pp_smu, dccg);
return _mgr->base;
}
-#endif
dcn20_clk_mgr_construct(ctx, clk_mgr, pp_smu, dccg);
return _mgr->base;
}
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index fc490b77f47d..561c10a92bb5 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -224,11 +224,9 @@ struct resource_pool *dc_create_resource_pool(struct dc  
*dc,
case DCN_VERSION_2_1:
res_pool = dcn21_create_resource_pool(init_data, dc);
break;
-#if defined(CONFIG_DRM_AMD_DC_DCN201)
case DCN_VERSION_2_01:
res_pool = dcn201_create_resource_pool(init_data, dc);
break;
-#endif
case DCN_VERSION_3_0:
res_pool = dcn30_create_resource_pool(init_data, dc);
break;
diff --git a/drivers/gpu/drm/amd/display/dc/irq/Makefile 
b/drivers/gpu/drm/amd/display/dc/irq/Makefile
index 8a182772eed2..fd739aecf104 100644
--- a/drivers/gpu/drm/amd/display/dc/irq/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/irq/Makefile
@@ -94,7 +94,6 @@ AMD_DAL_IRQ_DCN21= $(addprefix 
$(AMDDALPATH)/dc/irq/dcn21/,$(IRQ_DCN21))
 
 AMD_DISPLAY_FILES += $(AMD_DAL_IRQ_DCN21)
 
-ifdef CONFIG_DRM_AMD_DC_DCN201
 ###
 # DCN 201
 ###
@@ -103,7 +102,6 @@ IRQ_DCN201 = irq_service_dcn201.o
 AMD_DAL_IRQ_DCN201 = $(addprefix $(AMDDALPATH)/dc/irq/dcn201/,$(IRQ_DCN201))
 
 AMD_DISPLAY_FILES += $(AMD_DAL_IRQ_DCN201)
-endif
 
 

Re: [PATCH] drm/amdgpu: add missed write lock for pci detected state pci_channel_io_normal

2021-10-01 Thread Andrey Grodzovsky
No, scheduler restart and device unlock must take place 
inamdgpu_pci_resume (see struct pci_error_handlers for the various 
states of PCI recovery). So just add a flag (probably in amdgpu_device) 
so we can remember what pci_channel_state_t we came from (unfortunately 
it's not passed to us in  amdgpu_pci_resume) and unless it's set don't 
do anything in amdgpu_pci_resume.


Andrey

On 2021-10-01 4:21 a.m., Chen, Guchun wrote:

[Public]

Hi Andrey,

Do you mean to move the code of drm_sched_resubmit_jobs and drm_sched_start in 
amdgpu_pci_resume to amdgpu_pci_error_detected, under the case 
pci_channel_io_frozen?
Then leave amdgpu_pci_resume as a null function, and in this way, we can drop 
the acquire/lock write lock for case of pci_channel_io_normal as well?

Regards,
Guchun

-Original Message-
From: Grodzovsky, Andrey 
Sent: Friday, October 1, 2021 10:22 AM
To: Chen, Guchun ; amd-gfx@lists.freedesktop.org; Koenig, Christian 
; Pan, Xinhui ; Deucher, Alexander 

Subject: Re: [PATCH] drm/amdgpu: add missed write lock for pci detected state 
pci_channel_io_normal

On 2021-09-30 10:00 p.m., Guchun Chen wrote:


When a PCI error state pci_channel_io_normal is detectd, it will
report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI driver
will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock.


Good catch but, the issue is even wider in scope, what about 
drm_sched_resubmit_jobs and drm_sched_start called without being stopped before 
? Better to put the entire scope of code in this function under flag that set 
only in pci_channel_io_frozen. As far as i remember we don't need to do 
anything in case of pci_channel_io_normal.

Andrey



Fixes: c9a6b82f45e2("drm/amdgpu: Implement DPC recovery")
Signed-off-by: Guchun Chen 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
   1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bb5ad2b6ca13..12f822d51de2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5370,6 +5370,7 @@ pci_ers_result_t
amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
   
   	switch (state) {

case pci_channel_io_normal:
+   amdgpu_device_lock_adev(adev, NULL);
return PCI_ERS_RESULT_CAN_RECOVER;
/* Fatal error, prepare for slot reset */
case pci_channel_io_frozen:


RE: [PATCH] drm/amdgpu: During s0ix don't wait to signal GFXOFF

2021-10-01 Thread Limonciello, Mario
[Public]



> -Original Message-
> From: Alex Deucher 
> Sent: Friday, October 1, 2021 08:26
> To: Lazar, Lijo 
> Cc: amd-gfx list ; Deucher, Alexander
> ; Limonciello, Mario
> ; Zhang, Hawking ;
> Wang, Yang(Kevin) ; Quan, Evan
> 
> Subject: Re: [PATCH] drm/amdgpu: During s0ix don't wait to signal GFXOFF
> 
> On Fri, Oct 1, 2021 at 6:16 AM Lijo Lazar  wrote:
> >
> > In the rare event when GFX IP suspend coincides with a s0ix entry, don't
> > schedule a delayed work, instead signal PMFW immediately to allow GFXOFF
> > entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be
> > signaled about GFXOFF status before amd-pmc module passes OS HINT
> > to PMFW telling that everything is ready for a safe s0ix entry.
> >
> > Bug:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.fr
> eedesktop.org%2Fdrm%2Famd%2F-
> %2Fissues%2F1712data=04%7C01%7CMario.Limonciello%40amd.com%7
> C0ff4fe8eaf34471369ff08d984df1a33%7C3dd8961fe4884e608e11a82d994e183
> d%7C0%7C0%7C637686916025223001%7CUnknown%7CTWFpbGZsb3d8eyJWIj
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C100
> 0sdata=x4FS7%2B8uSPiNwYhQdDLekjBabrQwvkBfb%2BjlVbxJWB0%3D
> mp;reserved=0
> >
> > Signed-off-by: Lijo Lazar 
> 
> Reviewed-by: Alex Deucher 

Reviewed-by: Mario Limonciello 

> 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 14 --
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > index e7f06bd0f0cd..1916ec84dd71 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > @@ -31,6 +31,8 @@
> >  /* delay 0.1 second to enable gfx off feature */
> >  #define GFX_OFF_DELAY_ENABLE msecs_to_jiffies(100)
> >
> > +#define GFX_OFF_NO_DELAY 0
> > +
> >  /*
> >   * GPU GFX IP block helpers function.
> >   */
> > @@ -558,6 +560,8 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device
> *adev)
> >
> >  void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable)
> >  {
> > +   unsigned long delay = GFX_OFF_DELAY_ENABLE;
> > +
> > if (!(adev->pm.pp_feature & PP_GFXOFF_MASK))
> > return;
> >
> > @@ -573,8 +577,14 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device
> *adev, bool enable)
> >
> > adev->gfx.gfx_off_req_count--;
> >
> > -   if (adev->gfx.gfx_off_req_count == 0 && 
> > !adev->gfx.gfx_off_state)
> > -   schedule_delayed_work(>gfx.gfx_off_delay_work,
> GFX_OFF_DELAY_ENABLE);
> > +   if (adev->gfx.gfx_off_req_count == 0 &&
> > +   !adev->gfx.gfx_off_state) {
> > +   /* If going to s2idle, no need to wait */
> > +   if (adev->in_s0ix)
> > +   delay = GFX_OFF_NO_DELAY;
> > +   schedule_delayed_work(>gfx.gfx_off_delay_work,
> > + delay);
> > +   }
> > } else {
> > if (adev->gfx.gfx_off_req_count == 0) {
> > 
> > cancel_delayed_work_sync(>gfx.gfx_off_delay_work);
> > --
> > 2.17.1
> >


Re: [PATCH] drm/amdgpu: During s0ix don't wait to signal GFXOFF

2021-10-01 Thread Alex Deucher
On Fri, Oct 1, 2021 at 6:16 AM Lijo Lazar  wrote:
>
> In the rare event when GFX IP suspend coincides with a s0ix entry, don't
> schedule a delayed work, instead signal PMFW immediately to allow GFXOFF
> entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be
> signaled about GFXOFF status before amd-pmc module passes OS HINT
> to PMFW telling that everything is ready for a safe s0ix entry.
>
> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712
>
> Signed-off-by: Lijo Lazar 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index e7f06bd0f0cd..1916ec84dd71 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -31,6 +31,8 @@
>  /* delay 0.1 second to enable gfx off feature */
>  #define GFX_OFF_DELAY_ENABLE msecs_to_jiffies(100)
>
> +#define GFX_OFF_NO_DELAY 0
> +
>  /*
>   * GPU GFX IP block helpers function.
>   */
> @@ -558,6 +560,8 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device *adev)
>
>  void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable)
>  {
> +   unsigned long delay = GFX_OFF_DELAY_ENABLE;
> +
> if (!(adev->pm.pp_feature & PP_GFXOFF_MASK))
> return;
>
> @@ -573,8 +577,14 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, 
> bool enable)
>
> adev->gfx.gfx_off_req_count--;
>
> -   if (adev->gfx.gfx_off_req_count == 0 && 
> !adev->gfx.gfx_off_state)
> -   schedule_delayed_work(>gfx.gfx_off_delay_work, 
> GFX_OFF_DELAY_ENABLE);
> +   if (adev->gfx.gfx_off_req_count == 0 &&
> +   !adev->gfx.gfx_off_state) {
> +   /* If going to s2idle, no need to wait */
> +   if (adev->in_s0ix)
> +   delay = GFX_OFF_NO_DELAY;
> +   schedule_delayed_work(>gfx.gfx_off_delay_work,
> + delay);
> +   }
> } else {
> if (adev->gfx.gfx_off_req_count == 0) {
> 
> cancel_delayed_work_sync(>gfx.gfx_off_delay_work);
> --
> 2.17.1
>


[PATCH] drm/amdgpu: remove some repeated includings

2021-10-01 Thread Guo Zhengkui
Remove two repeated includings in line 46 and 47.

Signed-off-by: Guo Zhengkui 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 291a47f7992a..daa798c5b882 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -30,34 +30,32 @@
 
 #include "soc15.h"
 #include "gfx_v9_0.h"
 #include "gmc_v9_0.h"
 #include "df_v1_7.h"
 #include "df_v3_6.h"
 #include "nbio_v6_1.h"
 #include "nbio_v7_0.h"
 #include "nbio_v7_4.h"
 #include "hdp_v4_0.h"
 #include "vega10_ih.h"
 #include "vega20_ih.h"
 #include "sdma_v4_0.h"
 #include "uvd_v7_0.h"
 #include "vce_v4_0.h"
 #include "vcn_v1_0.h"
-#include "vcn_v2_0.h"
-#include "jpeg_v2_0.h"
 #include "vcn_v2_5.h"
 #include "jpeg_v2_5.h"
 #include "smuio_v9_0.h"
 #include "gmc_v10_0.h"
 #include "gfxhub_v2_0.h"
 #include "mmhub_v2_0.h"
 #include "nbio_v2_3.h"
 #include "nbio_v7_2.h"
 #include "hdp_v5_0.h"
 #include "nv.h"
 #include "navi10_ih.h"
 #include "gfx_v10_0.h"
 #include "sdma_v5_0.h"
 #include "sdma_v5_2.h"
 #include "vcn_v2_0.h"
 #include "jpeg_v2_0.h"
-- 
2.20.1



Re: [PATCH] drm/amdkfd: match the signatures of the real and stub kgd2kfd_probe()

2021-10-01 Thread kernel test robot
Hi,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.15-rc3 next-20210922]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/trix-redhat-com/drm-amdkfd-match-the-signatures-of-the-real-and-stub-kgd2kfd_probe/20211001-043648
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
02d5e016800d082058b3d3b7c3ede136cdc6ddcb
config: riscv-randconfig-r042-20211001 (attached as .config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 
962e503cc8bc411f7523cc393acae8aae425b1c4)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install riscv cross compiling tool for clang build
# apt-get install binutils-riscv64-linux-gnu
# 
https://github.com/0day-ci/linux/commit/f110a72ee42592cc2f841f5c345ffe7e0b831124
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
trix-redhat-com/drm-amdkfd-match-the-signatures-of-the-real-and-stub-kgd2kfd_probe/20211001-043648
git checkout f110a72ee42592cc2f841f5c345ffe7e0b831124
# save the attached .config to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=riscv SHELL=/bin/bash drivers/gpu/drm/amd/amdgpu/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c:74:23: error: too many arguments 
>> to function call, expected 2, have 4
 adev->pdev, adev->asic_type, vf);
 ^~~
   drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu_amdkfd.h:349:17: note: 
'kgd2kfd_probe' declared here
   struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, bool vf)
   ^
   1 error generated.


vim +74 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c

5c33f214db87b3 Felix Kuehling 2017-07-28  65  
5c33f214db87b3 Felix Kuehling 2017-07-28  66  void 
amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
5c33f214db87b3 Felix Kuehling 2017-07-28  67  {
050091ab6e832f Yong Zhao  2019-09-03  68bool vf = amdgpu_sriov_vf(adev);
5c33f214db87b3 Felix Kuehling 2017-07-28  69  
c7651b73586600 Felix Kuehling 2020-09-16  70if (!kfd_initialized)
c7651b73586600 Felix Kuehling 2020-09-16  71return;
c7651b73586600 Felix Kuehling 2020-09-16  72  
8e07e2676a42e7 Amber Lin  2018-12-13  73adev->kfd.dev = 
kgd2kfd_probe((struct kgd_dev *)adev,
e392c887df9791 Yong Zhao  2019-09-27 @74  
adev->pdev, adev->asic_type, vf);
611736d8447c0c Felix Kuehling 2018-11-19  75  
611736d8447c0c Felix Kuehling 2018-11-19  76if (adev->kfd.dev)
611736d8447c0c Felix Kuehling 2018-11-19  77
amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
130e0371b7d454 Oded Gabbay2015-06-12  78  }
130e0371b7d454 Oded Gabbay2015-06-12  79  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Lockdep spalt on killing a processes

2021-10-01 Thread Christian König

Hey, Andrey.

while investigating some memory management problems I've got the logdep 
splat below.


Looks like something is wrong with drm_sched_entity_kill_jobs_cb(), can 
you investigate?


Thanks,
Christian.

[11176.741052] 
[11176.741056] WARNING: possible recursive locking detected
[11176.741060] 5.15.0-rc1-00031-g9d546d600800 #171 Not tainted
[11176.741066] 
[11176.741070] swapper/12/0 is trying to acquire lock:
[11176.741074] 9c337ed175a8 (>lock){-.-.}-{3:3}, at: 
dma_fence_signal+0x28/0x80

[11176.741088]
   but task is already holding lock:
[11176.741092] 9c337ed172a8 (>lock){-.-.}-{3:3}, at: 
dma_fence_signal+0x28/0x80

[11176.741100]
   other info that might help us debug this:
[11176.741104]  Possible unsafe locking scenario:

[11176.741108]    CPU0
[11176.741110]    
[11176.741113]   lock(>lock);
[11176.741118]   lock(>lock);
[11176.741122]
    *** DEADLOCK ***

[11176.741125]  May be due to missing lock nesting notation

[11176.741128] 2 locks held by swapper/12/0:
[11176.741133]  #0: 9c339c30f768 
(>fence_drv.lock){-.-.}-{3:3}, at: dma_fence_signal+0x28/0x80
[11176.741142]  #1: 9c337ed172a8 (>lock){-.-.}-{3:3}, at: 
dma_fence_signal+0x28/0x80

[11176.741151]
   stack backtrace:
[11176.741155] CPU: 12 PID: 0 Comm: swapper/12 Not tainted 
5.15.0-rc1-00031-g9d546d600800 #171
[11176.741160] Hardware name: System manufacturer System Product 
Name/PRIME X399-A, BIOS 0808 10/12/2018

[11176.741165] Call Trace:
[11176.741169]  
[11176.741173]  dump_stack_lvl+0x5b/0x74
[11176.741181]  dump_stack+0x10/0x12
[11176.741186]  __lock_acquire.cold+0x208/0x2df
[11176.741197]  lock_acquire+0xc6/0x2d0
[11176.741204]  ? dma_fence_signal+0x28/0x80
[11176.741212]  _raw_spin_lock_irqsave+0x4d/0x70
[11176.741219]  ? dma_fence_signal+0x28/0x80
[11176.741225]  dma_fence_signal+0x28/0x80
[11176.741230]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[11176.741240]  drm_sched_entity_kill_jobs_cb+0x1c/0x50 [gpu_sched]
[11176.741248]  dma_fence_signal_timestamp_locked+0xac/0x1a0
[11176.741254]  dma_fence_signal+0x3b/0x80
[11176.741260]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[11176.741268]  drm_sched_job_done.isra.0+0x7f/0x1a0 [gpu_sched]
[11176.741277]  drm_sched_job_done_cb+0x12/0x20 [gpu_sched]
[11176.741284]  dma_fence_signal_timestamp_locked+0xac/0x1a0
[11176.741290]  dma_fence_signal+0x3b/0x80
[11176.741296]  amdgpu_fence_process+0xd1/0x140 [amdgpu]
[11176.741504]  sdma_v4_0_process_trap_irq+0x8c/0xb0 [amdgpu]
[11176.741731]  amdgpu_irq_dispatch+0xce/0x250 [amdgpu]
[11176.741954]  amdgpu_ih_process+0x81/0x100 [amdgpu]
[11176.742174]  amdgpu_irq_handler+0x26/0xa0 [amdgpu]
[11176.742393]  __handle_irq_event_percpu+0x4f/0x2c0
[11176.742402]  handle_irq_event_percpu+0x33/0x80
[11176.742408]  handle_irq_event+0x39/0x60
[11176.742414]  handle_edge_irq+0x93/0x1d0
[11176.742419]  __common_interrupt+0x50/0xe0
[11176.742426]  common_interrupt+0x80/0x90
[11176.742431]  
[11176.742436]  asm_common_interrupt+0x1e/0x40
[11176.742442] RIP: 0010:cpuidle_enter_state+0xff/0x470
[11176.742449] Code: 0f a3 05 04 54 24 01 0f 82 70 02 00 00 31 ff e8 37 
5d 6f ff 80 7d d7 00 0f 85 e9 01 00 00 e8 58 a2 7f ff fb 66 0f 1f 44 00 
00 <45> 85 ff 0f 88 01 01 00 00 49 63 c7 4c 2b 75 c8 48 8d 14 40 48 8d

[11176.742455] RSP: 0018:b6970021fe48 EFLAGS: 0202
[11176.742461] RAX: 0059be25 RBX: 0002 RCX: 

[11176.742465] RDX:  RSI:  RDI: 
9efeed78
[11176.742470] RBP: b6970021fe80 R08: 0001 R09: 
0001
[11176.742473] R10: 0001 R11: 0001 R12: 
9c3350b0e800
[11176.742477] R13: a00e9680 R14: 0a2a49ada060 R15: 
0002

[11176.742483]  ? cpuidle_enter_state+0xf8/0x470
[11176.742489]  ? cpuidle_enter_state+0xf8/0x470
[11176.742495]  cpuidle_enter+0x2e/0x40
[11176.742500]  call_cpuidle+0x23/0x40
[11176.742506]  do_idle+0x201/0x280
[11176.742512]  cpu_startup_entry+0x20/0x30
[11176.742517]  start_secondary+0x11f/0x160
[11176.742523]  secondary_startup_64_no_verify+0xb0/0xbb



[PATCH] drm/amdgpu: During s0ix don't wait to signal GFXOFF

2021-10-01 Thread Lijo Lazar
In the rare event when GFX IP suspend coincides with a s0ix entry, don't
schedule a delayed work, instead signal PMFW immediately to allow GFXOFF
entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be
signaled about GFXOFF status before amd-pmc module passes OS HINT
to PMFW telling that everything is ready for a safe s0ix entry.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index e7f06bd0f0cd..1916ec84dd71 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -31,6 +31,8 @@
 /* delay 0.1 second to enable gfx off feature */
 #define GFX_OFF_DELAY_ENABLE msecs_to_jiffies(100)
 
+#define GFX_OFF_NO_DELAY 0
+
 /*
  * GPU GFX IP block helpers function.
  */
@@ -558,6 +560,8 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device *adev)
 
 void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable)
 {
+   unsigned long delay = GFX_OFF_DELAY_ENABLE;
+
if (!(adev->pm.pp_feature & PP_GFXOFF_MASK))
return;
 
@@ -573,8 +577,14 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)
 
adev->gfx.gfx_off_req_count--;
 
-   if (adev->gfx.gfx_off_req_count == 0 && 
!adev->gfx.gfx_off_state)
-   schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);
+   if (adev->gfx.gfx_off_req_count == 0 &&
+   !adev->gfx.gfx_off_state) {
+   /* If going to s2idle, no need to wait */
+   if (adev->in_s0ix)
+   delay = GFX_OFF_NO_DELAY;
+   schedule_delayed_work(>gfx.gfx_off_delay_work,
+ delay);
+   }
} else {
if (adev->gfx.gfx_off_req_count == 0) {
cancel_delayed_work_sync(>gfx.gfx_off_delay_work);
-- 
2.17.1



Re: [PATCH] drm/amdgpu: remove some repeated includings

2021-10-01 Thread Christian König

Am 01.10.21 um 12:13 schrieb Guo Zhengkui:

Remove two repeated includings in line 46 and 47.

Signed-off-by: Guo Zhengkui 


Acked-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 --
  1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 291a47f7992a..daa798c5b882 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -30,34 +30,32 @@
  
  #include "soc15.h"

  #include "gfx_v9_0.h"
  #include "gmc_v9_0.h"
  #include "df_v1_7.h"
  #include "df_v3_6.h"
  #include "nbio_v6_1.h"
  #include "nbio_v7_0.h"
  #include "nbio_v7_4.h"
  #include "hdp_v4_0.h"
  #include "vega10_ih.h"
  #include "vega20_ih.h"
  #include "sdma_v4_0.h"
  #include "uvd_v7_0.h"
  #include "vce_v4_0.h"
  #include "vcn_v1_0.h"
-#include "vcn_v2_0.h"
-#include "jpeg_v2_0.h"
  #include "vcn_v2_5.h"
  #include "jpeg_v2_5.h"
  #include "smuio_v9_0.h"
  #include "gmc_v10_0.h"
  #include "gfxhub_v2_0.h"
  #include "mmhub_v2_0.h"
  #include "nbio_v2_3.h"
  #include "nbio_v7_2.h"
  #include "hdp_v5_0.h"
  #include "nv.h"
  #include "navi10_ih.h"
  #include "gfx_v10_0.h"
  #include "sdma_v5_0.h"
  #include "sdma_v5_2.h"
  #include "vcn_v2_0.h"
  #include "jpeg_v2_0.h"




RE: [PATCH] drm/amdgpu: add missed write lock for pci detected state pci_channel_io_normal

2021-10-01 Thread Chen, Guchun
[Public]

Hi Andrey,

Do you mean to move the code of drm_sched_resubmit_jobs and drm_sched_start in 
amdgpu_pci_resume to amdgpu_pci_error_detected, under the case 
pci_channel_io_frozen?
Then leave amdgpu_pci_resume as a null function, and in this way, we can drop 
the acquire/lock write lock for case of pci_channel_io_normal as well?

Regards,
Guchun

-Original Message-
From: Grodzovsky, Andrey  
Sent: Friday, October 1, 2021 10:22 AM
To: Chen, Guchun ; amd-gfx@lists.freedesktop.org; Koenig, 
Christian ; Pan, Xinhui ; 
Deucher, Alexander 
Subject: Re: [PATCH] drm/amdgpu: add missed write lock for pci detected state 
pci_channel_io_normal

On 2021-09-30 10:00 p.m., Guchun Chen wrote:

> When a PCI error state pci_channel_io_normal is detectd, it will 
> report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI driver 
> will continue the execution of PCI resume callback report_resume by 
> pci_walk_bridge, and the callback will go into amdgpu_pci_resume 
> finally, where write lock is releasd unconditionally without acquiring 
> such lock.


Good catch but, the issue is even wider in scope, what about 
drm_sched_resubmit_jobs and drm_sched_start called without being stopped before 
? Better to put the entire scope of code in this function under flag that set 
only in pci_channel_io_frozen. As far as i remember we don't need to do 
anything in case of pci_channel_io_normal.

Andrey


>
> Fixes: c9a6b82f45e2("drm/amdgpu: Implement DPC recovery")
> Signed-off-by: Guchun Chen 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index bb5ad2b6ca13..12f822d51de2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5370,6 +5370,7 @@ pci_ers_result_t 
> amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
>   
>   switch (state) {
>   case pci_channel_io_normal:
> + amdgpu_device_lock_adev(adev, NULL);
>   return PCI_ERS_RESULT_CAN_RECOVER;
>   /* Fatal error, prepare for slot reset */
>   case pci_channel_io_frozen:


Re: [PATCH] drm/amdgpu: fix some repeated includings

2021-10-01 Thread Christian König

It means that you should modify your patch, yes.

Regards,
Christian.

Am 01.10.21 um 05:02 schrieb 郭正奎:


So, it means I need to make another commit?

Zhengkui

*From:*guozheng...@vivo.com  *On Behalf Of 
*Christian K?nig

*Sent:* Thursday, September 30, 2021 7:56 PM
*To:* Guo Zhengkui ; Simon Ser 
*Cc:* Deucher, Alexander ; Pan, Xinhui 
; David Airlie ; Daniel Vetter 
; Chen, Guchun ; Zhou, Peng Ju 
; Zhang, Bokun ; Gao, Likun 
; amd-gfx@lists.freedesktop.org; 
dri-de...@lists.freedesktop.org; linux-ker...@vger.kernel.org; kernel 


*Subject:* Re: [PATCH] drm/amdgpu: fix some repeated includings

Ah, that makes more sense. Then please remove the duplicates in lines 
46 and 47 instead since the other ones are more correctly grouped 
together with their blocks.


Christian.

Am 30.09.21 um 13:54 schrieb 郭正奎:

Actually the duplicates take place in line 46, 47 and 62, 63.

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c

index 291a47f7992a..94fca56583a0 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c

@@ -46,34 +46,32 @@

#include "vcn_v2_0.h"

#include "jpeg_v2_0.h"

#include "vcn_v2_5.h"

#include "jpeg_v2_5.h"

#include "smuio_v9_0.h"

#include "gmc_v10_0.h"

#include "gfxhub_v2_0.h"

#include "mmhub_v2_0.h"

#include "nbio_v2_3.h"

#include "nbio_v7_2.h"

#include "hdp_v5_0.h"

#include "nv.h"

#include "navi10_ih.h"

#include "gfx_v10_0.h"

#include "sdma_v5_0.h"

#include "sdma_v5_2.h"

-#include "vcn_v2_0.h"

-#include "jpeg_v2_0.h"

#include "vcn_v3_0.h"

#include "jpeg_v3_0.h"

#include "amdgpu_vkms.h"

#include "mes_v10_1.h"

#include "smuio_v11_0.h"

#include "smuio_v11_0_6.h"

#include "smuio_v13_0.h"

MODULE_FIRMWARE("amdgpu/ip_discovery.bin");

#define mmRCC_CONFIG_MEMSIZE   0xde3

#define mmMM_INDEX 0x0

#define mmMM_INDEX_HI  0x6

#define mmMM_DATA  0x1

static const char *hw_id_names[HW_ID_MAX] = {





Re: [PATCH v2 10/12] lib: add support for device public type in test_hmm

2021-10-01 Thread Alistair Popple
On Tuesday, 14 September 2021 2:16:02 AM AEST Alex Sierra wrote:
> Device Public type uses device memory that is coherently accesible by
> the CPU. This could be shown as SP (special purpose) memory range
> at the BIOS-e820 memory enumeration. If no SP memory is supported in
> system, this could be faked by setting CONFIG_EFI_FAKE_MEMMAP.
> 
> Currently, test_hmm only supports two different SP ranges of at least
> 256MB size. This could be specified in the kernel parameter variable
> efi_fake_mem. Ex. Two SP ranges of 1GB starting at 0x1 &
> 0x14000 physical address. Ex.
> efi_fake_mem=1G@0x1:0x4,1G@0x14000:0x4
> 
> Signed-off-by: Alex Sierra 
> ---
>  lib/test_hmm.c  | 166 +++-
>  lib/test_hmm_uapi.h |  10 ++-
>  2 files changed, 113 insertions(+), 63 deletions(-)
> 
> diff --git a/lib/test_hmm.c b/lib/test_hmm.c
> index ef27e355738a..e346a48e2509 100644
> --- a/lib/test_hmm.c
> +++ b/lib/test_hmm.c
> @@ -469,6 +469,7 @@ static int dmirror_allocate_chunk(struct dmirror_device 
> *mdevice,
>   unsigned long pfn_first;
>   unsigned long pfn_last;
>   void *ptr;
> + int ret = -ENOMEM;
>  
>   devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
>   if (!devmem)
> @@ -551,7 +552,7 @@ static int dmirror_allocate_chunk(struct dmirror_device 
> *mdevice,
>   }
>   spin_unlock(>lock);
>  
> - return true;
> + return 0;
>  
>  err_release:
>   mutex_unlock(>devmem_lock);
> @@ -560,7 +561,7 @@ static int dmirror_allocate_chunk(struct dmirror_device 
> *mdevice,
>  err_devmem:
>   kfree(devmem);
>  
> - return false;
> + return ret;
>  }
>  
>  static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice)
> @@ -569,8 +570,10 @@ static struct page *dmirror_devmem_alloc_page(struct 
> dmirror_device *mdevice)
>   struct page *rpage;
>  
>   /*
> -  * This is a fake device so we alloc real system memory to store
> -  * our device memory.
> +  * For ZONE_DEVICE private type, this is a fake device so we alloc real
> +  * system memory to store our device memory.
> +  * For ZONE_DEVICE public type we use the actual dpage to store the data
> +  * and ignore rpage.
>*/
>   rpage = alloc_page(GFP_HIGHUSER);
>   if (!rpage)
> @@ -603,7 +606,7 @@ static void dmirror_migrate_alloc_and_copy(struct 
> migrate_vma *args,
>  struct dmirror *dmirror)
>  {
>   struct dmirror_device *mdevice = dmirror->mdevice;
> - const unsigned long *src = args->src;
> + unsigned long *src = args->src;
>   unsigned long *dst = args->dst;
>   unsigned long addr;
>  
> @@ -621,12 +624,18 @@ static void dmirror_migrate_alloc_and_copy(struct 
> migrate_vma *args,
>* unallocated pte_none() or read-only zero page.
>*/
>   spage = migrate_pfn_to_page(*src);
> -
> + if (spage && is_zone_device_page(spage)) {
> + pr_debug("page already in device spage pfn: 0x%lx\n",
> +   page_to_pfn(spage));
> + *src &= ~MIGRATE_PFN_MIGRATE;

I don't think this is quite correct, callers shouldn't modify the src array. To
mark a page as not migrating callers need to set *dst = 0.

However I think this should be considered a test failure anyway. If we are
migrating from system to device memory we would have set
MIGRATE_VMA_SELECT_SYSTEM meaning no device private pages should be returned.
Therefore I don't think we can reach this unless there is a bug right?

> + continue;
> + }
>   dpage = dmirror_devmem_alloc_page(mdevice);
>   if (!dpage)
>   continue;
>  
> - rpage = dpage->zone_device_data;
> + rpage = is_device_private_page(dpage) ? dpage->zone_device_data 
> :
> + dpage;
>   if (spage)
>   copy_highpage(rpage, spage);
>   else
> @@ -638,8 +647,10 @@ static void dmirror_migrate_alloc_and_copy(struct 
> migrate_vma *args,
>* the simulated device memory and that page holds the pointer
>* to the mirror.
>*/
> + rpage = dpage->zone_device_data;
>   rpage->zone_device_data = dmirror;
> -
> + pr_debug("migrating from sys to dev pfn src: 0x%lx pfn dst: 
> 0x%lx\n",
> +  page_to_pfn(spage), page_to_pfn(dpage));
>   *dst = migrate_pfn(page_to_pfn(dpage)) |
>   MIGRATE_PFN_LOCKED;
>   if ((*src & MIGRATE_PFN_WRITE) ||
> @@ -673,10 +684,13 @@ static int dmirror_migrate_finalize_and_map(struct 
> migrate_vma *args,
>   continue;
>  
>   /*
> -  * Store the page that holds the data so the page table
> -  * doesn't have 

Re: [PATCH v2 09/12] lib: test_hmm add module param for zone device type

2021-10-01 Thread Alistair Popple
On Friday, 24 September 2021 1:52:47 AM AEST Sierra Guiza, Alejandro (Alex) 
wrote:
> 
> On 9/21/2021 12:14 AM, Alistair Popple wrote:
> > On Tuesday, 21 September 2021 6:05:30 AM AEST Sierra Guiza, Alejandro 
> > (Alex) wrote:
> >> On 9/20/2021 3:53 AM, Alistair Popple wrote:
> >>> On Tuesday, 14 September 2021 2:16:01 AM AEST Alex Sierra wrote:
>  In order to configure device public in test_hmm, two module parameters
>  should be passed, which correspond to the SP start address of each
>  device (2) spm_addr_dev0 & spm_addr_dev1. If no parameters are passed,
>  private device type is configured.
> >>> It's a pity that testing this seems to require some amount of special 
> >>> setup to
> >>> test. Is there a way this could be made to work on a more standard setup
> >>> similar to how DEVICE_PRIVATE is tested?
> >> Hi Alistair
> >> We tried to do it as simpler as possible. Unfortunately, there are two main
> >> requirements to register dev memory as DEVICE_PUBLIC type. This memory must
> >> NOT be accessed by any memory allocator (SLAB, SLOB, SLUB) plus, it has
> >> to be
> >> CPU coherently accessed.  We also want to avoid aliasing the same PFNs for
> >> different page types (regular system memory and DEVICE_PUBLIC). So we don't
> >> want the reserved memory to be part of the kernel's memory map before we
> >> call
> >> memremap_pages. A transparent way of doing it, without any special HW, was
> >> setting a portion of system memory as SPM (Special purpose memory). And use
> >> this as our “device fake” memory.
> > Ok, I think it's great that we can test this without special HW but the boot
> > time configuration is still a bit annoying. Would it be possible to allocate
> > memory fitting the above requirements by hot unplugging it with something 
> > like
> > offline_and_remove_memory()?
> > I also don't see why the DEVICE_PRIVATE and DEVICE_PUBLIC testing should be
> > mutually exclusive - why can't we test both without reloading the module?
> You could do both DEVICE_PRIVATE and DEVICE_PUBLIC tests by running the 
> test_hmm_sh
> script twice, just passing the proper parameters. Even when you booted 
> with fake EFI SP
> regions. If spm_address_dev0/1 parameters are passed, the module gets 
> configured with
> DEVICE_PUBLIC type. Otherwise DEVICE_PRIVATE type is set. Technically 
> the only
> complication in testing DEVICE_PUBLIC is the fake SPM boot parameter.

Or you could just have the test specify what sort of memory it wants to use
(DEVICE_PRIVATE or DEVICE_GENERIC). That seems preferable to requiring a module
reload. A module reload also makes it impossible to test interactions between
DEVICE_PRIVATE and DEVICE_GENERIC memory.

 - Alistair

> Alex Sierra
> >
> >   - Alistair
> >
> >> Regards,
> >> Alex Sierra
> >>
>  Signed-off-by: Alex Sierra 
>  ---
>  v5:
>  Remove devmem->pagemap.type = MEMORY_DEVICE_PRIVATE at
>  dmirror_allocate_chunk that was forcing to configure pagemap.type
>  to MEMORY_DEVICE_PRIVATE
> 
>  v6:
>  Check for null pointers for resource and memremap references
>  at dmirror_allocate_chunk
> 
>  v7:
>  Due to patch dropped from these patch series "kernel: resource:
>  lookup_resource as exported symbol", lookup_resource was not longer a
>  callable function. This was used in public device configuration, to
>  get start and end addresses, to create pgmap->range struct. This
>  information is now taken directly from the spm_addr_devX parameters and
>  the fixed size DEVMEM_CHUNK_SIZE.
>  ---
> lib/test_hmm.c  | 66 +++--
> lib/test_hmm_uapi.h |  1 +
> 2 files changed, 47 insertions(+), 20 deletions(-)
> 
>  diff --git a/lib/test_hmm.c b/lib/test_hmm.c
>  index 3cd91ca31dd7..ef27e355738a 100644
>  --- a/lib/test_hmm.c
>  +++ b/lib/test_hmm.c
>  @@ -33,6 +33,16 @@
> #define DEVMEM_CHUNK_SIZE (256 * 1024 * 1024U)
> #define DEVMEM_CHUNKS_RESERVE 16
> 
>  +static unsigned long spm_addr_dev0;
>  +module_param(spm_addr_dev0, long, 0644);
>  +MODULE_PARM_DESC(spm_addr_dev0,
>  +"Specify start address for SPM (special purpose memory) 
>  used for device 0. By setting this Generic device type will be used. 
>  Make sure spm_addr_dev1 is set too");
>  +
>  +static unsigned long spm_addr_dev1;
>  +module_param(spm_addr_dev1, long, 0644);
>  +MODULE_PARM_DESC(spm_addr_dev1,
>  +"Specify start address for SPM (special purpose memory) 
>  used for device 1. By setting this Generic device type will be used. 
>  Make sure spm_addr_dev0 is set too");
>  +
> static const struct dev_pagemap_ops dmirror_devmem_ops;
> static const struct mmu_interval_notifier_ops dmirror_min_ops;
> static dev_t dmirror_dev;
>  @@ -450,11 +460,11 @@ static int 

RE: [PATCH] drm/amdgpu: fix some repeated includings

2021-10-01 Thread 郭正奎
So, it means I need to make another commit?

Zhengkui
From: guozheng...@vivo.com  On Behalf Of Christian K?nig
Sent: Thursday, September 30, 2021 7:56 PM
To: Guo Zhengkui ; Simon Ser 
Cc: Deucher, Alexander ; Pan, Xinhui 
; David Airlie ; Daniel Vetter 
; Chen, Guchun ; Zhou, Peng Ju 
; Zhang, Bokun ; Gao, Likun 
; amd-gfx@lists.freedesktop.org; 
dri-de...@lists.freedesktop.org; linux-ker...@vger.kernel.org; kernel 

Subject: Re: [PATCH] drm/amdgpu: fix some repeated includings

Ah, that makes more sense. Then please remove the duplicates in lines 46 and 47 
instead since the other ones are more correctly grouped together with their 
blocks.

Christian.
Am 30.09.21 um 13:54 schrieb 郭正奎:
Actually the duplicates take place in line 46, 47 and 62, 63.

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 291a47f7992a..94fca56583a0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -46,34 +46,32 @@
#include "vcn_v2_0.h"
#include "jpeg_v2_0.h"
#include "vcn_v2_5.h"
#include "jpeg_v2_5.h"
#include "smuio_v9_0.h"
#include "gmc_v10_0.h"
#include "gfxhub_v2_0.h"
#include "mmhub_v2_0.h"
#include "nbio_v2_3.h"
#include "nbio_v7_2.h"
#include "hdp_v5_0.h"
#include "nv.h"
#include "navi10_ih.h"
#include "gfx_v10_0.h"
#include "sdma_v5_0.h"
#include "sdma_v5_2.h"
-#include "vcn_v2_0.h"
-#include "jpeg_v2_0.h"
#include "vcn_v3_0.h"
#include "jpeg_v3_0.h"
#include "amdgpu_vkms.h"
#include "mes_v10_1.h"
#include "smuio_v11_0.h"
#include "smuio_v11_0_6.h"
#include "smuio_v13_0.h"

MODULE_FIRMWARE("amdgpu/ip_discovery.bin");

#define mmRCC_CONFIG_MEMSIZE   0xde3
#define mmMM_INDEX 0x0
#define mmMM_INDEX_HI  0x6
#define mmMM_DATA  0x1

static const char *hw_id_names[HW_ID_MAX] = {