Re: [PATCH] drm/amd/display: fix hw rotated modes when PSR-SU is enabled

2023-12-07 Thread Kai-Heng Feng
On Thu, Dec 7, 2023 at 9:57 AM Mario Limonciello
 wrote:
>
> On 12/6/2023 19:23, Kai-Heng Feng wrote:
> > On Wed, Dec 6, 2023 at 4:29 AM Mario Limonciello
> >  wrote:
> >>
> >> On 12/5/2023 14:17, Hamza Mahfooz wrote:
> >>> We currently don't support dirty rectangles on hardware rotated modes.
> >>> So, if a user is using hardware rotated modes with PSR-SU enabled,
> >>> use PSR-SU FFU for all rotated planes (including cursor planes).
> >>>
> >>
> >> Here is the email for the original reporter to give an attribution tag.
> >>
> >> Reported-by: Kai-Heng Feng 
> >
> > For this particular issue,
> > Tested-by: Kai-Heng Feng 
>
> Can you confirm what kernel base you tested issue against?
>
> I ask because Bin Li (+CC) also tested it against 6.1 based LTS kernel
> but ran into problems.

The patch was tested against ADSN.

>
> I wonder if it's because of other dependency patches.  If that's the
> case it would be good to call them out in the Cc: @stable as
> dependencies so when Greg or Sasha backport this 6.1 doesn't get broken.

Probably. I haven't really tested any older kernel series.

Kai-Heng

>
> Bin,
>
> Could you run ./scripts/decode_stacktrace.sh on your kernel trace to
> give us a specific line number on the issue you hit?
>
> Thanks!
> >
> >>
> >>> Cc: sta...@vger.kernel.org
> >>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2952
> >>> Fixes: 30ebe41582d1 ("drm/amd/display: add FB_DAMAGE_CLIPS support")
> >>> Signed-off-by: Hamza Mahfooz 
> >>> ---
> >>>drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c|  4 
> >>>drivers/gpu/drm/amd/display/dc/dc_hw_types.h |  1 +
> >>>drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c| 12 ++--
> >>>.../gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c  |  3 ++-
> >>>4 files changed, 17 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> >>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>> index c146dc9cba92..79f8102d2601 100644
> >>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>> @@ -5208,6 +5208,7 @@ static void fill_dc_dirty_rects(struct drm_plane 
> >>> *plane,
> >>>bool bb_changed;
> >>>bool fb_changed;
> >>>u32 i = 0;
> >>> +
> >>
> >> Looks like a spurious newline here.
> >>
> >>>*dirty_regions_changed = false;
> >>>
> >>>/*
> >>> @@ -5217,6 +5218,9 @@ static void fill_dc_dirty_rects(struct drm_plane 
> >>> *plane,
> >>>if (plane->type == DRM_PLANE_TYPE_CURSOR)
> >>>return;
> >>>
> >>> + if (new_plane_state->rotation != DRM_MODE_ROTATE_0)
> >>> + goto ffu;
> >>> +
> >>
> >> I noticed that the original report was specifically on 180°.  Since
> >> you're also covering 90° and 270° with this check it sounds like it's
> >> actually problematic on those too?
> >
> > 90 & 270 are problematic too. But from what I observed the issue is
> > much more than just cursors.
>
> Got it; thanks.
>
> >
> > Kai-Heng
> >
> >>
> >>>num_clips = drm_plane_get_damage_clips_count(new_plane_state);
> >>>clips = drm_plane_get_damage_clips(new_plane_state);
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/display/dc/dc_hw_types.h 
> >>> b/drivers/gpu/drm/amd/display/dc/dc_hw_types.h
> >>> index 9649934ea186..e2a3aa8812df 100644
> >>> --- a/drivers/gpu/drm/amd/display/dc/dc_hw_types.h
> >>> +++ b/drivers/gpu/drm/amd/display/dc/dc_hw_types.h
> >>> @@ -465,6 +465,7 @@ struct dc_cursor_mi_param {
> >>>struct fixed31_32 v_scale_ratio;
> >>>enum dc_rotation_angle rotation;
> >>>bool mirror;
> >>> + struct dc_stream_state *stream;
> >>>};
> >>>
> >>>/* IPP related types */
> >>> diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c 
> >>> b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
> >>> index 139cf31d2e45..89c3bf0fe0c9 100644
> >>> --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
> >

Re: [PATCH] drm/amd/display: fix hw rotated modes when PSR-SU is enabled

2023-12-07 Thread Kai-Heng Feng
On Thu, Dec 7, 2023 at 10:10 AM Mario Limonciello
 wrote:
>
> On 12/6/2023 20:07, Kai-Heng Feng wrote:
> > On Thu, Dec 7, 2023 at 9:57 AM Mario Limonciello
> >  wrote:
> >>
> >> On 12/6/2023 19:23, Kai-Heng Feng wrote:
> >>> On Wed, Dec 6, 2023 at 4:29 AM Mario Limonciello
> >>>  wrote:
> >>>>
> >>>> On 12/5/2023 14:17, Hamza Mahfooz wrote:
> >>>>> We currently don't support dirty rectangles on hardware rotated modes.
> >>>>> So, if a user is using hardware rotated modes with PSR-SU enabled,
> >>>>> use PSR-SU FFU for all rotated planes (including cursor planes).
> >>>>>
> >>>>
> >>>> Here is the email for the original reporter to give an attribution tag.
> >>>>
> >>>> Reported-by: Kai-Heng Feng 
> >>>
> >>> For this particular issue,
> >>> Tested-by: Kai-Heng Feng 
> >>
> >> Can you confirm what kernel base you tested issue against?
> >>
> >> I ask because Bin Li (+CC) also tested it against 6.1 based LTS kernel
> >> but ran into problems.
> >
> > The patch was tested against ADSN.
> >
> >>
> >> I wonder if it's because of other dependency patches.  If that's the
> >> case it would be good to call them out in the Cc: @stable as
> >> dependencies so when Greg or Sasha backport this 6.1 doesn't get broken.
> >
> > Probably. I haven't really tested any older kernel series.
>
> Since you've got a good environment to test it and reproduce it would
> you mind double checking it against 6.7-rc, 6.5 and 6.1 trees?  If we
> don't have confidence it works on the older trees I think we'll need to
> drop the stable tag.

Not seeing issues here when the patch is applied against 6.5 and 6.1
(which needs to resolve a minor conflict).

I am not sure what happened for Bin's case.

Kai-Heng

> >
> > Kai-Heng
> >
> >>
> >> Bin,
> >>
> >> Could you run ./scripts/decode_stacktrace.sh on your kernel trace to
> >> give us a specific line number on the issue you hit?
> >>
> >> Thanks!
> >>>
> >>>>
> >>>>> Cc: sta...@vger.kernel.org
> >>>>> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2952
> >>>>> Fixes: 30ebe41582d1 ("drm/amd/display: add FB_DAMAGE_CLIPS support")
> >>>>> Signed-off-by: Hamza Mahfooz 
> >>>>> ---
> >>>>> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c|  4 
> >>>>> drivers/gpu/drm/amd/display/dc/dc_hw_types.h |  1 +
> >>>>> drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c| 12 
> >>>>> ++--
> >>>>> .../gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c  |  3 ++-
> >>>>> 4 files changed, 17 insertions(+), 3 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> >>>>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>>>> index c146dc9cba92..79f8102d2601 100644
> >>>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>>>> @@ -5208,6 +5208,7 @@ static void fill_dc_dirty_rects(struct drm_plane 
> >>>>> *plane,
> >>>>> bool bb_changed;
> >>>>> bool fb_changed;
> >>>>> u32 i = 0;
> >>>>> +
> >>>>
> >>>> Looks like a spurious newline here.
> >>>>
> >>>>> *dirty_regions_changed = false;
> >>>>>
> >>>>> /*
> >>>>> @@ -5217,6 +5218,9 @@ static void fill_dc_dirty_rects(struct drm_plane 
> >>>>> *plane,
> >>>>> if (plane->type == DRM_PLANE_TYPE_CURSOR)
> >>>>> return;
> >>>>>
> >>>>> + if (new_plane_state->rotation != DRM_MODE_ROTATE_0)
> >>>>> + goto ffu;
> >>>>> +
> >>>>
> >>>> I noticed that the original report was specifically on 180°.  Since
> >>>> you're also covering 90° and 270° with this check it sounds like it's
> >>>> actually problematic on those too?
> >>>
> >>> 90 & 270 are problematic too. But from what I observed

Re: [PATCH] drm/amd/display: fix hw rotated modes when PSR-SU is enabled

2023-12-07 Thread Kai-Heng Feng
On Wed, Dec 6, 2023 at 4:29 AM Mario Limonciello
 wrote:
>
> On 12/5/2023 14:17, Hamza Mahfooz wrote:
> > We currently don't support dirty rectangles on hardware rotated modes.
> > So, if a user is using hardware rotated modes with PSR-SU enabled,
> > use PSR-SU FFU for all rotated planes (including cursor planes).
> >
>
> Here is the email for the original reporter to give an attribution tag.
>
> Reported-by: Kai-Heng Feng 

For this particular issue,
Tested-by: Kai-Heng Feng 

>
> > Cc: sta...@vger.kernel.org
> > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2952
> > Fixes: 30ebe41582d1 ("drm/amd/display: add FB_DAMAGE_CLIPS support")
> > Signed-off-by: Hamza Mahfooz 
> > ---
> >   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c|  4 
> >   drivers/gpu/drm/amd/display/dc/dc_hw_types.h |  1 +
> >   drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c| 12 ++--
> >   .../gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c  |  3 ++-
> >   4 files changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index c146dc9cba92..79f8102d2601 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -5208,6 +5208,7 @@ static void fill_dc_dirty_rects(struct drm_plane 
> > *plane,
> >   bool bb_changed;
> >   bool fb_changed;
> >   u32 i = 0;
> > +
>
> Looks like a spurious newline here.
>
> >   *dirty_regions_changed = false;
> >
> >   /*
> > @@ -5217,6 +5218,9 @@ static void fill_dc_dirty_rects(struct drm_plane 
> > *plane,
> >   if (plane->type == DRM_PLANE_TYPE_CURSOR)
> >   return;
> >
> > + if (new_plane_state->rotation != DRM_MODE_ROTATE_0)
> > + goto ffu;
> > +
>
> I noticed that the original report was specifically on 180°.  Since
> you're also covering 90° and 270° with this check it sounds like it's
> actually problematic on those too?

90 & 270 are problematic too. But from what I observed the issue is
much more than just cursors.

Kai-Heng

>
> >   num_clips = drm_plane_get_damage_clips_count(new_plane_state);
> >   clips = drm_plane_get_damage_clips(new_plane_state);
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/dc_hw_types.h 
> > b/drivers/gpu/drm/amd/display/dc/dc_hw_types.h
> > index 9649934ea186..e2a3aa8812df 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dc_hw_types.h
> > +++ b/drivers/gpu/drm/amd/display/dc/dc_hw_types.h
> > @@ -465,6 +465,7 @@ struct dc_cursor_mi_param {
> >   struct fixed31_32 v_scale_ratio;
> >   enum dc_rotation_angle rotation;
> >   bool mirror;
> > + struct dc_stream_state *stream;
> >   };
> >
> >   /* IPP related types */
> > diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c 
> > b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
> > index 139cf31d2e45..89c3bf0fe0c9 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
> > +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
> > @@ -1077,8 +1077,16 @@ void hubp2_cursor_set_position(
> >   if (src_y_offset < 0)
> >   src_y_offset = 0;
> >   /* Save necessary cursor info x, y position. w, h is saved in 
> > attribute func. */
> > - hubp->cur_rect.x = src_x_offset + param->viewport.x;
> > - hubp->cur_rect.y = src_y_offset + param->viewport.y;
> > + if (param->stream->link->psr_settings.psr_version >= 
> > DC_PSR_VERSION_SU_1 &&
> > + param->rotation != ROTATION_ANGLE_0) {
>
> Ditto on above about 90° and 270°.
>
> > + hubp->cur_rect.x = 0;
> > + hubp->cur_rect.y = 0;
> > + hubp->cur_rect.w = param->stream->timing.h_addressable;
> > + hubp->cur_rect.h = param->stream->timing.v_addressable;
> > + } else {
> > + hubp->cur_rect.x = src_x_offset + param->viewport.x;
> > + hubp->cur_rect.y = src_y_offset + param->viewport.y;
> > + }
> >   }
> >
> >   void hubp2_clk_cntl(struct hubp *hubp, bool enable)
> > diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c 
> > b/drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c
> > index 2b8b8366538e..ce5613a76267 100644
> > --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c
> >

Re: [PATCH 1/2] drm/amdgpu: Reset GPU on S0ix when device supports BOCO

2023-03-30 Thread Kai-Heng Feng
On Wed, Mar 29, 2023 at 9:23 PM Mario Limonciello
 wrote:
>
>
> On 3/29/23 04:59, Kai-Heng Feng wrote:
> > When the power is lost due to ACPI power resources being turned off, the
> > driver should reset the GPU so it can work anew.
> >
> > First, _PR3 support of the hierarchy needs to be found correctly. Since
> > the GPU on some discrete GFX cards is behind a PCIe switch, checking the
> > _PR3 on downstream port alone is not enough, as the _PR3 can associate
> > to the root port above the PCIe switch.
>
> I think this should be split into two commits:
>
> * One of them to look at _PR3 further up in hierarchy to fix indication
> for BOCO support.

Yes, this part can be split up.

>
> * One to adjust policy for whether to reset

IIUC, the GPU only needs to be reset when the power status isn't certain?

Assuming power resources in _PR3 are really disabled, GPU is already
reset by itself. That means reset shouldn't be necessary for D3cold,
am I understanding it correctly?

However, this is a desktop plugged with GFX card that has external
power, does that assumption still stand? Perform resetting on D3cold
can cover this scenario.

>
>
> > Once the _PR3 is found and BOCO support is correctly marked, use that
> > information to inform the GPU should be reset. This solves an issue that
> > system freeze on a Intel ADL desktop that uses S0ix for sleep and D3cold
> > is supported for the GFX slot.
>
> I'm worried this is still papering over an underlying issue with L0s
> handling on ALD + Navi1x/Navi2x.

Is it possible to get the ASIC's ASPM parameter under Windows? Knowing
the difference can be useful.

>
> Also, what about runtime suspend?  If you unplug the monitor from this
> dGPU and interact with it over SSH it should go into runtime suspend.
>
> Is it working properly for that case now?

Thanks for the tip. Runtime resume doesn't work at all:
[ 1087.601631] pcieport :00:01.0: power state changed by ACPI to D0
[ 1087.613820] pcieport :00:01.0: restoring config space at offset
0x2c (was 0x43, writing 0x43)
[ 1087.613835] pcieport :00:01.0: restoring config space at offset
0x28 (was 0x41, writing 0x41)
[ 1087.613841] pcieport :00:01.0: restoring config space at offset
0x24 (was 0xfff10001, writing 0xfff10001)
[ 1087.613978] pcieport :00:01.0: PME# disabled
[ 1087.613984] pcieport :00:01.0: waiting 100 ms for downstream
link, after activation
[ 1089.330956] pcieport :01:00.0: not ready 1023ms after resume; giving up
[ 1089.373036] pcieport :01:00.0: Unable to change power state
from D3cold to D0, device inaccessible

After a short while the whole system froze.

So the upstream port of GFX's PCIe switch cannot be powered on again.

Kai-Heng

>
> >
> > Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
> > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
> > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2458
> > Signed-off-by: Kai-Heng Feng 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c   |  3 +++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  7 ++-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 12 +---
> >   3 files changed, 14 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > index 60b1857f469e..407456ac0e84 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > @@ -987,6 +987,9 @@ bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device 
> > *adev)
> >   if (amdgpu_sriov_vf(adev))
> >   return false;
> >
> > + if (amdgpu_device_supports_boco(adev_to_drm(adev)))
> > + return true;
> > +
> >   #if IS_ENABLED(CONFIG_SUSPEND)
> >   return pm_suspend_target_state != PM_SUSPEND_TO_IDLE;
> >   #else
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index f5658359ff5c..d56b7a2bafa6 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -2181,7 +2181,12 @@ static int amdgpu_device_ip_early_init(struct 
> > amdgpu_device *adev)
> >
> >   if (!(adev->flags & AMD_IS_APU)) {
> >   parent = pci_upstream_bridge(adev->pdev);
> > - adev->has_pr3 = parent ? pci_pr3_present(parent) : false;
> > + do {
> > + if (pci_pr3_present(parent)) {
> > + adev->has_pr3 = true;
> > + break;
> > + }
&

Re: [PATCH 1/2] drm/amdgpu: Reset GPU on S0ix when device supports BOCO

2023-03-29 Thread Kai-Heng Feng
On Wed, Mar 29, 2023 at 9:21 PM Alex Deucher  wrote:
>
> On Wed, Mar 29, 2023 at 6:00 AM Kai-Heng Feng
>  wrote:
> >
> > When the power is lost due to ACPI power resources being turned off, the
> > driver should reset the GPU so it can work anew.
> >
> > First, _PR3 support of the hierarchy needs to be found correctly. Since
> > the GPU on some discrete GFX cards is behind a PCIe switch, checking the
> > _PR3 on downstream port alone is not enough, as the _PR3 can associate
> > to the root port above the PCIe switch.
> >
> > Once the _PR3 is found and BOCO support is correctly marked, use that
> > information to inform the GPU should be reset. This solves an issue that
> > system freeze on a Intel ADL desktop that uses S0ix for sleep and D3cold
> > is supported for the GFX slot.
>
> I don't think we need to reset the GPU.  If the power is turned off, a
> reset shouldn't be necessary. The reset is only necessary when the
> power is not turned off to put the GPU into a known good state.  It
> should be in that state already if the power is turn off.  It sounds
> like the device is not actually getting powered off.

I had the impression that the GPU gets reset because S3 turned the
power rail off.

So the actual intention for GPU reset is because S3 doesn't guarantee
the power is being turned off?

Kai-Heng

>
> Alex
>
> >
> > Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
> > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
> > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2458
> > Signed-off-by: Kai-Heng Feng 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c   |  3 +++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  7 ++-
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 12 +---
> >  3 files changed, 14 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > index 60b1857f469e..407456ac0e84 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > @@ -987,6 +987,9 @@ bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device 
> > *adev)
> > if (amdgpu_sriov_vf(adev))
> > return false;
> >
> > +   if (amdgpu_device_supports_boco(adev_to_drm(adev)))
> > +   return true;
> > +
> >  #if IS_ENABLED(CONFIG_SUSPEND)
> > return pm_suspend_target_state != PM_SUSPEND_TO_IDLE;
> >  #else
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index f5658359ff5c..d56b7a2bafa6 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -2181,7 +2181,12 @@ static int amdgpu_device_ip_early_init(struct 
> > amdgpu_device *adev)
> >
> > if (!(adev->flags & AMD_IS_APU)) {
> > parent = pci_upstream_bridge(adev->pdev);
> > -   adev->has_pr3 = parent ? pci_pr3_present(parent) : false;
> > +   do {
> > +   if (pci_pr3_present(parent)) {
> > +   adev->has_pr3 = true;
> > +   break;
> > +   }
> > +   } while ((parent = pci_upstream_bridge(parent)));
> > }
> >
> > amdgpu_amdkfd_device_probe(adev);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index ba5def374368..5d81fcac4b0a 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -2415,10 +2415,11 @@ static int amdgpu_pmops_suspend(struct device *dev)
> > struct drm_device *drm_dev = dev_get_drvdata(dev);
> > struct amdgpu_device *adev = drm_to_adev(drm_dev);
> >
> > -   if (amdgpu_acpi_is_s0ix_active(adev))
> > -   adev->in_s0ix = true;
> > -   else if (amdgpu_acpi_is_s3_active(adev))
> > +   if (amdgpu_acpi_is_s3_active(adev) ||
> > +   amdgpu_device_supports_boco(drm_dev))
> > adev->in_s3 = true;
> > +   else if (amdgpu_acpi_is_s0ix_active(adev))
> > +   adev->in_s0ix = true;
> > if (!adev->in_s0ix && !adev->in_s3)
> > return 0;
> > return amdgpu_device_suspend(drm_dev, true);
> > @@ -2449,10 +2450,7 @@ static int amdgpu_pmops_resume(struct device *dev)
> > adev->no_hw_access = true;
> >
> > r = amdgpu_device_resume(drm_dev, true);
> > -   if (amdgpu_acpi_is_s0ix_active(adev))
> > -   adev->in_s0ix = false;
> > -   else
> > -   adev->in_s3 = false;
> > +   adev->in_s0ix = adev->in_s3 = false;
> > return r;
> >  }
> >
> > --
> > 2.34.1
> >


[PATCH 2/2] drm/amdgpu: Remove ASPM workaround on VI and NV

2023-03-29 Thread Kai-Heng Feng
Since the original issue is resolved by a new fix, the ASPM workaround
can be dropped.

Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 ---
 drivers/gpu/drm/amd/amdgpu/nv.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/vi.c|  2 +-
 4 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 8cf2cc50b3de..a19a6489b117 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1248,7 +1248,6 @@ void amdgpu_device_pci_config_reset(struct amdgpu_device 
*adev);
 int amdgpu_device_pci_reset(struct amdgpu_device *adev);
 bool amdgpu_device_need_post(struct amdgpu_device *adev);
 bool amdgpu_device_should_use_aspm(struct amdgpu_device *adev);
-bool amdgpu_device_aspm_support_quirk(void);
 
 void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes,
  u64 num_vis_bytes);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index d56b7a2bafa6..0cacace2d6c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -81,10 +81,6 @@
 
 #include 
 
-#if IS_ENABLED(CONFIG_X86)
-#include 
-#endif
-
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -1377,17 +1373,6 @@ bool amdgpu_device_should_use_aspm(struct amdgpu_device 
*adev)
return pcie_aspm_enabled(adev->pdev);
 }
 
-bool amdgpu_device_aspm_support_quirk(void)
-{
-#if IS_ENABLED(CONFIG_X86)
-   struct cpuinfo_x86 *c = _data(0);
-
-   return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
-#else
-   return true;
-#endif
-}
-
 /* if we get transitioned to only one device, take VGA back */
 /**
  * amdgpu_device_vga_set_decode - enable/disable vga decode
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 47420b403871..15f3c6745ea9 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -522,7 +522,7 @@ static int nv_set_vce_clocks(struct amdgpu_device *adev, 
u32 evclk, u32 ecclk)
 
 static void nv_program_aspm(struct amdgpu_device *adev)
 {
-   if (!amdgpu_device_should_use_aspm(adev) || 
!amdgpu_device_aspm_support_quirk())
+   if (!amdgpu_device_should_use_aspm(adev))
return;
 
if (!(adev->flags & AMD_IS_APU) &&
diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 531f173ade2d..81dcb1148a60 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -1122,7 +1122,7 @@ static void vi_program_aspm(struct amdgpu_device *adev)
bool bL1SS = false;
bool bClkReqSupport = true;
 
-   if (!amdgpu_device_should_use_aspm(adev) || 
!amdgpu_device_aspm_support_quirk())
+   if (!amdgpu_device_should_use_aspm(adev))
return;
 
if (adev->flags & AMD_IS_APU ||
-- 
2.34.1



[PATCH 1/2] drm/amdgpu: Reset GPU on S0ix when device supports BOCO

2023-03-29 Thread Kai-Heng Feng
When the power is lost due to ACPI power resources being turned off, the
driver should reset the GPU so it can work anew.

First, _PR3 support of the hierarchy needs to be found correctly. Since
the GPU on some discrete GFX cards is behind a PCIe switch, checking the
_PR3 on downstream port alone is not enough, as the _PR3 can associate
to the root port above the PCIe switch.

Once the _PR3 is found and BOCO support is correctly marked, use that
information to inform the GPU should be reset. This solves an issue that
system freeze on a Intel ADL desktop that uses S0ix for sleep and D3cold
is supported for the GFX slot.

Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2458
Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c   |  3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  7 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 12 +---
 3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
index 60b1857f469e..407456ac0e84 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@@ -987,6 +987,9 @@ bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device 
*adev)
if (amdgpu_sriov_vf(adev))
return false;
 
+   if (amdgpu_device_supports_boco(adev_to_drm(adev)))
+   return true;
+
 #if IS_ENABLED(CONFIG_SUSPEND)
return pm_suspend_target_state != PM_SUSPEND_TO_IDLE;
 #else
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f5658359ff5c..d56b7a2bafa6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2181,7 +2181,12 @@ static int amdgpu_device_ip_early_init(struct 
amdgpu_device *adev)
 
if (!(adev->flags & AMD_IS_APU)) {
parent = pci_upstream_bridge(adev->pdev);
-   adev->has_pr3 = parent ? pci_pr3_present(parent) : false;
+   do {
+   if (pci_pr3_present(parent)) {
+   adev->has_pr3 = true;
+   break;
+   }
+   } while ((parent = pci_upstream_bridge(parent)));
}
 
amdgpu_amdkfd_device_probe(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index ba5def374368..5d81fcac4b0a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2415,10 +2415,11 @@ static int amdgpu_pmops_suspend(struct device *dev)
struct drm_device *drm_dev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(drm_dev);
 
-   if (amdgpu_acpi_is_s0ix_active(adev))
-   adev->in_s0ix = true;
-   else if (amdgpu_acpi_is_s3_active(adev))
+   if (amdgpu_acpi_is_s3_active(adev) ||
+   amdgpu_device_supports_boco(drm_dev))
adev->in_s3 = true;
+   else if (amdgpu_acpi_is_s0ix_active(adev))
+   adev->in_s0ix = true;
if (!adev->in_s0ix && !adev->in_s3)
return 0;
return amdgpu_device_suspend(drm_dev, true);
@@ -2449,10 +2450,7 @@ static int amdgpu_pmops_resume(struct device *dev)
adev->no_hw_access = true;
 
r = amdgpu_device_resume(drm_dev, true);
-   if (amdgpu_acpi_is_s0ix_active(adev))
-   adev->in_s0ix = false;
-   else
-   adev->in_s3 = false;
+   adev->in_s0ix = adev->in_s3 = false;
return r;
 }
 
-- 
2.34.1



[PATCH v2] drm/amdgpu/nv: Apply ASPM quirk on Intel ADL + AMD Navi

2023-03-15 Thread Kai-Heng Feng
S2idle resume freeze can be observed on Intel ADL + AMD WX5500. This is
caused by commit 0064b0ce85bb ("drm/amd/pm: enable ASPM by default").

The root cause is still not clear for now.

So extend and apply the ASPM quirk from commit e02fe3bc7aba
("drm/amdgpu: vi: disable ASPM on Intel Alder Lake based systems"), to
workaround the issue on Navi cards too.

Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2458
Reviewed-by: Alex Deucher 
Signed-off-by: Kai-Heng Feng 
---
v2:
 - Rename the quirk function.

 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 +++
 drivers/gpu/drm/amd/amdgpu/nv.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/vi.c| 17 +
 4 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 164141bc8b4a..5f3b139c1f99 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1272,6 +1272,7 @@ void amdgpu_device_pci_config_reset(struct amdgpu_device 
*adev);
 int amdgpu_device_pci_reset(struct amdgpu_device *adev);
 bool amdgpu_device_need_post(struct amdgpu_device *adev);
 bool amdgpu_device_should_use_aspm(struct amdgpu_device *adev);
+bool amdgpu_device_aspm_support_quirk(void);
 
 void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes,
  u64 num_vis_bytes);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c4a4e2fe6681..05a34ff79e78 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -80,6 +80,10 @@
 
 #include 
 
+#if IS_ENABLED(CONFIG_X86)
+#include 
+#endif
+
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -1356,6 +1360,17 @@ bool amdgpu_device_should_use_aspm(struct amdgpu_device 
*adev)
return pcie_aspm_enabled(adev->pdev);
 }
 
+bool amdgpu_device_aspm_support_quirk(void)
+{
+#if IS_ENABLED(CONFIG_X86)
+   struct cpuinfo_x86 *c = _data(0);
+
+   return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
+#else
+   return true;
+#endif
+}
+
 /* if we get transitioned to only one device, take VGA back */
 /**
  * amdgpu_device_vga_set_decode - enable/disable vga decode
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 855d390c41de..26733263913e 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -578,7 +578,7 @@ static void nv_pcie_gen3_enable(struct amdgpu_device *adev)
 
 static void nv_program_aspm(struct amdgpu_device *adev)
 {
-   if (!amdgpu_device_should_use_aspm(adev))
+   if (!amdgpu_device_should_use_aspm(adev) || 
!amdgpu_device_aspm_support_quirk())
return;
 
if (!(adev->flags & AMD_IS_APU) &&
diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 12ef782eb478..ceab8783575c 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -81,10 +81,6 @@
 #include "mxgpu_vi.h"
 #include "amdgpu_dm.h"
 
-#if IS_ENABLED(CONFIG_X86)
-#include 
-#endif
-
 #define ixPCIE_LC_L1_PM_SUBSTATE   0x100100C6
 #define PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK   
0x0001L
 #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK   0x0002L
@@ -1138,24 +1134,13 @@ static void vi_enable_aspm(struct amdgpu_device *adev)
WREG32_PCIE(ixPCIE_LC_CNTL, data);
 }
 
-static bool aspm_support_quirk_check(void)
-{
-#if IS_ENABLED(CONFIG_X86)
-   struct cpuinfo_x86 *c = _data(0);
-
-   return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
-#else
-   return true;
-#endif
-}
-
 static void vi_program_aspm(struct amdgpu_device *adev)
 {
u32 data, data1, orig;
bool bL1SS = false;
bool bClkReqSupport = true;
 
-   if (!amdgpu_device_should_use_aspm(adev) || !aspm_support_quirk_check())
+   if (!amdgpu_device_should_use_aspm(adev) || 
!amdgpu_device_aspm_support_quirk())
return;
 
if (adev->flags & AMD_IS_APU ||
-- 
2.34.1



[PATCH] drm/amdgpu/nv: Apply ASPM quirk on Intel ADL + AMD Navi

2023-03-14 Thread Kai-Heng Feng
S2idle resume freeze can be observed on Intel ADL + AMD WX5500. This is
caused by commit 0064b0ce85bb ("drm/amd/pm: enable ASPM by default").

The root cause is still not clear for now.

So extend and apply the ASPM quirk from commit e02fe3bc7aba
("drm/amdgpu: vi: disable ASPM on Intel Alder Lake based systems"), to
workaround the issue on Navi cards too.

Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2458
Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 +++
 drivers/gpu/drm/amd/amdgpu/nv.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/vi.c| 15 ---
 4 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 164141bc8b4a..c697580f1ee4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1272,6 +1272,7 @@ void amdgpu_device_pci_config_reset(struct amdgpu_device 
*adev);
 int amdgpu_device_pci_reset(struct amdgpu_device *adev);
 bool amdgpu_device_need_post(struct amdgpu_device *adev);
 bool amdgpu_device_should_use_aspm(struct amdgpu_device *adev);
+bool aspm_support_quirk_check(void);
 
 void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes,
  u64 num_vis_bytes);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c4a4e2fe6681..c09f19385628 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -80,6 +80,10 @@
 
 #include 
 
+#if IS_ENABLED(CONFIG_X86)
+#include 
+#endif
+
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -1356,6 +1360,17 @@ bool amdgpu_device_should_use_aspm(struct amdgpu_device 
*adev)
return pcie_aspm_enabled(adev->pdev);
 }
 
+bool aspm_support_quirk_check(void)
+{
+#if IS_ENABLED(CONFIG_X86)
+   struct cpuinfo_x86 *c = _data(0);
+
+   return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
+#else
+   return true;
+#endif
+}
+
 /* if we get transitioned to only one device, take VGA back */
 /**
  * amdgpu_device_vga_set_decode - enable/disable vga decode
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 855d390c41de..921adf66e3c4 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -578,7 +578,7 @@ static void nv_pcie_gen3_enable(struct amdgpu_device *adev)
 
 static void nv_program_aspm(struct amdgpu_device *adev)
 {
-   if (!amdgpu_device_should_use_aspm(adev))
+   if (!amdgpu_device_should_use_aspm(adev) || !aspm_support_quirk_check())
return;
 
if (!(adev->flags & AMD_IS_APU) &&
diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 12ef782eb478..e61ae372d674 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -81,10 +81,6 @@
 #include "mxgpu_vi.h"
 #include "amdgpu_dm.h"
 
-#if IS_ENABLED(CONFIG_X86)
-#include 
-#endif
-
 #define ixPCIE_LC_L1_PM_SUBSTATE   0x100100C6
 #define PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK   
0x0001L
 #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK   0x0002L
@@ -1138,17 +1134,6 @@ static void vi_enable_aspm(struct amdgpu_device *adev)
WREG32_PCIE(ixPCIE_LC_CNTL, data);
 }
 
-static bool aspm_support_quirk_check(void)
-{
-#if IS_ENABLED(CONFIG_X86)
-   struct cpuinfo_x86 *c = _data(0);
-
-   return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
-#else
-   return true;
-#endif
-}
-
 static void vi_program_aspm(struct amdgpu_device *adev)
 {
u32 data, data1, orig;
-- 
2.34.1



[PATCH] drm/amdgpu: Ensure HDA function is suspended before ASIC reset

2022-04-07 Thread Kai-Heng Feng
DP/HDMI audio on AMD PRO VII stops working after S3:
[  149.450391] amdgpu :63:00.0: amdgpu: MODE1 reset
[  149.450395] amdgpu :63:00.0: amdgpu: GPU mode1 reset
[  149.450494] amdgpu :63:00.0: amdgpu: GPU psp mode1 reset
[  149.983693] snd_hda_intel :63:00.1: refused to change power state from 
D0 to D3hot
[  150.003439] amdgpu :63:00.0: refused to change power state from D0 to 
D3hot
...
[  155.432975] snd_hda_intel :63:00.1: CORB reset timeout#2, CORBRP = 65535

The offending commit is daf8de0874ab5b ("drm/amdgpu: always reset the asic in
suspend (v2)"). Commit 34452ac3038a7 ("drm/amdgpu: don't use BACO for
reset in S3 ") doesn't help, so the issue is something different.

Assuming that to make HDA resume to D0 fully realized, it needs to be
successfully put to D3 first. And this guesswork proves working, by
moving amdgpu_asic_reset() to noirq callback, so it's called after HDA
function is in D3.

Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)")
Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index bb1c025d90019..31f7229e7ea89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2323,18 +2323,23 @@ static int amdgpu_pmops_suspend(struct device *dev)
 {
struct drm_device *drm_dev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(drm_dev);
-   int r;
 
if (amdgpu_acpi_is_s0ix_active(adev))
adev->in_s0ix = true;
else
adev->in_s3 = true;
-   r = amdgpu_device_suspend(drm_dev, true);
-   if (r)
-   return r;
+   return amdgpu_device_suspend(drm_dev, true);
+}
+
+static int amdgpu_pmops_suspend_noirq(struct device *dev)
+{
+   struct drm_device *drm_dev = dev_get_drvdata(dev);
+   struct amdgpu_device *adev = drm_to_adev(drm_dev);
+
if (!adev->in_s0ix)
-   r = amdgpu_asic_reset(adev);
-   return r;
+   return amdgpu_asic_reset(adev);
+
+   return 0;
 }
 
 static int amdgpu_pmops_resume(struct device *dev)
@@ -2575,6 +2580,7 @@ static const struct dev_pm_ops amdgpu_pm_ops = {
.prepare = amdgpu_pmops_prepare,
.complete = amdgpu_pmops_complete,
.suspend = amdgpu_pmops_suspend,
+   .suspend_noirq = amdgpu_pmops_suspend_noirq,
.resume = amdgpu_pmops_resume,
.freeze = amdgpu_pmops_freeze,
.thaw = amdgpu_pmops_thaw,
-- 
2.34.1



[PATCH] drm/amdgpu/acp: Make PM domain really work

2021-07-20 Thread Kai-Heng Feng
75 hwmon_vid drm ip_tables x_tables autofs4 dm_mirror dm_region_hash 
dm_log hid_generic usbhid hid uas usb_storage r8169 crc32_pclmul realtek ahci 
xhci_pci i2c_p
 iix4
[   56.979521]  xhci_pci_renesas libahci video
[   56.979541] ---[ end trace cb8f6a346f18da7b ]---

Instead of finding MFD hotplugged device by its name, simply iterate
over the child devices to avoid the issue.

BugLink: https://bugs.launchpad.net/bugs/1920674
Fixes: 25030321ba28 ("drm/amd: add pm domain for ACP IP sub blocks")
Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c | 49 +
 1 file changed, 25 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
index b8655ff73a658..8522f46d5d725 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
@@ -160,17 +160,28 @@ static int acp_poweron(struct generic_pm_domain *genpd)
return 0;
 }
 
-static struct device *get_mfd_cell_dev(const char *device_name, int r)
+static int acp_genpd_add_device(struct device *dev, void *data)
 {
-   char auto_dev_name[25];
-   struct device *dev;
+   struct generic_pm_domain *gpd = data;
+   int ret;
+
+   ret = pm_genpd_add_device(gpd, dev);
+   if (ret)
+   dev_err(dev, "Failed to add dev to genpd %d\n", ret);
 
-   snprintf(auto_dev_name, sizeof(auto_dev_name),
-"%s.%d.auto", device_name, r);
-   dev = bus_find_device_by_name(_bus_type, NULL, auto_dev_name);
-   dev_info(dev, "device %s added to pm domain\n", auto_dev_name);
+   return ret;
+}
 
-   return dev;
+static int acp_genpd_remove_device(struct device *dev, void *data)
+{
+   int ret;
+
+   ret = pm_genpd_remove_device(dev);
+   if (ret)
+   dev_err(dev, "Failed to remove dev from genpd %d\n", ret);
+
+   /* Continue to remove */
+   return 0;
 }
 
 /**
@@ -341,15 +352,10 @@ static int acp_hw_init(void *handle)
if (r)
goto failure;
 
-   for (i = 0; i < ACP_DEVS ; i++) {
-   dev = get_mfd_cell_dev(adev->acp.acp_cell[i].name, i);
-   r = pm_genpd_add_device(>acp.acp_genpd->gpd, dev);
-   if (r) {
-   dev_err(dev, "Failed to add dev to genpd\n");
-   goto failure;
-   }
-   }
-
+   r = device_for_each_child(adev->acp.parent, >acp.acp_genpd->gpd,
+ acp_genpd_add_device);
+   if (r)
+   goto failure;
 
/* Assert Soft reset of ACP */
val = cgs_read_register(adev->acp.cgs_device, mmACP_SOFT_RESET);
@@ -458,13 +464,8 @@ static int acp_hw_fini(void *handle)
udelay(100);
}
 
-   for (i = 0; i < ACP_DEVS ; i++) {
-   dev = get_mfd_cell_dev(adev->acp.acp_cell[i].name, i);
-   ret = pm_genpd_remove_device(dev);
-   /* If removal fails, dont giveup and try rest */
-   if (ret)
-   dev_err(dev, "remove dev from genpd failed\n");
-   }
+   device_for_each_child(adev->acp.parent, NULL,
+ acp_genpd_remove_device);
 
mfd_remove_devices(adev->acp.parent);
kfree(adev->acp.acp_res);
-- 
2.31.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2] drm/radeon/dpm: Disable sclk switching on Oland when two 4K 60Hz monitors are connected

2021-05-11 Thread Kai-Heng Feng
On Fri, Apr 30, 2021 at 12:57 PM Kai-Heng Feng
 wrote:
>
> Screen flickers rapidly when two 4K 60Hz monitors are in use. This issue
> doesn't happen when one monitor is 4K 60Hz (pixelclock 594MHz) and
> another one is 4K 30Hz (pixelclock 297MHz).
>
> The issue is gone after setting "power_dpm_force_performance_level" to
> "high". Following the indication, we found that the issue occurs when
> sclk is too low.
>
> So resolve the issue by disabling sclk switching when there are two
> monitors requires high pixelclock (> 297MHz).
>
> v2:
>  - Only apply the fix to Oland.
> Signed-off-by: Kai-Heng Feng 

A gentle ping...

> ---
>  drivers/gpu/drm/radeon/radeon.h| 1 +
>  drivers/gpu/drm/radeon/radeon_pm.c | 8 
>  drivers/gpu/drm/radeon/si_dpm.c| 3 +++
>  3 files changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 42281fce552e6..56ed5634cebef 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -1549,6 +1549,7 @@ struct radeon_dpm {
> void*priv;
> u32 new_active_crtcs;
> int new_active_crtc_count;
> +   int high_pixelclock_count;
> u32 current_active_crtcs;
> int current_active_crtc_count;
> bool single_display;
> diff --git a/drivers/gpu/drm/radeon/radeon_pm.c 
> b/drivers/gpu/drm/radeon/radeon_pm.c
> index 0c1950f4e146f..3861c0b98fcf3 100644
> --- a/drivers/gpu/drm/radeon/radeon_pm.c
> +++ b/drivers/gpu/drm/radeon/radeon_pm.c
> @@ -1767,6 +1767,7 @@ static void radeon_pm_compute_clocks_dpm(struct 
> radeon_device *rdev)
> struct drm_device *ddev = rdev->ddev;
> struct drm_crtc *crtc;
> struct radeon_crtc *radeon_crtc;
> +   struct radeon_connector *radeon_connector;
>
> if (!rdev->pm.dpm_enabled)
> return;
> @@ -1776,6 +1777,7 @@ static void radeon_pm_compute_clocks_dpm(struct 
> radeon_device *rdev)
> /* update active crtc counts */
> rdev->pm.dpm.new_active_crtcs = 0;
> rdev->pm.dpm.new_active_crtc_count = 0;
> +   rdev->pm.dpm.high_pixelclock_count = 0;
> if (rdev->num_crtc && rdev->mode_info.mode_config_initialized) {
> list_for_each_entry(crtc,
> >mode_config.crtc_list, head) {
> @@ -1783,6 +1785,12 @@ static void radeon_pm_compute_clocks_dpm(struct 
> radeon_device *rdev)
> if (crtc->enabled) {
> rdev->pm.dpm.new_active_crtcs |= (1 << 
> radeon_crtc->crtc_id);
> rdev->pm.dpm.new_active_crtc_count++;
> +   if (!radeon_crtc->connector)
> +   continue;
> +
> +   radeon_connector = 
> to_radeon_connector(radeon_crtc->connector);
> +   if (radeon_connector->pixelclock_for_modeset 
> > 297000)
> +   rdev->pm.dpm.high_pixelclock_count++;
> }
> }
> }
> diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
> index 9186095518047..3cc2b96a7f368 100644
> --- a/drivers/gpu/drm/radeon/si_dpm.c
> +++ b/drivers/gpu/drm/radeon/si_dpm.c
> @@ -2979,6 +2979,9 @@ static void si_apply_state_adjust_rules(struct 
> radeon_device *rdev,
> (rdev->pdev->device == 0x6605)) {
> max_sclk = 75000;
> }
> +
> +   if (rdev->pm.dpm.high_pixelclock_count > 1)
> +   disable_sclk_switching = true;
> }
>
> if (rps->vce_active) {
> --
> 2.30.2
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/radeon/si_dpm: Fix SMU power state load

2021-05-10 Thread Kai-Heng Feng
On Mon, May 10, 2021 at 6:54 AM Gustavo A. R. Silva
 wrote:
>
> Create new structure SISLANDS_SMC_SWSTATE_SINGLE, as initialState.levels
> and ACPIState.levels are never actually used as flexible arrays. Those
> arrays can be used as simple objects of type
> SISLANDS_SMC_HW_PERFORMANCE_LEVEL, instead.
>
> Currently, the code fails because flexible array _levels_ in
> struct SISLANDS_SMC_SWSTATE doesn't allow for code that access
> the first element of initialState.levels and ACPIState.levels
> arrays:
>
> 4353 table->initialState.levels[0].mclk.vDLL_CNTL =
> 4354 cpu_to_be32(si_pi->clock_registers.dll_cntl);
> ...
> 4555 table->ACPIState.levels[0].mclk.vDLL_CNTL =
> 4556 cpu_to_be32(dll_cntl);
>
> because such element cannot exist without previously allocating
> any dynamic memory for it (which never actually happens).
>
> That's why struct SISLANDS_SMC_SWSTATE should only be used as type
> for object driverState and new struct SISLANDS_SMC_SWSTATE_SINGLE is
> created as type for objects initialState, ACPIState and ULVState.
>
> Also, with the change from one-element array to flexible-array member
> in commit 96e27e8d919e ("drm/radeon/si_dpm: Replace one-element array
> with flexible-array in struct SISLANDS_SMC_SWSTATE"), the size of
> dpmLevels in struct SISLANDS_SMC_STATETABLE should be fixed to be
> SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE instead of
> SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE - 1.
>
> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1583
> Fixes: 96e27e8d919e ("drm/radeon/si_dpm: Replace one-element array with 
> flexible-array in struct SISLANDS_SMC_SWSTATE")
> Cc: sta...@vger.kernel.org
> Reported-by: Kai-Heng Feng 
> Signed-off-by: Gustavo A. R. Silva 

Tested-by: Kai-Heng Feng 

> ---
>  drivers/gpu/drm/radeon/si_dpm.c   | 174 +-
>  drivers/gpu/drm/radeon/sislands_smc.h |  34 +++--
>  2 files changed, 109 insertions(+), 99 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
> index 91bfc4762767..2a8b9680cf6b 100644
> --- a/drivers/gpu/drm/radeon/si_dpm.c
> +++ b/drivers/gpu/drm/radeon/si_dpm.c
> @@ -4350,70 +4350,70 @@ static int si_populate_smc_initial_state(struct 
> radeon_device *rdev,
> u32 reg;
> int ret;
>
> -   table->initialState.levels[0].mclk.vDLL_CNTL =
> +   table->initialState.level.mclk.vDLL_CNTL =
> cpu_to_be32(si_pi->clock_registers.dll_cntl);
> -   table->initialState.levels[0].mclk.vMCLK_PWRMGT_CNTL =
> +   table->initialState.level.mclk.vMCLK_PWRMGT_CNTL =
> cpu_to_be32(si_pi->clock_registers.mclk_pwrmgt_cntl);
> -   table->initialState.levels[0].mclk.vMPLL_AD_FUNC_CNTL =
> +   table->initialState.level.mclk.vMPLL_AD_FUNC_CNTL =
> cpu_to_be32(si_pi->clock_registers.mpll_ad_func_cntl);
> -   table->initialState.levels[0].mclk.vMPLL_DQ_FUNC_CNTL =
> +   table->initialState.level.mclk.vMPLL_DQ_FUNC_CNTL =
> cpu_to_be32(si_pi->clock_registers.mpll_dq_func_cntl);
> -   table->initialState.levels[0].mclk.vMPLL_FUNC_CNTL =
> +   table->initialState.level.mclk.vMPLL_FUNC_CNTL =
> cpu_to_be32(si_pi->clock_registers.mpll_func_cntl);
> -   table->initialState.levels[0].mclk.vMPLL_FUNC_CNTL_1 =
> +   table->initialState.level.mclk.vMPLL_FUNC_CNTL_1 =
> cpu_to_be32(si_pi->clock_registers.mpll_func_cntl_1);
> -   table->initialState.levels[0].mclk.vMPLL_FUNC_CNTL_2 =
> +   table->initialState.level.mclk.vMPLL_FUNC_CNTL_2 =
> cpu_to_be32(si_pi->clock_registers.mpll_func_cntl_2);
> -   table->initialState.levels[0].mclk.vMPLL_SS =
> +   table->initialState.level.mclk.vMPLL_SS =
> cpu_to_be32(si_pi->clock_registers.mpll_ss1);
> -   table->initialState.levels[0].mclk.vMPLL_SS2 =
> +   table->initialState.level.mclk.vMPLL_SS2 =
> cpu_to_be32(si_pi->clock_registers.mpll_ss2);
>
> -   table->initialState.levels[0].mclk.mclk_value =
> +   table->initialState.level.mclk.mclk_value =
> cpu_to_be32(initial_state->performance_levels[0].mclk);
>
> -   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL =
> +   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL =
> cpu_to_be32(si_pi->clock_registers.cg_spll_func_cntl);
> -   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL_2 =
> +   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL_2 =
> 

[PATCH v2] drm/radeon/dpm: Disable sclk switching on Oland when two 4K 60Hz monitors are connected

2021-04-30 Thread Kai-Heng Feng
Screen flickers rapidly when two 4K 60Hz monitors are in use. This issue
doesn't happen when one monitor is 4K 60Hz (pixelclock 594MHz) and
another one is 4K 30Hz (pixelclock 297MHz).

The issue is gone after setting "power_dpm_force_performance_level" to
"high". Following the indication, we found that the issue occurs when
sclk is too low.

So resolve the issue by disabling sclk switching when there are two
monitors requires high pixelclock (> 297MHz).

v2:
 - Only apply the fix to Oland.
Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/radeon/radeon.h| 1 +
 drivers/gpu/drm/radeon/radeon_pm.c | 8 
 drivers/gpu/drm/radeon/si_dpm.c| 3 +++
 3 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 42281fce552e6..56ed5634cebef 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -1549,6 +1549,7 @@ struct radeon_dpm {
void*priv;
u32 new_active_crtcs;
int new_active_crtc_count;
+   int high_pixelclock_count;
u32 current_active_crtcs;
int current_active_crtc_count;
bool single_display;
diff --git a/drivers/gpu/drm/radeon/radeon_pm.c 
b/drivers/gpu/drm/radeon/radeon_pm.c
index 0c1950f4e146f..3861c0b98fcf3 100644
--- a/drivers/gpu/drm/radeon/radeon_pm.c
+++ b/drivers/gpu/drm/radeon/radeon_pm.c
@@ -1767,6 +1767,7 @@ static void radeon_pm_compute_clocks_dpm(struct 
radeon_device *rdev)
struct drm_device *ddev = rdev->ddev;
struct drm_crtc *crtc;
struct radeon_crtc *radeon_crtc;
+   struct radeon_connector *radeon_connector;
 
if (!rdev->pm.dpm_enabled)
return;
@@ -1776,6 +1777,7 @@ static void radeon_pm_compute_clocks_dpm(struct 
radeon_device *rdev)
/* update active crtc counts */
rdev->pm.dpm.new_active_crtcs = 0;
rdev->pm.dpm.new_active_crtc_count = 0;
+   rdev->pm.dpm.high_pixelclock_count = 0;
if (rdev->num_crtc && rdev->mode_info.mode_config_initialized) {
list_for_each_entry(crtc,
>mode_config.crtc_list, head) {
@@ -1783,6 +1785,12 @@ static void radeon_pm_compute_clocks_dpm(struct 
radeon_device *rdev)
if (crtc->enabled) {
rdev->pm.dpm.new_active_crtcs |= (1 << 
radeon_crtc->crtc_id);
rdev->pm.dpm.new_active_crtc_count++;
+   if (!radeon_crtc->connector)
+   continue;
+
+   radeon_connector = 
to_radeon_connector(radeon_crtc->connector);
+   if (radeon_connector->pixelclock_for_modeset > 
297000)
+   rdev->pm.dpm.high_pixelclock_count++;
}
}
}
diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
index 9186095518047..3cc2b96a7f368 100644
--- a/drivers/gpu/drm/radeon/si_dpm.c
+++ b/drivers/gpu/drm/radeon/si_dpm.c
@@ -2979,6 +2979,9 @@ static void si_apply_state_adjust_rules(struct 
radeon_device *rdev,
(rdev->pdev->device == 0x6605)) {
max_sclk = 75000;
}
+
+   if (rdev->pm.dpm.high_pixelclock_count > 1)
+   disable_sclk_switching = true;
}
 
if (rps->vce_active) {
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/radeon/dpm: Disable sclk switching when two 4K 60Hz monitors are connected

2021-04-29 Thread Kai-Heng Feng
Screen flickers rapidly when two 4K 60Hz monitors are connected to an
Oland card. This issue doesn't happen when one monitor is 4K 60Hz
(pixelclock 594MHz) and another one is 4K 30Hz (pixelclock 297MHz).

The issue is gone after setting "power_dpm_force_performance_level" to
"high". Following the lead, we found that the issue only occurs when
sclk is too low.

So resolve the issue by disabling sclk switching when there are two
monitors that requires high pixelclock (> 297MHz).

Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/radeon/radeon.h| 1 +
 drivers/gpu/drm/radeon/radeon_pm.c | 8 
 drivers/gpu/drm/radeon/si_dpm.c| 3 +++
 3 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 42281fce552e6..56ed5634cebef 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -1549,6 +1549,7 @@ struct radeon_dpm {
void*priv;
u32 new_active_crtcs;
int new_active_crtc_count;
+   int high_pixelclock_count;
u32 current_active_crtcs;
int current_active_crtc_count;
bool single_display;
diff --git a/drivers/gpu/drm/radeon/radeon_pm.c 
b/drivers/gpu/drm/radeon/radeon_pm.c
index 0c1950f4e146f..3861c0b98fcf3 100644
--- a/drivers/gpu/drm/radeon/radeon_pm.c
+++ b/drivers/gpu/drm/radeon/radeon_pm.c
@@ -1767,6 +1767,7 @@ static void radeon_pm_compute_clocks_dpm(struct 
radeon_device *rdev)
struct drm_device *ddev = rdev->ddev;
struct drm_crtc *crtc;
struct radeon_crtc *radeon_crtc;
+   struct radeon_connector *radeon_connector;
 
if (!rdev->pm.dpm_enabled)
return;
@@ -1776,6 +1777,7 @@ static void radeon_pm_compute_clocks_dpm(struct 
radeon_device *rdev)
/* update active crtc counts */
rdev->pm.dpm.new_active_crtcs = 0;
rdev->pm.dpm.new_active_crtc_count = 0;
+   rdev->pm.dpm.high_pixelclock_count = 0;
if (rdev->num_crtc && rdev->mode_info.mode_config_initialized) {
list_for_each_entry(crtc,
>mode_config.crtc_list, head) {
@@ -1783,6 +1785,12 @@ static void radeon_pm_compute_clocks_dpm(struct 
radeon_device *rdev)
if (crtc->enabled) {
rdev->pm.dpm.new_active_crtcs |= (1 << 
radeon_crtc->crtc_id);
rdev->pm.dpm.new_active_crtc_count++;
+   if (!radeon_crtc->connector)
+   continue;
+
+   radeon_connector = 
to_radeon_connector(radeon_crtc->connector);
+   if (radeon_connector->pixelclock_for_modeset > 
297000)
+   rdev->pm.dpm.high_pixelclock_count++;
}
}
}
diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
index 9186095518047..be6fa3257d1bc 100644
--- a/drivers/gpu/drm/radeon/si_dpm.c
+++ b/drivers/gpu/drm/radeon/si_dpm.c
@@ -2995,6 +2995,9 @@ static void si_apply_state_adjust_rules(struct 
radeon_device *rdev,
ni_dpm_vblank_too_short(rdev))
disable_mclk_switching = true;
 
+   if (rdev->pm.dpm.high_pixelclock_count > 1)
+   disable_sclk_switching = true;
+
if (rps->vclk || rps->dclk) {
disable_mclk_switching = true;
disable_sclk_switching = true;
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v2] drm/amdgpu: Register VGA clients after init can no longer fail

2021-04-26 Thread Kai-Heng Feng
When an amdgpu device fails to init, it makes another VGA device cause
kernel splat:
kernel: amdgpu :08:00.0: amdgpu: amdgpu_device_ip_init failed
kernel: amdgpu :08:00.0: amdgpu: Fatal error during GPU init
kernel: amdgpu: probe of :08:00.0 failed with error -110
...
kernel: amdgpu :01:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none
kernel: BUG: kernel NULL pointer dereference, address: 0018
kernel: #PF: supervisor read access in kernel mode
kernel: #PF: error_code(0x) - not-present page
kernel: PGD 0 P4D 0
kernel: Oops:  [#1] SMP NOPTI
kernel: CPU: 6 PID: 1080 Comm: Xorg Tainted: GW 5.12.0-rc8+ #12
kernel: Hardware name: HP HP EliteDesk 805 G6/872B, BIOS S09 Ver. 02.02.00 
12/30/2020
kernel: RIP: 0010:amdgpu_device_vga_set_decode+0x13/0x30 [amdgpu]
kernel: Code: 06 31 c0 c3 b8 ea ff ff ff 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 
90 0f 1f 44 00 00 55 48 8b 87 90 06 00 00 48 89 e5 53 89 f3 <48> 8b 40 18 40 0f 
b6 f6 e8 40 58 39 fd 80 fb 01 5b 5d 19 c0 83 e0
kernel: RSP: 0018:ae3c0246bd68 EFLAGS: 00010002
kernel: RAX:  RBX:  RCX: 
kernel: RDX: 8dd1af5a8560 RSI:  RDI: 8dce8c16
kernel: RBP: ae3c0246bd70 R08: 8dd1af5985c0 R09: ae3c0246ba38
kernel: R10: 0001 R11: 0001 R12: 0246
kernel: R13:  R14: 0003 R15: 8dce8149
kernel: FS:  7f9303d8fa40() GS:8dd1af58() 
knlGS:
kernel: CS:  0010 DS:  ES:  CR0: 80050033
kernel: CR2: 0018 CR3: 000103cfa000 CR4: 00350ee0
kernel: Call Trace:
kernel:  vga_arbiter_notify_clients.part.0+0x4a/0x80
kernel:  vga_get+0x17f/0x1c0
kernel:  vga_arb_write+0x121/0x6a0
kernel:  ? apparmor_file_permission+0x1c/0x20
kernel:  ? security_file_permission+0x30/0x180
kernel:  vfs_write+0xca/0x280
kernel:  ksys_write+0x67/0xe0
kernel:  __x64_sys_write+0x1a/0x20
kernel:  do_syscall_64+0x38/0x90
kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
kernel: RIP: 0033:0x7f93041e02f7
kernel: Code: 75 05 48 83 c4 58 c3 e8 f7 33 ff ff 0f 1f 80 00 00 00 00 f3 0f 1e 
fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 
77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
kernel: RSP: 002b:7fff60e49b28 EFLAGS: 0246 ORIG_RAX: 0001
kernel: RAX: ffda RBX: 000b RCX: 7f93041e02f7
kernel: RDX: 000b RSI: 7fff60e49b40 RDI: 000f
kernel: RBP: 7fff60e49b40 R08:  R09: 7fff60e499d0
kernel: R10: 7f93049350b5 R11: 0246 R12: 56111d45e808
kernel: R13:  R14: 56111d45e7f8 R15: 56111d46c980
kernel: Modules linked in: nls_iso8859_1 snd_hda_codec_realtek 
snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel 
snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_seq 
input_leds snd_seq_device snd_timer snd soundcore joydev kvm_amd serio_raw 
k10temp mac_hid hp_wmi ccp kvm sparse_keymap wmi_bmof ucsi_acpi efi_pstore 
typec_ucsi rapl typec video wmi sch_fq_codel parport_pc ppdev lp parport 
ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c xor 
raid6_pq raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log 
hid_generic usbhid hid amdgpu drm_ttm_helper ttm iommu_v2 gpu_sched 
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect crct10dif_pclmul sysimgblt 
crc32_pclmul fb_sys_fops ghash_clmulni_intel cec rc_core aesni_intel 
crypto_simd psmouse cryptd r8169 i2c_piix4 drm ahci xhci_pci realtek libahci 
xhci_pci_renesas gpio_amdpt gpio_generic
kernel: CR2: 0018
kernel: ---[ end trace 76d04313d4214c51 ]---

Commit 4192f7b57689 ("drm/amdgpu: unmap register bar on device init
failure") makes amdgpu_driver_unload_kms() skips amdgpu_device_fini(),
so the VGA clients remain registered. So when
vga_arbiter_notify_clients() iterates over registered clients, it causes
NULL pointer dereference.

Since there's no reason to register VGA clients that early, so solve
the issue by putting them after all the goto cleanups.

v2:
 - Remove redundant vga_switcheroo cleanup in failed: label.

Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 28 ++
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b4ad1c055c70..7d3b54615147 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3410,19 +3410,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
/* doorbell bar mapping and doorbell index init*/

[PATCH] drm/amdgpu: Register VGA clients after init can no longer fail

2021-04-21 Thread Kai-Heng Feng
When an amdgpu device fails to init, it makes another VGA device cause
kernel splat:
kernel: amdgpu :08:00.0: amdgpu: amdgpu_device_ip_init failed
kernel: amdgpu :08:00.0: amdgpu: Fatal error during GPU init
kernel: amdgpu: probe of :08:00.0 failed with error -110
...
kernel: amdgpu :01:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none
kernel: BUG: kernel NULL pointer dereference, address: 0018
kernel: #PF: supervisor read access in kernel mode
kernel: #PF: error_code(0x) - not-present page
kernel: PGD 0 P4D 0
kernel: Oops:  [#1] SMP NOPTI
kernel: CPU: 6 PID: 1080 Comm: Xorg Tainted: GW 5.12.0-rc8+ #12
kernel: Hardware name: HP HP EliteDesk 805 G6/872B, BIOS S09 Ver. 02.02.00 
12/30/2020
kernel: RIP: 0010:amdgpu_device_vga_set_decode+0x13/0x30 [amdgpu]
kernel: Code: 06 31 c0 c3 b8 ea ff ff ff 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 
90 0f 1f 44 00 00 55 48 8b 87 90 06 00 00 48 89 e5 53 89 f3 <48> 8b 40 18 40 0f 
b6 f6 e8 40 58 39 fd 80 fb 01 5b 5d 19 c0 83 e0
kernel: RSP: 0018:ae3c0246bd68 EFLAGS: 00010002
kernel: RAX:  RBX:  RCX: 
kernel: RDX: 8dd1af5a8560 RSI:  RDI: 8dce8c16
kernel: RBP: ae3c0246bd70 R08: 8dd1af5985c0 R09: ae3c0246ba38
kernel: R10: 0001 R11: 0001 R12: 0246
kernel: R13:  R14: 0003 R15: 8dce8149
kernel: FS:  7f9303d8fa40() GS:8dd1af58() 
knlGS:
kernel: CS:  0010 DS:  ES:  CR0: 80050033
kernel: CR2: 0018 CR3: 000103cfa000 CR4: 00350ee0
kernel: Call Trace:
kernel:  vga_arbiter_notify_clients.part.0+0x4a/0x80
kernel:  vga_get+0x17f/0x1c0
kernel:  vga_arb_write+0x121/0x6a0
kernel:  ? apparmor_file_permission+0x1c/0x20
kernel:  ? security_file_permission+0x30/0x180
kernel:  vfs_write+0xca/0x280
kernel:  ksys_write+0x67/0xe0
kernel:  __x64_sys_write+0x1a/0x20
kernel:  do_syscall_64+0x38/0x90
kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
kernel: RIP: 0033:0x7f93041e02f7
kernel: Code: 75 05 48 83 c4 58 c3 e8 f7 33 ff ff 0f 1f 80 00 00 00 00 f3 0f 1e 
fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 
77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
kernel: RSP: 002b:7fff60e49b28 EFLAGS: 0246 ORIG_RAX: 0001
kernel: RAX: ffda RBX: 000b RCX: 7f93041e02f7
kernel: RDX: 000b RSI: 7fff60e49b40 RDI: 000f
kernel: RBP: 7fff60e49b40 R08:  R09: 7fff60e499d0
kernel: R10: 7f93049350b5 R11: 0246 R12: 56111d45e808
kernel: R13:  R14: 56111d45e7f8 R15: 56111d46c980
kernel: Modules linked in: nls_iso8859_1 snd_hda_codec_realtek 
snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel 
snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_seq 
input_leds snd_seq_device snd_timer snd soundcore joydev kvm_amd serio_raw 
k10temp mac_hid hp_wmi ccp kvm sparse_keymap wmi_bmof ucsi_acpi efi_pstore 
typec_ucsi rapl typec video wmi sch_fq_codel parport_pc ppdev lp parport 
ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c xor 
raid6_pq raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log 
hid_generic usbhid hid amdgpu drm_ttm_helper ttm iommu_v2 gpu_sched 
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect crct10dif_pclmul sysimgblt 
crc32_pclmul fb_sys_fops ghash_clmulni_intel cec rc_core aesni_intel 
crypto_simd psmouse cryptd r8169 i2c_piix4 drm ahci xhci_pci realtek libahci 
xhci_pci_renesas gpio_amdpt gpio_generic
kernel: CR2: 0018
kernel: ---[ end trace 76d04313d4214c51 ]---

Commit 4192f7b57689 ("drm/amdgpu: unmap register bar on device init
failure") makes amdgpu_driver_unload_kms() skips amdgpu_device_fini(),
so the VGA clients remain registered. So when
vga_arbiter_notify_clients() iterates over registered clients, it causes
NULL pointer dereference.

Since there's no reason to register VGA clients that early, so solve
the issue by putting them after all the goto cleanups.

Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 +++---
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b4ad1c055c70..115a7699e11e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3410,19 +3410,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
/* doorbell bar mapping and doorbell index init*/
amdgpu_device_doorbell_init(adev);
 
-   /* if we have > 1 VGA cards, th

Re: [PATCH] drm/radeon: Reset ASIC if suspend is not managed by platform firmware

2020-09-02 Thread Kai-Heng Feng



> On Sep 2, 2020, at 00:30, Alex Deucher  wrote:
> 
> On Tue, Sep 1, 2020 at 12:21 PM Kai-Heng Feng
>  wrote:
>> 
>> 
>> 
>>> On Sep 1, 2020, at 22:19, Alex Deucher  wrote:
>>> 
>>> On Tue, Sep 1, 2020 at 3:32 AM Kai-Heng Feng
>>>  wrote:
>>>> 
>>>> Suspend with s2idle or by the following steps cause screen frozen:
>>>> # echo devices > /sys/power/pm_test
>>>> # echo freeze > /sys/power/mem
>>>> 
>>>> [  289.625461] [drm:uvd_v1_0_ib_test [radeon]] *ERROR* radeon: fence wait 
>>>> timed out.
>>>> [  289.625494] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed 
>>>> testing IB on ring 5 (-110).
>>>> 
>>>> The issue doesn't happen on traditional S3, probably because firmware or
>>>> hardware provides extra power management.
>>>> 
>>>> Inspired by Daniel Drake's patch [1] on amdgpu, using a similar approach
>>>> can fix the issue.
>>> 
>>> It doesn't actually fix the issue.  The device is never powered down
>>> so you are using more power than you would if you did not suspend in
>>> the first place.  The reset just works around the fact that the device
>>> is never powered down.
>> 
>> So how do we properly suspend/resume the device without help from platform 
>> firmware?
> 
> I guess you don't?

Unfortunate but I guess we need to accept reality and use the default suspend 
method.

Kai-Heng

> 
> Alex
> 
> 
>> 
>> Kai-Heng
>> 
>>> 
>>> Alex
>>> 
>>>> 
>>>> [1] https://patchwork.freedesktop.org/patch/335839/
>>>> 
>>>> Signed-off-by: Kai-Heng Feng 
>>>> ---
>>>> drivers/gpu/drm/radeon/radeon_device.c | 3 +++
>>>> 1 file changed, 3 insertions(+)
>>>> 
>>>> diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
>>>> b/drivers/gpu/drm/radeon/radeon_device.c
>>>> index 266e3cbbd09b..df823b9ad79f 100644
>>>> --- a/drivers/gpu/drm/radeon/radeon_device.c
>>>> +++ b/drivers/gpu/drm/radeon/radeon_device.c
>>>> @@ -33,6 +33,7 @@
>>>> #include 
>>>> #include 
>>>> #include 
>>>> +#include 
>>>> 
>>>> #include 
>>>> #include 
>>>> @@ -1643,6 +1644,8 @@ int radeon_suspend_kms(struct drm_device *dev, bool 
>>>> suspend,
>>>>   rdev->asic->asic_reset(rdev, true);
>>>>   pci_restore_state(dev->pdev);
>>>>   } else if (suspend) {
>>>> +   if (pm_suspend_no_platform())
>>>> +   rdev->asic->asic_reset(rdev, true);
>>>>   /* Shut down the device */
>>>>   pci_disable_device(dev->pdev);
>>>>   pci_set_power_state(dev->pdev, PCI_D3hot);
>>>> --
>>>> 2.17.1
>>>> 
>>>> ___
>>>> dri-devel mailing list
>>>> dri-de...@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/radeon: Reset ASIC if suspend is not managed by platform firmware

2020-09-01 Thread Kai-Heng Feng



> On Sep 1, 2020, at 22:19, Alex Deucher  wrote:
> 
> On Tue, Sep 1, 2020 at 3:32 AM Kai-Heng Feng
>  wrote:
>> 
>> Suspend with s2idle or by the following steps cause screen frozen:
>> # echo devices > /sys/power/pm_test
>> # echo freeze > /sys/power/mem
>> 
>> [  289.625461] [drm:uvd_v1_0_ib_test [radeon]] *ERROR* radeon: fence wait 
>> timed out.
>> [  289.625494] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed 
>> testing IB on ring 5 (-110).
>> 
>> The issue doesn't happen on traditional S3, probably because firmware or
>> hardware provides extra power management.
>> 
>> Inspired by Daniel Drake's patch [1] on amdgpu, using a similar approach
>> can fix the issue.
> 
> It doesn't actually fix the issue.  The device is never powered down
> so you are using more power than you would if you did not suspend in
> the first place.  The reset just works around the fact that the device
> is never powered down.

So how do we properly suspend/resume the device without help from platform 
firmware?

Kai-Heng

> 
> Alex
> 
>> 
>> [1] https://patchwork.freedesktop.org/patch/335839/
>> 
>> Signed-off-by: Kai-Heng Feng 
>> ---
>> drivers/gpu/drm/radeon/radeon_device.c | 3 +++
>> 1 file changed, 3 insertions(+)
>> 
>> diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
>> b/drivers/gpu/drm/radeon/radeon_device.c
>> index 266e3cbbd09b..df823b9ad79f 100644
>> --- a/drivers/gpu/drm/radeon/radeon_device.c
>> +++ b/drivers/gpu/drm/radeon/radeon_device.c
>> @@ -33,6 +33,7 @@
>> #include 
>> #include 
>> #include 
>> +#include 
>> 
>> #include 
>> #include 
>> @@ -1643,6 +1644,8 @@ int radeon_suspend_kms(struct drm_device *dev, bool 
>> suspend,
>>rdev->asic->asic_reset(rdev, true);
>>pci_restore_state(dev->pdev);
>>} else if (suspend) {
>> +   if (pm_suspend_no_platform())
>> +   rdev->asic->asic_reset(rdev, true);
>>/* Shut down the device */
>>pci_disable_device(dev->pdev);
>>pci_set_power_state(dev->pdev, PCI_D3hot);
>> --
>> 2.17.1
>> 
>> ___
>> dri-devel mailing list
>> dri-de...@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/radeon: Reset ASIC if suspend is not managed by platform firmware

2020-09-01 Thread Kai-Heng Feng
Suspend with s2idle or by the following steps cause screen frozen:
 # echo devices > /sys/power/pm_test
 # echo freeze > /sys/power/mem

[  289.625461] [drm:uvd_v1_0_ib_test [radeon]] *ERROR* radeon: fence wait timed 
out.
[  289.625494] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed 
testing IB on ring 5 (-110).

The issue doesn't happen on traditional S3, probably because firmware or
hardware provides extra power management.

Inspired by Daniel Drake's patch [1] on amdgpu, using a similar approach
can fix the issue.

[1] https://patchwork.freedesktop.org/patch/335839/

Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/radeon/radeon_device.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index 266e3cbbd09b..df823b9ad79f 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1643,6 +1644,8 @@ int radeon_suspend_kms(struct drm_device *dev, bool 
suspend,
rdev->asic->asic_reset(rdev, true);
pci_restore_state(dev->pdev);
} else if (suspend) {
+   if (pm_suspend_no_platform())
+   rdev->asic->asic_reset(rdev, true);
/* Shut down the device */
pci_disable_device(dev->pdev);
pci_set_power_state(dev->pdev, PCI_D3hot);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/radeon: Prefer lower feedback dividers

2020-08-25 Thread Kai-Heng Feng
Commit 2e26ccb119bd ("drm/radeon: prefer lower reference dividers")
fixed screen flicker for HP Compaq nx9420 but breaks other laptops like
Asus X50SL.

Turns out we also need to favor lower feedback dividers.

Users confirmed this change fixes the regression and doesn't regress the
original fix.

Fixes: 2e26ccb119bd ("drm/radeon: prefer lower reference dividers")
BugLink: https://bugs.launchpad.net/bugs/1791312
BugLink: https://bugs.launchpad.net/bugs/1861554
Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/radeon/radeon_display.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index e0ae911ef427..7b69d6dfe44a 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -933,7 +933,7 @@ static void avivo_get_fb_ref_div(unsigned nom, unsigned 
den, unsigned post_div,
 
/* get matching reference and feedback divider */
*ref_div = min(max(den/post_div, 1u), ref_div_max);
-   *fb_div = DIV_ROUND_CLOSEST(nom * *ref_div * post_div, den);
+   *fb_div = max(nom * *ref_div * post_div / den, 1u);
 
/* limit fb divider to its maximum */
if (*fb_div > fb_div_max) {
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: Restore backlight brightness after system resume

2019-09-02 Thread Kai-Heng Feng
Laptops with AMD APU doesn't restore display backlight brightness after
system resume.

This issue started when DC was introduced.

Let's use BL_CORE_SUSPENDRESUME so the backlight core calls
update_status callback after system resume to restore the backlight
level.

Tested on Dell Inspiron 3180 (Stoney Ridge) and Dell Latitude 5495
(Raven Ridge).

Cc: 
Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 1b0949dd7808..183ef18ac6f3 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2111,6 +2111,7 @@ static int amdgpu_dm_backlight_get_brightness(struct 
backlight_device *bd)
 }
 
 static const struct backlight_ops amdgpu_dm_backlight_ops = {
+   .options = BL_CORE_SUSPENDRESUME,
.get_brightness = amdgpu_dm_backlight_get_brightness,
.update_status  = amdgpu_dm_backlight_update_status,
 };
-- 
2.17.1



[PATCH] drm/amdgpu: Add APTX quirk for Dell Latitude 5495

2019-08-27 Thread Kai-Heng Feng
Needs ATPX rather than _PR3 to really turn off the dGPU. This can save
~5W when dGPU is runtime-suspended.

Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
index 92b11de19581..354c8b6106dc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
@@ -575,6 +575,7 @@ static const struct amdgpu_px_quirk amdgpu_px_quirk_list[] 
= {
{ 0x1002, 0x6900, 0x1002, 0x0124, AMDGPU_PX_QUIRK_FORCE_ATPX },
{ 0x1002, 0x6900, 0x1028, 0x0812, AMDGPU_PX_QUIRK_FORCE_ATPX },
{ 0x1002, 0x6900, 0x1028, 0x0813, AMDGPU_PX_QUIRK_FORCE_ATPX },
+   { 0x1002, 0x699f, 0x1028, 0x0814, AMDGPU_PX_QUIRK_FORCE_ATPX },
{ 0x1002, 0x6900, 0x1025, 0x125A, AMDGPU_PX_QUIRK_FORCE_ATPX },
{ 0x1002, 0x6900, 0x17AA, 0x3806, AMDGPU_PX_QUIRK_FORCE_ATPX },
{ 0, 0, 0, 0, 0 },
-- 
2.17.1



Re: [PATCH] drm/amdgpu: Apply flags after amdgpu_device_ip_init()

2019-08-15 Thread Kai-Heng Feng

at 21:33, Deucher, Alexander  wrote:

Thanks for finding this!  I think the attached patch should fix the issue  
and it's much less invasive.


Yes it also fix the issue, please add by tested-by:
Tested-by: Kai-Heng Feng 

I took this more or less future proof approach because I think this won’t  
be the last chip that needs firmware information, which isn’t available in  
early init, to decides its flags.


Yes it’s intrusive to carve out all flags from early init callbacks, but I  
don’t think it’s that ugly.


Kai-Heng



Alex
From: Kai-Heng Feng 
Sent: Thursday, August 15, 2019 1:11 AM
To: Deucher, Alexander ; Koenig, Christian  
; Zhou, David(ChunMing) 
Cc: Huang, Ray ; anthony.w...@canonical.com  
; amd-gfx@lists.freedesktop.org  
; dri-de...@lists.freedesktop.org  
; linux-ker...@vger.kernel.org  
; Kai-Heng Feng  


Subject: [PATCH] drm/amdgpu: Apply flags after amdgpu_device_ip_init()

After commit 005440066f92 ("drm/amdgpu: enable gfxoff again on raven
series (v2)"), some Raven Ridge systems start to have rendering
corruption in browser [1].

Chip specific flags like cg_flags and pg_flags are applied in
amdgpu_device_ip_early_init(). For Raven Ridge, the flags may depend on
pp_feature's PP_GFXOFF_MASK bit, which can be negated in
amdgpu_device_ip_init() based on firmware information. At that time it's
already too late, since cg_flags and pg_flags are already set.

Apply flags after amdgpu_device_ip_init() and consolidate all flags to
one place, to solve the issue.

[1]  
https://lore.kernel.org/lkml/3eb0e920-31d7-4c91-a360-dbfb4417a...@canonical.com/


Fixes: 005440066f92 ("drm/amdgpu: enable gfxoff again on raven series  
(v2)")

Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 589 +
 drivers/gpu/drm/amd/amdgpu/cik.c   |  87 ---
 drivers/gpu/drm/amd/amdgpu/nv.c|  50 --
 drivers/gpu/drm/amd/amdgpu/si.c|  83 ---
 drivers/gpu/drm/amd/amdgpu/soc15.c | 140 -
 drivers/gpu/drm/amd/amdgpu/vi.c| 162 --
 6 files changed, 589 insertions(+), 522 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 275277364a8a..10ea4899c338 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1852,6 +1852,591 @@ static int amdgpu_device_ip_init(struct  
amdgpu_device *adev)

 return r;
 }

+#define CZ_REV_BRISTOL(rev) \
+   ((rev >= 0xC8 && rev <= 0xCE) || (rev >= 0xE1 && rev <= 0xE6))
+
+static int amdgpu_device_apply_flags(struct amdgpu_device *adev)
+{
+   switch (adev->asic_type) {
+   case CHIP_TAHITI:
+   adev->cg_flags =
+   AMD_CG_SUPPORT_GFX_MGCG |
+   AMD_CG_SUPPORT_GFX_MGLS |
+   /*AMD_CG_SUPPORT_GFX_CGCG |*/
+   AMD_CG_SUPPORT_GFX_CGLS |
+   AMD_CG_SUPPORT_GFX_CGTS |
+   AMD_CG_SUPPORT_GFX_CP_LS |
+   AMD_CG_SUPPORT_MC_MGCG |
+   AMD_CG_SUPPORT_SDMA_MGCG |
+   AMD_CG_SUPPORT_BIF_LS |
+   AMD_CG_SUPPORT_VCE_MGCG |
+   AMD_CG_SUPPORT_UVD_MGCG |
+   AMD_CG_SUPPORT_HDP_LS |
+   AMD_CG_SUPPORT_HDP_MGCG;
+   adev->pg_flags = 0;
+   break;
+   case CHIP_PITCAIRN:
+   adev->cg_flags =
+   AMD_CG_SUPPORT_GFX_MGCG |
+   AMD_CG_SUPPORT_GFX_MGLS |
+   /*AMD_CG_SUPPORT_GFX_CGCG |*/
+   AMD_CG_SUPPORT_GFX_CGLS |
+   AMD_CG_SUPPORT_GFX_CGTS |
+   AMD_CG_SUPPORT_GFX_CP_LS |
+   AMD_CG_SUPPORT_GFX_RLC_LS |
+   AMD_CG_SUPPORT_MC_LS |
+   AMD_CG_SUPPORT_MC_MGCG |
+   AMD_CG_SUPPORT_SDMA_MGCG |
+   AMD_CG_SUPPORT_BIF_LS |
+   AMD_CG_SUPPORT_VCE_MGCG |
+   AMD_CG_SUPPORT_UVD_MGCG |
+   AMD_CG_SUPPORT_HDP_LS |
+   AMD_CG_SUPPORT_HDP_MGCG;
+   adev->pg_flags = 0;
+   break;
+   case CHIP_VERDE:
+   adev->cg_flags =
+   AMD_CG_SUPPORT_GFX_MGCG |
+   AMD_CG_SUPPORT_GFX_MGLS |
+   AMD_CG_SUPPORT_GFX_CGLS |
+   AMD_CG_SUPPORT_GFX_CGTS |
+   AMD_CG_SUPPORT_GFX_CGTS_LS |
+   AMD_CG_SUPPORT_GFX_CP_LS |
+   AMD_CG_SUPPORT_MC_LS |
+   AMD_CG_SUPPORT_MC_MGCG |
+   AMD_CG_SUPPORT_SDMA_MGCG |
+   AMD_CG_SUPPORT_SDMA_LS |
+   AMD_C

[PATCH] drm/amdgpu: Apply flags after amdgpu_device_ip_init()

2019-08-14 Thread Kai-Heng Feng
After commit 005440066f92 ("drm/amdgpu: enable gfxoff again on raven
series (v2)"), some Raven Ridge systems start to have rendering
corruption in browser [1].

Chip specific flags like cg_flags and pg_flags are applied in
amdgpu_device_ip_early_init(). For Raven Ridge, the flags may depend on
pp_feature's PP_GFXOFF_MASK bit, which can be negated in
amdgpu_device_ip_init() based on firmware information. At that time it's
already too late, since cg_flags and pg_flags are already set.

Apply flags after amdgpu_device_ip_init() and consolidate all flags to
one place, to solve the issue.

[1] 
https://lore.kernel.org/lkml/3eb0e920-31d7-4c91-a360-dbfb4417a...@canonical.com/

Fixes: 005440066f92 ("drm/amdgpu: enable gfxoff again on raven series (v2)")
Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 589 +
 drivers/gpu/drm/amd/amdgpu/cik.c   |  87 ---
 drivers/gpu/drm/amd/amdgpu/nv.c|  50 --
 drivers/gpu/drm/amd/amdgpu/si.c|  83 ---
 drivers/gpu/drm/amd/amdgpu/soc15.c | 140 -
 drivers/gpu/drm/amd/amdgpu/vi.c| 162 --
 6 files changed, 589 insertions(+), 522 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 275277364a8a..10ea4899c338 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1852,6 +1852,591 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
return r;
 }
 
+#define CZ_REV_BRISTOL(rev) \
+   ((rev >= 0xC8 && rev <= 0xCE) || (rev >= 0xE1 && rev <= 0xE6))
+
+static int amdgpu_device_apply_flags(struct amdgpu_device *adev)
+{
+   switch (adev->asic_type) {
+   case CHIP_TAHITI:
+   adev->cg_flags =
+   AMD_CG_SUPPORT_GFX_MGCG |
+   AMD_CG_SUPPORT_GFX_MGLS |
+   /*AMD_CG_SUPPORT_GFX_CGCG |*/
+   AMD_CG_SUPPORT_GFX_CGLS |
+   AMD_CG_SUPPORT_GFX_CGTS |
+   AMD_CG_SUPPORT_GFX_CP_LS |
+   AMD_CG_SUPPORT_MC_MGCG |
+   AMD_CG_SUPPORT_SDMA_MGCG |
+   AMD_CG_SUPPORT_BIF_LS |
+   AMD_CG_SUPPORT_VCE_MGCG |
+   AMD_CG_SUPPORT_UVD_MGCG |
+   AMD_CG_SUPPORT_HDP_LS |
+   AMD_CG_SUPPORT_HDP_MGCG;
+   adev->pg_flags = 0;
+   break;
+   case CHIP_PITCAIRN:
+   adev->cg_flags =
+   AMD_CG_SUPPORT_GFX_MGCG |
+   AMD_CG_SUPPORT_GFX_MGLS |
+   /*AMD_CG_SUPPORT_GFX_CGCG |*/
+   AMD_CG_SUPPORT_GFX_CGLS |
+   AMD_CG_SUPPORT_GFX_CGTS |
+   AMD_CG_SUPPORT_GFX_CP_LS |
+   AMD_CG_SUPPORT_GFX_RLC_LS |
+   AMD_CG_SUPPORT_MC_LS |
+   AMD_CG_SUPPORT_MC_MGCG |
+   AMD_CG_SUPPORT_SDMA_MGCG |
+   AMD_CG_SUPPORT_BIF_LS |
+   AMD_CG_SUPPORT_VCE_MGCG |
+   AMD_CG_SUPPORT_UVD_MGCG |
+   AMD_CG_SUPPORT_HDP_LS |
+   AMD_CG_SUPPORT_HDP_MGCG;
+   adev->pg_flags = 0;
+   break;
+   case CHIP_VERDE:
+   adev->cg_flags =
+   AMD_CG_SUPPORT_GFX_MGCG |
+   AMD_CG_SUPPORT_GFX_MGLS |
+   AMD_CG_SUPPORT_GFX_CGLS |
+   AMD_CG_SUPPORT_GFX_CGTS |
+   AMD_CG_SUPPORT_GFX_CGTS_LS |
+   AMD_CG_SUPPORT_GFX_CP_LS |
+   AMD_CG_SUPPORT_MC_LS |
+   AMD_CG_SUPPORT_MC_MGCG |
+   AMD_CG_SUPPORT_SDMA_MGCG |
+   AMD_CG_SUPPORT_SDMA_LS |
+   AMD_CG_SUPPORT_BIF_LS |
+   AMD_CG_SUPPORT_VCE_MGCG |
+   AMD_CG_SUPPORT_UVD_MGCG |
+   AMD_CG_SUPPORT_HDP_LS |
+   AMD_CG_SUPPORT_HDP_MGCG;
+   adev->pg_flags = 0;
+   //???
+   break;
+   case CHIP_OLAND:
+   adev->cg_flags =
+   AMD_CG_SUPPORT_GFX_MGCG |
+   AMD_CG_SUPPORT_GFX_MGLS |
+   /*AMD_CG_SUPPORT_GFX_CGCG |*/
+   AMD_CG_SUPPORT_GFX_CGLS |
+   AMD_CG_SUPPORT_GFX_CGTS |
+   AMD_CG_SUPPORT_GFX_CP_LS |
+   AMD_CG_SUPPORT_GFX_RLC_LS |
+   AMD_CG_SUPPORT_MC_LS |
+   AMD_CG_SUPPORT_MC_MGCG |
+   AMD_CG_SUPPORT_SDMA_MGCG |
+   AMD_CG_SUPPORT_BIF_LS |
+  

Re: [Regression] "drm/amdgpu: enable gfxoff again on raven series (v2)"

2019-08-08 Thread Kai-Heng Feng

at 14:29, Huang, Ray  wrote:


-Original Message-
From: Kai-Heng Feng 
Sent: Thursday, August 08, 2019 1:45 AM
To: Huang, Ray 
Cc: Deucher, Alexander ; Koenig, Christian
; Zhou, David(ChunMing)
; amd-gfx list ;
dri-de...@lists.freedesktop.org; LKML ;
Anthony Wong 
Subject: Re: [Regression] "drm/amdgpu: enable gfxoff again on raven series
(v2)"

Hi Ray,

at 00:03, Huang, Ray  wrote:


May I know the all firmware version in your system?


Seems to the issue we encountered with IOMMU enabled. Could you please  
disable iommu in SBIOS or GRUB?


Yes, "amd_iommu=off" can workaround the issue.

Kai-Heng



Thanks,
Ray


# cat amdgpu_firmware_info
VCE feature version: 0, firmware version: 0x
UVD feature version: 0, firmware version: 0x
MC feature version: 0, firmware version: 0x
ME feature version: 40, firmware version: 0x0099
PFP feature version: 40, firmware version: 0x00ae
CE feature version: 40, firmware version: 0x004d
RLC feature version: 1, firmware version: 0x0213
RLC SRLC feature version: 1, firmware version: 0x0001
RLC SRLG feature version: 1, firmware version: 0x0001
RLC SRLS feature version: 1, firmware version: 0x0001
MEC feature version: 40, firmware version: 0x018b
MEC2 feature version: 40, firmware version: 0x018b
SOS feature version: 0, firmware version: 0x
ASD feature version: 0, firmware version: 0x001ad4d4
TA XGMI feature version: 0, firmware version: 0x
TA RAS feature version: 0, firmware version: 0x
SMC feature version: 0, firmware version: 0x1e44
SDMA0 feature version: 41, firmware version: 0x00a9
VCN feature version: 0, firmware version: 0x0110901c
DMCU feature version: 0, firmware version: 0x
VBIOS version: 113-RAVEN-103

Kai-Heng


Thanks,
Ray

From: Kai-Heng Feng 
Sent: Wednesday, August 7, 2019 8:50 PM
To: Huang, Ray
Cc: Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); amd-

gfx

list; dri-de...@lists.freedesktop.org; LKML; Anthony Wong
Subject: [Regression] "drm/amdgpu: enable gfxoff again on raven series
(v2)"

Hi,

After commit 005440066f92 ("drm/amdgpu: enable gfxoff again on raven

series
(v2)”), browsers on Raven Ridge systems cause serious corruption like  
this:

https://launchpadlibrarian.net/436319772/Screenshot%20from%202019-

08-07%2004-20-34.png

Firmwares for Raven Ridge is up-to-date.

Kai-Heng





Re: [Regression] "drm/amdgpu: enable gfxoff again on raven series (v2)"

2019-08-07 Thread Kai-Heng Feng

Hi Ray,

at 00:03, Huang, Ray  wrote:


May I know the all firmware version in your system?


# cat amdgpu_firmware_info
VCE feature version: 0, firmware version: 0x
UVD feature version: 0, firmware version: 0x
MC feature version: 0, firmware version: 0x
ME feature version: 40, firmware version: 0x0099
PFP feature version: 40, firmware version: 0x00ae
CE feature version: 40, firmware version: 0x004d
RLC feature version: 1, firmware version: 0x0213
RLC SRLC feature version: 1, firmware version: 0x0001
RLC SRLG feature version: 1, firmware version: 0x0001
RLC SRLS feature version: 1, firmware version: 0x0001
MEC feature version: 40, firmware version: 0x018b
MEC2 feature version: 40, firmware version: 0x018b
SOS feature version: 0, firmware version: 0x
ASD feature version: 0, firmware version: 0x001ad4d4
TA XGMI feature version: 0, firmware version: 0x
TA RAS feature version: 0, firmware version: 0x
SMC feature version: 0, firmware version: 0x1e44
SDMA0 feature version: 41, firmware version: 0x00a9
VCN feature version: 0, firmware version: 0x0110901c
DMCU feature version: 0, firmware version: 0x
VBIOS version: 113-RAVEN-103

Kai-Heng



Thanks,
Ray

From: Kai-Heng Feng 
Sent: Wednesday, August 7, 2019 8:50 PM
To: Huang, Ray
Cc: Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); amd-gfx  
list; dri-de...@lists.freedesktop.org; LKML; Anthony Wong
Subject: [Regression] "drm/amdgpu: enable gfxoff again on raven series  
(v2)"


Hi,

After commit 005440066f92 ("drm/amdgpu: enable gfxoff again on raven series
(v2)”), browsers on Raven Ridge systems cause serious corruption like this:
https://launchpadlibrarian.net/436319772/Screenshot%20from%202019-08-07%2004-20-34.png

Firmwares for Raven Ridge is up-to-date.

Kai-Heng





[Regression] "drm/amdgpu: enable gfxoff again on raven series (v2)"

2019-08-07 Thread Kai-Heng Feng

Hi,

After commit 005440066f92 ("drm/amdgpu: enable gfxoff again on raven series  
(v2)”), browsers on Raven Ridge systems cause serious corruption like this:

https://launchpadlibrarian.net/436319772/Screenshot%20from%202019-08-07%2004-20-34.png

Firmwares for Raven Ridge is up-to-date.

Kai-Heng


Where do I file AMDGPU bugs nowadays?

2019-07-05 Thread Kai-Heng Feng

Hi AMDGPU folks,

I’ve filed a bug [1] a while back, but no response so far.
I wonder if you still use BFO? Or do you migrate to another bug tracking  
system?


[1]https://bugs.freedesktop.org/show_bug.cgi?id=110886

Kai-Heng
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: Add Dell Inspiron 5575/5775 back to atpx quirk table

2018-06-13 Thread Kai-Heng Feng

at 01:41, Alex Deucher  wrote:


On Tue, Jun 5, 2018 at 2:47 AM, Kai-Heng Feng
 wrote:

The original issue on these laptops was about _PR3, not audio controller
prevents gfx auto suspending.


Have you verified that this this patch is still necessary with the HDA
driver fix in place?


Yes I did. And the HDA fix doesn't work for these laptops.

The HDA fix is to let HDA controller can be runtime suspended. OTOH, the  
ATPX quirk is to fix the "atombios stuck" for these laptops.


Kai-Heng



Alex


Commit 444d95f0eeef ("Partially revert: drm/amdgpu: add atpx quirk
handling (v2)") breaks these laptops:

[   29.572055] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in  
loop for more than 5secs aborting
[   29.572738] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR*  
atombios stuck executing 7C36 (len 272, WS 0, PS 4) @ 0x7C7F
[   29.573436] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR*  
atombios stuck executing 6444 (len 70, WS 0, PS 8) @ 0x646A
[   29.574125] [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu asic  
init failed

[   29.991377] amdgpu :01:00.0: Wait for MC idle timedout !
[   30.407480] amdgpu :01:00.0: Wait for MC idle timedout !
[   30.417279] [drm] PCIE GART of 256M enabled (table at  
0x00F4).

[   30.426550] amdgpu: [powerplay] smu not running, upload firmware again
[   30.435710] BUG: unable to handle kernel paging request at  
a52b90080fec
[   30.436982] IP: smu7_populate_single_firmware_entry.isra.5+0x65/0xe0  
[amdgpu]

[   30.438056] PGD 14e942067 P4D 14e942067 PUD 0
[   30.439280] Oops: 0002 [#1] SMP NOPTI
[   30.440339] Modules linked in: cmac bnep nls_iso8859_1 arc4  
ath10k_pci ath10k_core rtsx_usb_ms memstick dell_wmi uvcvideo  
sparse_keymap videobuf2_vmalloc dell_laptop videobuf2_memops dell_smbios  
dell_wmi_descriptor videobuf2_v4l2 btusb btrtl wmi_bmof joydev dcdbas  
btbcm dell_smm_hwmon videobuf2_common kvm_amd btintel ath videodev  
snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic cdc_acm  
mac80211 media bluetooth snd_hda_intel kvm snd_hda_codec snd_hwdep  
irqbypass snd_hda_core ecdh_generic crct10dif_pclmul snd_pcm  
crc32_pclmul snd_seq ghash_clmulni_intel pcbc snd_timer snd_seq_device  
aesni_intel cfg80211 snd aes_x86_64 soundcore input_leds crypto_simd  
cryptd tpm_crb hid_multitouch ucsi_acpi glue_helper typec_ucsi serio_raw  
typec video i2c_piix4 mac_hid dell_rbtn shpchp wmi parport_pc ppdev
[   30.445678]  lp parport autofs4 btrfs xor zstd_decompress  
zstd_compress xxhash raid6_pq dm_mirror dm_region_hash dm_log amdkfd  
amd_iommu_v2 amdgpu rtsx_usb_sdmmc rtsx_usb chash i2c_algo_bit gpu_sched  
drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm  
r8169 ahci libahci i2c_hid mii hid
[   30.448702] CPU: 7 PID: 1021 Comm: gpu-manager Not tainted  
4.16.0-rc7+ #1
[   30.450256] Hardware name: Dell Inc. Inspiron 5775/Inspiron 5775,  
BIOS 1.1.0 03/26/2018
[   30.451959] RIP:  
0010:smu7_populate_single_firmware_entry.isra.5+0x65/0xe0 [amdgpu]

[   30.453492] RSP: 0018:a50f816f3a58 EFLAGS: 00010246
[   30.455116] RAX: 008c RBX: a52b90080fec RCX:  

[   30.456676] RDX: 0004 RSI: 0004 RDI:  
917dfb3a5a90
[   30.458203] RBP: a50f816f3aa8 R08: 917dfb3a5a90 R09:  
00033930
[   30.459727] R10:  R11: 0412 R12:  
0003
[   30.461246] R13: 917dfb194c14 R14: 917dfac65000 R15:  
05fe
[   30.462733] FS:  7f5248978700() GS:917e0edc()  
knlGS:

[   30.464302] CS:  0010 DS:  ES:  CR0: 80050033
[   30.465830] CR2: a52b90080fec CR3: 0001358c2000 CR4:  
003406e0

[   30.467468] Call Trace:
[   30.469068]  smu7_request_smu_load_fw+0xa9/0x360 [amdgpu]
[   30.470630]  ? vga_switcheroo_fini_domain_pm_ops+0x20/0x20
[   30.472416]  iceland_start_smu+0x39/0x70 [amdgpu]
[   30.473492]  hwmgr_resume+0x2b/0xa0 [amdgpu]
[   30.474500]  pp_resume+0x15/0x20 [amdgpu]
[   30.475472]  amdgpu_device_ip_resume_phase2+0x58/0xb0 [amdgpu]
[   30.476431]  amdgpu_device_resume+0xd8/0x370 [amdgpu]
[   30.477379]  ? __pci_set_master+0x34/0xe0
[   30.478345]  ? vga_switcheroo_fini_domain_pm_ops+0x20/0x20
[   30.479317]  amdgpu_pmops_runtime_resume+0x76/0xa0 [amdgpu]
[   30.480265]  pci_pm_runtime_resume+0x76/0xb0
[   30.481216]  vga_switcheroo_runtime_resume+0x59/0x60
[   30.482201]  __rpm_callback+0xc4/0x200
[   30.483179]  ? vga_switcheroo_fini_domain_pm_ops+0x20/0x20
[   30.484075]  rpm_callback+0x24/0x80
[   30.485025]  ? vga_switcheroo_fini_domain_pm_ops+0x20/0x20
[   30.486005]  rpm_resume+0x499/0x6a0
[   30.486946]  __pm_runtime_resume+0x4e/0x80
[   30.487880]  pci_config_pm_runtime_get+0x53/0x60
[   30.488789]  pci_read_config+0x8f/0x280
[   30.489771]  sysfs_kf_bin_read+0x4a/0x70
[   30.490750]  kernfs_fop_read+0xa9/0x190
[   30.491648]  __vfs_read+0x37/0x160
[   30.492579]  ? security_file_

[PATCH] drm/amdgpu: Add Dell Inspiron 5575/5775 back to atpx quirk table

2018-06-05 Thread Kai-Heng Feng
7ca490 RDI: 0005
[   30.500796] RBP: 7ffcbd7ca490 R08:  R09: 0028
[   30.501776] R10:  R11: 0246 R12: 0005
[   30.502758] R13: 0030 R14: 7ffcbd7ca3f8 R15: 
[   30.503659] Code: 83 fc 23 f3 48 ab 48 8b 06 be 0d 00 00 00 48 8b 40 20 77 
0a 44 89 e1 0f b6 b1 a0 46 68 c0 4c 89 c7 ff d0 85 c0 75 3e 0f b7 45 b2 <66> 44 
89 23 c7 43 0c 00 00 00 00 c7 43 10 00 00 00 00 66 89 43
[   30.504668] RIP: smu7_populate_single_firmware_entry.isra.5+0x65/0xe0 
[amdgpu] RSP: a50f816f3a58
[   30.505643] CR2: a52b90080fec
[   30.506618] ---[ end trace b443cb7ec0f49d4f ]---

So add these IDs back to atpx quirk table.

Fixes: 444d95f0eeef ("Partially revert: drm/amdgpu: add atpx quirk handling 
(v2)")
Fixes: c6f5b3155fbc ("Revert "drm/amdgpu: add new device to use atpx quirk"")
Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
index 9c493e8a48a5..1b8cb076c378 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
@@ -565,6 +565,10 @@ static const struct vga_switcheroo_handler 
amdgpu_atpx_handler = {
 };
 
 static const struct amdgpu_px_quirk amdgpu_px_quirk_list[] = {
+   /* Dell Inspiron 5575 */
+   { 0x1002, 0x6900, 0x1028, 0x0812, AMDGPU_PX_QUIRK_FORCE_ATPX },
+   /* Dell Inspiron 5775 */
+   { 0x1002, 0x6900, 0x1028, 0x0813, AMDGPU_PX_QUIRK_FORCE_ATPX },
{ 0, 0, 0, 0, 0 },
 };
 
-- 
2.17.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx