Re: [PATCH v3 0/8] Support DEVICE_GENERIC memory in migrate_vma_*

2021-07-23 Thread Sierra Guiza, Alejandro (Alex)


On 7/17/2021 2:54 PM, Sierra Guiza, Alejandro (Alex) wrote:


On 7/16/2021 5:14 PM, Felix Kuehling wrote:

Am 2021-07-16 um 11:07 a.m. schrieb Theodore Y. Ts'o:

On Wed, Jun 23, 2021 at 05:49:55PM -0400, Felix Kuehling wrote:
I can think of two ways to test the changes for 
MEMORY_DEVICE_GENERIC in
this patch series in a way that is reproducible without special 
hardware and

firmware:

For the reference counting changes we could use the dax driver with 
hmem and
use efi_fake_mem on the kernel command line to create some 
DEVICE_GENERIC
pages. I'm open to suggestions for good user mode tests to exercise 
dax

functionality on this type of memory.

Sorry for the thread necromancy, but now that the merge window is
past

No worries. Alejandro should have a new version of this series in a few
days, with updates to hmm_test and some fixes.


V4 patch series have been sent for review.
https://marc.info/?l=dri-devel=162654971618911=2

Regards,
Alex Sierra





Today I test ext4's dax support, without having any $$$ DAX hardware,
by using the kernel command line "memmap=4G!9G:memmap=9G!14G" which
reserves memory so that creates two pmem device and then I run
xfstests with DAX enabled using qemu or using a Google Compute Engine
VM, using TEST_DEV=/dev/pmem0 and SCRATCH_DEV=/dev/pmem1.

If you can give me a recipe for what kernel configs I should enable,
and what magic kernel command line arguments to use, then I'd be able
to test your patch set with ext4.

That would be great!

Regarding kernel config options, it should be the same as what you're
using for DAX testing today. We're not changing or adding any Kconfig
options. But let me take a stab:

ZONE_DEVICE
HMM_MIRROR
MMU_NOTIFIER
DEVICE_PRIVATE (maybe not needed for your test)
FS_DAX

Hi Theodore,
I wonder if you had chance to set the kernel configs from Felix to 
enable DAX in xfstests.


I've been trying to test FS DAX on my side using virtio-fs + QEMU. 
Unfortunately, Im having some problems setting up the environment with 
the guest kernel. Could you share your VM (QEMU or GCE) setup to run it 
with xfstests?


Regards,
Alex S.



I'm not sure what you're looking for in terms of kernel command line,
other than the memmap options you already found. There are some more
options to run hmm_test with fake SPM (DEVICE_GENERIC) memory, but we're
already running that ourselves. That will also be in the next revision
of this patch series.


In order to run hmm test with generic device type enabled, set the 
following:


kernel config:
EFI_FAKE_MEMMAP
RUNTIME_TESTING_MENU
TEST_HMM=m

Kernel parameters to fake SP memory. The addresses set here are based 
on my system's usable memory enumerated by BIOS-e820 at boot. The test 
requires two SP devices of at least 256MB.

efi_fake_mem=1G@0x1:0x4,1G@0x14000:0x4

To run the hmm_test in generic device type pass the SP addresses to 
the sh script.

sudo ./test_hmm.sh smoke 0x1 0x14000



If you can run your xfstests with DAX on top of this patch series, that
would be very helpful. That's to make sure the ZONE_DEVICE page refcount
changes don't break DAX.

Regards,
   Felix



Cheers,

    - Ted

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amd/pm: restore user customized OD settings properly for NV1x

2021-07-23 Thread Matt Coffin

On 7/21/21 9:20 PM, Evan Quan wrote:

The customized OD settings can be divided into two parts: those
committed ones and non-committed ones.
   - For those changes which had been fed to SMU before S3/S4/Runpm
 suspend kicked, they are committed changes. They should be properly
 restored and fed to SMU on S3/S4/Runpm resume.
   - For those non-committed changes, they are restored only without feeding
 to SMU.

Change-Id: Iea7cf7908dfcd919a4d0205e10bff91b1149a440
Signed-off-by: Evan Quan 
---


It's been a while since I've been through the SMU code so I don't know 
all the places that restore_uesr_profile is used, but, just to confirm, 
these would still go back to default (boot) values on a GPU reset, yes?


I ask because when pushing OD settings too far, GPU resets are often 
encountered, and you might be able to create a state where it would 
continually reset if your OD settings are remembered even through a full 
reset.


Very nice feature other than that, though, as I actually keep 
`radeontop` open in a tmux session to prevent my secondary card from 
resetting my OD settings, so this will be nice.


Thanks for listening, and for the driver in general,
Matt
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/3] drm/amdgpu: replace dce_virtual with amdgpu_vkms (v3)

2021-07-23 Thread Taylor, Ryan
[AMD Official Use Only]

Thanks Alex! I will make these changes.

Best,
Ryan

From: Alex Deucher 
Sent: Friday, July 23, 2021 7:33 AM
To: Taylor, Ryan 
Cc: Maling list - DRI developers ; amd-gfx 
list ; kernel test robot ; 
Daniel Vetter ; Siqueira, Rodrigo 
; Melissa Wen ; Deucher, 
Alexander 
Subject: Re: [PATCH 3/3] drm/amdgpu: replace dce_virtual with amdgpu_vkms (v3)

On Wed, Jul 21, 2021 at 1:07 PM Ryan Taylor  wrote:
>
> Move dce_virtual into amdgpu_vkms and update all references to
> dce_virtual with amdgpu_vkms.
>
> v2: Removed more references to dce_virtual.
>
> v3: Restored display modes from previous implementation.
>
> Reported-by: kernel test robot 
> Suggested-by: Alex Deucher 
> Signed-off-by: Ryan Taylor 
> ---
>  drivers/gpu/drm/amd/amdgpu/Makefile  |   3 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 234 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.h |   5 +-
>  drivers/gpu/drm/amd/amdgpu/cik.c |  10 +-
>  drivers/gpu/drm/amd/amdgpu/dce_virtual.c | 223 -
>  drivers/gpu/drm/amd/amdgpu/dce_virtual.h |  30 ---
>  drivers/gpu/drm/amd/amdgpu/nv.c  |  20 +-
>  drivers/gpu/drm/amd/amdgpu/si.c  |   8 +-
>  drivers/gpu/drm/amd/amdgpu/soc15.c   |  10 +-
>  drivers/gpu/drm/amd/amdgpu/vi.c  |  14 +-
>  10 files changed, 264 insertions(+), 293 deletions(-)
>  delete mode 100644 drivers/gpu/drm/amd/amdgpu/dce_virtual.c
>  delete mode 100644 drivers/gpu/drm/amd/amdgpu/dce_virtual.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 30cbcd5ce1cc..0d814c957461 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -120,8 +120,7 @@ amdgpu-y += \
>  amdgpu-y += \
> dce_v10_0.o \
> dce_v11_0.o \
> -   amdgpu_vkms.o \
> -   dce_virtual.o
> +   amdgpu_vkms.o
>
>  # add GFX block
>  amdgpu-y += \
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> index d5c1f1c58f5f..538d41ea 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> @@ -5,6 +5,15 @@
>  #include 
>
>  #include "amdgpu.h"
> +#ifdef CONFIG_DRM_AMDGPU_SI
> +#include "dce_v6_0.h"
> +#endif
> +#ifdef CONFIG_DRM_AMDGPU_CIK
> +#include "dce_v8_0.h"
> +#endif
> +#include "dce_v10_0.h"
> +#include "dce_v11_0.h"
> +#include "ivsrcid/ivsrcid_vislands30.h"
>  #include "amdgpu_vkms.h"
>  #include "amdgpu_display.h"
>
> @@ -180,12 +189,45 @@ static const struct drm_connector_funcs 
> amdgpu_vkms_connector_funcs = {
>
>  static int amdgpu_vkms_conn_get_modes(struct drm_connector *connector)
>  {
> -   int count;
> +   struct drm_device *dev = connector->dev;
> +   struct drm_display_mode *mode = NULL;
> +   unsigned i;
> +   static const struct mode_size {
> +   int w;
> +   int h;
> +   } common_modes[] = {
> +   { 640,  480},
> +   { 720,  480},
> +   { 800,  600},
> +   { 848,  480},
> +   {1024,  768},
> +   {1152,  768},
> +   {1280,  720},
> +   {1280,  800},
> +   {1280,  854},
> +   {1280,  960},
> +   {1280, 1024},
> +   {1440,  900},
> +   {1400, 1050},
> +   {1680, 1050},
> +   {1600, 1200},
> +   {1920, 1080},
> +   {1920, 1200},
> +   {2560, 1440},
> +   {4096, 3112},
> +   {3656, 2664},
> +   {3840, 2160},
> +   {4096, 2160},
> +   };
> +
> +   for (i = 0; i < ARRAY_SIZE(common_modes); i++) {
> +   mode = drm_cvt_mode(dev, common_modes[i].w, 
> common_modes[i].h, 60, false, false, false);
> +   drm_mode_probed_add(connector, mode);
> +   }
>
> -   count = drm_add_modes_noedid(connector, XRES_MAX, YRES_MAX);
> drm_set_preferred_mode(connector, XRES_DEF, YRES_DEF);
>
> -   return count;
> +   return ARRAY_SIZE(common_modes);
>  }
>
>  static const struct drm_connector_helper_funcs amdgpu_vkms_conn_helper_funcs 
> = {
> @@ -409,3 +451,189 @@ int amdgpu_vkms_output_init(struct drm_device *dev,
>
> return ret;
>  }
> +
> +const struct drm_mode_config_funcs amdgpu_vkms_mode_funcs = {
> +   .fb_create = amdgpu_display_user_framebuffer_create,
> +   .atomic_check = drm_atomic_helper_check,
> +   .atomic_commit = drm_atomic_helper_commit,
> +};
> +
> +static int amdgpu_vkms_sw_init(void *handle)
> +{
> +   int r, i;
> +   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> +
> +   adev_to_drm(adev)->max_vblank_count = 0;
> +
> +   adev_to_drm(adev)->mode_config.funcs = _vkms_mode_funcs;
> +
> +   adev_to_drm(adev)->mode_config.max_width = XRES_MAX;
> +   adev_to_drm(adev)->mode_config.max_height = 

Re: [PATCH V2 1/2] drm/amd/pm: restore user customized OD settings properly for NV1x

2021-07-23 Thread Lazar, Lijo
Good one, that makes better use of flags.

Thanks,
Lijo

From: Chen, Guchun 
Sent: Friday, July 23, 2021 7:14:58 PM
To: Lazar, Lijo ; Quan, Evan ; 
amd-gfx@lists.freedesktop.org 
Cc: Deucher, Alexander 
Subject: RE: [PATCH V2 1/2] drm/amd/pm: restore user customized OD settings 
properly for NV1x

[Public]

Just curious that in the patch, it adds a variable *user_od* serving the check 
of applying user customized OD setting. Instead of this, does it make sense to 
use the *flag* in struct smu_user_dpm_profile for this? As we have below bit in 
pm/inc/amdgpu_smu.h, can we add another bit for OD restore? This will help 
unify the usage of *flag* in SMU.

#define SMU_DPM_USER_PROFILE_RESTORE (1 << 0)

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Lazar, Lijo
Sent: Friday, July 23, 2021 6:09 PM
To: Quan, Evan ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: Re: [PATCH V2 1/2] drm/amd/pm: restore user customized OD settings 
properly for NV1x

The series looks good to me, though I prefer to use a common logic to restore 
od settings so that smuv12,smuv13 gets the restore feature by default once they 
add the user table logic. Don't have strong argument for it unless Alex, 
Kenneth or others have some comments.

Anyway, the series is
Reviewed-by: Lijo Lazar 

On 7/23/2021 2:39 PM, Evan Quan wrote:
> The customized OD settings can be divided into two parts: those
> committed ones and non-committed ones.
>- For those changes which had been fed to SMU before S3/S4/Runpm
>  suspend kicked, they are committed changes. They should be properly
>  restored and fed to SMU on S3/S4/Runpm resume.
>- For those non-committed changes, they are restored only without feeding
>  to SMU.
>
> Change-Id: Iea7cf7908dfcd919a4d0205e10bff91b1149a440
> Signed-off-by: Evan Quan 
> --
> v1->v2
>- better naming and logic revised for checking OD setting
> update(Lijo)
> ---
>   drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h   |  8 +++
>   drivers/gpu/drm/amd/pm/inc/smu_v11_0.h|  2 +
>   drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  9 +++
>   .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 55 +--
>   .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 25 +
>   5 files changed, 82 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> index 3e89852e4820..c2c201b8e3cf 100644
> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> @@ -231,6 +231,7 @@ struct smu_user_dpm_profile {
>uint32_t power_limit;
>uint32_t fan_speed_percent;
>uint32_t flags;
> + uint32_t user_od;
>
>/* user clock state information */
>uint32_t clk_mask[SMU_CLK_COUNT];
> @@ -352,6 +353,7 @@ struct smu_table_context
>
>void*overdrive_table;
>void*boot_overdrive_table;
> + void*user_overdrive_table;
>
>uint32_tgpu_metrics_table_size;
>void*gpu_metrics_table;
> @@ -623,6 +625,12 @@ struct pptable_funcs {
> enum PP_OD_DPM_TABLE_COMMAND type,
> long *input, uint32_t size);
>
> + /**
> +  * @restore_user_od_settings: Restore the user customized
> +  *OD settings on S3/S4/Runpm resume.
> +  */
> + int (*restore_user_od_settings)(struct smu_context *smu);
> +
>/**
> * @get_clock_by_type_with_latency: Get the speed and latency of a 
> clock
> *  domain.
> diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
> b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
> index 385b2ea5379c..1e42aafbb9fd 100644
> --- a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
> +++ b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
> @@ -302,5 +302,7 @@ void smu_v11_0_interrupt_work(struct smu_context
> *smu);
>
>   int smu_v11_0_set_light_sbr(struct smu_context *smu, bool enable);
>
> +int smu_v11_0_restore_user_od_settings(struct smu_context *smu);
> +
>   #endif
>   #endif
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> index ebe672142808..8ca7337ea5fc 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> @@ -416,6 +416,15 @@ static void smu_restore_dpm_user_profile(struct 
> smu_context *smu)
>}
>}
>
> + /* Restore user customized OD settings */
> + if (smu->user_dpm_profile.user_od) {
> + if (smu->ppt_funcs->restore_user_od_settings) {
> + ret = smu->ppt_funcs->restore_user_od_settings(smu);
> + if (ret)
> + dev_err(smu->adev->dev, "Failed to upload 
> customized OD settings\n");

RE: [PATCH V2 1/2] drm/amd/pm: restore user customized OD settings properly for NV1x

2021-07-23 Thread Chen, Guchun
[Public]

Just curious that in the patch, it adds a variable *user_od* serving the check 
of applying user customized OD setting. Instead of this, does it make sense to 
use the *flag* in struct smu_user_dpm_profile for this? As we have below bit in 
pm/inc/amdgpu_smu.h, can we add another bit for OD restore? This will help 
unify the usage of *flag* in SMU.

#define SMU_DPM_USER_PROFILE_RESTORE (1 << 0)

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Lazar, Lijo
Sent: Friday, July 23, 2021 6:09 PM
To: Quan, Evan ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: Re: [PATCH V2 1/2] drm/amd/pm: restore user customized OD settings 
properly for NV1x

The series looks good to me, though I prefer to use a common logic to restore 
od settings so that smuv12,smuv13 gets the restore feature by default once they 
add the user table logic. Don't have strong argument for it unless Alex, 
Kenneth or others have some comments.

Anyway, the series is
Reviewed-by: Lijo Lazar 

On 7/23/2021 2:39 PM, Evan Quan wrote:
> The customized OD settings can be divided into two parts: those 
> committed ones and non-committed ones.
>- For those changes which had been fed to SMU before S3/S4/Runpm
>  suspend kicked, they are committed changes. They should be properly
>  restored and fed to SMU on S3/S4/Runpm resume.
>- For those non-committed changes, they are restored only without feeding
>  to SMU.
> 
> Change-Id: Iea7cf7908dfcd919a4d0205e10bff91b1149a440
> Signed-off-by: Evan Quan 
> --
> v1->v2
>- better naming and logic revised for checking OD setting 
> update(Lijo)
> ---
>   drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h   |  8 +++
>   drivers/gpu/drm/amd/pm/inc/smu_v11_0.h|  2 +
>   drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  9 +++
>   .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 55 +--
>   .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 25 +
>   5 files changed, 82 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h 
> b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> index 3e89852e4820..c2c201b8e3cf 100644
> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> @@ -231,6 +231,7 @@ struct smu_user_dpm_profile {
>   uint32_t power_limit;
>   uint32_t fan_speed_percent;
>   uint32_t flags;
> + uint32_t user_od;
>   
>   /* user clock state information */
>   uint32_t clk_mask[SMU_CLK_COUNT];
> @@ -352,6 +353,7 @@ struct smu_table_context
>   
>   void*overdrive_table;
>   void*boot_overdrive_table;
> + void*user_overdrive_table;
>   
>   uint32_tgpu_metrics_table_size;
>   void*gpu_metrics_table;
> @@ -623,6 +625,12 @@ struct pptable_funcs {
>enum PP_OD_DPM_TABLE_COMMAND type,
>long *input, uint32_t size);
>   
> + /**
> +  * @restore_user_od_settings: Restore the user customized
> +  *OD settings on S3/S4/Runpm resume.
> +  */
> + int (*restore_user_od_settings)(struct smu_context *smu);
> +
>   /**
>* @get_clock_by_type_with_latency: Get the speed and latency of a clock
>*  domain.
> diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h 
> b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
> index 385b2ea5379c..1e42aafbb9fd 100644
> --- a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
> +++ b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
> @@ -302,5 +302,7 @@ void smu_v11_0_interrupt_work(struct smu_context 
> *smu);
>   
>   int smu_v11_0_set_light_sbr(struct smu_context *smu, bool enable);
>   
> +int smu_v11_0_restore_user_od_settings(struct smu_context *smu);
> +
>   #endif
>   #endif
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> index ebe672142808..8ca7337ea5fc 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> @@ -416,6 +416,15 @@ static void smu_restore_dpm_user_profile(struct 
> smu_context *smu)
>   }
>   }
>   
> + /* Restore user customized OD settings */
> + if (smu->user_dpm_profile.user_od) {
> + if (smu->ppt_funcs->restore_user_od_settings) {
> + ret = smu->ppt_funcs->restore_user_od_settings(smu);
> + if (ret)
> + dev_err(smu->adev->dev, "Failed to upload 
> customized OD settings\n");
> + }
> + }
> +
>   /* Disable restore flag */
>   smu->user_dpm_profile.flags &= ~SMU_DPM_USER_PROFILE_RESTORE;
>   }
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 59ea59acfb00..d7722c229ddd 100644
> --- 

Re: [PATCH V2 1/2] drm/amd/pm: restore user customized OD settings properly for NV1x

2021-07-23 Thread Deucher, Alexander
[AMD Official Use Only]

I haven't had a chance to look at the patches too closely, but if it could be 
done in a generic may, that makes sense to me.  Maybe as a follow up patch?

Alex


From: Lazar, Lijo 
Sent: Friday, July 23, 2021 6:09 AM
To: Quan, Evan ; amd-gfx@lists.freedesktop.org 

Cc: Deucher, Alexander 
Subject: Re: [PATCH V2 1/2] drm/amd/pm: restore user customized OD settings 
properly for NV1x

The series looks good to me, though I prefer to use a common logic to
restore od settings so that smuv12,smuv13 gets the restore feature by
default once they add the user table logic. Don't have strong argument
for it unless Alex, Kenneth or others have some comments.

Anyway, the series is
Reviewed-by: Lijo Lazar 

On 7/23/2021 2:39 PM, Evan Quan wrote:
> The customized OD settings can be divided into two parts: those
> committed ones and non-committed ones.
>- For those changes which had been fed to SMU before S3/S4/Runpm
>  suspend kicked, they are committed changes. They should be properly
>  restored and fed to SMU on S3/S4/Runpm resume.
>- For those non-committed changes, they are restored only without feeding
>  to SMU.
>
> Change-Id: Iea7cf7908dfcd919a4d0205e10bff91b1149a440
> Signed-off-by: Evan Quan 
> --
> v1->v2
>- better naming and logic revised for checking OD setting update(Lijo)
> ---
>   drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h   |  8 +++
>   drivers/gpu/drm/amd/pm/inc/smu_v11_0.h|  2 +
>   drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  9 +++
>   .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 55 +--
>   .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 25 +
>   5 files changed, 82 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h 
> b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> index 3e89852e4820..c2c201b8e3cf 100644
> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> @@ -231,6 +231,7 @@ struct smu_user_dpm_profile {
>uint32_t power_limit;
>uint32_t fan_speed_percent;
>uint32_t flags;
> + uint32_t user_od;
>
>/* user clock state information */
>uint32_t clk_mask[SMU_CLK_COUNT];
> @@ -352,6 +353,7 @@ struct smu_table_context
>
>void*overdrive_table;
>void*boot_overdrive_table;
> + void*user_overdrive_table;
>
>uint32_tgpu_metrics_table_size;
>void*gpu_metrics_table;
> @@ -623,6 +625,12 @@ struct pptable_funcs {
> enum PP_OD_DPM_TABLE_COMMAND type,
> long *input, uint32_t size);
>
> + /**
> +  * @restore_user_od_settings: Restore the user customized
> +  *OD settings on S3/S4/Runpm resume.
> +  */
> + int (*restore_user_od_settings)(struct smu_context *smu);
> +
>/**
> * @get_clock_by_type_with_latency: Get the speed and latency of a 
> clock
> *  domain.
> diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h 
> b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
> index 385b2ea5379c..1e42aafbb9fd 100644
> --- a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
> +++ b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
> @@ -302,5 +302,7 @@ void smu_v11_0_interrupt_work(struct smu_context *smu);
>
>   int smu_v11_0_set_light_sbr(struct smu_context *smu, bool enable);
>
> +int smu_v11_0_restore_user_od_settings(struct smu_context *smu);
> +
>   #endif
>   #endif
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> index ebe672142808..8ca7337ea5fc 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> @@ -416,6 +416,15 @@ static void smu_restore_dpm_user_profile(struct 
> smu_context *smu)
>}
>}
>
> + /* Restore user customized OD settings */
> + if (smu->user_dpm_profile.user_od) {
> + if (smu->ppt_funcs->restore_user_od_settings) {
> + ret = smu->ppt_funcs->restore_user_od_settings(smu);
> + if (ret)
> + dev_err(smu->adev->dev, "Failed to upload 
> customized OD settings\n");
> + }
> + }
> +
>/* Disable restore flag */
>smu->user_dpm_profile.flags &= ~SMU_DPM_USER_PROFILE_RESTORE;
>   }
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 59ea59acfb00..d7722c229ddd 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -2294,41 +2294,52 @@ static int navi10_set_default_od_settings(struct 
> smu_context *smu)
>(OverDriveTable_t *)smu->smu_table.overdrive_table;
>  

Re: [PATCH v2] drm/amdgpu: Check pmops for desired suspend state

2021-07-23 Thread Lazar, Lijo




On 7/23/2021 6:18 PM, Pratik Vishwakarma wrote:

[Why]
User might change the suspend behaviour from OS.

[How]
Check with pm for target suspend state and set s0ix
flag only for s2idle state.

v2: User might change default suspend state, use target state
Suggested-by: Lijo Lazar 
Signed-off-by: Pratik Vishwakarma 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
index 84a1b4bc9bb4..bf59bb263816 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@@ -1042,7 +1042,7 @@ bool amdgpu_acpi_is_s0ix_supported(struct amdgpu_device 
*adev)
  #if defined(CONFIG_AMD_PMC) || defined(CONFIG_AMD_PMC_MODULE)
if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) {
if (adev->flags & AMD_IS_APU)
-   return true;
+   return pm_suspend_target_state == PM_SUSPEND_TO_IDLE;


Not sure if this is the right place, the name _is_s0ix_supported() gives 
the sense of a static check - whether the feature is supported.


pm_suspend_target_state is a dynamic one - actual suspend state to which 
transition happens. Ex: s0ix may be supported, but user may choose 
suspend to RAM.


Maybe rename to is_s0ix_transition? Will let Alex to comment as he added 
this function originally.


Thanks,
Lijo


}
  #endif
return false;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v2] drm/amdgpu: Check pmops for desired suspend state

2021-07-23 Thread Pratik Vishwakarma
[Why]
User might change the suspend behaviour from OS.

[How]
Check with pm for target suspend state and set s0ix
flag only for s2idle state.

v2: User might change default suspend state, use target state
Suggested-by: Lijo Lazar 
Signed-off-by: Pratik Vishwakarma 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
index 84a1b4bc9bb4..bf59bb263816 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@@ -1042,7 +1042,7 @@ bool amdgpu_acpi_is_s0ix_supported(struct amdgpu_device 
*adev)
 #if defined(CONFIG_AMD_PMC) || defined(CONFIG_AMD_PMC_MODULE)
if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) {
if (adev->flags & AMD_IS_APU)
-   return true;
+   return pm_suspend_target_state == PM_SUSPEND_TO_IDLE;
}
 #endif
return false;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/3] drm/amdgpu: replace dce_virtual with amdgpu_vkms (v3)

2021-07-23 Thread Alex Deucher
On Wed, Jul 21, 2021 at 1:07 PM Ryan Taylor  wrote:
>
> Move dce_virtual into amdgpu_vkms and update all references to
> dce_virtual with amdgpu_vkms.
>
> v2: Removed more references to dce_virtual.
>
> v3: Restored display modes from previous implementation.
>
> Reported-by: kernel test robot 
> Suggested-by: Alex Deucher 
> Signed-off-by: Ryan Taylor 
> ---
>  drivers/gpu/drm/amd/amdgpu/Makefile  |   3 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 234 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.h |   5 +-
>  drivers/gpu/drm/amd/amdgpu/cik.c |  10 +-
>  drivers/gpu/drm/amd/amdgpu/dce_virtual.c | 223 -
>  drivers/gpu/drm/amd/amdgpu/dce_virtual.h |  30 ---
>  drivers/gpu/drm/amd/amdgpu/nv.c  |  20 +-
>  drivers/gpu/drm/amd/amdgpu/si.c  |   8 +-
>  drivers/gpu/drm/amd/amdgpu/soc15.c   |  10 +-
>  drivers/gpu/drm/amd/amdgpu/vi.c  |  14 +-
>  10 files changed, 264 insertions(+), 293 deletions(-)
>  delete mode 100644 drivers/gpu/drm/amd/amdgpu/dce_virtual.c
>  delete mode 100644 drivers/gpu/drm/amd/amdgpu/dce_virtual.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 30cbcd5ce1cc..0d814c957461 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -120,8 +120,7 @@ amdgpu-y += \
>  amdgpu-y += \
> dce_v10_0.o \
> dce_v11_0.o \
> -   amdgpu_vkms.o \
> -   dce_virtual.o
> +   amdgpu_vkms.o
>
>  # add GFX block
>  amdgpu-y += \
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> index d5c1f1c58f5f..538d41ea 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> @@ -5,6 +5,15 @@
>  #include 
>
>  #include "amdgpu.h"
> +#ifdef CONFIG_DRM_AMDGPU_SI
> +#include "dce_v6_0.h"
> +#endif
> +#ifdef CONFIG_DRM_AMDGPU_CIK
> +#include "dce_v8_0.h"
> +#endif
> +#include "dce_v10_0.h"
> +#include "dce_v11_0.h"
> +#include "ivsrcid/ivsrcid_vislands30.h"
>  #include "amdgpu_vkms.h"
>  #include "amdgpu_display.h"
>
> @@ -180,12 +189,45 @@ static const struct drm_connector_funcs 
> amdgpu_vkms_connector_funcs = {
>
>  static int amdgpu_vkms_conn_get_modes(struct drm_connector *connector)
>  {
> -   int count;
> +   struct drm_device *dev = connector->dev;
> +   struct drm_display_mode *mode = NULL;
> +   unsigned i;
> +   static const struct mode_size {
> +   int w;
> +   int h;
> +   } common_modes[] = {
> +   { 640,  480},
> +   { 720,  480},
> +   { 800,  600},
> +   { 848,  480},
> +   {1024,  768},
> +   {1152,  768},
> +   {1280,  720},
> +   {1280,  800},
> +   {1280,  854},
> +   {1280,  960},
> +   {1280, 1024},
> +   {1440,  900},
> +   {1400, 1050},
> +   {1680, 1050},
> +   {1600, 1200},
> +   {1920, 1080},
> +   {1920, 1200},
> +   {2560, 1440},
> +   {4096, 3112},
> +   {3656, 2664},
> +   {3840, 2160},
> +   {4096, 2160},
> +   };
> +
> +   for (i = 0; i < ARRAY_SIZE(common_modes); i++) {
> +   mode = drm_cvt_mode(dev, common_modes[i].w, 
> common_modes[i].h, 60, false, false, false);
> +   drm_mode_probed_add(connector, mode);
> +   }
>
> -   count = drm_add_modes_noedid(connector, XRES_MAX, YRES_MAX);
> drm_set_preferred_mode(connector, XRES_DEF, YRES_DEF);
>
> -   return count;
> +   return ARRAY_SIZE(common_modes);
>  }
>
>  static const struct drm_connector_helper_funcs amdgpu_vkms_conn_helper_funcs 
> = {
> @@ -409,3 +451,189 @@ int amdgpu_vkms_output_init(struct drm_device *dev,
>
> return ret;
>  }
> +
> +const struct drm_mode_config_funcs amdgpu_vkms_mode_funcs = {
> +   .fb_create = amdgpu_display_user_framebuffer_create,
> +   .atomic_check = drm_atomic_helper_check,
> +   .atomic_commit = drm_atomic_helper_commit,
> +};
> +
> +static int amdgpu_vkms_sw_init(void *handle)
> +{
> +   int r, i;
> +   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> +
> +   adev_to_drm(adev)->max_vblank_count = 0;
> +
> +   adev_to_drm(adev)->mode_config.funcs = _vkms_mode_funcs;
> +
> +   adev_to_drm(adev)->mode_config.max_width = XRES_MAX;
> +   adev_to_drm(adev)->mode_config.max_height = YRES_MAX;
> +
> +   adev_to_drm(adev)->mode_config.preferred_depth = 24;
> +   adev_to_drm(adev)->mode_config.prefer_shadow = 1;
> +
> +   adev_to_drm(adev)->mode_config.fb_base = adev->gmc.aper_base;
> +
> +   r = amdgpu_display_modeset_create_props(adev);
> +   if (r)
> +   return r;
> +
> +   adev->amdgpu_vkms_output = kzalloc(sizeof(struct amdgpu_vkms_output) 
> * 

Re: [PATCH 1/3] drm/amdgpu: create amdgpu_vkms (v2)

2021-07-23 Thread Alex Deucher
On Wed, Jul 21, 2021 at 1:07 PM Ryan Taylor  wrote:
>
> Modify the VKMS driver into an api that dce_virtual can use to create
> virtual displays that obey drm's atomic modesetting api.
>
> v2: Made local functions static.
>
> Reported-by: kernel test robot 
> Signed-off-by: Ryan Taylor 
> ---
>  drivers/gpu/drm/amd/amdgpu/Makefile  |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h  |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c   |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 411 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.h |  29 ++
>  drivers/gpu/drm/amd/amdgpu/dce_virtual.c |  23 +-
>  7 files changed, 458 insertions(+), 11 deletions(-)
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index f089794bbdd5..30cbcd5ce1cc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -120,6 +120,7 @@ amdgpu-y += \
>  amdgpu-y += \
> dce_v10_0.o \
> dce_v11_0.o \
> +   amdgpu_vkms.o \
> dce_virtual.o
>
>  # add GFX block
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 54cf647bd018..d0a2f2ed433d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -919,6 +919,7 @@ struct amdgpu_device {
>
> /* display */
> boolenable_virtual_display;
> +   struct amdgpu_vkms_output   *amdgpu_vkms_output;
> struct amdgpu_mode_info mode_info;
> /* For pre-DCE11. DCE11 and later are in "struct amdgpu_device->dm" */
> struct work_struct  hotplug_work;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index d0c935cf4f0f..1b016e5bc75f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1230,7 +1230,7 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
> int ret, retry = 0;
> bool supports_atomic = false;
>
> -   if (!amdgpu_virtual_display &&
> +   if (amdgpu_virtual_display ||
> amdgpu_device_asic_has_dc_support(flags & AMD_ASIC_MASK))
> supports_atomic = true;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> index 09b048647523..5a143ca02cf9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> @@ -344,7 +344,7 @@ int amdgpu_fbdev_init(struct amdgpu_device *adev)
> }
>
> /* disable all the possible outputs/crtcs before entering KMS mode */
> -   if (!amdgpu_device_has_dc_support(adev))
> +   if (!amdgpu_device_has_dc_support(adev) && !amdgpu_virtual_display)
> drm_helper_disable_unused_functions(adev_to_drm(adev));
>
> drm_fb_helper_initial_config(>helper, bpp_sel);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> new file mode 100644
> index ..d5c1f1c58f5f
> --- /dev/null
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> @@ -0,0 +1,411 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +
> +#include 
> +#include 
> +#include 
> +
> +#include "amdgpu.h"
> +#include "amdgpu_vkms.h"
> +#include "amdgpu_display.h"
> +
> +/**
> + * DOC: amdgpu_vkms
> + *
> + * The amdgpu vkms interface provides a virtual KMS interface for several use
> + * cases: devices without display hardware, platforms where the actual 
> display
> + * hardware is not useful (e.g., servers), SR-IOV virtual functions, device
> + * emulation/simulation, and device bring up prior to display hardware being
> + * usable. We previously emulated a legacy KMS interface, but there was a 
> desire
> + * to move to the atomic KMS interface. The vkms driver did everything we
> + * needed, but we wanted KMS support natively in the driver without buffer
> + * sharing and the ability to support an instance of VKMS per device. We 
> first
> + * looked at splitting vkms into a stub driver and a helper module that other
> + * drivers could use to implement a virtual display, but this strategy ended 
> up
> + * being messy due to driver specific callbacks needed for buffer management.
> + * Ultimately, it proved easier to import the vkms code as it mostly used 
> core
> + * drm helpers anyway.
> + */
> +
> +static const u32 amdgpu_vkms_formats[] = {
> +   DRM_FORMAT_XRGB,
> +};
> +
> +static enum hrtimer_restart amdgpu_vkms_vblank_simulate(struct hrtimer 
> *timer)
> +{
> +   struct amdgpu_vkms_output *output = container_of(timer,
> +struct 
> amdgpu_vkms_output,
> +

Re: [PATCH V2 1/2] drm/amd/pm: restore user customized OD settings properly for NV1x

2021-07-23 Thread Lazar, Lijo
The series looks good to me, though I prefer to use a common logic to 
restore od settings so that smuv12,smuv13 gets the restore feature by 
default once they add the user table logic. Don't have strong argument 
for it unless Alex, Kenneth or others have some comments.


Anyway, the series is
Reviewed-by: Lijo Lazar 

On 7/23/2021 2:39 PM, Evan Quan wrote:

The customized OD settings can be divided into two parts: those
committed ones and non-committed ones.
   - For those changes which had been fed to SMU before S3/S4/Runpm
 suspend kicked, they are committed changes. They should be properly
 restored and fed to SMU on S3/S4/Runpm resume.
   - For those non-committed changes, they are restored only without feeding
 to SMU.

Change-Id: Iea7cf7908dfcd919a4d0205e10bff91b1149a440
Signed-off-by: Evan Quan 
--
v1->v2
   - better naming and logic revised for checking OD setting update(Lijo)
---
  drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h   |  8 +++
  drivers/gpu/drm/amd/pm/inc/smu_v11_0.h|  2 +
  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  9 +++
  .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 55 +--
  .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 25 +
  5 files changed, 82 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
index 3e89852e4820..c2c201b8e3cf 100644
--- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
@@ -231,6 +231,7 @@ struct smu_user_dpm_profile {
uint32_t power_limit;
uint32_t fan_speed_percent;
uint32_t flags;
+   uint32_t user_od;
  
  	/* user clock state information */

uint32_t clk_mask[SMU_CLK_COUNT];
@@ -352,6 +353,7 @@ struct smu_table_context
  
  	void*overdrive_table;

void*boot_overdrive_table;
+   void*user_overdrive_table;
  
  	uint32_t			gpu_metrics_table_size;

void*gpu_metrics_table;
@@ -623,6 +625,12 @@ struct pptable_funcs {
 enum PP_OD_DPM_TABLE_COMMAND type,
 long *input, uint32_t size);
  
+	/**

+* @restore_user_od_settings: Restore the user customized
+*OD settings on S3/S4/Runpm resume.
+*/
+   int (*restore_user_od_settings)(struct smu_context *smu);
+
/**
 * @get_clock_by_type_with_latency: Get the speed and latency of a clock
 *  domain.
diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h 
b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
index 385b2ea5379c..1e42aafbb9fd 100644
--- a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
+++ b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
@@ -302,5 +302,7 @@ void smu_v11_0_interrupt_work(struct smu_context *smu);
  
  int smu_v11_0_set_light_sbr(struct smu_context *smu, bool enable);
  
+int smu_v11_0_restore_user_od_settings(struct smu_context *smu);

+
  #endif
  #endif
diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index ebe672142808..8ca7337ea5fc 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -416,6 +416,15 @@ static void smu_restore_dpm_user_profile(struct 
smu_context *smu)
}
}
  
+	/* Restore user customized OD settings */

+   if (smu->user_dpm_profile.user_od) {
+   if (smu->ppt_funcs->restore_user_od_settings) {
+   ret = smu->ppt_funcs->restore_user_od_settings(smu);
+   if (ret)
+   dev_err(smu->adev->dev, "Failed to upload customized 
OD settings\n");
+   }
+   }
+
/* Disable restore flag */
smu->user_dpm_profile.flags &= ~SMU_DPM_USER_PROFILE_RESTORE;
  }
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 59ea59acfb00..d7722c229ddd 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2294,41 +2294,52 @@ static int navi10_set_default_od_settings(struct 
smu_context *smu)
(OverDriveTable_t *)smu->smu_table.overdrive_table;
OverDriveTable_t *boot_od_table =
(OverDriveTable_t *)smu->smu_table.boot_overdrive_table;
+   OverDriveTable_t *user_od_table =
+   (OverDriveTable_t *)smu->smu_table.user_overdrive_table;
int ret = 0;
  
-	ret = smu_cmn_update_table(smu, SMU_TABLE_OVERDRIVE, 0, (void *)od_table, false);

+   /*
+* For S3/S4/Runpm resume, no need to setup those overdrive tables 
again as
+*   - either they already have the default OD settings got during cold 
bootup
+*   - or they have some user customized OD settings which cannot be 
overwritten
+*/
+   if 

Re: [PATCH v2] drm/amdgpu: Check pmops for desired suspend state

2021-07-23 Thread Alex Deucher
On Fri, Jul 23, 2021 at 9:08 AM Lazar, Lijo  wrote:
>
>
>
> On 7/23/2021 6:18 PM, Pratik Vishwakarma wrote:
> > [Why]
> > User might change the suspend behaviour from OS.
> >
> > [How]
> > Check with pm for target suspend state and set s0ix
> > flag only for s2idle state.
> >
> > v2: User might change default suspend state, use target state
> > Suggested-by: Lijo Lazar 
> > Signed-off-by: Pratik Vishwakarma 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > index 84a1b4bc9bb4..bf59bb263816 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > @@ -1042,7 +1042,7 @@ bool amdgpu_acpi_is_s0ix_supported(struct 
> > amdgpu_device *adev)
> >   #if defined(CONFIG_AMD_PMC) || defined(CONFIG_AMD_PMC_MODULE)
> >   if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) {
> >   if (adev->flags & AMD_IS_APU)
> > - return true;
> > + return pm_suspend_target_state == PM_SUSPEND_TO_IDLE;
>
> Not sure if this is the right place, the name _is_s0ix_supported() gives
> the sense of a static check - whether the feature is supported.
>
> pm_suspend_target_state is a dynamic one - actual suspend state to which
> transition happens. Ex: s0ix may be supported, but user may choose
> suspend to RAM.
>
> Maybe rename to is_s0ix_transition? Will let Alex to comment as he added
> this function originally.

Maybe change it to amdgpu_acpi_is_s0ix_active()?  That better reflects
how it's used.  But please do that as a separate follow up patch so
that we can more easily backport this patch to older kernels.  Thanks.
Reviewed-by: Alex Deucher 

Alex

>
> Thanks,
> Lijo
>
> >   }
> >   #endif
> >   return false;
> >
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm: add tdr support for embeded hw_fence

2021-07-23 Thread Jingwen Chen
On Fri Jul 23, 2021 at 10:45:49AM +0200, Christian König wrote:
> Am 23.07.21 um 09:07 schrieb Jingwen Chen:
> > [SNIP]
> > Hi Christian,
> > 
> > The thing is vm flush fence has no job passed to amdgpu_fence_emit, so
> > go through the jobs cannot help find the vm flush fence. And keep the
> > rest fences in the RCU array will lead to signaling them in the ib_test
> > right after ASIC reset. While they will be signaled again during
> > resubmission. What I'm doning here is just trying to cleanup the fences
> > without a parent job and make sure the rest fences won't be signaled
> > twice.
> 
> It took me a moment to realize what you are talking about here.
> 
> This is for the KIQ! You need different handling for the KIQ than for the
> scheduler controlled rings.
> 
> It is not only the flush jobs, but the KIQ needs a complete reset because of
> the register writes pushed there as well.
> 
> > > And please drop any usage of DMA_FENCE_FLAG_USER_BITS. That is not 
> > > something
> > > which should be abused for reset handling.
> > > 
> > The DMA_FENCE_FLAG_USER_BITS here is to mark this fence has a parent
> > job. If this is not a proper usage here, do you have any suggestions
> > about how to identify whether the fence has a parent job?
> 
> You don't need to mark the fences at all. Everything on the KIQ ring needs
> to be force completed since none of the fences on that ring have a parent
> job.
> 
> Christian.
>
Hi Christian

Yes KIQ ring fences all need force_completion. But we do need to mark the
fences. Say we have a gfx ring job with vm_flush=1 sending to
amdgpu_ib_schedule, then in amdgpu_vm_flush, after the
amdgpu_ring_emit_vm_flush is done, we will create a vm flush fence on
gfx ring with amdgpu_fence_emit(ring, , NULL, 0).

Then this vm flush fence we create here has no parent job but it's on
gfx ring. 
If we only do force_completion for KIQ ring and just clear the
RCU array for the scheduler controlled rings, nobody will signal and put this
gfx ring vm_flush fence again. When this job is resubmitted, it will
just create a new vm_flush fence. This is a memleak and I have seen this
memleak during my test.

Best Regards,
JingWen Chen
> > 
> > Best Regards,
> > JingWen Chen
> > > Regards,
> > > Christian.
> > > 
> > > > 
> > > > Best Regards,
> > > > JingWen Chen
> > > > > > Andrey
> > > > > > 
> > > > > > 
> > > > > > > > > +    }
> > > > > > > > >      /* after all hw jobs are reset, hw fence is
> > > > > > > > > meaningless, so force_completion */
> > > > > > > > >      amdgpu_fence_driver_force_completion(ring);
> > > > > > > > >      }
> > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > > > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > > > > > > > > index eecf21d8ec33..815776c9a013 100644
> > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > > > > > > > > @@ -156,11 +156,17 @@ int amdgpu_fence_emit(struct
> > > > > > > > > amdgpu_ring *ring, struct dma_fence **f, struct amd
> > > > > > > > >      job->ring = ring;
> > > > > > > > >      }
> > > > > > > > > -    seq = ++ring->fence_drv.sync_seq;
> > > > > > > > > -    dma_fence_init(fence, _fence_ops,
> > > > > > > > > -   >fence_drv.lock,
> > > > > > > > > -   adev->fence_context + ring->idx,
> > > > > > > > > -   seq);
> > > > > > > > > +    if (job != NULL && job->base.resubmit_flag == 1) {
> > > > > > > > > +    /* reinit seq for resubmitted jobs */
> > > > > > > > > +    seq = ++ring->fence_drv.sync_seq;
> > > > > > > > > +    fence->seqno = seq;
> > > > > > > > > +    } else {
> > > > > > > > > +    seq = ++ring->fence_drv.sync_seq;
> > > > > > > > Seems like you could do the above line only once above if-else
> > > > > > > > as it was
> > > > > > > > before
> > > > > > > Sure, I will modify this.
> > > > > > > 
> > > > > > > 
> > > > > > > Best Regards,
> > > > > > > JingWen Chen
> > > > > > > > > +    dma_fence_init(fence, _fence_ops,
> > > > > > > > > +    >fence_drv.lock,
> > > > > > > > > +    adev->fence_context + ring->idx,
> > > > > > > > > +    seq);
> > > > > > > > > +    }
> > > > > > > > >      if (job != NULL) {
> > > > > > > > >      /* mark this fence has a parent job */
> > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > > > > > > index 7c426e225b24..d6f848adc3f4 100644
> > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > > > > > > @@ -241,6 +241,7 @@ static struct dma_fence
> > > > > > > > > *amdgpu_job_run(struct drm_sched_job *sched_job)
> > > > > > > > >      dma_fence_set_error(finished, -ECANCELED);/* skip
> > > > > > > > > IB as well if VRAM lost */
> > > > > > > > >      if 

RE: [PATCH 1/2] drm/amd/pm: restore user customized OD settings properly for NV1x

2021-07-23 Thread Quan, Evan
[Public]

I probably get your point. Please check the update V2.

BR
Evan
> -Original Message-
> From: Lazar, Lijo 
> Sent: Friday, July 23, 2021 3:39 PM
> To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: RE: [PATCH 1/2] drm/amd/pm: restore user customized OD settings
> properly for NV1x
> 
> [Public]
> 
> Hi Evan,
> 
> In that case, using "od_edit" table for the intermediate table
> 
> During commit command, what if it's done like
>   1) Copy and update table if od_edit table is different than user_od
> table
>   2) Set the flag to false if od_edit table and boot_od table are
> different
> 
> Then current od settings may always be picked from user_od table and the
> flag may be used to distinguish if it's boot od settings or custom settings 
> (for
> restore or other purposes).
> 
> Thanks,
> Lijo
> 
> -Original Message-
> From: Quan, Evan 
> Sent: Friday, July 23, 2021 12:51 PM
> To: Lazar, Lijo ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: RE: [PATCH 1/2] drm/amd/pm: restore user customized OD settings
> properly for NV1x
> 
> [AMD Official Use Only]
> 
> Hi Lijo,
> 
> Sorry, I doubled checked. The implementation of Navi1x is right.
> The original design for "restores to default settings" is a two-step process.
> That means for "case PP_OD_COMMIT_DPM_TABLE:" we must distinguish
> whether it's an usual customized od setting update or just restoring to
> defaults.
> And my current implementation for that is as below. Any comments?
> 
> +   if (memcmp(table_context->overdrive_table, table_context-
> >boot_overdrive_table,
> +   sizeof(OverDriveTable_t))) {
> +   smu->user_dpm_profile.committed_od = true;
> +   memcpy(table_context->committed_overdrive_table,
> table_context->overdrive_table,
> +   sizeof(OverDriveTable_t));
> +   } else {
> +   smu->user_dpm_profile.committed_od = false;
> +   }
> 
> BR
> Evan
> > -Original Message-
> > From: Quan, Evan
> > Sent: Friday, July 23, 2021 2:48 PM
> > To: Lazar, Lijo ; amd-gfx@lists.freedesktop.org
> > Cc: Deucher, Alexander 
> > Subject: RE: [PATCH 1/2] drm/amd/pm: restore user customized OD
> > settings properly for NV1x
> >
> > [AMD Official Use Only]
> >
> >
> >
> > > -Original Message-
> > > From: Lazar, Lijo 
> > > Sent: Thursday, July 22, 2021 5:03 PM
> > > To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> > > Cc: Deucher, Alexander 
> > > Subject: Re: [PATCH 1/2] drm/amd/pm: restore user customized OD
> > > settings properly for NV1x
> > >
> > >
> > >
> > > On 7/22/2021 2:03 PM, Quan, Evan wrote:
> > > > [AMD Official Use Only]
> > > >
> > > >
> > > >
> > > >> -Original Message-
> > > >> From: Lazar, Lijo 
> > > >> Sent: Thursday, July 22, 2021 4:10 PM
> > > >> To: Quan, Evan ; amd-
> > g...@lists.freedesktop.org
> > > >> Cc: Deucher, Alexander 
> > > >> Subject: Re: [PATCH 1/2] drm/amd/pm: restore user customized OD
> > > >> settings properly for NV1x
> > > >>
> > > >>
> > > >>
> > > >> On 7/22/2021 8:50 AM, Evan Quan wrote:
> > > >>> The customized OD settings can be divided into two parts: those
> > > >>> committed ones and non-committed ones.
> > > >>> - For those changes which had been fed to SMU before
> > S3/S4/Runpm
> > > >>>   suspend kicked, they are committed changes. They should be
> > > properly
> > > >>>   restored and fed to SMU on S3/S4/Runpm resume.
> > > >>> - For those non-committed changes, they are restored only
> > > >>> without
> > > >> feeding
> > > >>>   to SMU.
> > > >>>
> > > >>> Change-Id: Iea7cf7908dfcd919a4d0205e10bff91b1149a440
> > > >>> Signed-off-by: Evan Quan 
> > > >>> ---
> > > >>>drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h   |  8 +++
> > > >>>drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  9 
> > > >>>.../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 52
> > > >> ++-
> > > >>>.../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 12 +
> > > >>>4 files changed, 69 insertions(+), 12 deletions(-)
> > > >>>
> > > >>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > > >>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > > >>> index 3e89852e4820..8ba53f16d2d9 100644
> > > >>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > > >>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > > >>> @@ -231,6 +231,7 @@ struct smu_user_dpm_profile {
> > > >>>   uint32_t power_limit;
> > > >>>   uint32_t fan_speed_percent;
> > > >>>   uint32_t flags;
> > > >>> + uint32_t committed_od;
> > > >>>
> > > >>>   /* user clock state information */
> > > >>>   uint32_t clk_mask[SMU_CLK_COUNT]; @@ -352,6 +353,7
> > @@ struct
> > > >>> smu_table_context
> > > >>>
> > > >>>   void*overdrive_table;
> > > >>>   void*boot_overdrive_table;
> > > >>> + 

[PATCH V2 2/2] drm/amd/pm: restore user customized OD settings properly for Sienna Cichlid

2021-07-23 Thread Evan Quan
Properly restore those committed and non-committed user customized OD
settings.

Change-Id: I25396df0b3ecdd7a0d9fc77ed220b0abf1fde020
Signed-off-by: Evan Quan 
---
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 37 ++-
 1 file changed, 28 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index e0638dc3f617..7efbdca2d3b1 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -1954,18 +1954,29 @@ static int 
sienna_cichlid_set_default_od_settings(struct smu_context *smu)
(OverDriveTable_t *)smu->smu_table.overdrive_table;
OverDriveTable_t *boot_od_table =
(OverDriveTable_t *)smu->smu_table.boot_overdrive_table;
+   OverDriveTable_t *user_od_table =
+   (OverDriveTable_t *)smu->smu_table.user_overdrive_table;
int ret = 0;
 
+   /*
+* For S3/S4/Runpm resume, no need to setup those overdrive tables 
again as
+*   - either they already have the default OD settings got during cold 
bootup
+*   - or they have some user customized OD settings which cannot be 
overwritten
+*/
+   if (smu->adev->in_suspend)
+   return 0;
+
ret = smu_cmn_update_table(smu, SMU_TABLE_OVERDRIVE,
-  0, (void *)od_table, false);
+  0, (void *)boot_od_table, false);
if (ret) {
dev_err(smu->adev->dev, "Failed to get overdrive table!\n");
return ret;
}
 
-   memcpy(boot_od_table, od_table, sizeof(OverDriveTable_t));
+   sienna_cichlid_dump_od_table(smu, boot_od_table);
 
-   sienna_cichlid_dump_od_table(smu, od_table);
+   memcpy(od_table, boot_od_table, sizeof(OverDriveTable_t));
+   memcpy(user_od_table, boot_od_table, sizeof(OverDriveTable_t));
 
return 0;
 }
@@ -2128,13 +2139,20 @@ static int sienna_cichlid_od_edit_dpm_table(struct 
smu_context *smu,
fallthrough;
 
case PP_OD_COMMIT_DPM_TABLE:
-   sienna_cichlid_dump_od_table(smu, od_table);
+   if (memcmp(od_table, table_context->user_overdrive_table, 
sizeof(OverDriveTable_t))) {
+   sienna_cichlid_dump_od_table(smu, od_table);
+   ret = smu_cmn_update_table(smu, SMU_TABLE_OVERDRIVE, 0, 
(void *)od_table, true);
+   if (ret) {
+   dev_err(smu->adev->dev, "Failed to import 
overdrive table!\n");
+   return ret;
+   }
+   memcpy(table_context->user_overdrive_table, od_table, 
sizeof(OverDriveTable_t));
+   smu->user_dpm_profile.user_od = true;
 
-   ret = smu_cmn_update_table(smu, SMU_TABLE_OVERDRIVE,
-  0, (void *)od_table, true);
-   if (ret) {
-   dev_err(smu->adev->dev, "Failed to import overdrive 
table!\n");
-   return ret;
+   if (!memcmp(table_context->user_overdrive_table,
+   table_context->boot_overdrive_table,
+   sizeof(OverDriveTable_t)))
+   smu->user_dpm_profile.user_od = false;
}
break;
 
@@ -3902,6 +3920,7 @@ static const struct pptable_funcs 
sienna_cichlid_ppt_funcs = {
.set_soft_freq_limited_range = smu_v11_0_set_soft_freq_limited_range,
.set_default_od_settings = sienna_cichlid_set_default_od_settings,
.od_edit_dpm_table = sienna_cichlid_od_edit_dpm_table,
+   .restore_user_od_settings = smu_v11_0_restore_user_od_settings,
.run_btc = sienna_cichlid_run_btc,
.set_power_source = smu_v11_0_set_power_source,
.get_pp_feature_mask = smu_cmn_get_pp_feature_mask,
-- 
2.29.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH V2 1/2] drm/amd/pm: restore user customized OD settings properly for NV1x

2021-07-23 Thread Evan Quan
The customized OD settings can be divided into two parts: those
committed ones and non-committed ones.
  - For those changes which had been fed to SMU before S3/S4/Runpm
suspend kicked, they are committed changes. They should be properly
restored and fed to SMU on S3/S4/Runpm resume.
  - For those non-committed changes, they are restored only without feeding
to SMU.

Change-Id: Iea7cf7908dfcd919a4d0205e10bff91b1149a440
Signed-off-by: Evan Quan 
--
v1->v2
  - better naming and logic revised for checking OD setting update(Lijo)
---
 drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h   |  8 +++
 drivers/gpu/drm/amd/pm/inc/smu_v11_0.h|  2 +
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  9 +++
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 55 +--
 .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 25 +
 5 files changed, 82 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
index 3e89852e4820..c2c201b8e3cf 100644
--- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
@@ -231,6 +231,7 @@ struct smu_user_dpm_profile {
uint32_t power_limit;
uint32_t fan_speed_percent;
uint32_t flags;
+   uint32_t user_od;
 
/* user clock state information */
uint32_t clk_mask[SMU_CLK_COUNT];
@@ -352,6 +353,7 @@ struct smu_table_context
 
void*overdrive_table;
void*boot_overdrive_table;
+   void*user_overdrive_table;
 
uint32_tgpu_metrics_table_size;
void*gpu_metrics_table;
@@ -623,6 +625,12 @@ struct pptable_funcs {
 enum PP_OD_DPM_TABLE_COMMAND type,
 long *input, uint32_t size);
 
+   /**
+* @restore_user_od_settings: Restore the user customized
+*OD settings on S3/S4/Runpm resume.
+*/
+   int (*restore_user_od_settings)(struct smu_context *smu);
+
/**
 * @get_clock_by_type_with_latency: Get the speed and latency of a clock
 *  domain.
diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h 
b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
index 385b2ea5379c..1e42aafbb9fd 100644
--- a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
+++ b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
@@ -302,5 +302,7 @@ void smu_v11_0_interrupt_work(struct smu_context *smu);
 
 int smu_v11_0_set_light_sbr(struct smu_context *smu, bool enable);
 
+int smu_v11_0_restore_user_od_settings(struct smu_context *smu);
+
 #endif
 #endif
diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index ebe672142808..8ca7337ea5fc 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -416,6 +416,15 @@ static void smu_restore_dpm_user_profile(struct 
smu_context *smu)
}
}
 
+   /* Restore user customized OD settings */
+   if (smu->user_dpm_profile.user_od) {
+   if (smu->ppt_funcs->restore_user_od_settings) {
+   ret = smu->ppt_funcs->restore_user_od_settings(smu);
+   if (ret)
+   dev_err(smu->adev->dev, "Failed to upload 
customized OD settings\n");
+   }
+   }
+
/* Disable restore flag */
smu->user_dpm_profile.flags &= ~SMU_DPM_USER_PROFILE_RESTORE;
 }
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 59ea59acfb00..d7722c229ddd 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2294,41 +2294,52 @@ static int navi10_set_default_od_settings(struct 
smu_context *smu)
(OverDriveTable_t *)smu->smu_table.overdrive_table;
OverDriveTable_t *boot_od_table =
(OverDriveTable_t *)smu->smu_table.boot_overdrive_table;
+   OverDriveTable_t *user_od_table =
+   (OverDriveTable_t *)smu->smu_table.user_overdrive_table;
int ret = 0;
 
-   ret = smu_cmn_update_table(smu, SMU_TABLE_OVERDRIVE, 0, (void 
*)od_table, false);
+   /*
+* For S3/S4/Runpm resume, no need to setup those overdrive tables 
again as
+*   - either they already have the default OD settings got during cold 
bootup
+*   - or they have some user customized OD settings which cannot be 
overwritten
+*/
+   if (smu->adev->in_suspend)
+   return 0;
+
+   ret = smu_cmn_update_table(smu, SMU_TABLE_OVERDRIVE, 0, (void 
*)boot_od_table, false);
if (ret) {
dev_err(smu->adev->dev, "Failed to get overdrive table!\n");
return ret;
}
 
-   if (!od_table->GfxclkVolt1) {
+

RE: [PATCH 2/2] drm: add tdr support for embeded hw_fence

2021-07-23 Thread Liu, Monk
[AMD Official Use Only]

Hi Christian 

How about this way:

#define AMDGPU_DMA_FENCE_FLAG_INSIDE_JOB  DMA_FENCE_FLAG_USER_BITS

And we can use  AMDGPU_DMA_FENCE_FLAG_INSIDE_JOB instead of 
DMA_FENCE_FLAG_USER_BITS everywhere

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Jingwen Chen  
Sent: Friday, July 23, 2021 3:07 PM
To: Christian König ; Grodzovsky, Andrey 
; amd-gfx@lists.freedesktop.org
Cc: Chen, Horace ; Liu, Monk ; Zhang, 
Jack (Jian) 
Subject: Re: [PATCH 2/2] drm: add tdr support for embeded hw_fence

On Fri Jul 23, 2021 at 08:33:02AM +0200, Christian König wrote:
> Am 22.07.21 um 18:47 schrieb Jingwen Chen:
> > On Thu Jul 22, 2021 at 06:24:28PM +0200, Christian König wrote:
> > > Am 22.07.21 um 16:45 schrieb Andrey Grodzovsky:
> > > > On 2021-07-22 6:45 a.m., Jingwen Chen wrote:
> > > > > On Wed Jul 21, 2021 at 12:53:51PM -0400, Andrey Grodzovsky wrote:
> > > > > > On 2021-07-20 11:13 p.m., Jingwen Chen wrote:
> > > > > > > [Why]
> > > > > > > After embeded hw_fence to amdgpu_job, we need to add tdr 
> > > > > > > support for this feature.
> > > > > > > 
> > > > > > > [How]
> > > > > > > 1. Add a resubmit_flag for resubmit jobs.
> > > > > > > 2. Clear job fence from RCU and force complete vm flush 
> > > > > > > fences in
> > > > > > >   pre_asic_reset
> > > > > > > 3. skip dma_fence_get for resubmit jobs and add a 
> > > > > > > dma_fence_put
> > > > > > >   for guilty jobs.
> > > > > > > 
> > > > > > > Signed-off-by: Jack Zhang 
> > > > > > > Signed-off-by: Jingwen Chen 
> > > > > > > ---
> > > > > > >     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 
> > > > > > > +++-
> > > > > > >     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 16 
> > > > > > > +++-
> > > > > > >     drivers/gpu/drm/amd/amdgpu/amdgpu_job.c    |  4 +++-
> > > > > > >     drivers/gpu/drm/scheduler/sched_main.c |  1 +
> > > > > > >     include/drm/gpu_scheduler.h    |  1 +
> > > > > > >     5 files changed, 27 insertions(+), 7 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > > index 40461547701a..fe0237f72a09 100644
> > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > > @@ -4382,7 +4382,7 @@ int amdgpu_device_mode1_reset(struct 
> > > > > > > amdgpu_device *adev)
> > > > > > >     int amdgpu_device_pre_asic_reset(struct amdgpu_device 
> > > > > > > *adev,
> > > > > > >      struct amdgpu_reset_context 
> > > > > > > *reset_context)
> > > > > > >     {
> > > > > > > -    int i, r = 0;
> > > > > > > +    int i, j, r = 0;
> > > > > > >     struct amdgpu_job *job = NULL;
> > > > > > >     bool need_full_reset =
> > > > > > >     test_bit(AMDGPU_NEED_FULL_RESET, 
> > > > > > > _context->flags); @@ -4406,6 +4406,16 @@ int 
> > > > > > > amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
> > > > > > >     if (!ring || !ring->sched.thread)
> > > > > > >     continue;
> > > > > > > +    /*clear job fence from fence drv to avoid 
> > > > > > > +force_completion
> > > > > > > + *leave NULL and vm flush fence in fence drv */
> > > > > > > +    for (j = 0; j <= ring->fence_drv.num_fences_mask; 
> > > > > > > +j ++) {
> > > > > > > +    struct dma_fence *old,**ptr;
> > > > > > > +    ptr = >fence_drv.fences[j];
> > > > > > > +    old = rcu_dereference_protected(*ptr, 1);
> > > > > > > +    if (old && test_bit(DMA_FENCE_FLAG_USER_BITS,
> > > > > > > >flags))) {
> > > > > > > +    RCU_INIT_POINTER(*ptr, NULL);
> > > > > > > +    }
> > > > > > Is this to avoid premature job free because of dma_fence_put 
> > > > > > inside amdgpu_fence_process ?
> > > > > > I can't currently remember why but we probably want all the 
> > > > > > HW fences currently in the ring to be forced signaled - 
> > > > > > maybe better to test for DMA_FENCE_FLAG_USER_BITS inside 
> > > > > > amdgpu_fence_process and still do the signaling but not the 
> > > > > > dma_fence_put part
> > > > > > 
> > > > > > Andrey
> > > > > Hi Andrey,
> > > > > 
> > > > > This is to avoid signaling the same fence twice. If we still 
> > > > > do the signaling, then the job in the pending list will be 
> > > > > signaled first in force_completion, and later be signaled in 
> > > > > resubmit. This will go to
> > > > > BUG() in amdgpu_fence_process.
> > > > 
> > > > Oh, i see, how about just adding 'skip' flag to amdgpu_ring and 
> > > > setting it before calling amdgpu_fence_driver_force_completion 
> > > > and resetting it after, then inside 
> > > > amdgpu_fence_driver_force_completion
> > > > you can just skip the signaling part with this flag for fences 
> > > > with DMA_FENCE_FLAG_USER_BITS set 

RE: [PATCH 1/2] drm/amd/amdgpu embed hw_fence into amdgpu_job

2021-07-23 Thread Liu, Monk
[AMD Official Use Only]

Hi jingwen

In the lines like  below:
+   if (job != NULL) {
+   /* mark this fence has a parent job */
+   set_bit(DMA_FENCE_FLAG_USER_BITS, >flags);
+   }

The DMA_FENCE_FLAG_USER_BITS usage looks not good enough

Please try to improve it, e.g.:

#define AMDGPU_DMA_FENCE_FLAG_INSIDE_JOB  DMA_FENCE_FLAG_USER_BITS

This way in the future USER BITS could be extended to other purpose:
#define AMDGPU_DMA_FENCE_FLAG_X(AMDGPU_DMA_FENCE_FLAG_INSIDE_JOB  + 1)

Please try to use AMDGPU_DMA_FENCE_FLAG_INSIDE_JOB instead of 
DMA_FENCE_FLAG_USER_BITS everywhere 

For other logic parts I do not have suggestions since I already know those 
history and the corresponding changes much enough 

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Jingwen Chen  
Sent: Friday, July 23, 2021 12:42 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk ; Grodzovsky, Andrey 
; ckoenig.leichtzumer...@gmail.com; Zhang, Jack 
(Jian) ; Chen, JingWen 
Subject: [PATCH 1/2] drm/amd/amdgpu embed hw_fence into amdgpu_job

From: Jack Zhang 

Why: Previously hw fence is alloced separately with job.
It caused historical lifetime issues and corner cases.
The ideal situation is to take fence to manage both job and fence's lifetime, 
and simplify the design of gpu-scheduler.

How:
We propose to embed hw_fence into amdgpu_job.
1. We cover the normal job submission by this method.
2. For ib_test, and submit without a parent job keep the legacy way to create a 
hw fence separately.

Signed-off-by: Jingwen Chen 
Signed-off-by: Jack Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c  |  1 -  
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c   | 62 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 35 
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h|  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  2 +-
 8 files changed, 80 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index b6d33f13b476..bad403978bac 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -714,7 +714,6 @@ int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum 
kgd_engine_type engine,
ret = dma_fence_wait(f, false);
 
 err_ib_sched:
-   dma_fence_put(f);
amdgpu_job_free(job);
 err:
return ret;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 536005bff24a..277128846dd1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1414,7 +1414,7 @@ static void amdgpu_ib_preempt_mark_partial_job(struct 
amdgpu_ring *ring)
continue;
}
job = to_amdgpu_job(s_job);
-   if (preempted && job->fence == fence)
+   if (preempted && (>hw_fence) == fence)
/* mark the job as preempted */
job->preemption_status |= AMDGPU_IB_PREEMPTED;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 30772608eac6..eecf21d8ec33 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -133,25 +133,40 @@ static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
  * Emits a fence command on the requested ring (all asics).
  * Returns 0 on success, -ENOMEM on failure.
  */
-int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f,
+int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f, 
+struct amdgpu_job *job,
  unsigned flags)
 {
struct amdgpu_device *adev = ring->adev;
-   struct amdgpu_fence *fence;
+   struct dma_fence *fence;
+   struct amdgpu_fence *am_fence;
struct dma_fence __rcu **ptr;
uint32_t seq;
int r;
 
-   fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL);
-   if (fence == NULL)
-   return -ENOMEM;
+   if (job == NULL) {
+   /* create a sperate hw fence */
+   am_fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL);
+   if (am_fence == NULL)
+   return -ENOMEM;
+   fence = _fence->base;
+   am_fence->ring = ring;
+   } else {
+   /* take use of job-embedded fence */
+   fence = >hw_fence;
+   job->ring = ring;
+   }
 
seq = ++ring->fence_drv.sync_seq;
-   fence->ring = ring;
-   dma_fence_init(>base, _fence_ops,
+   dma_fence_init(fence, _fence_ops,
   

RE: [PATCH 1/2] drm/amd/pm: restore user customized OD settings properly for NV1x

2021-07-23 Thread Lazar, Lijo
[Public]

Hi Evan,

In that case, using "od_edit" table for the intermediate table

During commit command, what if it's done like
1) Copy and update table if od_edit table is different than user_od 
table
2) Set the flag to false if od_edit table and boot_od table are 
different

Then current od settings may always be picked from user_od table and the flag 
may be used to distinguish if it's boot od settings or custom settings (for 
restore or other purposes).

Thanks,
Lijo

-Original Message-
From: Quan, Evan  
Sent: Friday, July 23, 2021 12:51 PM
To: Lazar, Lijo ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: RE: [PATCH 1/2] drm/amd/pm: restore user customized OD settings 
properly for NV1x

[AMD Official Use Only]

Hi Lijo,

Sorry, I doubled checked. The implementation of Navi1x is right. 
The original design for "restores to default settings" is a two-step process.
That means for "case PP_OD_COMMIT_DPM_TABLE:" we must distinguish whether it's 
an usual customized od setting update or just restoring to defaults.
And my current implementation for that is as below. Any comments?

+   if (memcmp(table_context->overdrive_table, 
table_context->boot_overdrive_table,
+   sizeof(OverDriveTable_t))) {
+   smu->user_dpm_profile.committed_od = true;
+   memcpy(table_context->committed_overdrive_table, 
table_context->overdrive_table,
+   sizeof(OverDriveTable_t));
+   } else {
+   smu->user_dpm_profile.committed_od = false;
+   }

BR
Evan
> -Original Message-
> From: Quan, Evan
> Sent: Friday, July 23, 2021 2:48 PM
> To: Lazar, Lijo ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: RE: [PATCH 1/2] drm/amd/pm: restore user customized OD 
> settings properly for NV1x
> 
> [AMD Official Use Only]
> 
> 
> 
> > -Original Message-
> > From: Lazar, Lijo 
> > Sent: Thursday, July 22, 2021 5:03 PM
> > To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> > Cc: Deucher, Alexander 
> > Subject: Re: [PATCH 1/2] drm/amd/pm: restore user customized OD 
> > settings properly for NV1x
> >
> >
> >
> > On 7/22/2021 2:03 PM, Quan, Evan wrote:
> > > [AMD Official Use Only]
> > >
> > >
> > >
> > >> -Original Message-
> > >> From: Lazar, Lijo 
> > >> Sent: Thursday, July 22, 2021 4:10 PM
> > >> To: Quan, Evan ; amd-
> g...@lists.freedesktop.org
> > >> Cc: Deucher, Alexander 
> > >> Subject: Re: [PATCH 1/2] drm/amd/pm: restore user customized OD 
> > >> settings properly for NV1x
> > >>
> > >>
> > >>
> > >> On 7/22/2021 8:50 AM, Evan Quan wrote:
> > >>> The customized OD settings can be divided into two parts: those 
> > >>> committed ones and non-committed ones.
> > >>> - For those changes which had been fed to SMU before
> S3/S4/Runpm
> > >>>   suspend kicked, they are committed changes. They should be
> > properly
> > >>>   restored and fed to SMU on S3/S4/Runpm resume.
> > >>> - For those non-committed changes, they are restored only 
> > >>> without
> > >> feeding
> > >>>   to SMU.
> > >>>
> > >>> Change-Id: Iea7cf7908dfcd919a4d0205e10bff91b1149a440
> > >>> Signed-off-by: Evan Quan 
> > >>> ---
> > >>>drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h   |  8 +++
> > >>>drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  9 
> > >>>.../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 52
> > >> ++-
> > >>>.../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 12 +
> > >>>4 files changed, 69 insertions(+), 12 deletions(-)
> > >>>
> > >>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > >>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > >>> index 3e89852e4820..8ba53f16d2d9 100644
> > >>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > >>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > >>> @@ -231,6 +231,7 @@ struct smu_user_dpm_profile {
> > >>> uint32_t power_limit;
> > >>> uint32_t fan_speed_percent;
> > >>> uint32_t flags;
> > >>> +   uint32_t committed_od;
> > >>>
> > >>> /* user clock state information */
> > >>> uint32_t clk_mask[SMU_CLK_COUNT]; @@ -352,6 +353,7
> @@ struct
> > >>> smu_table_context
> > >>>
> > >>> void*overdrive_table;
> > >>> void*boot_overdrive_table;
> > >>> +   void*committed_overdrive_table;
> > >>
> > >> May be rename it to user_overdrive_table - OD table with user settings?
> > > [Quan, Evan] Actually "overdrive_table " is  the 
> > > user_overdrive_table. It
> > contains all the modification including those not committed( the 
> > commit is performed by echo "c" >
> /sys/class/drm/card1/device/pp_od_clk_voltage).
> > > The new member added refers to those user customized but committed
> > only settings. That's why it was named as " committed_overdrive_table".
> > Any good 

RE: [PATCH] drm/amdgpu: retire sdma v5_2 golden settings from driver

2021-07-23 Thread Gao, Likun
[AMD Official Use Only]

Reviewed-by: Likun Gao 

Regards,
Likun

-Original Message-
From: Hawking Zhang  
Sent: Friday, July 23, 2021 2:11 PM
To: amd-gfx@lists.freedesktop.org; Gao, Likun ; Deucher, 
Alexander 
Cc: Zhang, Hawking 
Subject: [PATCH] drm/amdgpu: retire sdma v5_2 golden settings from driver

They are initalized by hardware during power up phase, starting from sdma v5_2 
generation

Signed-off-by: Hawking Zhang 
Reviewed-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 17 -
 1 file changed, 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 7486e53..779f5c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -87,21 +87,6 @@ static u32 sdma_v5_2_get_reg_offset(struct amdgpu_device 
*adev, u32 instance, u3
return base + internal_offset;
 }
 
-static void sdma_v5_2_init_golden_registers(struct amdgpu_device *adev) -{
-   switch (adev->asic_type) {
-   case CHIP_SIENNA_CICHLID:
-   case CHIP_NAVY_FLOUNDER:
-   case CHIP_VANGOGH:
-   case CHIP_DIMGREY_CAVEFISH:
-   case CHIP_BEIGE_GOBY:
-   case CHIP_YELLOW_CARP:
-   break;
-   default:
-   break;
-   }
-}
-
 static int sdma_v5_2_init_inst_ctx(struct amdgpu_sdma_instance *sdma_inst)  {
int err = 0;
@@ -1345,8 +1330,6 @@ static int sdma_v5_2_hw_init(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-   sdma_v5_2_init_golden_registers(adev);
-
r = sdma_v5_2_start(adev);
 
return r;
--
2.7.4
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/2] drm/amd/pm: restore user customized OD settings properly for NV1x

2021-07-23 Thread Quan, Evan
[AMD Official Use Only]

Hi Lijo,

Sorry, I doubled checked. The implementation of Navi1x is right. 
The original design for "restores to default settings" is a two-step process.
That means for "case PP_OD_COMMIT_DPM_TABLE:" we must distinguish whether it's 
an usual customized od setting update or just restoring to defaults.
And my current implementation for that is as below. Any comments?

+   if (memcmp(table_context->overdrive_table, 
table_context->boot_overdrive_table,
+   sizeof(OverDriveTable_t))) {
+   smu->user_dpm_profile.committed_od = true;
+   memcpy(table_context->committed_overdrive_table, 
table_context->overdrive_table,
+   sizeof(OverDriveTable_t));
+   } else {
+   smu->user_dpm_profile.committed_od = false;
+   }

BR
Evan
> -Original Message-
> From: Quan, Evan
> Sent: Friday, July 23, 2021 2:48 PM
> To: Lazar, Lijo ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: RE: [PATCH 1/2] drm/amd/pm: restore user customized OD settings
> properly for NV1x
> 
> [AMD Official Use Only]
> 
> 
> 
> > -Original Message-
> > From: Lazar, Lijo 
> > Sent: Thursday, July 22, 2021 5:03 PM
> > To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> > Cc: Deucher, Alexander 
> > Subject: Re: [PATCH 1/2] drm/amd/pm: restore user customized OD
> > settings properly for NV1x
> >
> >
> >
> > On 7/22/2021 2:03 PM, Quan, Evan wrote:
> > > [AMD Official Use Only]
> > >
> > >
> > >
> > >> -Original Message-
> > >> From: Lazar, Lijo 
> > >> Sent: Thursday, July 22, 2021 4:10 PM
> > >> To: Quan, Evan ; amd-
> g...@lists.freedesktop.org
> > >> Cc: Deucher, Alexander 
> > >> Subject: Re: [PATCH 1/2] drm/amd/pm: restore user customized OD
> > >> settings properly for NV1x
> > >>
> > >>
> > >>
> > >> On 7/22/2021 8:50 AM, Evan Quan wrote:
> > >>> The customized OD settings can be divided into two parts: those
> > >>> committed ones and non-committed ones.
> > >>> - For those changes which had been fed to SMU before
> S3/S4/Runpm
> > >>>   suspend kicked, they are committed changes. They should be
> > properly
> > >>>   restored and fed to SMU on S3/S4/Runpm resume.
> > >>> - For those non-committed changes, they are restored only
> > >>> without
> > >> feeding
> > >>>   to SMU.
> > >>>
> > >>> Change-Id: Iea7cf7908dfcd919a4d0205e10bff91b1149a440
> > >>> Signed-off-by: Evan Quan 
> > >>> ---
> > >>>drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h   |  8 +++
> > >>>drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  9 
> > >>>.../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 52
> > >> ++-
> > >>>.../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 12 +
> > >>>4 files changed, 69 insertions(+), 12 deletions(-)
> > >>>
> > >>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > >>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > >>> index 3e89852e4820..8ba53f16d2d9 100644
> > >>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > >>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> > >>> @@ -231,6 +231,7 @@ struct smu_user_dpm_profile {
> > >>> uint32_t power_limit;
> > >>> uint32_t fan_speed_percent;
> > >>> uint32_t flags;
> > >>> +   uint32_t committed_od;
> > >>>
> > >>> /* user clock state information */
> > >>> uint32_t clk_mask[SMU_CLK_COUNT]; @@ -352,6 +353,7
> @@ struct
> > >>> smu_table_context
> > >>>
> > >>> void*overdrive_table;
> > >>> void*boot_overdrive_table;
> > >>> +   void*committed_overdrive_table;
> > >>
> > >> May be rename it to user_overdrive_table - OD table with user settings?
> > > [Quan, Evan] Actually "overdrive_table " is  the
> > > user_overdrive_table. It
> > contains all the modification including those not committed( the
> > commit is performed by echo "c" >
> /sys/class/drm/card1/device/pp_od_clk_voltage).
> > > The new member added refers to those user customized but committed
> > only settings. That's why it was named as " committed_overdrive_table".
> > Any good suggestion for the naming?
> >
> > As far as I understand, the problem is overdrive_table can have
> > intemediary settings as the edit/commit is a two-step process. At any
> > point we can have user_od_table keep the latest user mode settings.
> > That is true when user restores to default settings also; that time we
> > can just copy the boot settings back to user_od table and keep the
> > flag as false indicating that there is no custom settings.
> [Quan, Evan] For now, on Navi1x the "restores to default settings" is also a
> two-step process. That will make "copy the boot settings back to user_od
> table and keep the flag as false" tricky. However, it seems that does not fit
> original design. As per original design, the "restores to default 

Re: [PATCH 2/2] drm: add tdr support for embeded hw_fence

2021-07-23 Thread Jingwen Chen
On Fri Jul 23, 2021 at 08:33:02AM +0200, Christian König wrote:
> Am 22.07.21 um 18:47 schrieb Jingwen Chen:
> > On Thu Jul 22, 2021 at 06:24:28PM +0200, Christian König wrote:
> > > Am 22.07.21 um 16:45 schrieb Andrey Grodzovsky:
> > > > On 2021-07-22 6:45 a.m., Jingwen Chen wrote:
> > > > > On Wed Jul 21, 2021 at 12:53:51PM -0400, Andrey Grodzovsky wrote:
> > > > > > On 2021-07-20 11:13 p.m., Jingwen Chen wrote:
> > > > > > > [Why]
> > > > > > > After embeded hw_fence to amdgpu_job, we need to add tdr support
> > > > > > > for this feature.
> > > > > > > 
> > > > > > > [How]
> > > > > > > 1. Add a resubmit_flag for resubmit jobs.
> > > > > > > 2. Clear job fence from RCU and force complete vm flush fences in
> > > > > > >   pre_asic_reset
> > > > > > > 3. skip dma_fence_get for resubmit jobs and add a dma_fence_put
> > > > > > >   for guilty jobs.
> > > > > > > 
> > > > > > > Signed-off-by: Jack Zhang 
> > > > > > > Signed-off-by: Jingwen Chen 
> > > > > > > ---
> > > > > > >     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 +++-
> > > > > > >     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 16 
> > > > > > > +++-
> > > > > > >     drivers/gpu/drm/amd/amdgpu/amdgpu_job.c    |  4 +++-
> > > > > > >     drivers/gpu/drm/scheduler/sched_main.c |  1 +
> > > > > > >     include/drm/gpu_scheduler.h    |  1 +
> > > > > > >     5 files changed, 27 insertions(+), 7 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > > index 40461547701a..fe0237f72a09 100644
> > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > > @@ -4382,7 +4382,7 @@ int amdgpu_device_mode1_reset(struct
> > > > > > > amdgpu_device *adev)
> > > > > > >     int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
> > > > > > >      struct amdgpu_reset_context *reset_context)
> > > > > > >     {
> > > > > > > -    int i, r = 0;
> > > > > > > +    int i, j, r = 0;
> > > > > > >     struct amdgpu_job *job = NULL;
> > > > > > >     bool need_full_reset =
> > > > > > >     test_bit(AMDGPU_NEED_FULL_RESET, 
> > > > > > > _context->flags);
> > > > > > > @@ -4406,6 +4406,16 @@ int
> > > > > > > amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
> > > > > > >     if (!ring || !ring->sched.thread)
> > > > > > >     continue;
> > > > > > > +    /*clear job fence from fence drv to avoid 
> > > > > > > force_completion
> > > > > > > + *leave NULL and vm flush fence in fence drv */
> > > > > > > +    for (j = 0; j <= ring->fence_drv.num_fences_mask; j ++) {
> > > > > > > +    struct dma_fence *old,**ptr;
> > > > > > > +    ptr = >fence_drv.fences[j];
> > > > > > > +    old = rcu_dereference_protected(*ptr, 1);
> > > > > > > +    if (old && test_bit(DMA_FENCE_FLAG_USER_BITS,
> > > > > > > >flags))) {
> > > > > > > +    RCU_INIT_POINTER(*ptr, NULL);
> > > > > > > +    }
> > > > > > Is this to avoid premature job free because of dma_fence_put inside
> > > > > > amdgpu_fence_process ?
> > > > > > I can't currently remember why but we probably want all the HW 
> > > > > > fences
> > > > > > currently in the ring to
> > > > > > be forced signaled - maybe better to test for 
> > > > > > DMA_FENCE_FLAG_USER_BITS
> > > > > > inside amdgpu_fence_process
> > > > > > and still do the signaling but not the dma_fence_put part
> > > > > > 
> > > > > > Andrey
> > > > > Hi Andrey,
> > > > > 
> > > > > This is to avoid signaling the same fence twice. If we still do the
> > > > > signaling, then the job in the pending list will be signaled first in
> > > > > force_completion, and later be signaled in resubmit. This will go to
> > > > > BUG() in amdgpu_fence_process.
> > > > 
> > > > Oh, i see, how about just adding 'skip' flag to amdgpu_ring and setting
> > > > it before calling
> > > > amdgpu_fence_driver_force_completion and resetting it after, then inside
> > > > amdgpu_fence_driver_force_completion
> > > > you can just skip the signaling part with this flag for fences with
> > > > DMA_FENCE_FLAG_USER_BITS set
> > > > Less lines of code at least.
> > > Still sounds quite a bit hacky.
> > > 
> > > I would rather suggest to completely drop the approach with
> > > amdgpu_fence_driver_force_completion(). I could never see why we would 
> > > want
> > > that in the first place.
> > > 
> > > Regards,
> > > Christian.
> > > 
> > Hi Christian,
> > 
> > I keep the amdgpu_fence_driver_force_completion here to make sure the vm
> > flush fence is signaled and put.
> > So the key question is whether the fence of ib test should be the first
> > fence to be signaled after reset.
> > If it should be, it means not only fences with DMA_FENCE_FLAG_USER_BITS
> > but also vm flush fences should be removed from RCU fence array 

RE: [PATCH 1/2] drm/amd/pm: restore user customized OD settings properly for NV1x

2021-07-23 Thread Quan, Evan
[AMD Official Use Only]



> -Original Message-
> From: Lazar, Lijo 
> Sent: Thursday, July 22, 2021 5:03 PM
> To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: Re: [PATCH 1/2] drm/amd/pm: restore user customized OD settings
> properly for NV1x
> 
> 
> 
> On 7/22/2021 2:03 PM, Quan, Evan wrote:
> > [AMD Official Use Only]
> >
> >
> >
> >> -Original Message-
> >> From: Lazar, Lijo 
> >> Sent: Thursday, July 22, 2021 4:10 PM
> >> To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> >> Cc: Deucher, Alexander 
> >> Subject: Re: [PATCH 1/2] drm/amd/pm: restore user customized OD
> >> settings properly for NV1x
> >>
> >>
> >>
> >> On 7/22/2021 8:50 AM, Evan Quan wrote:
> >>> The customized OD settings can be divided into two parts: those
> >>> committed ones and non-committed ones.
> >>> - For those changes which had been fed to SMU before S3/S4/Runpm
> >>>   suspend kicked, they are committed changes. They should be
> properly
> >>>   restored and fed to SMU on S3/S4/Runpm resume.
> >>> - For those non-committed changes, they are restored only
> >>> without
> >> feeding
> >>>   to SMU.
> >>>
> >>> Change-Id: Iea7cf7908dfcd919a4d0205e10bff91b1149a440
> >>> Signed-off-by: Evan Quan 
> >>> ---
> >>>drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h   |  8 +++
> >>>drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  9 
> >>>.../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 52
> >> ++-
> >>>.../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 12 +
> >>>4 files changed, 69 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> >>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> >>> index 3e89852e4820..8ba53f16d2d9 100644
> >>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> >>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h
> >>> @@ -231,6 +231,7 @@ struct smu_user_dpm_profile {
> >>>   uint32_t power_limit;
> >>>   uint32_t fan_speed_percent;
> >>>   uint32_t flags;
> >>> + uint32_t committed_od;
> >>>
> >>>   /* user clock state information */
> >>>   uint32_t clk_mask[SMU_CLK_COUNT]; @@ -352,6 +353,7 @@ struct
> >>> smu_table_context
> >>>
> >>>   void*overdrive_table;
> >>>   void*boot_overdrive_table;
> >>> + void*committed_overdrive_table;
> >>
> >> May be rename it to user_overdrive_table - OD table with user settings?
> > [Quan, Evan] Actually "overdrive_table " is  the user_overdrive_table. It
> contains all the modification including those not committed( the commit is
> performed by echo "c" > /sys/class/drm/card1/device/pp_od_clk_voltage).
> > The new member added refers to those user customized but committed
> only settings. That's why it was named as " committed_overdrive_table".
> Any good suggestion for the naming?
> 
> As far as I understand, the problem is overdrive_table can have intemediary
> settings as the edit/commit is a two-step process. At any point we can have
> user_od_table keep the latest user mode settings. That is true when user
> restores to default settings also; that time we can just copy the boot 
> settings
> back to user_od table and keep the flag as false indicating that there is no
> custom settings.
[Quan, Evan] For now, on Navi1x the "restores to default settings" is also a 
two-step process. That will make "copy the boot settings
back to user_od table and keep the flag as false" tricky. However, it seems 
that does not fit original design. As per original design,
the "restores to default settings" should be a one-step process. Let me correct 
them with new patch series and proceed further discussion then.

BR
Evan
> 
> >>
> >>>   uint32_tgpu_metrics_table_size;
> >>>   void*gpu_metrics_table;
> >>> @@ -623,6 +625,12 @@ struct pptable_funcs {
> >>>enum PP_OD_DPM_TABLE_COMMAND
> >> type,
> >>>long *input, uint32_t size);
> >>>
> >>> + /**
> >>> +  * @restore_committed_od_settings: Restore the customized and
> >> committed
> >>> +  * OD settings on S3/S4/Runpm resume.
> >>> +  */
> >>> + int (*restore_committed_od_settings)(struct smu_context *smu);
> >>> +
> >>>   /**
> >>>* @get_clock_by_type_with_latency: Get the speed and latency of
> >> a clock
> >>>*  domain.
> >>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> >>> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> >>> index ebe672142808..5f7d98e99f76 100644
> >>> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> >>> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> >>> @@ -416,6 +416,15 @@ static void
> smu_restore_dpm_user_profile(struct
> >> smu_context *smu)
> >>>   }
> >>>   }
> >>>
> >>> + 

[PATCH] drm/amdgpu: retire sdma v5_2 golden settings from driver

2021-07-23 Thread Hawking Zhang
They are initalized by hardware during power up phase,
starting from sdma v5_2 generation

Signed-off-by: Hawking Zhang 
Reviewed-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 17 -
 1 file changed, 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 7486e53..779f5c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -87,21 +87,6 @@ static u32 sdma_v5_2_get_reg_offset(struct amdgpu_device 
*adev, u32 instance, u3
return base + internal_offset;
 }
 
-static void sdma_v5_2_init_golden_registers(struct amdgpu_device *adev)
-{
-   switch (adev->asic_type) {
-   case CHIP_SIENNA_CICHLID:
-   case CHIP_NAVY_FLOUNDER:
-   case CHIP_VANGOGH:
-   case CHIP_DIMGREY_CAVEFISH:
-   case CHIP_BEIGE_GOBY:
-   case CHIP_YELLOW_CARP:
-   break;
-   default:
-   break;
-   }
-}
-
 static int sdma_v5_2_init_inst_ctx(struct amdgpu_sdma_instance *sdma_inst)
 {
int err = 0;
@@ -1345,8 +1330,6 @@ static int sdma_v5_2_hw_init(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-   sdma_v5_2_init_golden_registers(adev);
-
r = sdma_v5_2_start(adev);
 
return r;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm: add tdr support for embeded hw_fence

2021-07-23 Thread Christian König

Am 23.07.21 um 11:25 schrieb Jingwen Chen:

On Fri Jul 23, 2021 at 10:45:49AM +0200, Christian König wrote:

Am 23.07.21 um 09:07 schrieb Jingwen Chen:

[SNIP]
Hi Christian,

The thing is vm flush fence has no job passed to amdgpu_fence_emit, so
go through the jobs cannot help find the vm flush fence. And keep the
rest fences in the RCU array will lead to signaling them in the ib_test
right after ASIC reset. While they will be signaled again during
resubmission. What I'm doning here is just trying to cleanup the fences
without a parent job and make sure the rest fences won't be signaled
twice.

It took me a moment to realize what you are talking about here.

This is for the KIQ! You need different handling for the KIQ than for the
scheduler controlled rings.

It is not only the flush jobs, but the KIQ needs a complete reset because of
the register writes pushed there as well.


And please drop any usage of DMA_FENCE_FLAG_USER_BITS. That is not something
which should be abused for reset handling.


The DMA_FENCE_FLAG_USER_BITS here is to mark this fence has a parent
job. If this is not a proper usage here, do you have any suggestions
about how to identify whether the fence has a parent job?

You don't need to mark the fences at all. Everything on the KIQ ring needs
to be force completed since none of the fences on that ring have a parent
job.

Christian.


Hi Christian

Yes KIQ ring fences all need force_completion. But we do need to mark the
fences. Say we have a gfx ring job with vm_flush=1 sending to
amdgpu_ib_schedule, then in amdgpu_vm_flush, after the
amdgpu_ring_emit_vm_flush is done, we will create a vm flush fence on
gfx ring with amdgpu_fence_emit(ring, , NULL, 0).


That is illegal and needs to be fixed as well.


Then this vm flush fence we create here has no parent job but it's on
gfx ring.
If we only do force_completion for KIQ ring and just clear the
RCU array for the scheduler controlled rings, nobody will signal and put this
gfx ring vm_flush fence again. When this job is resubmitted, it will
just create a new vm_flush fence. This is a memleak and I have seen this
memleak during my test.


At least I'm better understanding the problem now as well.

But you are just trying to circumvent symptoms and not fixing the root 
cause.


See any memory allocation during command submission must be prevented. 
This also includes the flush fence.


Regards,
Christian.



Best Regards,
JingWen Chen

Best Regards,
JingWen Chen

Regards,
Christian.


Best Regards,
JingWen Chen

Andrey



+    }
      /* after all hw jobs are reset, hw fence is
meaningless, so force_completion */
      amdgpu_fence_driver_force_completion(ring);
      }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index eecf21d8ec33..815776c9a013 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -156,11 +156,17 @@ int amdgpu_fence_emit(struct
amdgpu_ring *ring, struct dma_fence **f, struct amd
      job->ring = ring;
      }
-    seq = ++ring->fence_drv.sync_seq;
-    dma_fence_init(fence, _fence_ops,
-   >fence_drv.lock,
-   adev->fence_context + ring->idx,
-   seq);
+    if (job != NULL && job->base.resubmit_flag == 1) {
+    /* reinit seq for resubmitted jobs */
+    seq = ++ring->fence_drv.sync_seq;
+    fence->seqno = seq;
+    } else {
+    seq = ++ring->fence_drv.sync_seq;

Seems like you could do the above line only once above if-else
as it was
before

Sure, I will modify this.


Best Regards,
JingWen Chen

+    dma_fence_init(fence, _fence_ops,
+    >fence_drv.lock,
+    adev->fence_context + ring->idx,
+    seq);
+    }
      if (job != NULL) {
      /* mark this fence has a parent job */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 7c426e225b24..d6f848adc3f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -241,6 +241,7 @@ static struct dma_fence
*amdgpu_job_run(struct drm_sched_job *sched_job)
      dma_fence_set_error(finished, -ECANCELED);/* skip
IB as well if VRAM lost */
      if (finished->error < 0) {
+    dma_fence_put(>hw_fence);
      DRM_INFO("Skip scheduling IBs!\n");
      } else {
      r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, job,
@@ -249,7 +250,8 @@ static struct dma_fence
*amdgpu_job_run(struct drm_sched_job *sched_job)
      DRM_ERROR("Error scheduling IBs (%d)\n", r);
      }
-    dma_fence_get(fence);
+    if (!job->base.resubmit_flag)
+    dma_fence_get(fence);
      amdgpu_job_free_resources(job);
      fence = r ? ERR_PTR(r) : fence;
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index f4f474944169..5a36ab5aea2d 100644
--- 

Re: [PATCH v3 1/1] drm/ttm: Fix COW check

2021-07-23 Thread Christian König

Am 23.07.21 um 11:00 schrieb Daniel Vetter:

On Fri, Jul 23, 2021 at 10:33:48AM +0200, Christian König wrote:


Am 23.07.21 um 10:21 schrieb Daniel Vetter:

On Wed, Jul 14, 2021 at 10:51 AM Christian König
 wrote:


Am 13.07.21 um 17:28 schrieb Alex Deucher:

On Tue, Jul 13, 2021 at 2:57 AM Christian König
 wrote:

Am 13.07.21 um 00:06 schrieb Felix Kuehling:

KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
is_cow_mapping returns true for these mappings. Add a check for
vm_flags & VM_WRITE to avoid mmap failures on private read-only or
PROT_NONE mappings.

v2: protect against mprotect making a mapping writable after the fact
v3: update driver-specific vm_operations_structs

Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
Signed-off-by: Felix Kuehling 
Signed-off-by: Alex Deucher 

Reviewed-by: Christian König 

Are you planning to push this to drm-misc?

Yes, just didn't found time yesterday.

This is pushed to the wrong tree drm-misc-next-fixes, should have been
in drm-misc-fixes. Please be careful with that because every time that
goes wrong the script gets confused about which the current tree is,
and pushes the wrong tree to linux-next branches.

I'm going to hard-reset drm-misc-next-fixes now and hope that's good
enough to fix things up (since Thomas is not around all the time for
this merge window).

STOP! I'm about to push a revert for this patch.

And yes that was pushed to the wrong branch, but it turned out that this was
fortunate since the patch doesn't work correctly.

Well I just hard-reset, so you can push the right patch to the right
branch now. The trouble is that outside of the merge window no one is
allowed to push to drm-misc-next-fixes. If you do, then dim pushes
drm-misc-next-fixes to for-linux-next instead of drm-misc-next, and we
have bad surprises.


Could we then make the branch read-only for that time?


Which unfortunately happens like every merge window a few times and always
takes a few days/weeks to get caught.


Yeah, at least to me it's absolutely not obvious when the merge windows 
for a certain version start/end.


Christian.


-Danie


Christian.


-Daniel



Christian.


Alex


---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  |  3 ++-
 drivers/gpu/drm/nouveau/nouveau_gem.c|  3 ++-
 drivers/gpu/drm/radeon/radeon_gem.c  |  3 ++-
 drivers/gpu/drm/ttm/ttm_bo_vm.c  | 14 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c |  1 +
 include/drm/ttm/ttm_bo_api.h |  4 
 6 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index b3404c43a911..1aa750a6a5d2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -79,7 +79,8 @@ static const struct vm_operations_struct amdgpu_gem_vm_ops = {
 .fault = amdgpu_gem_fault,
 .open = ttm_bo_vm_open,
 .close = ttm_bo_vm_close,
- .access = ttm_bo_vm_access
+ .access = ttm_bo_vm_access,
+ .mprotect = ttm_bo_vm_mprotect
 };

 static void amdgpu_gem_object_free(struct drm_gem_object *gobj)
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 5b27845075a1..164ea564bb7a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -70,7 +70,8 @@ static const struct vm_operations_struct nouveau_ttm_vm_ops = 
{
 .fault = nouveau_ttm_fault,
 .open = ttm_bo_vm_open,
 .close = ttm_bo_vm_close,
- .access = ttm_bo_vm_access
+ .access = ttm_bo_vm_access,
+ .mprotect = ttm_bo_vm_mprotect
 };

 void
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 458f92a70887..c19ad07eb7b5 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -77,7 +77,8 @@ static const struct vm_operations_struct radeon_gem_vm_ops = {
 .fault = radeon_gem_fault,
 .open = ttm_bo_vm_open,
 .close = ttm_bo_vm_close,
- .access = ttm_bo_vm_access
+ .access = ttm_bo_vm_access,
+ .mprotect = ttm_bo_vm_mprotect
 };

 static void radeon_gem_object_free(struct drm_gem_object *gobj)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index f56be5bc0861..fb325bad5db6 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -542,17 +542,29 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned 
long addr,
 }
 EXPORT_SYMBOL(ttm_bo_vm_access);

+int ttm_bo_vm_mprotect(struct vm_area_struct *vma, unsigned long start,
+unsigned long end, unsigned long newflags)
+{
+ /* Enforce no COW since would have really strange behavior with it. */
+ if (is_cow_mapping(newflags) && (newflags & VM_WRITE))
+ return -EINVAL;
+
+ return 0;
+}
+EXPORT_SYMBOL(ttm_bo_vm_mprotect);
+
 

Re: [PATCH v3 1/1] drm/ttm: Fix COW check

2021-07-23 Thread Daniel Vetter
On Fri, Jul 23, 2021 at 10:33:48AM +0200, Christian König wrote:
> 
> 
> Am 23.07.21 um 10:21 schrieb Daniel Vetter:
> > On Wed, Jul 14, 2021 at 10:51 AM Christian König
> >  wrote:
> > > 
> > > 
> > > Am 13.07.21 um 17:28 schrieb Alex Deucher:
> > > > On Tue, Jul 13, 2021 at 2:57 AM Christian König
> > > >  wrote:
> > > > > 
> > > > > Am 13.07.21 um 00:06 schrieb Felix Kuehling:
> > > > > > KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
> > > > > > is_cow_mapping returns true for these mappings. Add a check for
> > > > > > vm_flags & VM_WRITE to avoid mmap failures on private read-only or
> > > > > > PROT_NONE mappings.
> > > > > > 
> > > > > > v2: protect against mprotect making a mapping writable after the 
> > > > > > fact
> > > > > > v3: update driver-specific vm_operations_structs
> > > > > > 
> > > > > > Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
> > > > > > Signed-off-by: Felix Kuehling 
> > > > > > Signed-off-by: Alex Deucher 
> > > > > Reviewed-by: Christian König 
> > > > Are you planning to push this to drm-misc?
> > > Yes, just didn't found time yesterday.
> > This is pushed to the wrong tree drm-misc-next-fixes, should have been
> > in drm-misc-fixes. Please be careful with that because every time that
> > goes wrong the script gets confused about which the current tree is,
> > and pushes the wrong tree to linux-next branches.
> > 
> > I'm going to hard-reset drm-misc-next-fixes now and hope that's good
> > enough to fix things up (since Thomas is not around all the time for
> > this merge window).
> 
> STOP! I'm about to push a revert for this patch.
> 
> And yes that was pushed to the wrong branch, but it turned out that this was
> fortunate since the patch doesn't work correctly.

Well I just hard-reset, so you can push the right patch to the right
branch now. The trouble is that outside of the merge window no one is
allowed to push to drm-misc-next-fixes. If you do, then dim pushes
drm-misc-next-fixes to for-linux-next instead of drm-misc-next, and we
have bad surprises.

Which unfortunately happens like every merge window a few times and always
takes a few days/weeks to get caught.
-Danie

> 
> Christian.
> 
> > -Daniel
> > 
> > 
> > > Christian.
> > > 
> > > > Alex
> > > > 
> > > > > > ---
> > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  |  3 ++-
> > > > > > drivers/gpu/drm/nouveau/nouveau_gem.c|  3 ++-
> > > > > > drivers/gpu/drm/radeon/radeon_gem.c  |  3 ++-
> > > > > > drivers/gpu/drm/ttm/ttm_bo_vm.c  | 14 +-
> > > > > > drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c |  1 +
> > > > > > include/drm/ttm/ttm_bo_api.h |  4 
> > > > > > 6 files changed, 24 insertions(+), 4 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > > > > > index b3404c43a911..1aa750a6a5d2 100644
> > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > > > > > @@ -79,7 +79,8 @@ static const struct vm_operations_struct 
> > > > > > amdgpu_gem_vm_ops = {
> > > > > > .fault = amdgpu_gem_fault,
> > > > > > .open = ttm_bo_vm_open,
> > > > > > .close = ttm_bo_vm_close,
> > > > > > - .access = ttm_bo_vm_access
> > > > > > + .access = ttm_bo_vm_access,
> > > > > > + .mprotect = ttm_bo_vm_mprotect
> > > > > > };
> > > > > > 
> > > > > > static void amdgpu_gem_object_free(struct drm_gem_object *gobj)
> > > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
> > > > > > b/drivers/gpu/drm/nouveau/nouveau_gem.c
> > > > > > index 5b27845075a1..164ea564bb7a 100644
> > > > > > --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
> > > > > > +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
> > > > > > @@ -70,7 +70,8 @@ static const struct vm_operations_struct 
> > > > > > nouveau_ttm_vm_ops = {
> > > > > > .fault = nouveau_ttm_fault,
> > > > > > .open = ttm_bo_vm_open,
> > > > > > .close = ttm_bo_vm_close,
> > > > > > - .access = ttm_bo_vm_access
> > > > > > + .access = ttm_bo_vm_access,
> > > > > > + .mprotect = ttm_bo_vm_mprotect
> > > > > > };
> > > > > > 
> > > > > > void
> > > > > > diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
> > > > > > b/drivers/gpu/drm/radeon/radeon_gem.c
> > > > > > index 458f92a70887..c19ad07eb7b5 100644
> > > > > > --- a/drivers/gpu/drm/radeon/radeon_gem.c
> > > > > > +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> > > > > > @@ -77,7 +77,8 @@ static const struct vm_operations_struct 
> > > > > > radeon_gem_vm_ops = {
> > > > > > .fault = radeon_gem_fault,
> > > > > > .open = ttm_bo_vm_open,
> > > > > > .close = ttm_bo_vm_close,
> > > > > > - .access = ttm_bo_vm_access
> > > > > > + .access = ttm_bo_vm_access,
> > > > > > + .mprotect = ttm_bo_vm_mprotect
> > > > > > };
> > > > > > 
> > > > > > static void 

Re: [PATCH 2/2] drm: add tdr support for embeded hw_fence

2021-07-23 Thread Christian König

Am 23.07.21 um 09:07 schrieb Jingwen Chen:

[SNIP]
Hi Christian,

The thing is vm flush fence has no job passed to amdgpu_fence_emit, so
go through the jobs cannot help find the vm flush fence. And keep the
rest fences in the RCU array will lead to signaling them in the ib_test
right after ASIC reset. While they will be signaled again during
resubmission. What I'm doning here is just trying to cleanup the fences
without a parent job and make sure the rest fences won't be signaled
twice.


It took me a moment to realize what you are talking about here.

This is for the KIQ! You need different handling for the KIQ than for 
the scheduler controlled rings.


It is not only the flush jobs, but the KIQ needs a complete reset 
because of the register writes pushed there as well.



And please drop any usage of DMA_FENCE_FLAG_USER_BITS. That is not something
which should be abused for reset handling.


The DMA_FENCE_FLAG_USER_BITS here is to mark this fence has a parent
job. If this is not a proper usage here, do you have any suggestions
about how to identify whether the fence has a parent job?


You don't need to mark the fences at all. Everything on the KIQ ring 
needs to be force completed since none of the fences on that ring have a 
parent job.


Christian.



Best Regards,
JingWen Chen

Regards,
Christian.



Best Regards,
JingWen Chen

Andrey



+    }
     /* after all hw jobs are reset, hw fence is
meaningless, so force_completion */
     amdgpu_fence_driver_force_completion(ring);
     }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index eecf21d8ec33..815776c9a013 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -156,11 +156,17 @@ int amdgpu_fence_emit(struct
amdgpu_ring *ring, struct dma_fence **f, struct amd
     job->ring = ring;
     }
-    seq = ++ring->fence_drv.sync_seq;
-    dma_fence_init(fence, _fence_ops,
-   >fence_drv.lock,
-   adev->fence_context + ring->idx,
-   seq);
+    if (job != NULL && job->base.resubmit_flag == 1) {
+    /* reinit seq for resubmitted jobs */
+    seq = ++ring->fence_drv.sync_seq;
+    fence->seqno = seq;
+    } else {
+    seq = ++ring->fence_drv.sync_seq;

Seems like you could do the above line only once above if-else
as it was
before

Sure, I will modify this.


Best Regards,
JingWen Chen

+    dma_fence_init(fence, _fence_ops,
+    >fence_drv.lock,
+    adev->fence_context + ring->idx,
+    seq);
+    }
     if (job != NULL) {
     /* mark this fence has a parent job */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 7c426e225b24..d6f848adc3f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -241,6 +241,7 @@ static struct dma_fence
*amdgpu_job_run(struct drm_sched_job *sched_job)
     dma_fence_set_error(finished, -ECANCELED);/* skip
IB as well if VRAM lost */
     if (finished->error < 0) {
+    dma_fence_put(>hw_fence);
     DRM_INFO("Skip scheduling IBs!\n");
     } else {
     r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, job,
@@ -249,7 +250,8 @@ static struct dma_fence
*amdgpu_job_run(struct drm_sched_job *sched_job)
     DRM_ERROR("Error scheduling IBs (%d)\n", r);
     }
-    dma_fence_get(fence);
+    if (!job->base.resubmit_flag)
+    dma_fence_get(fence);
     amdgpu_job_free_resources(job);
     fence = r ? ERR_PTR(r) : fence;
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index f4f474944169..5a36ab5aea2d 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -544,6 +544,7 @@ void drm_sched_resubmit_jobs_ext(struct
drm_gpu_scheduler *sched, int max)
dma_fence_set_error(_fence->finished, -ECANCELED);
     dma_fence_put(s_job->s_fence->parent);
+    s_job->resubmit_flag = 1;
     fence = sched->ops->run_job(s_job);
     i++;
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 4ea8606d91fe..06c101af1f71 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -198,6 +198,7 @@ struct drm_sched_job {
     enum drm_sched_priority    s_priority;
     struct drm_sched_entity *entity;
     struct dma_fence_cb    cb;
+    int resubmit_flag;
     };
     static inline bool drm_sched_invalidate_job(struct
drm_sched_job *s_job,


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v2] drm: add tdr support for embeded hw_fence

2021-07-23 Thread Jingwen Chen
[Why]
After embeded hw_fence to amdgpu_job, we need to add tdr support
for this feature.

[How]
1. Add a resubmit_flag for resubmit jobs.
2. Clear job fence from RCU and force complete vm flush fences in
   pre_asic_reset
3. skip dma_fence_get for resubmit jobs and add a dma_fence_put
   for guilty jobs.

Signed-off-by: Jack Zhang 
Signed-off-by: Jingwen Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 13 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c|  4 +++-
 drivers/gpu/drm/scheduler/sched_main.c |  1 +
 include/drm/gpu_scheduler.h|  1 +
 5 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 280b1940e892..df73fe666e87 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4382,7 +4382,7 @@ int amdgpu_device_mode1_reset(struct amdgpu_device *adev)
 int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
 struct amdgpu_reset_context *reset_context)
 {
-   int i, r = 0;
+   int i, j, r = 0;
struct amdgpu_job *job = NULL;
bool need_full_reset =
test_bit(AMDGPU_NEED_FULL_RESET, _context->flags);
@@ -4406,6 +4406,16 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device 
*adev,
if (!ring || !ring->sched.thread)
continue;
 
+   /*clear job fence from fence drv to avoid force_completion
+*leave NULL and vm flush fence in fence drv */
+   for (j = 0; j <= ring->fence_drv.num_fences_mask; j ++) {
+   struct dma_fence *old,**ptr;
+   ptr = >fence_drv.fences[j];
+   old = rcu_dereference_protected(*ptr, 1);
+   if (old && test_bit(DMA_FENCE_FLAG_USER_BITS, 
>flags))) {
+   RCU_INIT_POINTER(*ptr, NULL);
+   }
+   }
/* after all hw jobs are reset, hw fence is meaningless, so 
force_completion */
amdgpu_fence_driver_force_completion(ring);
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index eecf21d8ec33..d5b3d5f8f951 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -157,10 +157,15 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct 
dma_fence **f, struct amd
}
 
seq = ++ring->fence_drv.sync_seq;
-   dma_fence_init(fence, _fence_ops,
-  >fence_drv.lock,
-  adev->fence_context + ring->idx,
-  seq);
+   if (job != NULL && job->base.resubmit_flag == 1) {
+   /* reset seq for resubmitted jobs */
+   fence->seqno = seq;
+   } else {
+   dma_fence_init(fence, _fence_ops,
+   >fence_drv.lock,
+   adev->fence_context + ring->idx,
+   seq);
+   }
 
if (job != NULL) {
/* mark this fence has a parent job */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 7c426e225b24..d6f848adc3f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -241,6 +241,7 @@ static struct dma_fence *amdgpu_job_run(struct 
drm_sched_job *sched_job)
dma_fence_set_error(finished, -ECANCELED);/* skip IB as well if 
VRAM lost */
 
if (finished->error < 0) {
+   dma_fence_put(>hw_fence);
DRM_INFO("Skip scheduling IBs!\n");
} else {
r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, job,
@@ -249,7 +250,8 @@ static struct dma_fence *amdgpu_job_run(struct 
drm_sched_job *sched_job)
DRM_ERROR("Error scheduling IBs (%d)\n", r);
}
 
-   dma_fence_get(fence);
+   if (!job->base.resubmit_flag)
+   dma_fence_get(fence);
amdgpu_job_free_resources(job);
 
fence = r ? ERR_PTR(r) : fence;
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index f4f474944169..5a36ab5aea2d 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -544,6 +544,7 @@ void drm_sched_resubmit_jobs_ext(struct drm_gpu_scheduler 
*sched, int max)
dma_fence_set_error(_fence->finished, -ECANCELED);
 
dma_fence_put(s_job->s_fence->parent);
+   s_job->resubmit_flag = 1;
fence = sched->ops->run_job(s_job);
i++;
 
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 4ea8606d91fe..06c101af1f71 100644
--- a/include/drm/gpu_scheduler.h
+++ 

[PATCH v2 1/2] drm/amd/amdgpu embed hw_fence into amdgpu_job

2021-07-23 Thread Jingwen Chen
From: Jack Zhang 

Why: Previously hw fence is alloced separately with job.
It caused historical lifetime issues and corner cases.
The ideal situation is to take fence to manage both job
and fence's lifetime, and simplify the design of gpu-scheduler.

How:
We propose to embed hw_fence into amdgpu_job.
1. We cover the normal job submission by this method.
2. For ib_test, and submit without a parent job keep the
legacy way to create a hw fence separately.

Signed-off-by: Jingwen Chen 
Signed-off-by: Jack Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c  |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c   | 62 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 35 
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h|  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  2 +-
 8 files changed, 80 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index b6d33f13b476..bad403978bac 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -714,7 +714,6 @@ int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum 
kgd_engine_type engine,
ret = dma_fence_wait(f, false);
 
 err_ib_sched:
-   dma_fence_put(f);
amdgpu_job_free(job);
 err:
return ret;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 536005bff24a..277128846dd1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1414,7 +1414,7 @@ static void amdgpu_ib_preempt_mark_partial_job(struct 
amdgpu_ring *ring)
continue;
}
job = to_amdgpu_job(s_job);
-   if (preempted && job->fence == fence)
+   if (preempted && (>hw_fence) == fence)
/* mark the job as preempted */
job->preemption_status |= AMDGPU_IB_PREEMPTED;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 30772608eac6..eecf21d8ec33 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -133,25 +133,40 @@ static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
  * Emits a fence command on the requested ring (all asics).
  * Returns 0 on success, -ENOMEM on failure.
  */
-int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f,
+int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f, struct 
amdgpu_job *job,
  unsigned flags)
 {
struct amdgpu_device *adev = ring->adev;
-   struct amdgpu_fence *fence;
+   struct dma_fence *fence;
+   struct amdgpu_fence *am_fence;
struct dma_fence __rcu **ptr;
uint32_t seq;
int r;
 
-   fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL);
-   if (fence == NULL)
-   return -ENOMEM;
+   if (job == NULL) {
+   /* create a sperate hw fence */
+   am_fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL);
+   if (am_fence == NULL)
+   return -ENOMEM;
+   fence = _fence->base;
+   am_fence->ring = ring;
+   } else {
+   /* take use of job-embedded fence */
+   fence = >hw_fence;
+   job->ring = ring;
+   }
 
seq = ++ring->fence_drv.sync_seq;
-   fence->ring = ring;
-   dma_fence_init(>base, _fence_ops,
+   dma_fence_init(fence, _fence_ops,
   >fence_drv.lock,
   adev->fence_context + ring->idx,
   seq);
+
+   if (job != NULL) {
+   /* mark this fence has a parent job */
+   set_bit(DMA_FENCE_FLAG_USER_BITS, >flags);
+   }
+
amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
   seq, flags | AMDGPU_FENCE_FLAG_INT);
pm_runtime_get_noresume(adev_to_drm(adev)->dev);
@@ -174,9 +189,9 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct 
dma_fence **f,
/* This function can't be called concurrently anyway, otherwise
 * emitting the fence would mess up the hardware ring buffer.
 */
-   rcu_assign_pointer(*ptr, dma_fence_get(>base));
+   rcu_assign_pointer(*ptr, dma_fence_get(fence));
 
-   *f = >base;
+   *f = fence;
 
return 0;
 }
@@ -636,8 +651,16 @@ static const char *amdgpu_fence_get_driver_name(struct 
dma_fence *fence)
 
 static const char *amdgpu_fence_get_timeline_name(struct dma_fence *f)
 {
-   struct amdgpu_fence *fence = to_amdgpu_fence(f);
-   return (const char *)fence->ring->name;
+   struct amdgpu_ring *ring;
+
+   

[PATCH] drm: add tdr support for embeded hw_fence

2021-07-23 Thread Jingwen Chen
[Why]
After embeded hw_fence to amdgpu_job, we need to add tdr support
for this feature.

[How]
1. Add a resubmit_flag for resubmit jobs.
2. Clear job fence from RCU and force complete vm flush fences in
   pre_asic_reset
3. skip dma_fence_get for resubmit jobs and add a dma_fence_put
   for guilty jobs.

Signed-off-by: Jack Zhang 
Signed-off-by: Jingwen Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 13 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c|  4 +++-
 drivers/gpu/drm/scheduler/sched_main.c |  1 +
 include/drm/gpu_scheduler.h|  1 +
 5 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 280b1940e892..df73fe666e87 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4382,7 +4382,7 @@ int amdgpu_device_mode1_reset(struct amdgpu_device *adev)
 int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
 struct amdgpu_reset_context *reset_context)
 {
-   int i, r = 0;
+   int i, j, r = 0;
struct amdgpu_job *job = NULL;
bool need_full_reset =
test_bit(AMDGPU_NEED_FULL_RESET, _context->flags);
@@ -4406,6 +4406,16 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device 
*adev,
if (!ring || !ring->sched.thread)
continue;
 
+   /*clear job fence from fence drv to avoid force_completion
+*leave NULL and vm flush fence in fence drv */
+   for (j = 0; j <= ring->fence_drv.num_fences_mask; j ++) {
+   struct dma_fence *old,**ptr;
+   ptr = >fence_drv.fences[j];
+   old = rcu_dereference_protected(*ptr, 1);
+   if (old && test_bit(DMA_FENCE_FLAG_USER_BITS, 
>flags))) {
+   RCU_INIT_POINTER(*ptr, NULL);
+   }
+   }
/* after all hw jobs are reset, hw fence is meaningless, so 
force_completion */
amdgpu_fence_driver_force_completion(ring);
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index eecf21d8ec33..d5b3d5f8f951 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -157,10 +157,15 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct 
dma_fence **f, struct amd
}
 
seq = ++ring->fence_drv.sync_seq;
-   dma_fence_init(fence, _fence_ops,
-  >fence_drv.lock,
-  adev->fence_context + ring->idx,
-  seq);
+   if (job != NULL && job->base.resubmit_flag == 1) {
+   /* reset seq for resubmitted jobs */
+   fence->seqno = seq;
+   } else {
+   dma_fence_init(fence, _fence_ops,
+   >fence_drv.lock,
+   adev->fence_context + ring->idx,
+   seq);
+   }
 
if (job != NULL) {
/* mark this fence has a parent job */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 7c426e225b24..d6f848adc3f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -241,6 +241,7 @@ static struct dma_fence *amdgpu_job_run(struct 
drm_sched_job *sched_job)
dma_fence_set_error(finished, -ECANCELED);/* skip IB as well if 
VRAM lost */
 
if (finished->error < 0) {
+   dma_fence_put(>hw_fence);
DRM_INFO("Skip scheduling IBs!\n");
} else {
r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, job,
@@ -249,7 +250,8 @@ static struct dma_fence *amdgpu_job_run(struct 
drm_sched_job *sched_job)
DRM_ERROR("Error scheduling IBs (%d)\n", r);
}
 
-   dma_fence_get(fence);
+   if (!job->base.resubmit_flag)
+   dma_fence_get(fence);
amdgpu_job_free_resources(job);
 
fence = r ? ERR_PTR(r) : fence;
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index f4f474944169..5a36ab5aea2d 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -544,6 +544,7 @@ void drm_sched_resubmit_jobs_ext(struct drm_gpu_scheduler 
*sched, int max)
dma_fence_set_error(_fence->finished, -ECANCELED);
 
dma_fence_put(s_job->s_fence->parent);
+   s_job->resubmit_flag = 1;
fence = sched->ops->run_job(s_job);
i++;
 
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 4ea8606d91fe..06c101af1f71 100644
--- a/include/drm/gpu_scheduler.h
+++ 

[PATCH 1/2] drm/amd/amdgpu embed hw_fence into amdgpu_job

2021-07-23 Thread Jingwen Chen
From: Jack Zhang 

Why: Previously hw fence is alloced separately with job.
It caused historical lifetime issues and corner cases.
The ideal situation is to take fence to manage both job
and fence's lifetime, and simplify the design of gpu-scheduler.

How:
We propose to embed hw_fence into amdgpu_job.
1. We cover the normal job submission by this method.
2. For ib_test, and submit without a parent job keep the
legacy way to create a hw fence separately.

Signed-off-by: Jingwen Chen 
Signed-off-by: Jack Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c  |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c   | 62 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 35 
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h|  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  2 +-
 8 files changed, 80 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index b6d33f13b476..bad403978bac 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -714,7 +714,6 @@ int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum 
kgd_engine_type engine,
ret = dma_fence_wait(f, false);
 
 err_ib_sched:
-   dma_fence_put(f);
amdgpu_job_free(job);
 err:
return ret;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 536005bff24a..277128846dd1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1414,7 +1414,7 @@ static void amdgpu_ib_preempt_mark_partial_job(struct 
amdgpu_ring *ring)
continue;
}
job = to_amdgpu_job(s_job);
-   if (preempted && job->fence == fence)
+   if (preempted && (>hw_fence) == fence)
/* mark the job as preempted */
job->preemption_status |= AMDGPU_IB_PREEMPTED;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 30772608eac6..eecf21d8ec33 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -133,25 +133,40 @@ static u32 amdgpu_fence_read(struct amdgpu_ring *ring)
  * Emits a fence command on the requested ring (all asics).
  * Returns 0 on success, -ENOMEM on failure.
  */
-int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f,
+int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f, struct 
amdgpu_job *job,
  unsigned flags)
 {
struct amdgpu_device *adev = ring->adev;
-   struct amdgpu_fence *fence;
+   struct dma_fence *fence;
+   struct amdgpu_fence *am_fence;
struct dma_fence __rcu **ptr;
uint32_t seq;
int r;
 
-   fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL);
-   if (fence == NULL)
-   return -ENOMEM;
+   if (job == NULL) {
+   /* create a sperate hw fence */
+   am_fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL);
+   if (am_fence == NULL)
+   return -ENOMEM;
+   fence = _fence->base;
+   am_fence->ring = ring;
+   } else {
+   /* take use of job-embedded fence */
+   fence = >hw_fence;
+   job->ring = ring;
+   }
 
seq = ++ring->fence_drv.sync_seq;
-   fence->ring = ring;
-   dma_fence_init(>base, _fence_ops,
+   dma_fence_init(fence, _fence_ops,
   >fence_drv.lock,
   adev->fence_context + ring->idx,
   seq);
+
+   if (job != NULL) {
+   /* mark this fence has a parent job */
+   set_bit(DMA_FENCE_FLAG_USER_BITS, >flags);
+   }
+
amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
   seq, flags | AMDGPU_FENCE_FLAG_INT);
pm_runtime_get_noresume(adev_to_drm(adev)->dev);
@@ -174,9 +189,9 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct 
dma_fence **f,
/* This function can't be called concurrently anyway, otherwise
 * emitting the fence would mess up the hardware ring buffer.
 */
-   rcu_assign_pointer(*ptr, dma_fence_get(>base));
+   rcu_assign_pointer(*ptr, dma_fence_get(fence));
 
-   *f = >base;
+   *f = fence;
 
return 0;
 }
@@ -636,8 +651,16 @@ static const char *amdgpu_fence_get_driver_name(struct 
dma_fence *fence)
 
 static const char *amdgpu_fence_get_timeline_name(struct dma_fence *f)
 {
-   struct amdgpu_fence *fence = to_amdgpu_fence(f);
-   return (const char *)fence->ring->name;
+   struct amdgpu_ring *ring;
+
+   

Re: [PATCH 2/2] drm: add tdr support for embeded hw_fence

2021-07-23 Thread Jingwen Chen
On Fri Jul 23, 2021 at 12:06:32AM -0400, Andrey Grodzovsky wrote:
> 
> On 2021-07-22 8:20 p.m., Jingwen Chen wrote:
> > On Thu Jul 22, 2021 at 01:50:09PM -0400, Andrey Grodzovsky wrote:
> > > On 2021-07-22 1:27 p.m., Jingwen Chen wrote:
> > > > On Thu Jul 22, 2021 at 01:17:13PM -0400, Andrey Grodzovsky wrote:
> > > > > On 2021-07-22 12:47 p.m., Jingwen Chen wrote:
> > > > > > On Thu Jul 22, 2021 at 06:24:28PM +0200, Christian König wrote:
> > > > > > > Am 22.07.21 um 16:45 schrieb Andrey Grodzovsky:
> > > > > > > > On 2021-07-22 6:45 a.m., Jingwen Chen wrote:
> > > > > > > > > On Wed Jul 21, 2021 at 12:53:51PM -0400, Andrey Grodzovsky 
> > > > > > > > > wrote:
> > > > > > > > > > On 2021-07-20 11:13 p.m., Jingwen Chen wrote:
> > > > > > > > > > > [Why]
> > > > > > > > > > > After embeded hw_fence to amdgpu_job, we need to add tdr 
> > > > > > > > > > > support
> > > > > > > > > > > for this feature.
> > > > > > > > > > > 
> > > > > > > > > > > [How]
> > > > > > > > > > > 1. Add a resubmit_flag for resubmit jobs.
> > > > > > > > > > > 2. Clear job fence from RCU and force complete vm flush 
> > > > > > > > > > > fences in
> > > > > > > > > > > pre_asic_reset
> > > > > > > > > > > 3. skip dma_fence_get for resubmit jobs and add a 
> > > > > > > > > > > dma_fence_put
> > > > > > > > > > > for guilty jobs.
> > > > > > > > > > > 
> > > > > > > > > > > Signed-off-by: Jack Zhang 
> > > > > > > > > > > Signed-off-by: Jingwen Chen 
> > > > > > > > > > > ---
> > > > > > > > > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 
> > > > > > > > > > > +++-
> > > > > > > > > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 16 
> > > > > > > > > > > +++-
> > > > > > > > > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c    |  4 +++-
> > > > > > > > > > >   drivers/gpu/drm/scheduler/sched_main.c |  1 +
> > > > > > > > > > >   include/drm/gpu_scheduler.h    |  1 +
> > > > > > > > > > >   5 files changed, 27 insertions(+), 7 deletions(-)
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > > > > > > index 40461547701a..fe0237f72a09 100644
> > > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > > > > > > @@ -4382,7 +4382,7 @@ int amdgpu_device_mode1_reset(struct
> > > > > > > > > > > amdgpu_device *adev)
> > > > > > > > > > >   int amdgpu_device_pre_asic_reset(struct 
> > > > > > > > > > > amdgpu_device *adev,
> > > > > > > > > > >    struct amdgpu_reset_context 
> > > > > > > > > > > *reset_context)
> > > > > > > > > > >   {
> > > > > > > > > > > -    int i, r = 0;
> > > > > > > > > > > +    int i, j, r = 0;
> > > > > > > > > > >   struct amdgpu_job *job = NULL;
> > > > > > > > > > >   bool need_full_reset =
> > > > > > > > > > >   test_bit(AMDGPU_NEED_FULL_RESET, 
> > > > > > > > > > > _context->flags);
> > > > > > > > > > > @@ -4406,6 +4406,16 @@ int
> > > > > > > > > > > amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
> > > > > > > > > > >   if (!ring || !ring->sched.thread)
> > > > > > > > > > >   continue;
> > > > > > > > > > > +    /*clear job fence from fence drv to avoid 
> > > > > > > > > > > force_completion
> > > > > > > > > > > + *leave NULL and vm flush fence in fence drv */
> > > > > > > > > > > +    for (j = 0; j <= 
> > > > > > > > > > > ring->fence_drv.num_fences_mask; j ++) {
> > > > > > > > > > > +    struct dma_fence *old,**ptr;
> > > > > > > > > > > +    ptr = >fence_drv.fences[j];
> > > > > > > > > > > +    old = rcu_dereference_protected(*ptr, 1);
> > > > > > > > > > > +    if (old && test_bit(DMA_FENCE_FLAG_USER_BITS,
> > > > > > > > > > > >flags))) {
> > > > > > > > > > > +    RCU_INIT_POINTER(*ptr, NULL);
> > > > > > > > > > > +    }
> > > > > > > > > > Is this to avoid premature job free because of 
> > > > > > > > > > dma_fence_put inside
> > > > > > > > > > amdgpu_fence_process ?
> > > > > > > > > > I can't currently remember why but we probably want all the 
> > > > > > > > > > HW fences
> > > > > > > > > > currently in the ring to
> > > > > > > > > > be forced signaled - maybe better to test for 
> > > > > > > > > > DMA_FENCE_FLAG_USER_BITS
> > > > > > > > > > inside amdgpu_fence_process
> > > > > > > > > > and still do the signaling but not the dma_fence_put part
> > > > > > > > > > 
> > > > > > > > > > Andrey
> > > > > > > > > Hi Andrey,
> > > > > > > > > 
> > > > > > > > > This is to avoid signaling the same fence twice. If we still 
> > > > > > > > > do the
> > > > > > > > > signaling, then the job in the pending list will be signaled 
> > > > > > > > > first in
> > > > > > > > > force_completion, and later be signaled in resubmit. 

Re: [PATCH v3 1/1] drm/ttm: Fix COW check

2021-07-23 Thread Christian König



Am 23.07.21 um 10:21 schrieb Daniel Vetter:

On Wed, Jul 14, 2021 at 10:51 AM Christian König
 wrote:



Am 13.07.21 um 17:28 schrieb Alex Deucher:

On Tue, Jul 13, 2021 at 2:57 AM Christian König
 wrote:


Am 13.07.21 um 00:06 schrieb Felix Kuehling:

KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
is_cow_mapping returns true for these mappings. Add a check for
vm_flags & VM_WRITE to avoid mmap failures on private read-only or
PROT_NONE mappings.

v2: protect against mprotect making a mapping writable after the fact
v3: update driver-specific vm_operations_structs

Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
Signed-off-by: Felix Kuehling 
Signed-off-by: Alex Deucher 

Reviewed-by: Christian König 

Are you planning to push this to drm-misc?

Yes, just didn't found time yesterday.

This is pushed to the wrong tree drm-misc-next-fixes, should have been
in drm-misc-fixes. Please be careful with that because every time that
goes wrong the script gets confused about which the current tree is,
and pushes the wrong tree to linux-next branches.

I'm going to hard-reset drm-misc-next-fixes now and hope that's good
enough to fix things up (since Thomas is not around all the time for
this merge window).


STOP! I'm about to push a revert for this patch.

And yes that was pushed to the wrong branch, but it turned out that this 
was fortunate since the patch doesn't work correctly.


Christian.


-Daniel



Christian.


Alex


---
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  |  3 ++-
drivers/gpu/drm/nouveau/nouveau_gem.c|  3 ++-
drivers/gpu/drm/radeon/radeon_gem.c  |  3 ++-
drivers/gpu/drm/ttm/ttm_bo_vm.c  | 14 +-
drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c |  1 +
include/drm/ttm/ttm_bo_api.h |  4 
6 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index b3404c43a911..1aa750a6a5d2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -79,7 +79,8 @@ static const struct vm_operations_struct amdgpu_gem_vm_ops = {
.fault = amdgpu_gem_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
- .access = ttm_bo_vm_access
+ .access = ttm_bo_vm_access,
+ .mprotect = ttm_bo_vm_mprotect
};

static void amdgpu_gem_object_free(struct drm_gem_object *gobj)
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 5b27845075a1..164ea564bb7a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -70,7 +70,8 @@ static const struct vm_operations_struct nouveau_ttm_vm_ops = 
{
.fault = nouveau_ttm_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
- .access = ttm_bo_vm_access
+ .access = ttm_bo_vm_access,
+ .mprotect = ttm_bo_vm_mprotect
};

void
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 458f92a70887..c19ad07eb7b5 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -77,7 +77,8 @@ static const struct vm_operations_struct radeon_gem_vm_ops = {
.fault = radeon_gem_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
- .access = ttm_bo_vm_access
+ .access = ttm_bo_vm_access,
+ .mprotect = ttm_bo_vm_mprotect
};

static void radeon_gem_object_free(struct drm_gem_object *gobj)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index f56be5bc0861..fb325bad5db6 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -542,17 +542,29 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned 
long addr,
}
EXPORT_SYMBOL(ttm_bo_vm_access);

+int ttm_bo_vm_mprotect(struct vm_area_struct *vma, unsigned long start,
+unsigned long end, unsigned long newflags)
+{
+ /* Enforce no COW since would have really strange behavior with it. */
+ if (is_cow_mapping(newflags) && (newflags & VM_WRITE))
+ return -EINVAL;
+
+ return 0;
+}
+EXPORT_SYMBOL(ttm_bo_vm_mprotect);
+
static const struct vm_operations_struct ttm_bo_vm_ops = {
.fault = ttm_bo_vm_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
.access = ttm_bo_vm_access,
+ .mprotect = ttm_bo_vm_mprotect,
};

int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct ttm_buffer_object 
*bo)
{
/* Enforce no COW since would have really strange behavior with it. */
- if (is_cow_mapping(vma->vm_flags))
+ if (is_cow_mapping(vma->vm_flags) && (vma->vm_flags & VM_WRITE))
return -EINVAL;

ttm_bo_get(bo);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c
index e6b1f98ec99f..e4bf7dc99320 100644
--- 

Re: [PATCH v3 1/1] drm/ttm: Fix COW check

2021-07-23 Thread Daniel Vetter
On Wed, Jul 14, 2021 at 10:51 AM Christian König
 wrote:
>
>
>
> Am 13.07.21 um 17:28 schrieb Alex Deucher:
> > On Tue, Jul 13, 2021 at 2:57 AM Christian König
> >  wrote:
> >>
> >>
> >> Am 13.07.21 um 00:06 schrieb Felix Kuehling:
> >>> KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
> >>> is_cow_mapping returns true for these mappings. Add a check for
> >>> vm_flags & VM_WRITE to avoid mmap failures on private read-only or
> >>> PROT_NONE mappings.
> >>>
> >>> v2: protect against mprotect making a mapping writable after the fact
> >>> v3: update driver-specific vm_operations_structs
> >>>
> >>> Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
> >>> Signed-off-by: Felix Kuehling 
> >>> Signed-off-by: Alex Deucher 
> >> Reviewed-by: Christian König 
> > Are you planning to push this to drm-misc?
>
> Yes, just didn't found time yesterday.

This is pushed to the wrong tree drm-misc-next-fixes, should have been
in drm-misc-fixes. Please be careful with that because every time that
goes wrong the script gets confused about which the current tree is,
and pushes the wrong tree to linux-next branches.

I'm going to hard-reset drm-misc-next-fixes now and hope that's good
enough to fix things up (since Thomas is not around all the time for
this merge window).
-Daniel


>
> Christian.
>
> >
> > Alex
> >
> >>> ---
> >>>drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c  |  3 ++-
> >>>drivers/gpu/drm/nouveau/nouveau_gem.c|  3 ++-
> >>>drivers/gpu/drm/radeon/radeon_gem.c  |  3 ++-
> >>>drivers/gpu/drm/ttm/ttm_bo_vm.c  | 14 +-
> >>>drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c |  1 +
> >>>include/drm/ttm/ttm_bo_api.h |  4 
> >>>6 files changed, 24 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >>> index b3404c43a911..1aa750a6a5d2 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >>> @@ -79,7 +79,8 @@ static const struct vm_operations_struct 
> >>> amdgpu_gem_vm_ops = {
> >>>.fault = amdgpu_gem_fault,
> >>>.open = ttm_bo_vm_open,
> >>>.close = ttm_bo_vm_close,
> >>> - .access = ttm_bo_vm_access
> >>> + .access = ttm_bo_vm_access,
> >>> + .mprotect = ttm_bo_vm_mprotect
> >>>};
> >>>
> >>>static void amdgpu_gem_object_free(struct drm_gem_object *gobj)
> >>> diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
> >>> b/drivers/gpu/drm/nouveau/nouveau_gem.c
> >>> index 5b27845075a1..164ea564bb7a 100644
> >>> --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
> >>> +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
> >>> @@ -70,7 +70,8 @@ static const struct vm_operations_struct 
> >>> nouveau_ttm_vm_ops = {
> >>>.fault = nouveau_ttm_fault,
> >>>.open = ttm_bo_vm_open,
> >>>.close = ttm_bo_vm_close,
> >>> - .access = ttm_bo_vm_access
> >>> + .access = ttm_bo_vm_access,
> >>> + .mprotect = ttm_bo_vm_mprotect
> >>>};
> >>>
> >>>void
> >>> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
> >>> b/drivers/gpu/drm/radeon/radeon_gem.c
> >>> index 458f92a70887..c19ad07eb7b5 100644
> >>> --- a/drivers/gpu/drm/radeon/radeon_gem.c
> >>> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> >>> @@ -77,7 +77,8 @@ static const struct vm_operations_struct 
> >>> radeon_gem_vm_ops = {
> >>>.fault = radeon_gem_fault,
> >>>.open = ttm_bo_vm_open,
> >>>.close = ttm_bo_vm_close,
> >>> - .access = ttm_bo_vm_access
> >>> + .access = ttm_bo_vm_access,
> >>> + .mprotect = ttm_bo_vm_mprotect
> >>>};
> >>>
> >>>static void radeon_gem_object_free(struct drm_gem_object *gobj)
> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
> >>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >>> index f56be5bc0861..fb325bad5db6 100644
> >>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >>> @@ -542,17 +542,29 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, 
> >>> unsigned long addr,
> >>>}
> >>>EXPORT_SYMBOL(ttm_bo_vm_access);
> >>>
> >>> +int ttm_bo_vm_mprotect(struct vm_area_struct *vma, unsigned long start,
> >>> +unsigned long end, unsigned long newflags)
> >>> +{
> >>> + /* Enforce no COW since would have really strange behavior with it. 
> >>> */
> >>> + if (is_cow_mapping(newflags) && (newflags & VM_WRITE))
> >>> + return -EINVAL;
> >>> +
> >>> + return 0;
> >>> +}
> >>> +EXPORT_SYMBOL(ttm_bo_vm_mprotect);
> >>> +
> >>>static const struct vm_operations_struct ttm_bo_vm_ops = {
> >>>.fault = ttm_bo_vm_fault,
> >>>.open = ttm_bo_vm_open,
> >>>.close = ttm_bo_vm_close,
> >>>.access = ttm_bo_vm_access,
> >>> + .mprotect = ttm_bo_vm_mprotect,
> >>>};
> >>>
> >>>int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct 
> >>> ttm_buffer_object *bo)
> >>>{
> >>>/* Enforce no 

Re: [PATCH] drm/amdgpu: Check pmops for desired suspend state

2021-07-23 Thread Lazar, Lijo




On 7/22/2021 10:57 AM, Pratik Vishwakarma wrote:

[Why]
User might set mem_sleep as deep and it will result
in amdgpu resume errors.

[How]
Check with pm for default suspend state

Signed-off-by: Pratik Vishwakarma 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index af1710971ff3..d92196429741 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1468,7 +1468,8 @@ static int amdgpu_pmops_suspend(struct device *dev)
struct amdgpu_device *adev = drm_to_adev(drm_dev);
int r;
  
-	if (amdgpu_acpi_is_s0ix_supported(adev))

+   if (amdgpu_acpi_is_s0ix_supported(adev)
+   && pm_suspend_default_s2idle())


A better way would be to use the exported pm_suspend_target_state - 
(pm_suspend_target_state == PM_SUSPEND_TO_IDLE)


Thanks,
Lijo


adev->in_s0ix = true;
adev->in_s3 = true;
r = amdgpu_device_suspend(drm_dev, true);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm: add tdr support for embeded hw_fence

2021-07-23 Thread Andrey Grodzovsky


On 2021-07-22 8:20 p.m., Jingwen Chen wrote:

On Thu Jul 22, 2021 at 01:50:09PM -0400, Andrey Grodzovsky wrote:

On 2021-07-22 1:27 p.m., Jingwen Chen wrote:

On Thu Jul 22, 2021 at 01:17:13PM -0400, Andrey Grodzovsky wrote:

On 2021-07-22 12:47 p.m., Jingwen Chen wrote:

On Thu Jul 22, 2021 at 06:24:28PM +0200, Christian König wrote:

Am 22.07.21 um 16:45 schrieb Andrey Grodzovsky:

On 2021-07-22 6:45 a.m., Jingwen Chen wrote:

On Wed Jul 21, 2021 at 12:53:51PM -0400, Andrey Grodzovsky wrote:

On 2021-07-20 11:13 p.m., Jingwen Chen wrote:

[Why]
After embeded hw_fence to amdgpu_job, we need to add tdr support
for this feature.

[How]
1. Add a resubmit_flag for resubmit jobs.
2. Clear job fence from RCU and force complete vm flush fences in
    pre_asic_reset
3. skip dma_fence_get for resubmit jobs and add a dma_fence_put
    for guilty jobs.

Signed-off-by: Jack Zhang 
Signed-off-by: Jingwen Chen 
---
      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 +++-
      drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 16 +++-
      drivers/gpu/drm/amd/amdgpu/amdgpu_job.c    |  4 +++-
      drivers/gpu/drm/scheduler/sched_main.c |  1 +
      include/drm/gpu_scheduler.h    |  1 +
      5 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 40461547701a..fe0237f72a09 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4382,7 +4382,7 @@ int amdgpu_device_mode1_reset(struct
amdgpu_device *adev)
      int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
       struct amdgpu_reset_context *reset_context)
      {
-    int i, r = 0;
+    int i, j, r = 0;
      struct amdgpu_job *job = NULL;
      bool need_full_reset =
      test_bit(AMDGPU_NEED_FULL_RESET, _context->flags);
@@ -4406,6 +4406,16 @@ int
amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
      if (!ring || !ring->sched.thread)
      continue;
+    /*clear job fence from fence drv to avoid force_completion
+ *leave NULL and vm flush fence in fence drv */
+    for (j = 0; j <= ring->fence_drv.num_fences_mask; j ++) {
+    struct dma_fence *old,**ptr;
+    ptr = >fence_drv.fences[j];
+    old = rcu_dereference_protected(*ptr, 1);
+    if (old && test_bit(DMA_FENCE_FLAG_USER_BITS,
>flags))) {
+    RCU_INIT_POINTER(*ptr, NULL);
+    }

Is this to avoid premature job free because of dma_fence_put inside
amdgpu_fence_process ?
I can't currently remember why but we probably want all the HW fences
currently in the ring to
be forced signaled - maybe better to test for DMA_FENCE_FLAG_USER_BITS
inside amdgpu_fence_process
and still do the signaling but not the dma_fence_put part

Andrey

Hi Andrey,

This is to avoid signaling the same fence twice. If we still do the
signaling, then the job in the pending list will be signaled first in
force_completion, and later be signaled in resubmit. This will go to
BUG() in amdgpu_fence_process.

Oh, i see, how about just adding 'skip' flag to amdgpu_ring and setting
it before calling
amdgpu_fence_driver_force_completion and resetting it after, then inside
amdgpu_fence_driver_force_completion
you can just skip the signaling part with this flag for fences with
DMA_FENCE_FLAG_USER_BITS set
Less lines of code at least.

Still sounds quite a bit hacky.

I would rather suggest to completely drop the approach with
amdgpu_fence_driver_force_completion(). I could never see why we would want
that in the first place.

Regards,
Christian.


Hi Christian,

I keep the amdgpu_fence_driver_force_completion here to make sure the vm
flush fence is signaled and put.

Right, so we need to do force completion for the sake of all the fences
without parent jobs since there is code which wait directly on them.



So the key question is whether the fence of ib test should be the first
fence to be signaled after reset.

What do you mean by 'after reset' - at this point in the code there was
no ASIC reset yet, we just stopped the schedulers and detached the
HW fences from their parent jobs (sched_fence)

I mean the ASIC reset. There will be a ib_test for each ring after ASIC
reset.


Then why wouldn't they be the first ? They will emit new fences into the
ring
which will be signaled right away because the ASIC went through reset and is
not
hung anymore.  All the previous fences, including VM flush once from before
the
reset will be already signaled by this time from
amdgpu_fence_driver_force_completion.


Hi Andrey,

Sorry I didn't make it clear. I mean if we drop force_completion here,
and keep other code unchanged, then the ib_test wouldn't be the first to
be signaled.



At least in my opinion the order is not important of who will be 
signaled first. I do think it's important
to force signal all the old fences before HW 

Re: [PATCH 2/2] drm: add tdr support for embeded hw_fence

2021-07-23 Thread Christian König

Am 22.07.21 um 18:47 schrieb Jingwen Chen:

On Thu Jul 22, 2021 at 06:24:28PM +0200, Christian König wrote:

Am 22.07.21 um 16:45 schrieb Andrey Grodzovsky:

On 2021-07-22 6:45 a.m., Jingwen Chen wrote:

On Wed Jul 21, 2021 at 12:53:51PM -0400, Andrey Grodzovsky wrote:

On 2021-07-20 11:13 p.m., Jingwen Chen wrote:

[Why]
After embeded hw_fence to amdgpu_job, we need to add tdr support
for this feature.

[How]
1. Add a resubmit_flag for resubmit jobs.
2. Clear job fence from RCU and force complete vm flush fences in
  pre_asic_reset
3. skip dma_fence_get for resubmit jobs and add a dma_fence_put
  for guilty jobs.

Signed-off-by: Jack Zhang 
Signed-off-by: Jingwen Chen 
---
    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 +++-
    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 16 +++-
    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c    |  4 +++-
    drivers/gpu/drm/scheduler/sched_main.c |  1 +
    include/drm/gpu_scheduler.h    |  1 +
    5 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 40461547701a..fe0237f72a09 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4382,7 +4382,7 @@ int amdgpu_device_mode1_reset(struct
amdgpu_device *adev)
    int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
     struct amdgpu_reset_context *reset_context)
    {
-    int i, r = 0;
+    int i, j, r = 0;
    struct amdgpu_job *job = NULL;
    bool need_full_reset =
    test_bit(AMDGPU_NEED_FULL_RESET, _context->flags);
@@ -4406,6 +4406,16 @@ int
amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
    if (!ring || !ring->sched.thread)
    continue;
+    /*clear job fence from fence drv to avoid force_completion
+ *leave NULL and vm flush fence in fence drv */
+    for (j = 0; j <= ring->fence_drv.num_fences_mask; j ++) {
+    struct dma_fence *old,**ptr;
+    ptr = >fence_drv.fences[j];
+    old = rcu_dereference_protected(*ptr, 1);
+    if (old && test_bit(DMA_FENCE_FLAG_USER_BITS,
>flags))) {
+    RCU_INIT_POINTER(*ptr, NULL);
+    }

Is this to avoid premature job free because of dma_fence_put inside
amdgpu_fence_process ?
I can't currently remember why but we probably want all the HW fences
currently in the ring to
be forced signaled - maybe better to test for DMA_FENCE_FLAG_USER_BITS
inside amdgpu_fence_process
and still do the signaling but not the dma_fence_put part

Andrey

Hi Andrey,

This is to avoid signaling the same fence twice. If we still do the
signaling, then the job in the pending list will be signaled first in
force_completion, and later be signaled in resubmit. This will go to
BUG() in amdgpu_fence_process.


Oh, i see, how about just adding 'skip' flag to amdgpu_ring and setting
it before calling
amdgpu_fence_driver_force_completion and resetting it after, then inside
amdgpu_fence_driver_force_completion
you can just skip the signaling part with this flag for fences with
DMA_FENCE_FLAG_USER_BITS set
Less lines of code at least.

Still sounds quite a bit hacky.

I would rather suggest to completely drop the approach with
amdgpu_fence_driver_force_completion(). I could never see why we would want
that in the first place.

Regards,
Christian.


Hi Christian,

I keep the amdgpu_fence_driver_force_completion here to make sure the vm
flush fence is signaled and put.
So the key question is whether the fence of ib test should be the first
fence to be signaled after reset.
If it should be, it means not only fences with DMA_FENCE_FLAG_USER_BITS
but also vm flush fences should be removed from RCU fence array before
ib_test. Then we must do the force_completion here for vm flush
fence, otherwise there will be a memory leak here as no one will signal
and put it after we remove it from RCU.
If not, then the first fence to signal could be vm flush fence. And I'm
OK with drop amdgpu_fence_driver_force_completion here.


The key problem is that amdgpu_fence_driver_force_completion() goes over 
the RCU array and signals everything there.


What we should do instead is to go over the jobs and cleanup the ones we 
don't want to re-submit and keep the rest.


And please drop any usage of DMA_FENCE_FLAG_USER_BITS. That is not 
something which should be abused for reset handling.


Regards,
Christian.




Best Regards,
JingWen Chen

Andrey



+    }
    /* after all hw jobs are reset, hw fence is
meaningless, so force_completion */
    amdgpu_fence_driver_force_completion(ring);
    }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index eecf21d8ec33..815776c9a013 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -156,11 +156,17 @@ int amdgpu_fence_emit(struct