RE: [PATCH v2] drm/amdgpu/display: Fix reload driver error

2019-05-29 Thread Deng, Emily
Hi Kazlauskas,
I have modified the patch as your suggestion, could you please help to 
review it again?

Best wishes
Emily Deng



>-Original Message-
>From: amd-gfx  On Behalf Of Emily
>Deng
>Sent: Wednesday, May 29, 2019 11:12 AM
>To: amd-gfx@lists.freedesktop.org
>Cc: Deng, Emily 
>Subject: [PATCH v2] drm/amdgpu/display: Fix reload driver error
>
>Issue:
>Will have follow error when reload driver:
>[ 3986.567739] sysfs: cannot create duplicate filename
>'/devices/pci:00/:00:07.0/drm_dp_aux_dev'
>[ 3986.567743] CPU: 6 PID: 1767 Comm: modprobe Tainted: G   OE 
>5.0.0-
>rc1-custom #1
>[ 3986.567745] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 3986.567746] Call Trace:
>..
>[ 3986.567808]  drm_dp_aux_register_devnode+0xdc/0x140
>[drm_kms_helper] ..
>[ 3986.569081] kobject_add_internal failed for drm_dp_aux_dev with -EEXIST,
>don't try to register things with the same name in the same directory.
>
>Reproduce sequences:
>1.modprobe amdgpu
>2.modprobe -r amdgpu
>3.modprobe amdgpu
>
>Root cause:
>When unload driver, it doesn't unregister aux.
>
>v2: Don't use has_aux
>
>Signed-off-by: Emily Deng 
>---
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15
>++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>index 8fe1685..941313b 100644
>--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>@@ -3760,6 +3760,13 @@ int
>amdgpu_dm_connector_atomic_get_property(struct drm_connector
>*connector,
>   return ret;
> }
>
>+static void amdgpu_dm_connector_unregister(struct drm_connector
>+*connector) {
>+  struct amdgpu_dm_connector *amdgpu_dm_connector =
>+to_amdgpu_dm_connector(connector);
>+
>+  drm_dp_aux_unregister(_dm_connector-
>>dm_dp_aux.aux);
>+}
>+
> static void amdgpu_dm_connector_destroy(struct drm_connector
>*connector)  {
>   struct amdgpu_dm_connector *aconnector =
>to_amdgpu_dm_connector(connector);
>@@ -3788,6 +3795,11 @@ static void amdgpu_dm_connector_destroy(struct
>drm_connector *connector)
>   drm_dp_cec_unregister_connector(>dm_dp_aux.aux);
>   drm_connector_unregister(connector);
>   drm_connector_cleanup(connector);
>+  if (aconnector->i2c) {
>+  i2c_del_adapter(>i2c->base);
>+  kfree(aconnector->i2c);
>+  }
>+
>   kfree(connector);
> }
>
>@@ -3846,7 +3858,8 @@ static const struct drm_connector_funcs
>amdgpu_dm_connector_funcs = {
>   .atomic_duplicate_state =
>amdgpu_dm_connector_atomic_duplicate_state,
>   .atomic_destroy_state =
>drm_atomic_helper_connector_destroy_state,
>   .atomic_set_property =
>amdgpu_dm_connector_atomic_set_property,
>-  .atomic_get_property =
>amdgpu_dm_connector_atomic_get_property
>+  .atomic_get_property =
>amdgpu_dm_connector_atomic_get_property,
>+  .early_unregister = amdgpu_dm_connector_unregister
> };
>
> static int get_modes(struct drm_connector *connector)
>--
>2.7.4
>
>___
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

回复: [PATCH] drm/amdgpu: fix a race in GPU reset with IB test (v2)

2019-05-29 Thread Pan, Xinhui
looks good to me.


发件人: Alex Deucher 
发送时间: 2019年5月30日 2:44
收件人: amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; Pan, Xinhui
主题: [PATCH] drm/amdgpu: fix a race in GPU reset with IB test (v2)

Split late_init into two functions, one (do_late_init) which
just does the hw init, and late_init which calls do_late_init
and schedules the IB test work.  Call do_late_init in
the GPU reset code to run the init code, but not schedule
the IB test code.  The IB test code is called directly
in the gpu reset code so no need to run the IB tests
in a separate work thread.  If we do, we end up racing.

v2: Rework late_init.  Pull out the mgpu fan boost and xgmi
pstate code into late_init so they get called in all cases.
rename the late_init worker thread to delayed work since it's
just the IB tests now which can happen later.  Schedule the
work at init and resume time.  It's not needed at reset time
because the IB tests are called directly.

Cc: Xinhui Pan 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 116 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|   2 +-
 3 files changed, 61 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d355e9a09ad1..19a00282e34c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -925,7 +925,7 @@ struct amdgpu_device {
const struct amdgpu_df_funcs*df_funcs;

/* delayed work_func for deferring clockgating during resume */
-   struct delayed_work late_init_work;
+   struct delayed_work delayed_init_work;

struct amdgpu_virt  virt;
/* firmware VRAM reservation */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7a8c2201cd04..d00fd5dd307a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1869,6 +1869,43 @@ static int amdgpu_device_set_pg_state(struct 
amdgpu_device *adev, enum amd_power
return 0;
 }

+static int amdgpu_device_enable_mgpu_fan_boost(void)
+{
+   struct amdgpu_gpu_instance *gpu_ins;
+   struct amdgpu_device *adev;
+   int i, ret = 0;
+
+   mutex_lock(_info.mutex);
+
+   /*
+* MGPU fan boost feature should be enabled
+* only when there are two or more dGPUs in
+* the system
+*/
+   if (mgpu_info.num_dgpu < 2)
+   goto out;
+
+   for (i = 0; i < mgpu_info.num_dgpu; i++) {
+   gpu_ins = &(mgpu_info.gpu_ins[i]);
+   adev = gpu_ins->adev;
+   if (!(adev->flags & AMD_IS_APU) &&
+   !gpu_ins->mgpu_fan_enabled &&
+   adev->powerplay.pp_funcs &&
+   adev->powerplay.pp_funcs->enable_mgpu_fan_boost) {
+   ret = amdgpu_dpm_enable_mgpu_fan_boost(adev);
+   if (ret)
+   break;
+
+   gpu_ins->mgpu_fan_enabled = 1;
+   }
+   }
+
+out:
+   mutex_unlock(_info.mutex);
+
+   return ret;
+}
+
 /**
  * amdgpu_device_ip_late_init - run late init for hardware IPs
  *
@@ -1902,11 +1939,15 @@ static int amdgpu_device_ip_late_init(struct 
amdgpu_device *adev)
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_GATE);
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_GATE);

-   queue_delayed_work(system_wq, >late_init_work,
-  msecs_to_jiffies(AMDGPU_RESUME_MS));
-
amdgpu_device_fill_reset_magic(adev);

+   r = amdgpu_device_enable_mgpu_fan_boost();
+   if (r)
+   DRM_ERROR("enable mgpu fan boost failed (%d).\n", r);
+
+   /* set to low pstate by default */
+   amdgpu_xgmi_set_pstate(adev, 0);
+
return 0;
 }

@@ -2005,65 +2046,20 @@ static int amdgpu_device_ip_fini(struct amdgpu_device 
*adev)
return 0;
 }

-static int amdgpu_device_enable_mgpu_fan_boost(void)
-{
-   struct amdgpu_gpu_instance *gpu_ins;
-   struct amdgpu_device *adev;
-   int i, ret = 0;
-
-   mutex_lock(_info.mutex);
-
-   /*
-* MGPU fan boost feature should be enabled
-* only when there are two or more dGPUs in
-* the system
-*/
-   if (mgpu_info.num_dgpu < 2)
-   goto out;
-
-   for (i = 0; i < mgpu_info.num_dgpu; i++) {
-   gpu_ins = &(mgpu_info.gpu_ins[i]);
-   adev = gpu_ins->adev;
-   if (!(adev->flags & AMD_IS_APU) &&
-   !gpu_ins->mgpu_fan_enabled &&
-   adev->powerplay.pp_funcs &&
-   adev->powerplay.pp_funcs->enable_mgpu_fan_boost) {
-   ret = amdgpu_dpm_enable_mgpu_fan_boost(adev);
-   if (ret)
-   break;
-
-   

RE: [PATCH 3/3] drm/amdkfd: remove duplicated PCIE atomics request

2019-05-29 Thread Zhang, Hawking
Series is

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Xiao, Jack  
Sent: 2019年5月29日 14:32
To: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
; Zhang, Hawking ; Koenig, 
Christian ; Kuehling, Felix 
Cc: Xiao, Jack 
Subject: [PATCH 3/3] drm/amdkfd: remove duplicated PCIE atomics request

Since amdgpu has always requested PCIE atomics, kfd don't need duplicated PCIE 
atomics enablement. Referring to amdgpu request result is enough.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |  7 +++  
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 10 --
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 98326e3b..ddd6c52 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -620,6 +620,13 @@ bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, 
u32 vmid)
return false;
 }
 
+bool amdgpu_amdkfd_have_atomics_support(struct kgd_dev *kgd) {
+   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
+
+   return adev->have_atomics_support;
+}
+
 #ifndef CONFIG_HSA_AMD
 bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm)  { diff 
--git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index f57f297..8d135c82 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -135,6 +135,7 @@ int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum 
kgd_engine_type engine,
uint32_t vmid, uint64_t gpu_addr,
uint32_t *ib_cmd, uint32_t ib_len);  void 
amdgpu_amdkfd_set_compute_idle(struct kgd_dev *kgd, bool idle);
+bool amdgpu_amdkfd_have_atomics_support(struct kgd_dev *kgd);
 
 struct kfd2kgd_calls *amdgpu_amdkfd_gfx_7_get_functions(void);
 struct kfd2kgd_calls *amdgpu_amdkfd_gfx_8_0_get_functions(void);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 7b4ea24..76a1599 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -481,17 +481,15 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
 * 32 and 64-bit requests are possible and must be
 * supported.
 */
-   ret = pci_enable_atomic_ops_to_root(pdev,
-   PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
-   PCI_EXP_DEVCAP2_ATOMIC_COMP64);
-   if (device_info->needs_pci_atomics && ret < 0) {
+   kfd->pci_atomic_requested = amdgpu_amdkfd_have_atomics_support(kgd);
+   if (device_info->needs_pci_atomics &&
+   !kfd->pci_atomic_requested) {
dev_info(kfd_device,
 "skipped device %x:%x, PCI rejects atomics\n",
 pdev->vendor, pdev->device);
kfree(kfd);
return NULL;
-   } else if (!ret)
-   kfd->pci_atomic_requested = true;
+   }
 
kfd->kgd = kgd;
kfd->device_info = device_info;
--
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/powerplay/smu7_hwmgr: replace blocking delay with non-blocking

2019-05-29 Thread Yrjan Skrimstad
This driver currently contains a repeated 500ms blocking delay call
which causes frequent major buffer underruns in PulseAudio. This patch
fixes this issue by replacing the blocking delay with a non-blocking
sleep call.

Signed-off-by: Yrjan Skrimstad 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index 048757e8f494..340afdbddc72 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -3494,7 +3494,7 @@ static int smu7_get_gpu_power(struct pp_hwmgr *hwmgr, u32 
*query)
ixSMU_PM_STATUS_95, 0);
 
for (i = 0; i < 10; i++) {
-   mdelay(500);
+   msleep(500);
smum_send_msg_to_smc(hwmgr, PPSMC_MSG_PmStatusLogSample);
tmp = cgs_read_ind_register(hwmgr->device,
CGS_IND_REG__SMC,
-- 
2.20.1



Re: [PATCH 0/2] drm/amd/display: Add HDR output metadata support for amdgpu

2019-05-29 Thread Harry Wentland
On 2019-05-28 3:08 p.m., Nicholas Kazlauskas wrote:
> This patch series enables HDR output metadata support in amdgpu using the
> DRM HDR interface merged in drm-misc-next. Enabled for DCE and DCN ASICs
> over DP and HDMI.
> 
> It's limited to static HDR metadata support for now since that's all the
> DRM interface supports. It requires a modeset for entering and exiting HDR
> but the metadata can be changed without one.
> 
> Cc: Harry Wentland 
> 

Series is
Reviewed-by: Harry Wentland 

Harry

> Nicholas Kazlauskas (2):
>   drm/amd/display: Expose HDR output metadata for supported connectors
>   drm/amd/display: Only force modesets when toggling HDR
> 
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 155 +-
>  1 file changed, 151 insertions(+), 4 deletions(-)
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[pull] amdgpu, amdkfd drm-next-5.3

2019-05-29 Thread Alex Deucher
Hi Dave, Daniel,

New stuff for 5.3:
- Add new thermal sensors for vega asics
- Various RAS fixes
- Add sysfs interface for memory interface utilization
- Use HMM rather than mmu notifier for user pages
- Expose xgmi topology via kfd
- SR-IOV fixes
- Fixes for manual driver reload
- Add unique identifier for vega asics
- Clean up user fence handling with UVD/VCE/VCN blocks
- Convert DC to use core bpc attribute rather than a custom one
- Add GWS support for KFD
- Vega powerplay improvements
- Add CRC support for DCE 12
- SR-IOV support for new security policy
- Various cleanups

The following changes since commit a188339ca5a396acc588e5851ed7e19f66b0ebd9:

  Linux 5.2-rc1 (2019-05-19 15:47:09 -0700)

are available in the Git repository at:

  git://people.freedesktop.org/~agd5f/linux drm-next-5.3

for you to fetch changes up to cf401e2856b27b2deeada498eab864e2a50cf219:

  drm/amdkfd: fix null pointer dereference on dev (2019-05-29 16:50:09 -0500)


Alex Deucher (3):
  drm/amdgpu/vega20: use mode1 reset for RAS and XGMI
  drm/amdgpu: use pcie_bandwidth_available rather than open coding it
  drm/amdgpu/soc15: skip reset on init

Amber Lin (1):
  drm/amdkfd: Add domain number into gpu_id

Anthony Koo (1):
  drm/amd/display: fix multi display seamless boot case

Aric Cyr (5):
  drm/amd/display: 3.2.28
  drm/amd/display: 3.2.29
  drm/amd/display: 3.2.30
  drm/amd/display: Use VCP for extended colorimetry
  drm/amd/display: 3.2.31

Bhawanpreet Lakha (1):
  drm/amd/powerplay: Fix maybe-uninitialized in get_ppfeature_status

Charlene Liu (5):
  drm/amd/display: add SW_USE_I2C_REG request.
  drm/amd/display: color space ycbcr709 support
  drm/amd/display: reset retimer/redriver below 340Mhz
  drm/amd/display: define v_total_min and max parameters
  drm/amd/display: enabling stream after HPD low to high happened

Chengming Gui (2):
  drm/amd/powerplay: Enable "disable dpm" feature to support swSMU debug 
(v2)
  drm/amd/powerplay: Fix code error for translating int type to bool type 
correctly

Chris Park (2):
  drm/amd/display: Support AVI InfoFrame V3 and V4
  drm/amd/display: Define Byte 14 on AVI InfoFrame

Christian König (2):
  drm/amdgpu: rename amdgpu_prime.[ch] into amdgpu_dma_buf.[ch]
  drm/amdgpu: remove static GDS, GWS and OA allocation

Chunming Zhou (1):
  drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu

Colin Ian King (2):
  drm/amdgpu: fix spelling mistake "retrived" -> "retrieved"
  drm/amdkfd: fix null pointer dereference on dev

Dmytro Laktyushkin (4):
  drm/amd/display: move signal type out of otg dlg params
  drm/amd/display: stop external access to internal optc sync params
  drm/amd/display: fix acquire_first_split_pipe function
  drm/amd/display: add null checks and set update flags

Emily Deng (2):
  drm/amdgpu: fix unload driver fail
  drm/amdgpu: Need to set the baco cap before baco reset

Eric Yang (2):
  drm/amd/display: Set dispclk and dprefclock directly
  drm/amd/display: move back vbios cmd table for set dprefclk

Evan Quan (26):
  drm/amd/powerplay: support hotspot/memory critical limit values
  drm/amd/powerplay: support temperature emergency max values
  drm/amd/powerplay: support SMU metrics table on Vega12
  drm/amd/powerplay: expose current hotspot and memory temperatures V2
  drm/amd/powerplay: support hwmon temperature channel labels V2
  drm/amd/powerplay: expose Vega12 current power
  drm/amd/powerplay: expose Vega12 current gpu activity
  drm/amd/powerplay: expose Vega20 realtime memory utilization
  drm/amd/powerplay: expose Vega12 realtime memory utilization
  drm/amd/powerplay: expose SMU7 asics realtime memory utilization
  drm/amdgpu: add new sysfs interface for memory realtime utilization
  drm/amdgpu: enable separate timeout setting for every ring type V4
  drm/amd/powerplay: fix Vega10 mclk/socclk voltage link setup
  drm/amd/powerplay: valid Vega10 DPMTABLE_OD_UPDATE_VDDC settings V2
  drm/amd/powerplay: avoid repeat AVFS enablement/disablement
  drm/amd/powerplay: update Vega10 power state on OD
  drm/amd/powerplay: force to update all clock tables on OD reset
  drm/amd/powerplay: update Vega10 ACG Avfs Gb parameters
  drm/amd/powerplay: drop unnecessary sw smu check
  drm/amd/powerplay: drop redundant smu call
  drm/amd/powerplay: support ppfeatures sysfs interface on sw smu routine
  drm/amd/powerplay: honor hw limit on fetching metrics data
  drm/amd/powerplay: support uclk activity retrieve on sw smu routine
  drm/amd/powerplay: support sw smu hotspot and memory temperature retrieval
  drm/amd/powerplay: fix sw SMU wrong UVD/VCE powergate setting
  drm/amd/powerplay: enable ppfeaturemask module parameter support on Vega20

Felix Kuehling (3):
  drm/amdgpu: 

Re: [PATCH 3/3] drm/amdkfd: remove duplicated PCIE atomics request

2019-05-29 Thread Kuehling, Felix
On 2019-05-29 2:32 a.m., Xiao, Jack wrote:
> Since amdgpu has always requested PCIE atomics, kfd don't
> need duplicated PCIE atomics enablement. Referring to amdgpu
> request result is enough.
>
> Signed-off-by: Jack Xiao 

This patch is Reviewed-by: Felix Kuehling 


> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |  7 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  1 +
>   drivers/gpu/drm/amd/amdkfd/kfd_device.c| 10 --
>   3 files changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 98326e3b..ddd6c52 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -620,6 +620,13 @@ bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device 
> *adev, u32 vmid)
>   return false;
>   }
>   
> +bool amdgpu_amdkfd_have_atomics_support(struct kgd_dev *kgd)
> +{
> + struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
> +
> + return adev->have_atomics_support;
> +}
> +
>   #ifndef CONFIG_HSA_AMD
>   bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm)
>   {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index f57f297..8d135c82 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -135,6 +135,7 @@ int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum 
> kgd_engine_type engine,
>   uint32_t vmid, uint64_t gpu_addr,
>   uint32_t *ib_cmd, uint32_t ib_len);
>   void amdgpu_amdkfd_set_compute_idle(struct kgd_dev *kgd, bool idle);
> +bool amdgpu_amdkfd_have_atomics_support(struct kgd_dev *kgd);
>   
>   struct kfd2kgd_calls *amdgpu_amdkfd_gfx_7_get_functions(void);
>   struct kfd2kgd_calls *amdgpu_amdkfd_gfx_8_0_get_functions(void);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 7b4ea24..76a1599 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -481,17 +481,15 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>* 32 and 64-bit requests are possible and must be
>* supported.
>*/
> - ret = pci_enable_atomic_ops_to_root(pdev,
> - PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> - PCI_EXP_DEVCAP2_ATOMIC_COMP64);
> - if (device_info->needs_pci_atomics && ret < 0) {
> + kfd->pci_atomic_requested = amdgpu_amdkfd_have_atomics_support(kgd);
> + if (device_info->needs_pci_atomics &&
> + !kfd->pci_atomic_requested) {
>   dev_info(kfd_device,
>"skipped device %x:%x, PCI rejects atomics\n",
>pdev->vendor, pdev->device);
>   kfree(kfd);
>   return NULL;
> - } else if (!ret)
> - kfd->pci_atomic_requested = true;
> + }
>   
>   kfd->kgd = kgd;
>   kfd->device_info = device_info;
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH][next] drm/amdkfd: fix null pointer dereference on dev

2019-05-29 Thread Kuehling, Felix
On 2019-05-29 11:07 a.m., Colin King wrote:
> From: Colin Ian King 
>
> The pointer dev is set to null yet it is being dereferenced when
> checking dev->dqm->sched_policy.  Fix this by performing the check
> on dev->dqm->sched_policy after dev has been assigned and null
> checked.  Also remove the redundant null assignment to dev.
>
> Addresses-Coverity: ("Explicit null dereference")
> Fixes: 1a058c337676 ("drm/amdkfd: New IOCTL to allocate queue GWS")
> Signed-off-by: Colin Ian King 

Reviewed-by: Felix Kuehling 

Thanks!

   Felix

> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 ---
>   1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index aab2aa6c1dee..ea82828fdc76 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -1572,10 +1572,9 @@ static int kfd_ioctl_alloc_queue_gws(struct file 
> *filep,
>   {
>   int retval;
>   struct kfd_ioctl_alloc_queue_gws_args *args = data;
> - struct kfd_dev *dev = NULL;
> + struct kfd_dev *dev;
>   
> - if (!hws_gws_support ||
> - dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS)
> + if (!hws_gws_support)
>   return -EINVAL;
>   
>   dev = kfd_device_by_id(args->gpu_id);
> @@ -1583,6 +1582,8 @@ static int kfd_ioctl_alloc_queue_gws(struct file *filep,
>   pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
>   return -EINVAL;
>   }
> + if (dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS)
> + return -EINVAL;
>   
>   mutex_lock(>mutex);
>   retval = pqm_set_gws(>pqm, args->queue_id, args->num_gws ? dev->gws 
> : NULL);
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 4/4] drm/amdkfd: Check against device cgroup

2019-05-29 Thread Tejun Heo
On Wed, May 29, 2019 at 08:45:44PM +, Kuehling, Felix wrote:
> Just to clarify, are you saying that we should upstream this change 
> through Alex Deucher's amd-staging-drm-next and Dave Airlie's drm-next 
> trees?

Yeah, sure, whichever tree is the most convenient.

Thanks.

-- 
tejun
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

TTM allocation failure under memory pressure on suspend

2019-05-29 Thread Lorenz Brun
Hi,

I have a RX 570 which fails to suspend properly under memory pressure and stays 
black after waking up.
It looks like an allocation failure in the TTM VRAM eviction is to blame:

[635471.240411] kworker/u24:26: page allocation failure: order:0, 
mode:0x620402(GFP_NOIO|__GFP_HIGHMEM|__GFP_RETRY_MAYFAIL|__GFP_HARDWALL), 
nodemask=(null),cpuset=/,mems_allowed=0
[635471.240416] CPU: 9 PID: 20884 Comm: kworker/u24:26 Tainted: P   OE  
   5.0.0-13-generic #14-Ubuntu
[635471.240417] Hardware name: MSI MS-7885/X99A SLI PLUS(MS-7885), BIOS 1.80 
03/20/2015
[635471.240421] Workqueue: events_unbound async_run_entry_fn
[635471.240421] Call Trace:
[635471.240426]  dump_stack+0x63/0x8a
[635471.240428]  warn_alloc.cold.119+0x7b/0xfb
[635471.240429]  __alloc_pages_slowpath+0xe63/0xea0
[635471.240432]  ? flush_tlb_all+0x1c/0x20
[635471.240433]  ? change_page_attr_set_clr+0x164/0x1f0
[635471.240434]  __alloc_pages_nodemask+0x2c4/0x2e0
[635471.240437]  alloc_pages_current+0x81/0xe0
[635471.240442]  ttm_alloc_new_pages.isra.16+0x95/0x1e0 [ttm]
[635471.240444]  ttm_page_pool_get_pages+0x16b/0x380 [ttm]
[635471.240446]  ttm_pool_populate+0x1a3/0x4a0 [ttm]
[635471.240448]  ttm_populate_and_map_pages+0x28/0x250 [ttm]
[635471.240450]  ? ttm_dma_tt_alloc_page_directory+0x2d/0x60 [ttm]
[635471.240490]  amdgpu_ttm_tt_populate+0x56/0xe0 [amdgpu]
[635471.240493]  ttm_tt_populate.part.9+0x22/0x60 [ttm]
[635471.240495]  ttm_tt_bind+0x4f/0x60 [ttm]
[635471.240497]  ttm_bo_handle_move_mem+0x26c/0x500 [ttm]
[635471.240499]  ttm_bo_evict+0x142/0x1c0 [ttm]
[635471.240501]  ttm_mem_evict_first+0x19a/0x220 [ttm]
[635471.240504]  ttm_bo_force_list_clean+0xa1/0x170 [ttm]
[635471.240506]  ttm_bo_evict_mm+0x2e/0x30 [ttm]
[635471.240531]  amdgpu_bo_evict_vram+0x1a/0x20 [amdgpu]
[635471.240554]  amdgpu_device_suspend+0x1dd/0x3d0 [amdgpu]
[635471.240578]  amdgpu_pmops_suspend+0x1f/0x30 [amdgpu]
[635471.240579]  pci_pm_suspend+0x76/0x130
[635471.240580]  ? pci_pm_freeze+0xf0/0xf0
[635471.240582]  dpm_run_callback+0x66/0x150
[635471.240582]  __device_suspend+0x110/0x490
[635471.240583]  async_suspend+0x1f/0x90
[635471.240584]  async_run_entry_fn+0x3c/0x150
[635471.240586]  process_one_work+0x20f/0x410
[635471.240587]  worker_thread+0x34/0x400
[635471.240589]  kthread+0x120/0x140
[635471.240589]  ? process_one_work+0x410/0x410
[635471.240591]  ? __kthread_parkme+0x70/0x70
[635471.240592]  ret_from_fork+0x35/0x40
…
[635471.241994] [TTM] Buffer eviction failed
[635471.627554] [TTM] Buffer eviction failed

Subsequently it fails to wake up (all 3 screens black) because of an 
initialization failure:

[635472.216323] amdgpu :04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] 
*ERROR* ring gfx test failed (-110)
[635472.216354] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of 
IP block  failed -110
[635472.216384] [drm:amdgpu_device_resume [amdgpu]] *ERROR* 
amdgpu_device_ip_resume failed (-110).
[635472.216387] dpm_run_callback(): pci_pm_resume+0x0/0xb0 returns -110
[635472.216390] PM: Device :04:00.0 failed to resume async: error -110

I’m pretty sure the problem is setting GFP_NOIO which makes it impossible for 
the kernel to swap anything out and it subsequently gives up trying to satisfy 
the allocation. I usually run under quite some memory pressure with a lot of 
swap (32GiB RAM + 48GiB Swap, >48GiB memory usage is regular). I have looked at 
the code in question but I’m not sure where this is coming from, it seems like 
neither ttm nor amdgpu set GFP_NOIO. TTM seems to have per-pool allocation 
flags and somehow GFP_NOIO is getting enabled there for the amdgpu pool.

Thanks,
Lorenz
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 4/4] drm/amdkfd: Check against device cgroup

2019-05-29 Thread Kuehling, Felix
On 2019-05-28 3:02 p.m., Tejun Heo wrote:
> Hello,
>
> On Fri, May 17, 2019 at 08:12:17PM +, Kuehling, Felix wrote:
>> Patches 1,2,4 will be submitted through amd-staging-drm-next. Patch 3
>> goes through the cgroup tree. Patch 4 depends on patch 3. So submitting
>> patch 4 will need to wait until we rebase amd-staging-drm-next on a new
>> enough kernel release that includes patch 3.
>>
>> Patch 1 and 2 could be submitted now or wait for patch 3 as well so we
>> submit all our cgroup stuff in amdgpu and KFD together. It probably
>> makes most sense to wait since unused code tends to rot.
>>
>> Patches 1,2,4 are already reviewed by me. Feel free to add my Acked-by
>> to patch 3.
> Please feel free to add my acked-by and take patch 3 with the rest of
> the patchset.

Thank you Tejun!

Just to clarify, are you saying that we should upstream this change 
through Alex Deucher's amd-staging-drm-next and Dave Airlie's drm-next 
trees?

I added Dave and Alex to this email to make sure we're all on the same page.

Regards,
   Felix

>
> Thanks.
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/sched: Fix make htmldocs warnings.

2019-05-29 Thread Daniel Vetter
On Wed, May 29, 2019 at 04:43:45PM +, Grodzovsky, Andrey wrote:
> I don't, sorry.

Should we fix that? Seems like you do plenty of scheduler stuff, so would
make sense I guess ...
-Daniel

> 
> Andrey
> 
> On 5/29/19 12:42 PM, Alex Deucher wrote:
> > On Wed, May 29, 2019 at 10:29 AM Andrey Grodzovsky
> >  wrote:
> >> Signed-off-by: Andrey Grodzovsky 
> > Reviewed-by: Alex Deucher 
> >
> > I'll push it to drm-misc in a minute unless you have commit rights.
> >
> > Alex
> >
> >> ---
> >>   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
> >>   1 file changed, 2 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> >> b/drivers/gpu/drm/scheduler/sched_main.c
> >> index 49e7d07..c1058ee 100644
> >> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >> @@ -353,6 +353,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
> >>* drm_sched_stop - stop the scheduler
> >>*
> >>* @sched: scheduler instance
> >> + * @bad: job which caused the time out
> >>*
> >>* Stop the scheduler and also removes and frees all completed jobs.
> >>* Note: bad job will not be freed as it might be used later and so it's
> >> @@ -422,6 +423,7 @@ EXPORT_SYMBOL(drm_sched_stop);
> >>* drm_sched_job_recovery - recover jobs after a reset
> >>*
> >>* @sched: scheduler instance
> >> + * @full_recovery: proceed with complete sched restart
> >>*
> >>*/
> >>   void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
> >> --
> >> 2.7.4
> >>
> >> ___
> >> dri-devel mailing list
> >> dri-de...@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/sched: Fix make htmldocs warnings.

2019-05-29 Thread Daniel Vetter
On Wed, May 29, 2019 at 10:29:40AM -0400, Andrey Grodzovsky wrote:
> Signed-off-by: Andrey Grodzovsky 

Thanks for quick fixing!

Acked-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 49e7d07..c1058ee 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -353,6 +353,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
>   * drm_sched_stop - stop the scheduler
>   *
>   * @sched: scheduler instance
> + * @bad: job which caused the time out
>   *
>   * Stop the scheduler and also removes and frees all completed jobs.
>   * Note: bad job will not be freed as it might be used later and so it's
> @@ -422,6 +423,7 @@ EXPORT_SYMBOL(drm_sched_stop);
>   * drm_sched_job_recovery - recover jobs after a reset
>   *
>   * @sched: scheduler instance
> + * @full_recovery: proceed with complete sched restart
>   *
>   */
>  void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
> -- 
> 2.7.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-29 Thread Khalid Aziz
On 5/29/19 8:20 AM, Catalin Marinas wrote:
> Hi Khalid,
> 
> On Tue, May 28, 2019 at 05:33:04PM -0600, Khalid Aziz wrote:
>> On Tue, 2019-05-28 at 16:40 +0100, Catalin Marinas wrote:
>>> I think another aspect is how we define the ABI. Is allowing tags to
>>> mlock() for example something specific to arm64 or would sparc ADI
>>> need the same? In the absence of other architectures defining such
>>> ABI, my preference would be to keep the wrappers in the arch code.
>>>
>>> Assuming sparc won't implement untagged_addr(), we can place the
>>> macros back in the generic code but, as per the review here, we need
>>> to be more restrictive on where we allow tagged addresses. For
>>> example, if mmap() gets a tagged address with MAP_FIXED, is it
>>> expected to return the tag?
>>
>> I would recommend against any ABI differences between ARM64 MTE/TBI and
>> sparc ADI unless it simply can not be helped. My understanding of
>> MTE/TBI is limited, so I will explain how sparc ADI works. On sparc, a
>> tagged address has no meaning until following steps happen:
> 
> Before we go into the MTE/ADI similarities or differences, just to
> clarify that TBI is something that we supported from the start of the
> arm64 kernel port. TBI (top byte ignore) allows a user pointer to have
> non-zero top byte and dereference it without causing a fault (the
> hardware masks it out). The user/kernel ABI does not allow such tagged
> pointers into the kernel, nor would the kernel return any such tagged
> addresses.
> 
> With MTE (memory tagging extensions), the top-byte meaning is changed
> from no longer being ignored to actually being checked against a tag in
> the physical RAM (we call it allocation tag).
> 
>> 1. set the user mode PSTATE.mcde bit. This acts as the master switch to
>> enable ADI for a process.
>>
>> 2. set TTE.mcd bit on TLB entries that match the address range ADI is
>> being enabled on.
> 
> Something close enough for MTE, with the difference that enabling it is
> not a PSTATE bit but rather a system control bit (SCTLR_EL1 register),
> so only the kernel can turn it on/off for the user.
> 
>> 3. Store version tag for the range of addresses userspace wants ADI
>> enabled on using "stxa" instruction. These tags are stored in physical
>> memory by the memory controller.
> 
> Do you have an "ldxa" instruction to load the tags from physical memory?

Yes, "ldxa" can be used to read current tag for any memory location.
Kernel uses it to read the tags for a physical page being swapped out
and restores those tags when the page is swapped back in.

> 
>> Steps 1 and 2 are accomplished by userspace by calling mprotect() with
>> PROT_ADI. Tags are set by storing tags in a loop, for example:
>>
>> version = 10;
>> tmp_addr = shmaddr;
>> end = shmaddr + BUFFER_SIZE;
>> while (tmp_addr < end) {
>> asm volatile(
>> "stxa %1, [%0]0x90\n\t"
>> :
>> : "r" (tmp_addr), "r" (version));
>> tmp_addr += adi_blksz;
>> }
> 
> On arm64, a sequence similar to the above would live in the libc. So a
> malloc() call will tag the memory and return the tagged address to 
> thePre-coloring could easily be done by 
> user.
> 
> We were not planning for a PROT_ADI/MTE but rather have MTE enabled for
> all user memory ranges. We may revisit this before we upstream the MTE
> support (probably some marginal benefit for the hardware not fetching
> the tags from memory if we don't need to, e.g. code sections).
> 
> Given that we already have the TBI feature and with MTE enabled the top
> byte is no longer ignored, we are planning for an explicit opt-in by the
> user via prctl() to enable MTE.

OK. I had initially proposed enabling ADI for a process using prctl().
Feedback I got was prctl was not a desirable interface and I ended up
making mprotect() with PROT_ADI enable ADI on the process instead. Just
something to keep in mind.

> 
>> With these semantics, giving mmap() or shamat() a tagged address is
>> meaningless since no tags have been stored at the addresses mmap() will
>> allocate and one can not store tags before memory range has been
>> allocated. If we choose to allow tagged addresses to come into mmap()
>> and shmat(), sparc code can strip the tags unconditionally and that may
>> help simplify ABI and/or code.
> 
> We could say that with TBI (pre-MTE support), the top byte is actually
> ignored on mmap(). Now, if you pass a MAP_FIXED with a tagged address,
> should the user expect the same tagged address back or stripping the tag
> is acceptable? If we want to keep the current mmap() semantics, I'd say
> the same tag is returned. However, with MTE this also implies that the
> memory was coloured.
> 

Is assigning a tag aprivileged operationon ARM64? I am thinking not
since you mentioned libc could do it in a loop for malloc'd memory.
mmap() can return the same tagged address but I am uneasy about kernel

[PATCH] drm/amdgpu: fix a race in GPU reset with IB test (v2)

2019-05-29 Thread Alex Deucher
Split late_init into two functions, one (do_late_init) which
just does the hw init, and late_init which calls do_late_init
and schedules the IB test work.  Call do_late_init in
the GPU reset code to run the init code, but not schedule
the IB test code.  The IB test code is called directly
in the gpu reset code so no need to run the IB tests
in a separate work thread.  If we do, we end up racing.

v2: Rework late_init.  Pull out the mgpu fan boost and xgmi
pstate code into late_init so they get called in all cases.
rename the late_init worker thread to delayed work since it's
just the IB tests now which can happen later.  Schedule the
work at init and resume time.  It's not needed at reset time
because the IB tests are called directly.

Cc: Xinhui Pan 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 116 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|   2 +-
 3 files changed, 61 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d355e9a09ad1..19a00282e34c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -925,7 +925,7 @@ struct amdgpu_device {
const struct amdgpu_df_funcs*df_funcs;
 
/* delayed work_func for deferring clockgating during resume */
-   struct delayed_work late_init_work;
+   struct delayed_work delayed_init_work;
 
struct amdgpu_virt  virt;
/* firmware VRAM reservation */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7a8c2201cd04..d00fd5dd307a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1869,6 +1869,43 @@ static int amdgpu_device_set_pg_state(struct 
amdgpu_device *adev, enum amd_power
return 0;
 }
 
+static int amdgpu_device_enable_mgpu_fan_boost(void)
+{
+   struct amdgpu_gpu_instance *gpu_ins;
+   struct amdgpu_device *adev;
+   int i, ret = 0;
+
+   mutex_lock(_info.mutex);
+
+   /*
+* MGPU fan boost feature should be enabled
+* only when there are two or more dGPUs in
+* the system
+*/
+   if (mgpu_info.num_dgpu < 2)
+   goto out;
+
+   for (i = 0; i < mgpu_info.num_dgpu; i++) {
+   gpu_ins = &(mgpu_info.gpu_ins[i]);
+   adev = gpu_ins->adev;
+   if (!(adev->flags & AMD_IS_APU) &&
+   !gpu_ins->mgpu_fan_enabled &&
+   adev->powerplay.pp_funcs &&
+   adev->powerplay.pp_funcs->enable_mgpu_fan_boost) {
+   ret = amdgpu_dpm_enable_mgpu_fan_boost(adev);
+   if (ret)
+   break;
+
+   gpu_ins->mgpu_fan_enabled = 1;
+   }
+   }
+
+out:
+   mutex_unlock(_info.mutex);
+
+   return ret;
+}
+
 /**
  * amdgpu_device_ip_late_init - run late init for hardware IPs
  *
@@ -1902,11 +1939,15 @@ static int amdgpu_device_ip_late_init(struct 
amdgpu_device *adev)
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_GATE);
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_GATE);
 
-   queue_delayed_work(system_wq, >late_init_work,
-  msecs_to_jiffies(AMDGPU_RESUME_MS));
-
amdgpu_device_fill_reset_magic(adev);
 
+   r = amdgpu_device_enable_mgpu_fan_boost();
+   if (r)
+   DRM_ERROR("enable mgpu fan boost failed (%d).\n", r);
+
+   /* set to low pstate by default */
+   amdgpu_xgmi_set_pstate(adev, 0);
+
return 0;
 }
 
@@ -2005,65 +2046,20 @@ static int amdgpu_device_ip_fini(struct amdgpu_device 
*adev)
return 0;
 }
 
-static int amdgpu_device_enable_mgpu_fan_boost(void)
-{
-   struct amdgpu_gpu_instance *gpu_ins;
-   struct amdgpu_device *adev;
-   int i, ret = 0;
-
-   mutex_lock(_info.mutex);
-
-   /*
-* MGPU fan boost feature should be enabled
-* only when there are two or more dGPUs in
-* the system
-*/
-   if (mgpu_info.num_dgpu < 2)
-   goto out;
-
-   for (i = 0; i < mgpu_info.num_dgpu; i++) {
-   gpu_ins = &(mgpu_info.gpu_ins[i]);
-   adev = gpu_ins->adev;
-   if (!(adev->flags & AMD_IS_APU) &&
-   !gpu_ins->mgpu_fan_enabled &&
-   adev->powerplay.pp_funcs &&
-   adev->powerplay.pp_funcs->enable_mgpu_fan_boost) {
-   ret = amdgpu_dpm_enable_mgpu_fan_boost(adev);
-   if (ret)
-   break;
-
-   gpu_ins->mgpu_fan_enabled = 1;
-   }
-   }
-
-out:
-   mutex_unlock(_info.mutex);
-
-   return ret;
-}
-
 /**
- * amdgpu_device_ip_late_init_func_handler - work handler for ib test
+ * 

[pull] amdgpu drm-fixes-5.2

2019-05-29 Thread Alex Deucher
Hi Dave, Daniel,

Fixes for 5.2:
- Respin the Raven DMCU patch with the ifdef fixed
- Fix for a clean display when loading the driver on Raven

The following changes since commit c074989171801171af6c5f53dd16b27f36b31deb:

  Revert "drm/amd/display: Don't load DMCU for Raven 1" (2019-05-24 19:56:50 
+1000)

are available in the Git repository at:

  git://people.freedesktop.org/~agd5f/linux drm-fixes-5.2

for you to fetch changes up to 02122753f1d0ac39d8c89f20f541a519a3002e92:

  drm/amdgpu: reserve stollen vram for raven series (2019-05-29 09:52:10 -0500)


Flora Cui (1):
  drm/amdgpu: reserve stollen vram for raven series

Harry Wentland (1):
  drm/amd/display: Don't load DMCU for Raven 1 (v2)

 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  3 +--
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 ++--
 2 files changed, 11 insertions(+), 4 deletions(-)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/sched: Fix make htmldocs warnings.

2019-05-29 Thread Grodzovsky, Andrey
I don't, sorry.

Andrey

On 5/29/19 12:42 PM, Alex Deucher wrote:
> On Wed, May 29, 2019 at 10:29 AM Andrey Grodzovsky
>  wrote:
>> Signed-off-by: Andrey Grodzovsky 
> Reviewed-by: Alex Deucher 
>
> I'll push it to drm-misc in a minute unless you have commit rights.
>
> Alex
>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index 49e7d07..c1058ee 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -353,6 +353,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
>>* drm_sched_stop - stop the scheduler
>>*
>>* @sched: scheduler instance
>> + * @bad: job which caused the time out
>>*
>>* Stop the scheduler and also removes and frees all completed jobs.
>>* Note: bad job will not be freed as it might be used later and so it's
>> @@ -422,6 +423,7 @@ EXPORT_SYMBOL(drm_sched_stop);
>>* drm_sched_job_recovery - recover jobs after a reset
>>*
>>* @sched: scheduler instance
>> + * @full_recovery: proceed with complete sched restart
>>*
>>*/
>>   void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>> --
>> 2.7.4
>>
>> ___
>> dri-devel mailing list
>> dri-de...@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/sched: Fix make htmldocs warnings.

2019-05-29 Thread Alex Deucher
On Wed, May 29, 2019 at 10:29 AM Andrey Grodzovsky
 wrote:
>
> Signed-off-by: Andrey Grodzovsky 

Reviewed-by: Alex Deucher 

I'll push it to drm-misc in a minute unless you have commit rights.

Alex

> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 49e7d07..c1058ee 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -353,6 +353,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
>   * drm_sched_stop - stop the scheduler
>   *
>   * @sched: scheduler instance
> + * @bad: job which caused the time out
>   *
>   * Stop the scheduler and also removes and frees all completed jobs.
>   * Note: bad job will not be freed as it might be used later and so it's
> @@ -422,6 +423,7 @@ EXPORT_SYMBOL(drm_sched_stop);
>   * drm_sched_job_recovery - recover jobs after a reset
>   *
>   * @sched: scheduler instance
> + * @full_recovery: proceed with complete sched restart
>   *
>   */
>  void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
> --
> 2.7.4
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

2019-05-29 Thread Emil Velikov
On 2019/05/29, Koenig, Christian wrote:
> Am 29.05.19 um 15:03 schrieb Emil Velikov:
> > On 2019/05/29, Dave Airlie wrote:
> >> On Wed, 29 May 2019 at 02:47, Emil Velikov  
> >> wrote:
> >>> On 2019/05/28, Koenig, Christian wrote:
>  Am 28.05.19 um 18:10 schrieb Emil Velikov:
> > On 2019/05/28, Daniel Vetter wrote:
> >> On Tue, May 28, 2019 at 10:03 AM Koenig, Christian
> >>  wrote:
> >>> Am 28.05.19 um 09:38 schrieb Daniel Vetter:
>  [SNIP]
> > Might be a good idea looking into reverting it partially, so that at
> > least command submission and buffer allocation is still blocked.
>  I thought the issue is a lot more than vainfo, it's pretty much every
>  hacked up compositor under the sun getting this wrong one way or
>  another. Thinking about this some more, I also have no idea how you'd
>  want to deprecate rendering on primary nodes in general. Apparently
>  that breaks -modesetting already, and probably lots more compositors.
>  And it looks like we're finally achieve the goal kms set out to 10
>  years ago, and new compositors are sprouting up all the time. I guess
>  we could just break them all (on new hardware) and tell them to all
>  suck it up. But I don't think that's a great option. And just
>  deprecating this on amdgpu is going to be even harder, since then
>  everywhere else it'll keep working, and it's just amdgpu.ko that 
>  looks
>  broken.
> 
>  Aside: I'm not supporting Emil's idea here because it fixes any 
>  issues
>  Intel has - Intel doesn't care. I support it because reality sucks,
>  people get this render vs. primary vs. multi-gpu prime wrong all the
>  time (that's also why we have hardcoded display+gpu pairs in mesa for
>  the various soc combinations out there), and this looks like a
>  pragmatic solution. It'd be nice if every compositor and everything
>  else would perfectly support multi gpu and only use render nodes for
>  rendering, and only primary nodes for display. But reality is that
>  people hack on stuff until gears on screen and then move on to more
>  interesting things (to them). So I don't think we'll ever win this 
>  :-/
> >>> Yeah, but this is a classic case of working around user space issues 
> >>> by
> >>> making kernel changes instead of fixing user space.
> >>>
> >>> Having privileged (output control) and unprivileged (rendering 
> >>> control)
> >>> functionality behind the same node is a mistake we have made a long 
> >>> time
> >>> ago and render nodes finally seemed to be a way to fix that.
> >>>
> >>> I mean why are compositors using the primary node in the first place?
> >>> Because they want to have access to privileged resources I think and 
> >>> in
> >>> this case it is perfectly ok to do so.
> >>>
> >>> Now extending unprivileged access to the primary node actually sounds
> >>> like a step into the wrong direction to me.
> >>>
> >>> I rather think that we should go down the route of completely dropping
> >>> command submission and buffer allocation through the primary node for
> >>> non master clients. And then as next step at some point drop support 
> >>> for
> >>> authentication/flink.
> >>>
> >>> I mean we have done this with UMS as well and I don't see much other 
> >>> way
> >>> to move forward and get rid of those ancient interface in the long 
> >>> term.
> >> Well kms had some really good benefits that drove quick adoption, like
> >> "suspend/resume actually has a chance of working" or "comes with
> >> buffer management so you can run multiple gears".
> >>
> >> The render node thing is a lot more niche use case (prime, better priv
> >> separation), plus "it's cleaner design". And the "cleaner design" part
> >> is something that empirically doesn't seem to matter :-/ Just two
> >> examples:
> >> - KHR_display/leases just iterated display resources on the fd needed
> >> for rendering (and iirc there was even a patch to expose that for
> >> render nodes too so it works with DRI3), because implementing
> >> protocols is too hard. Barely managed to stop that one before it
> >> happened.
> >> - Various video players use the vblank ioctl on directly to schedule
> >> frames, without telling the compositor. I discovered that when I
> >> wanted to limite the vblank ioctl to master clients only. Again,
> >> apparently too hard to use the existing extensions, or fix the bugs in
> >> there, or whatever. One userspace got fixed last year, but it'll
> >> probably get copypasted around forever :-/
> >>
> >> So I don't think we'll ever manage to roll a clean split out, and best
> >> we can do is give in 

[PATCH][next] drm/amdkfd: fix null pointer dereference on dev

2019-05-29 Thread Colin King
From: Colin Ian King 

The pointer dev is set to null yet it is being dereferenced when
checking dev->dqm->sched_policy.  Fix this by performing the check
on dev->dqm->sched_policy after dev has been assigned and null
checked.  Also remove the redundant null assignment to dev.

Addresses-Coverity: ("Explicit null dereference")
Fixes: 1a058c337676 ("drm/amdkfd: New IOCTL to allocate queue GWS")
Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index aab2aa6c1dee..ea82828fdc76 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1572,10 +1572,9 @@ static int kfd_ioctl_alloc_queue_gws(struct file *filep,
 {
int retval;
struct kfd_ioctl_alloc_queue_gws_args *args = data;
-   struct kfd_dev *dev = NULL;
+   struct kfd_dev *dev;
 
-   if (!hws_gws_support ||
-   dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS)
+   if (!hws_gws_support)
return -EINVAL;
 
dev = kfd_device_by_id(args->gpu_id);
@@ -1583,6 +1582,8 @@ static int kfd_ioctl_alloc_queue_gws(struct file *filep,
pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
return -EINVAL;
}
+   if (dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS)
+   return -EINVAL;
 
mutex_lock(>mutex);
retval = pqm_set_gws(>pqm, args->queue_id, args->num_gws ? dev->gws 
: NULL);
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-29 Thread Dave Martin
On Wed, May 29, 2019 at 02:23:42PM +0100, Catalin Marinas wrote:
> On Wed, May 29, 2019 at 01:42:25PM +0100, Dave P Martin wrote:
> > On Tue, May 28, 2019 at 05:34:00PM +0100, Catalin Marinas wrote:
> > > On Tue, May 28, 2019 at 04:56:45PM +0100, Dave P Martin wrote:
> > > > On Tue, May 28, 2019 at 04:40:58PM +0100, Catalin Marinas wrote:
> > > > 
> > > > [...]
> > > > 
> > > > > My thoughts on allowing tags (quick look):
> > > > >
> > > > > brk - no
> > > > 
> > > > [...]
> > > > 
> > > > > mlock, mlock2, munlock - yes
> > > > > mmap - no (we may change this with MTE but not for TBI)
> > > > 
> > > > [...]
> > > > 
> > > > > mprotect - yes
> > > > 
> > > > I haven't following this discussion closely... what's the rationale for
> > > > the inconsistencies here (feel free to refer me back to the discussion
> > > > if it's elsewhere).
> > > 
> > > _My_ rationale (feel free to disagree) is that mmap() by default would
> > > not return a tagged address (ignoring MTE for now). If it gets passed a
> > > tagged address or a "tagged NULL" (for lack of a better name) we don't
> > > have clear semantics of whether the returned address should be tagged in
> > > this ABI relaxation. I'd rather reserve this specific behaviour if we
> > > overload the non-zero tag meaning of mmap() for MTE. Similar reasoning
> > > for mremap(), at least on the new_address argument (not entirely sure
> > > about old_address).
> > > 
> > > munmap() should probably follow the mmap() rules.
> > > 
> > > As for brk(), I don't see why the user would need to pass a tagged
> > > address, we can't associate any meaning to this tag.
> > > 
> > > For the rest, since it's likely such addresses would have been tagged by
> > > malloc() in user space, we should allow tagged pointers.
> > 
> > Those arguments seem reasonable.  We should try to capture this
> > somewhere when documenting the ABI.
> > 
> > To be clear, I'm not sure that we should guarantee anywhere that a
> > tagged pointer is rejected: rather the behaviour should probably be
> > left unspecified.  Then we can tidy it up incrementally.
> > 
> > (The behaviour is unspecified today, in any case.)
> 
> What is specified (or rather de-facto ABI) today is that passing a user
> address above TASK_SIZE (e.g. non-zero top byte) would fail in most
> cases. If we relax this with the TBI we may end up with some de-facto

I may be being too picky, but "would fail in most cases" sounds like
"unspecified" ?

> ABI before we actually get MTE hardware. Tightening it afterwards may be
> slightly more problematic, although MTE needs to be an explicit opt-in.
> 
> IOW, I wouldn't want to unnecessarily relax the ABI if we don't need to.

So long we don't block foreseeable future developments unnecessarily
either -- I agree there's a balance to be struck.

I guess this can be reviewed when we have nailed down the details a bit
further.

Cheers
---Dave


[PATCH] drm/amdgpu: add pmu counters

2019-05-29 Thread Kim, Jonathan
add pmu counters to monitor amdgpu device performance.
each pmu registered recorded per pmu type per asic type.

Change-Id: I8449f4ea824c411ee24a5b783ac066189b9de08e
Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdgpu/Makefile|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   5 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c| 394 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.h|  37 ++
 4 files changed, 437 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 11a651ff7f0d..90d4c5d299dd 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -54,7 +54,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o amdgpu_atomfirmware.o \
amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o \
amdgpu_gmc.o amdgpu_xgmi.o amdgpu_csa.o amdgpu_ras.o amdgpu_vm_cpu.o \
-   amdgpu_vm_sdma.o
+   amdgpu_vm_sdma.o amdgpu_pmu.o
 
 # add asic specific block
 amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 582f5635fcb2..51f479b357a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -61,6 +61,7 @@
 
 #include "amdgpu_xgmi.h"
 #include "amdgpu_ras.h"
+#include "amdgpu_pmu.h"
 
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
@@ -2748,6 +2749,10 @@ int amdgpu_device_init(struct amdgpu_device *adev,
goto failed;
}
 
+   r = amdgpu_pmu_init(adev);
+   if (r)
+   dev_err(adev->dev, "amdgpu_pmu_init failed\n");
+
/* must succeed. */
amdgpu_ras_resume(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c
new file mode 100644
index ..39cff772dd9e
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c
@@ -0,0 +1,394 @@
+/*
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Author: Jonathan Kim 
+ *
+ */
+
+#define pr_fmt(fmt)"perf/amdgpu_pmu: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include "amdgpu.h"
+#include "amdgpu_pmu.h"
+#include "df_v3_6.h"
+
+#define PMU_NAME_SIZE 32
+
+/* record to keep track of pmu entry per pmu type per device */
+struct amdgpu_pmu_entry {
+   struct amdgpu_device *adev;
+   struct pmu pmu;
+   unsigned int pmu_perf_type;
+};
+
+PMU_FORMAT_ATTR(df_event,  "config:0-7");
+PMU_FORMAT_ATTR(df_instance,   "config:8-15");
+PMU_FORMAT_ATTR(df_unitmask,   "config:16-23");
+
+/* df format attributes  */
+static struct attribute *amdgpu_df_format_attrs[] = {
+   _attr_df_event.attr,
+   _attr_df_instance.attr,
+   _attr_df_unitmask.attr,
+   NULL
+};
+
+/* df format attribute group */
+static struct attribute_group amdgpu_df_format_attr_group = {
+   .name = "format",
+   .attrs = amdgpu_df_format_attrs,
+};
+
+/* df event attribute group */
+static struct attribute_group amdgpu_df_events_attr_group = {
+   .name = "events",
+};
+
+struct AMDGPU_PMU_EVENT_DESC {
+   struct kobj_attribute attr;
+   const char *event;
+};
+
+static ssize_t _pmu_event_show(struct kobject *kobj,
+  struct kobj_attribute *attr, char *buf)
+{
+   struct AMDGPU_PMU_EVENT_DESC *event =
+   container_of(attr, struct AMDGPU_PMU_EVENT_DESC, attr);
+   return sprintf(buf, "%s\n", event->event);
+};
+
+#define AMDGPU_PMU_EVENT_DESC(_name, _event)   \
+{  

Re: [PATCH v15 01/17] uaccess: add untagged_addr definition for other arches

2019-05-29 Thread Khalid Aziz
On Mon, 2019-05-06 at 18:30 +0200, Andrey Konovalov wrote:
> To allow arm64 syscalls to accept tagged pointers from userspace, we
> must
> untag them when they are passed to the kernel. Since untagging is
> done in
> generic parts of the kernel, the untagged_addr macro needs to be
> defined
> for all architectures.
> 
> Define it as a noop for architectures other than arm64.
> 
> Acked-by: Catalin Marinas 
> Signed-off-by: Andrey Konovalov 
> ---
>  include/linux/mm.h | 4 
>  1 file changed, 4 insertions(+)

As discussed in the other thread Chris started, there is a generic need
to untag addresses in kernel and this patch gets us ready for that.

Reviewed-by: Khalid Aziz 

> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 6b10c21630f5..44041df804a6 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -99,6 +99,10 @@ extern int mmap_rnd_compat_bits __read_mostly;
>  #include 
>  #include 
>  
> +#ifndef untagged_addr
> +#define untagged_addr(addr) (addr)
> +#endif
> +
>  #ifndef __pa_symbol
>  #define __pa_symbol(x)  __pa(RELOC_HIDE((unsigned long)(x), 0))
>  #endif



RE: [PATCH] drm/amdkfd: Fix a potential circular lock

2019-05-29 Thread Zeng, Oak


Regards,
Oak

-Original Message-
From: Kuehling, Felix  
Sent: Tuesday, May 28, 2019 5:08 PM
To: Zeng, Oak ; amd-gfx@lists.freedesktop.org
Cc: Zhao, Yong ; Liu, Alex ; Freehill, 
Chris 
Subject: Re: [PATCH] drm/amdkfd: Fix a potential circular lock

On 2019-05-28 2:28 p.m., Zeng, Oak wrote:
> The idea to break the circular lock dependency is, unlock dqm 
> temporarily before calling init_mqd in call stack #1 (see below)
>
> [  513.604034] ==
> [  513.604205] WARNING: possible circular locking dependency detected
> [  513.604375] 4.18.0-kfd-root #2 Tainted: GW
> [  513.604530] --
> [  513.604699] kswapd0/611 is trying to acquire lock:
> [  513.604840] d254022e (>lock_hidden){+.+.}, at: 
> evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [  513.605150]
> but task is already holding lock:
> [  513.605307] 961547fc (_vma->rwsem){}, at: 
> page_lock_anon_vma_read+0xe4/0x250
> [  513.605540]
> which lock already depends on the new lock.
>
> [  513.605747]
> the existing dependency chain (in reverse order) is:
> [  513.605944]
> -> #4 (_vma->rwsem){}:
> [  513.606106]__vma_adjust+0x147/0x7f0
> [  513.606231]__split_vma+0x179/0x190
> [  513.606353]mprotect_fixup+0x217/0x260
> [  513.606553]do_mprotect_pkey+0x211/0x380
> [  513.606752]__x64_sys_mprotect+0x1b/0x20
> [  513.606954]do_syscall_64+0x50/0x1a0
> [  513.607149]entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  513.607380]
> -> #3 (>i_mmap_rwsem){}:
> [  513.607678]rmap_walk_file+0x1f0/0x280
> [  513.607887]page_referenced+0xdd/0x180
> [  513.608081]shrink_page_list+0x853/0xcb0
> [  513.608279]shrink_inactive_list+0x33b/0x700
> [  513.608483]shrink_node_memcg+0x37a/0x7f0
> [  513.608682]shrink_node+0xd8/0x490
> [  513.608869]balance_pgdat+0x18b/0x3b0
> [  513.609062]kswapd+0x203/0x5c0
> [  513.609241]kthread+0x100/0x140
> [  513.609420]ret_from_fork+0x24/0x30
> [  513.609607]
> -> #2 (fs_reclaim){+.+.}:
> [  513.609883]kmem_cache_alloc_trace+0x34/0x2e0
> [  513.610093]reservation_object_reserve_shared+0x139/0x300
> [  513.610326]ttm_bo_init_reserved+0x291/0x480 [ttm]
> [  513.610567]amdgpu_bo_do_create+0x1d2/0x650 [amdgpu]
> [  513.610811]amdgpu_bo_create+0x40/0x1f0 [amdgpu]
> [  513.611041]amdgpu_bo_create_reserved+0x249/0x2d0 [amdgpu]
> [  513.611290]amdgpu_bo_create_kernel+0x12/0x70 [amdgpu]
> [  513.611584]amdgpu_ttm_init+0x2cb/0x560 [amdgpu]
> [  513.611823]gmc_v9_0_sw_init+0x400/0x750 [amdgpu]
> [  513.612491]amdgpu_device_init+0x14eb/0x1990 [amdgpu]
> [  513.612730]amdgpu_driver_load_kms+0x78/0x290 [amdgpu]
> [  513.612958]drm_dev_register+0x111/0x1a0
> [  513.613171]amdgpu_pci_probe+0x11c/0x1e0 [amdgpu]
> [  513.613389]local_pci_probe+0x3f/0x90
> [  513.613581]pci_device_probe+0x102/0x1c0
> [  513.613779]driver_probe_device+0x2a7/0x480
> [  513.613984]__driver_attach+0x10a/0x110
> [  513.614179]bus_for_each_dev+0x67/0xc0
> [  513.614372]bus_add_driver+0x1eb/0x260
> [  513.614565]driver_register+0x5b/0xe0
> [  513.614756]do_one_initcall+0xac/0x357
> [  513.614952]do_init_module+0x5b/0x213
> [  513.615145]load_module+0x2542/0x2d30
> [  513.615337]__do_sys_finit_module+0xd2/0x100
> [  513.615541]do_syscall_64+0x50/0x1a0
> [  513.615731]entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  513.615963]
> -> #1 (reservation_ww_class_mutex){+.+.}:
> [  513.616293]amdgpu_amdkfd_alloc_gtt_mem+0xcf/0x2c0 [amdgpu]
> [  513.616554]init_mqd+0x223/0x260 [amdgpu]
> [  513.616779]create_queue_nocpsch+0x4d9/0x600 [amdgpu]
> [  513.617031]pqm_create_queue+0x37c/0x520 [amdgpu]
> [  513.617270]kfd_ioctl_create_queue+0x2f9/0x650 [amdgpu]
> [  513.617522]kfd_ioctl+0x202/0x350 [amdgpu]
> [  513.617724]do_vfs_ioctl+0x9f/0x6c0
> [  513.617914]ksys_ioctl+0x66/0x70
> [  513.618095]__x64_sys_ioctl+0x16/0x20
> [  513.618286]do_syscall_64+0x50/0x1a0
> [  513.618476]entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  513.618695]
> -> #0 (>lock_hidden){+.+.}:
> [  513.618984]__mutex_lock+0x98/0x970
> [  513.619197]evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
> [  513.619459]kfd_process_evict_queues+0x3b/0xb0 [amdgpu]
> [  513.619710]kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu]
> [  513.620103]amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu]
> [  513.620363]amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu]
> [  513.620614]

Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-29 Thread Catalin Marinas
Hi Khalid,

On Tue, May 28, 2019 at 05:33:04PM -0600, Khalid Aziz wrote:
> On Tue, 2019-05-28 at 16:40 +0100, Catalin Marinas wrote:
> > I think another aspect is how we define the ABI. Is allowing tags to
> > mlock() for example something specific to arm64 or would sparc ADI
> > need the same? In the absence of other architectures defining such
> > ABI, my preference would be to keep the wrappers in the arch code.
> > 
> > Assuming sparc won't implement untagged_addr(), we can place the
> > macros back in the generic code but, as per the review here, we need
> > to be more restrictive on where we allow tagged addresses. For
> > example, if mmap() gets a tagged address with MAP_FIXED, is it
> > expected to return the tag?
> 
> I would recommend against any ABI differences between ARM64 MTE/TBI and
> sparc ADI unless it simply can not be helped. My understanding of
> MTE/TBI is limited, so I will explain how sparc ADI works. On sparc, a
> tagged address has no meaning until following steps happen:

Before we go into the MTE/ADI similarities or differences, just to
clarify that TBI is something that we supported from the start of the
arm64 kernel port. TBI (top byte ignore) allows a user pointer to have
non-zero top byte and dereference it without causing a fault (the
hardware masks it out). The user/kernel ABI does not allow such tagged
pointers into the kernel, nor would the kernel return any such tagged
addresses.

With MTE (memory tagging extensions), the top-byte meaning is changed
from no longer being ignored to actually being checked against a tag in
the physical RAM (we call it allocation tag).

> 1. set the user mode PSTATE.mcde bit. This acts as the master switch to
> enable ADI for a process.
> 
> 2. set TTE.mcd bit on TLB entries that match the address range ADI is
> being enabled on.

Something close enough for MTE, with the difference that enabling it is
not a PSTATE bit but rather a system control bit (SCTLR_EL1 register),
so only the kernel can turn it on/off for the user.

> 3. Store version tag for the range of addresses userspace wants ADI
> enabled on using "stxa" instruction. These tags are stored in physical
> memory by the memory controller.

Do you have an "ldxa" instruction to load the tags from physical memory?

> Steps 1 and 2 are accomplished by userspace by calling mprotect() with
> PROT_ADI. Tags are set by storing tags in a loop, for example:
> 
> version = 10;
> tmp_addr = shmaddr;
> end = shmaddr + BUFFER_SIZE;
> while (tmp_addr < end) {
> asm volatile(
> "stxa %1, [%0]0x90\n\t"
> :
> : "r" (tmp_addr), "r" (version));
> tmp_addr += adi_blksz;
> }

On arm64, a sequence similar to the above would live in the libc. So a
malloc() call will tag the memory and return the tagged address to the
user.

We were not planning for a PROT_ADI/MTE but rather have MTE enabled for
all user memory ranges. We may revisit this before we upstream the MTE
support (probably some marginal benefit for the hardware not fetching
the tags from memory if we don't need to, e.g. code sections).

Given that we already have the TBI feature and with MTE enabled the top
byte is no longer ignored, we are planning for an explicit opt-in by the
user via prctl() to enable MTE.

> With these semantics, giving mmap() or shamat() a tagged address is
> meaningless since no tags have been stored at the addresses mmap() will
> allocate and one can not store tags before memory range has been
> allocated. If we choose to allow tagged addresses to come into mmap()
> and shmat(), sparc code can strip the tags unconditionally and that may
> help simplify ABI and/or code.

We could say that with TBI (pre-MTE support), the top byte is actually
ignored on mmap(). Now, if you pass a MAP_FIXED with a tagged address,
should the user expect the same tagged address back or stripping the tag
is acceptable? If we want to keep the current mmap() semantics, I'd say
the same tag is returned. However, with MTE this also implies that the
memory was coloured.

> > My thoughts on allowing tags (quick look):
> > 
> > brk - no
> > get_mempolicy - yes
> > madvise - yes
> > mbind - yes
> > mincore - yes
> > mlock, mlock2, munlock - yes
> > mmap - no (we may change this with MTE but not for TBI)
> > mmap_pgoff - not used on arm64
> > mprotect - yes
> > mremap - yes for old_address, no for new_address (on par with mmap)
> > msync - yes
> > munmap - probably no (mmap does not return tagged ptrs)
> > remap_file_pages - no (also deprecated syscall)
> > shmat, shmdt - shall we allow tagged addresses on shared memory?
> > 
> > The above is only about the TBI ABI while ignoring hardware MTE. For
> > the latter, we may want to change the mmap() to allow pre-colouring
> > on page fault which means that munmap()/mprotect() should also
> > support tagged pointers. Possibly mremap() as well but we need to
> > decide 

Re: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3

2019-05-29 Thread Pelloux-prayer, Pierre-eric
Hi Christian,

The series is:

Tested-by: Pierre-Eric Pelloux-Prayer 


Pierre-Eric



On 29/05/2019 14:27, Christian König wrote:
> This avoids OOM situations when we have lots of threads
> submitting at the same time.
> 
> v3: apply this to the whole driver, not just CS
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c| 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 +-
>  4 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 20f2955d2a55..3e2da24cd17a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
> *p,
>   }
>  
>   r = ttm_eu_reserve_buffers(>ticket, >validated, true,
> -, true);
> +, false);
>   if (unlikely(r != 0)) {
>   if (r != -ERESTARTSYS)
>   DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> index 06f83cac0d3a..f660628e6af9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> @@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, 
> struct amdgpu_vm *vm,
>   list_add(_tv.head, );
>   amdgpu_vm_get_pd_bo(vm, , );
>  
> - r = ttm_eu_reserve_buffers(, , true, NULL, true);
> + r = ttm_eu_reserve_buffers(, , true, NULL, false);
>   if (r) {
>   DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r);
>   return r;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index d513a5ad03dd..ed25a4e14404 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,
>  
>   amdgpu_vm_get_pd_bo(vm, , _pd);
>  
> - r = ttm_eu_reserve_buffers(, , false, , true);
> + r = ttm_eu_reserve_buffers(, , false, , false);
>   if (r) {
>   dev_err(adev->dev, "leaking bo va because "
>   "we fail to reserve bo (%d)\n", r);
> @@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void 
> *data,
>  
>   amdgpu_vm_get_pd_bo(>vm, , _pd);
>  
> - r = ttm_eu_reserve_buffers(, , true, , true);
> + r = ttm_eu_reserve_buffers(, , true, , false);
>   if (r)
>   goto error_unref;
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index c430e8259038..d60593cc436e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -155,7 +155,7 @@ static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, 
> bool no_intr)
>   struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>   int r;
>  
> - r = ttm_bo_reserve(>tbo, !no_intr, false, NULL);
> + r = __ttm_bo_reserve(>tbo, !no_intr, false, NULL);
>   if (unlikely(r != 0)) {
>   if (r != -ERESTARTSYS)
>   dev_err(adev->dev, "%p reserve failed\n", bo);
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-29 Thread Catalin Marinas
On Wed, May 29, 2019 at 01:42:25PM +0100, Dave P Martin wrote:
> On Tue, May 28, 2019 at 05:34:00PM +0100, Catalin Marinas wrote:
> > On Tue, May 28, 2019 at 04:56:45PM +0100, Dave P Martin wrote:
> > > On Tue, May 28, 2019 at 04:40:58PM +0100, Catalin Marinas wrote:
> > > 
> > > [...]
> > > 
> > > > My thoughts on allowing tags (quick look):
> > > >
> > > > brk - no
> > > 
> > > [...]
> > > 
> > > > mlock, mlock2, munlock - yes
> > > > mmap - no (we may change this with MTE but not for TBI)
> > > 
> > > [...]
> > > 
> > > > mprotect - yes
> > > 
> > > I haven't following this discussion closely... what's the rationale for
> > > the inconsistencies here (feel free to refer me back to the discussion
> > > if it's elsewhere).
> > 
> > _My_ rationale (feel free to disagree) is that mmap() by default would
> > not return a tagged address (ignoring MTE for now). If it gets passed a
> > tagged address or a "tagged NULL" (for lack of a better name) we don't
> > have clear semantics of whether the returned address should be tagged in
> > this ABI relaxation. I'd rather reserve this specific behaviour if we
> > overload the non-zero tag meaning of mmap() for MTE. Similar reasoning
> > for mremap(), at least on the new_address argument (not entirely sure
> > about old_address).
> > 
> > munmap() should probably follow the mmap() rules.
> > 
> > As for brk(), I don't see why the user would need to pass a tagged
> > address, we can't associate any meaning to this tag.
> > 
> > For the rest, since it's likely such addresses would have been tagged by
> > malloc() in user space, we should allow tagged pointers.
> 
> Those arguments seem reasonable.  We should try to capture this
> somewhere when documenting the ABI.
> 
> To be clear, I'm not sure that we should guarantee anywhere that a
> tagged pointer is rejected: rather the behaviour should probably be
> left unspecified.  Then we can tidy it up incrementally.
> 
> (The behaviour is unspecified today, in any case.)

What is specified (or rather de-facto ABI) today is that passing a user
address above TASK_SIZE (e.g. non-zero top byte) would fail in most
cases. If we relax this with the TBI we may end up with some de-facto
ABI before we actually get MTE hardware. Tightening it afterwards may be
slightly more problematic, although MTE needs to be an explicit opt-in.

IOW, I wouldn't want to unnecessarily relax the ABI if we don't need to.

-- 
Catalin


Re: [PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

2019-05-29 Thread Koenig, Christian
Am 29.05.19 um 15:03 schrieb Emil Velikov:
> On 2019/05/29, Dave Airlie wrote:
>> On Wed, 29 May 2019 at 02:47, Emil Velikov  wrote:
>>> On 2019/05/28, Koenig, Christian wrote:
 Am 28.05.19 um 18:10 schrieb Emil Velikov:
> On 2019/05/28, Daniel Vetter wrote:
>> On Tue, May 28, 2019 at 10:03 AM Koenig, Christian
>>  wrote:
>>> Am 28.05.19 um 09:38 schrieb Daniel Vetter:
 [SNIP]
> Might be a good idea looking into reverting it partially, so that at
> least command submission and buffer allocation is still blocked.
 I thought the issue is a lot more than vainfo, it's pretty much every
 hacked up compositor under the sun getting this wrong one way or
 another. Thinking about this some more, I also have no idea how you'd
 want to deprecate rendering on primary nodes in general. Apparently
 that breaks -modesetting already, and probably lots more compositors.
 And it looks like we're finally achieve the goal kms set out to 10
 years ago, and new compositors are sprouting up all the time. I guess
 we could just break them all (on new hardware) and tell them to all
 suck it up. But I don't think that's a great option. And just
 deprecating this on amdgpu is going to be even harder, since then
 everywhere else it'll keep working, and it's just amdgpu.ko that looks
 broken.

 Aside: I'm not supporting Emil's idea here because it fixes any issues
 Intel has - Intel doesn't care. I support it because reality sucks,
 people get this render vs. primary vs. multi-gpu prime wrong all the
 time (that's also why we have hardcoded display+gpu pairs in mesa for
 the various soc combinations out there), and this looks like a
 pragmatic solution. It'd be nice if every compositor and everything
 else would perfectly support multi gpu and only use render nodes for
 rendering, and only primary nodes for display. But reality is that
 people hack on stuff until gears on screen and then move on to more
 interesting things (to them). So I don't think we'll ever win this :-/
>>> Yeah, but this is a classic case of working around user space issues by
>>> making kernel changes instead of fixing user space.
>>>
>>> Having privileged (output control) and unprivileged (rendering control)
>>> functionality behind the same node is a mistake we have made a long time
>>> ago and render nodes finally seemed to be a way to fix that.
>>>
>>> I mean why are compositors using the primary node in the first place?
>>> Because they want to have access to privileged resources I think and in
>>> this case it is perfectly ok to do so.
>>>
>>> Now extending unprivileged access to the primary node actually sounds
>>> like a step into the wrong direction to me.
>>>
>>> I rather think that we should go down the route of completely dropping
>>> command submission and buffer allocation through the primary node for
>>> non master clients. And then as next step at some point drop support for
>>> authentication/flink.
>>>
>>> I mean we have done this with UMS as well and I don't see much other way
>>> to move forward and get rid of those ancient interface in the long term.
>> Well kms had some really good benefits that drove quick adoption, like
>> "suspend/resume actually has a chance of working" or "comes with
>> buffer management so you can run multiple gears".
>>
>> The render node thing is a lot more niche use case (prime, better priv
>> separation), plus "it's cleaner design". And the "cleaner design" part
>> is something that empirically doesn't seem to matter :-/ Just two
>> examples:
>> - KHR_display/leases just iterated display resources on the fd needed
>> for rendering (and iirc there was even a patch to expose that for
>> render nodes too so it works with DRI3), because implementing
>> protocols is too hard. Barely managed to stop that one before it
>> happened.
>> - Various video players use the vblank ioctl on directly to schedule
>> frames, without telling the compositor. I discovered that when I
>> wanted to limite the vblank ioctl to master clients only. Again,
>> apparently too hard to use the existing extensions, or fix the bugs in
>> there, or whatever. One userspace got fixed last year, but it'll
>> probably get copypasted around forever :-/
>>
>> So I don't think we'll ever manage to roll a clean split out, and best
>> we can do is give in and just hand userspace what it wants. As much as
>> that's misguided and unclean and all that. Maybe it'll result in a
>> least fewer stuff getting run as root to hack around this, because
>> fixing properly seems not to be on the table.
>>
>> The beauty of kms is that we've achieved the 

RE: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3

2019-05-29 Thread Zhou, David(ChunMing)
Patch #1,#5,#6,#8,#9,#10 are Reviewed-by: Chunming Zhou 
Patch #2,#3,#4 are Acked-by: Chunming Zhou 

-David

> -Original Message-
> From: dri-devel  On Behalf Of
> Christian K?nig
> Sent: Wednesday, May 29, 2019 8:27 PM
> To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Subject: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3
> 
> This avoids OOM situations when we have lots of threads submitting at the
> same time.
> 
> v3: apply this to the whole driver, not just CS
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c| 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 +-
>  4 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 20f2955d2a55..3e2da24cd17a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct
> amdgpu_cs_parser *p,
>   }
> 
>   r = ttm_eu_reserve_buffers(>ticket, >validated, true,
> -, true);
> +, false);
>   if (unlikely(r != 0)) {
>   if (r != -ERESTARTSYS)
>   DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> index 06f83cac0d3a..f660628e6af9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> @@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device
> *adev, struct amdgpu_vm *vm,
>   list_add(_tv.head, );
>   amdgpu_vm_get_pd_bo(vm, , );
> 
> - r = ttm_eu_reserve_buffers(, , true, NULL, true);
> + r = ttm_eu_reserve_buffers(, , true, NULL, false);
>   if (r) {
>   DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r);
>   return r;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index d513a5ad03dd..ed25a4e14404 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct
> drm_gem_object *obj,
> 
>   amdgpu_vm_get_pd_bo(vm, , _pd);
> 
> - r = ttm_eu_reserve_buffers(, , false, , true);
> + r = ttm_eu_reserve_buffers(, , false, , false);
>   if (r) {
>   dev_err(adev->dev, "leaking bo va because "
>   "we fail to reserve bo (%d)\n", r);
> @@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev,
> void *data,
> 
>   amdgpu_vm_get_pd_bo(>vm, , _pd);
> 
> - r = ttm_eu_reserve_buffers(, , true, , true);
> + r = ttm_eu_reserve_buffers(, , true, , false);
>   if (r)
>   goto error_unref;
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index c430e8259038..d60593cc436e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -155,7 +155,7 @@ static inline int amdgpu_bo_reserve(struct
> amdgpu_bo *bo, bool no_intr)
>   struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>   int r;
> 
> - r = ttm_bo_reserve(>tbo, !no_intr, false, NULL);
> + r = __ttm_bo_reserve(>tbo, !no_intr, false, NULL);
>   if (unlikely(r != 0)) {
>   if (r != -ERESTARTSYS)
>   dev_err(adev->dev, "%p reserve failed\n", bo);
> --
> 2.17.1
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

2019-05-29 Thread Emil Velikov
On 2019/05/29, Dave Airlie wrote:
> On Wed, 29 May 2019 at 02:47, Emil Velikov  wrote:
> >
> > On 2019/05/28, Koenig, Christian wrote:
> > > Am 28.05.19 um 18:10 schrieb Emil Velikov:
> > > > On 2019/05/28, Daniel Vetter wrote:
> > > >> On Tue, May 28, 2019 at 10:03 AM Koenig, Christian
> > > >>  wrote:
> > > >>> Am 28.05.19 um 09:38 schrieb Daniel Vetter:
> > >  [SNIP]
> > > > Might be a good idea looking into reverting it partially, so that at
> > > > least command submission and buffer allocation is still blocked.
> > >  I thought the issue is a lot more than vainfo, it's pretty much every
> > >  hacked up compositor under the sun getting this wrong one way or
> > >  another. Thinking about this some more, I also have no idea how you'd
> > >  want to deprecate rendering on primary nodes in general. Apparently
> > >  that breaks -modesetting already, and probably lots more compositors.
> > >  And it looks like we're finally achieve the goal kms set out to 10
> > >  years ago, and new compositors are sprouting up all the time. I guess
> > >  we could just break them all (on new hardware) and tell them to all
> > >  suck it up. But I don't think that's a great option. And just
> > >  deprecating this on amdgpu is going to be even harder, since then
> > >  everywhere else it'll keep working, and it's just amdgpu.ko that 
> > >  looks
> > >  broken.
> > > 
> > >  Aside: I'm not supporting Emil's idea here because it fixes any 
> > >  issues
> > >  Intel has - Intel doesn't care. I support it because reality sucks,
> > >  people get this render vs. primary vs. multi-gpu prime wrong all the
> > >  time (that's also why we have hardcoded display+gpu pairs in mesa for
> > >  the various soc combinations out there), and this looks like a
> > >  pragmatic solution. It'd be nice if every compositor and everything
> > >  else would perfectly support multi gpu and only use render nodes for
> > >  rendering, and only primary nodes for display. But reality is that
> > >  people hack on stuff until gears on screen and then move on to more
> > >  interesting things (to them). So I don't think we'll ever win this 
> > >  :-/
> > > >>> Yeah, but this is a classic case of working around user space issues 
> > > >>> by
> > > >>> making kernel changes instead of fixing user space.
> > > >>>
> > > >>> Having privileged (output control) and unprivileged (rendering 
> > > >>> control)
> > > >>> functionality behind the same node is a mistake we have made a long 
> > > >>> time
> > > >>> ago and render nodes finally seemed to be a way to fix that.
> > > >>>
> > > >>> I mean why are compositors using the primary node in the first place?
> > > >>> Because they want to have access to privileged resources I think and 
> > > >>> in
> > > >>> this case it is perfectly ok to do so.
> > > >>>
> > > >>> Now extending unprivileged access to the primary node actually sounds
> > > >>> like a step into the wrong direction to me.
> > > >>>
> > > >>> I rather think that we should go down the route of completely dropping
> > > >>> command submission and buffer allocation through the primary node for
> > > >>> non master clients. And then as next step at some point drop support 
> > > >>> for
> > > >>> authentication/flink.
> > > >>>
> > > >>> I mean we have done this with UMS as well and I don't see much other 
> > > >>> way
> > > >>> to move forward and get rid of those ancient interface in the long 
> > > >>> term.
> > > >> Well kms had some really good benefits that drove quick adoption, like
> > > >> "suspend/resume actually has a chance of working" or "comes with
> > > >> buffer management so you can run multiple gears".
> > > >>
> > > >> The render node thing is a lot more niche use case (prime, better priv
> > > >> separation), plus "it's cleaner design". And the "cleaner design" part
> > > >> is something that empirically doesn't seem to matter :-/ Just two
> > > >> examples:
> > > >> - KHR_display/leases just iterated display resources on the fd needed
> > > >> for rendering (and iirc there was even a patch to expose that for
> > > >> render nodes too so it works with DRI3), because implementing
> > > >> protocols is too hard. Barely managed to stop that one before it
> > > >> happened.
> > > >> - Various video players use the vblank ioctl on directly to schedule
> > > >> frames, without telling the compositor. I discovered that when I
> > > >> wanted to limite the vblank ioctl to master clients only. Again,
> > > >> apparently too hard to use the existing extensions, or fix the bugs in
> > > >> there, or whatever. One userspace got fixed last year, but it'll
> > > >> probably get copypasted around forever :-/
> > > >>
> > > >> So I don't think we'll ever manage to roll a clean split out, and best
> > > >> we can do is give in and just hand userspace what it wants. As much as
> > > >> that's misguided and unclean and all 

Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-29 Thread Dave Martin
On Tue, May 28, 2019 at 05:34:00PM +0100, Catalin Marinas wrote:
> On Tue, May 28, 2019 at 04:56:45PM +0100, Dave P Martin wrote:
> > On Tue, May 28, 2019 at 04:40:58PM +0100, Catalin Marinas wrote:
> > 
> > [...]
> > 
> > > My thoughts on allowing tags (quick look):
> > >
> > > brk - no
> > 
> > [...]
> > 
> > > mlock, mlock2, munlock - yes
> > > mmap - no (we may change this with MTE but not for TBI)
> > 
> > [...]
> > 
> > > mprotect - yes
> > 
> > I haven't following this discussion closely... what's the rationale for
> > the inconsistencies here (feel free to refer me back to the discussion
> > if it's elsewhere).
> 
> _My_ rationale (feel free to disagree) is that mmap() by default would
> not return a tagged address (ignoring MTE for now). If it gets passed a
> tagged address or a "tagged NULL" (for lack of a better name) we don't
> have clear semantics of whether the returned address should be tagged in
> this ABI relaxation. I'd rather reserve this specific behaviour if we
> overload the non-zero tag meaning of mmap() for MTE. Similar reasoning
> for mremap(), at least on the new_address argument (not entirely sure
> about old_address).
> 
> munmap() should probably follow the mmap() rules.
> 
> As for brk(), I don't see why the user would need to pass a tagged
> address, we can't associate any meaning to this tag.
> 
> For the rest, since it's likely such addresses would have been tagged by
> malloc() in user space, we should allow tagged pointers.

Those arguments seem reasonable.  We should try to capture this
somewhere when documenting the ABI.

To be clear, I'm not sure that we should guarantee anywhere that a
tagged pointer is rejected: rather the behaviour should probably be
left unspecified.  Then we can tidy it up incrementally.

(The behaviour is unspecified today, in any case.)

Cheers
---Dave


[PATCH 04/10] drm/ttm: cleanup ttm_bo_mem_space

2019-05-29 Thread Christian König
We tried this once before, but that turned out to be more
complicated than thought. With all the right prerequisites
it looks like we can do this now.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 127 ++-
 1 file changed, 66 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 702cd89adbf9..0dbb23b0dedd 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -893,13 +893,12 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object 
*bo,
  * space, or we've evicted everything and there isn't enough space.
  */
 static int ttm_bo_mem_force_space(struct ttm_buffer_object *bo,
-   uint32_t mem_type,
-   const struct ttm_place *place,
-   struct ttm_mem_reg *mem,
-   struct ttm_operation_ctx *ctx)
+ const struct ttm_place *place,
+ struct ttm_mem_reg *mem,
+ struct ttm_operation_ctx *ctx)
 {
struct ttm_bo_device *bdev = bo->bdev;
-   struct ttm_mem_type_manager *man = >man[mem_type];
+   struct ttm_mem_type_manager *man = >man[mem->mem_type];
int ret;
 
do {
@@ -908,11 +907,11 @@ static int ttm_bo_mem_force_space(struct 
ttm_buffer_object *bo,
return ret;
if (mem->mm_node)
break;
-   ret = ttm_mem_evict_first(bdev, mem_type, place, ctx);
+   ret = ttm_mem_evict_first(bdev, mem->mem_type, place, ctx);
if (unlikely(ret != 0))
return ret;
} while (1);
-   mem->mem_type = mem_type;
+
return ttm_bo_add_move_fence(bo, man, mem);
 }
 
@@ -960,6 +959,51 @@ static bool ttm_bo_mt_compatible(struct 
ttm_mem_type_manager *man,
return true;
 }
 
+/**
+ * ttm_bo_mem_placement - check if placement is compatible
+ * @bo: BO to find memory for
+ * @place: where to search
+ * @mem: the memory object to fill in
+ * @ctx: operation context
+ *
+ * Check if placement is compatible and fill in mem structure.
+ * Returns -EBUSY if placement won't work or negative error code.
+ * 0 when placement can be used.
+ */
+static int ttm_bo_mem_placement(struct ttm_buffer_object *bo,
+   const struct ttm_place *place,
+   struct ttm_mem_reg *mem,
+   struct ttm_operation_ctx *ctx)
+{
+   struct ttm_bo_device *bdev = bo->bdev;
+   uint32_t mem_type = TTM_PL_SYSTEM;
+   struct ttm_mem_type_manager *man;
+   uint32_t cur_flags = 0;
+   int ret;
+
+   ret = ttm_mem_type_from_place(place, _type);
+   if (ret)
+   return ret;
+
+   man = >man[mem_type];
+   if (!man->has_type || !man->use_type)
+   return -EBUSY;
+
+   if (!ttm_bo_mt_compatible(man, mem_type, place, _flags))
+   return -EBUSY;
+
+   cur_flags = ttm_bo_select_caching(man, bo->mem.placement, cur_flags);
+   /*
+* Use the access and other non-mapping-related flag bits from
+* the memory placement flags to the current flags
+*/
+   ttm_flag_masked(_flags, place->flags, ~TTM_PL_MASK_MEMTYPE);
+
+   mem->mem_type = mem_type;
+   mem->placement = cur_flags;
+   return 0;
+}
+
 /**
  * Creates space for memory region @mem according to its type.
  *
@@ -974,11 +1018,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
struct ttm_operation_ctx *ctx)
 {
struct ttm_bo_device *bdev = bo->bdev;
-   struct ttm_mem_type_manager *man;
-   uint32_t mem_type = TTM_PL_SYSTEM;
-   uint32_t cur_flags = 0;
bool type_found = false;
-   bool type_ok = false;
int i, ret;
 
ret = reservation_object_reserve_shared(bo->resv, 1);
@@ -988,37 +1028,20 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
mem->mm_node = NULL;
for (i = 0; i < placement->num_placement; ++i) {
const struct ttm_place *place = >placement[i];
+   struct ttm_mem_type_manager *man;
 
-   ret = ttm_mem_type_from_place(place, _type);
+   ret = ttm_bo_mem_placement(bo, place, mem, ctx);
+   if (ret == -EBUSY)
+   continue;
if (ret)
return ret;
-   man = >man[mem_type];
-   if (!man->has_type || !man->use_type)
-   continue;
-
-   type_ok = ttm_bo_mt_compatible(man, mem_type, place,
-   _flags);
-
-   if (!type_ok)
-   continue;
 
type_found = true;
-   cur_flags = ttm_bo_select_caching(man, bo->mem.placement,
-  

[PATCH 08/10] drm/amdgpu: drop some validation failure messages

2019-05-29 Thread Christian König
The messages about amdgpu_cs_list_validate are duplicated because the
caller will complain into the logs as well and we can also get
interrupted by a signal here.

Also fix the the caller to not report -EAGAIN from validation.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index fff558cf385b..20f2955d2a55 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -671,16 +671,12 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
*p,
}
 
r = amdgpu_cs_list_validate(p, );
-   if (r) {
-   DRM_ERROR("amdgpu_cs_list_validate(duplicates) failed.\n");
+   if (r)
goto error_validate;
-   }
 
r = amdgpu_cs_list_validate(p, >validated);
-   if (r) {
-   DRM_ERROR("amdgpu_cs_list_validate(validated) failed.\n");
+   if (r)
goto error_validate;
-   }
 
amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved,
 p->bytes_moved_vis);
@@ -1383,7 +1379,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
if (r) {
if (r == -ENOMEM)
DRM_ERROR("Not enough memory for command 
submission!\n");
-   else if (r != -ERESTARTSYS)
+   else if (r != -ERESTARTSYS && r != -EAGAIN)
DRM_ERROR("Failed to process the buffer list %d!\n", r);
goto out;
}
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 09/10] drm/amdgpu: create GDS, GWS and OA in system domain

2019-05-29 Thread Christian König
And only move them in on validation. This allows for better control
when multiple processes are fighting over those resources.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 93b2c5a48a71..30493429851e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -495,7 +495,11 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 #endif
 
bo->tbo.bdev = >mman.bdev;
-   amdgpu_bo_placement_from_domain(bo, bp->domain);
+   if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA |
+ AMDGPU_GEM_DOMAIN_GDS))
+   amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU);
+   else
+   amdgpu_bo_placement_from_domain(bo, bp->domain);
if (bp->type == ttm_bo_type_kernel)
bo->tbo.priority = 1;
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 02/10] drm/ttm: return immediately in case of a signal

2019-05-29 Thread Christian König
When a signal arrives we should return immediately for
handling it and not try other placements first.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 06bbcd2679b2..7b59e5ecde7f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -979,7 +979,6 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
uint32_t cur_flags = 0;
bool type_found = false;
bool type_ok = false;
-   bool has_erestartsys = false;
int i, ret;
 
ret = reservation_object_reserve_shared(bo->resv, 1);
@@ -1070,8 +1069,8 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
mem->placement = cur_flags;
return 0;
}
-   if (ret == -ERESTARTSYS)
-   has_erestartsys = true;
+   if (ret && ret != -EBUSY)
+   return ret;
}
 
if (!type_found) {
@@ -1079,7 +1078,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
return -EINVAL;
}
 
-   return (has_erestartsys) ? -ERESTARTSYS : -ENOMEM;
+   return -ENOMEM;
 }
 EXPORT_SYMBOL(ttm_bo_mem_space);
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 03/10] drm/ttm: remove manual placement preference

2019-05-29 Thread Christian König
If drivers don't prefer a system memory placement
they should not but it into the placement list first.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 7b59e5ecde7f..702cd89adbf9 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1012,8 +1012,12 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
ttm_flag_masked(_flags, place->flags,
~TTM_PL_MASK_MEMTYPE);
 
-   if (mem_type == TTM_PL_SYSTEM)
-   break;
+   if (mem_type == TTM_PL_SYSTEM) {
+   mem->mem_type = mem_type;
+   mem->placement = cur_flags;
+   mem->mm_node = NULL;
+   return 0;
+   }
 
ret = (*man->func->get_node)(man, bo, place, mem);
if (unlikely(ret))
@@ -1025,16 +1029,12 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
(*man->func->put_node)(man, mem);
return ret;
}
-   break;
+   mem->mem_type = mem_type;
+   mem->placement = cur_flags;
+   return 0;
}
}
 
-   if ((type_ok && (mem_type == TTM_PL_SYSTEM)) || mem->mm_node) {
-   mem->mem_type = mem_type;
-   mem->placement = cur_flags;
-   return 0;
-   }
-
for (i = 0; i < placement->num_busy_placement; ++i) {
const struct ttm_place *place = >busy_placement[i];
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 07/10] drm/amd/display: use ttm_eu_reserve_buffers instead of amdgpu_bo_reserve v2

2019-05-29 Thread Christian König
From: Chunming Zhou 

add ticket for display bo, so that it can preempt busy bo.

v2: fix stupid rebase error

Change-Id: I9f031cdcc8267de00e819ae303baa0a52df8ebb9
Signed-off-by: Chunming Zhou 
Reviewed-by: Christian König 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 21 ++-
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 4a1755bce96c..56f320f3fd72 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4182,6 +4182,9 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
struct amdgpu_device *adev;
struct amdgpu_bo *rbo;
struct dm_plane_state *dm_plane_state_new, *dm_plane_state_old;
+   struct list_head list;
+   struct ttm_validate_buffer tv;
+   struct ww_acquire_ctx ticket;
uint64_t tiling_flags;
uint32_t domain;
int r;
@@ -4198,9 +4201,17 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
obj = new_state->fb->obj[0];
rbo = gem_to_amdgpu_bo(obj);
adev = amdgpu_ttm_adev(rbo->tbo.bdev);
-   r = amdgpu_bo_reserve(rbo, false);
-   if (unlikely(r != 0))
+   INIT_LIST_HEAD();
+
+   tv.bo = >tbo;
+   tv.num_shared = 1;
+   list_add(, );
+
+   r = ttm_eu_reserve_buffers(, , false, NULL, true);
+   if (r) {
+   dev_err(adev->dev, "fail to reserve bo (%d)\n", r);
return r;
+   }
 
if (plane->type != DRM_PLANE_TYPE_CURSOR)
domain = amdgpu_display_supported_domains(adev);
@@ -4211,21 +4222,21 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("Failed to pin framebuffer with error %d\n", 
r);
-   amdgpu_bo_unreserve(rbo);
+   ttm_eu_backoff_reservation(, );
return r;
}
 
r = amdgpu_ttm_alloc_gart(>tbo);
if (unlikely(r != 0)) {
amdgpu_bo_unpin(rbo);
-   amdgpu_bo_unreserve(rbo);
+   ttm_eu_backoff_reservation(, );
DRM_ERROR("%p bind failed\n", rbo);
return r;
}
 
amdgpu_bo_get_tiling_flags(rbo, _flags);
 
-   amdgpu_bo_unreserve(rbo);
+   ttm_eu_backoff_reservation(, );
 
afb->address = amdgpu_bo_gpu_offset(rbo);
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 05/10] drm/ttm: immediately move BOs to the new LRU v3

2019-05-29 Thread Christian König
Move BOs which are currently in a lower domain to the new
LRU before allocating backing space while validating.

This makes sure that we always have enough entries on the
LRU to allow for other processes to wait for an operation
to complete.

v2: generalize the test
v3: fix rebase error

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 42 +++-
 1 file changed, 32 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 0dbb23b0dedd..8a8859cf60e8 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -166,7 +166,8 @@ static void ttm_bo_release_list(struct kref *list_kref)
ttm_mem_global_free(bdev->glob->mem_glob, acc_size);
 }
 
-void ttm_bo_add_to_lru(struct ttm_buffer_object *bo)
+static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo,
+ struct ttm_mem_reg *mem)
 {
struct ttm_bo_device *bdev = bo->bdev;
struct ttm_mem_type_manager *man;
@@ -176,10 +177,10 @@ void ttm_bo_add_to_lru(struct ttm_buffer_object *bo)
if (!list_empty(>lru))
return;
 
-   if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT)
+   if (mem->placement & TTM_PL_FLAG_NO_EVICT)
return;
 
-   man = >man[bo->mem.mem_type];
+   man = >man[mem->mem_type];
list_add_tail(>lru, >lru[bo->priority]);
kref_get(>list_kref);
 
@@ -189,6 +190,11 @@ void ttm_bo_add_to_lru(struct ttm_buffer_object *bo)
kref_get(>list_kref);
}
 }
+
+void ttm_bo_add_to_lru(struct ttm_buffer_object *bo)
+{
+   ttm_bo_add_mem_to_lru(bo, >mem);
+}
 EXPORT_SYMBOL(ttm_bo_add_to_lru);
 
 static void ttm_bo_ref_bug(struct kref *list_kref)
@@ -1001,6 +1007,14 @@ static int ttm_bo_mem_placement(struct ttm_buffer_object 
*bo,
 
mem->mem_type = mem_type;
mem->placement = cur_flags;
+
+   if (bo->mem.mem_type < mem_type && !list_empty(>lru)) {
+   spin_lock(>bdev->glob->lru_lock);
+   ttm_bo_del_from_lru(bo);
+   ttm_bo_add_mem_to_lru(bo, mem);
+   spin_unlock(>bdev->glob->lru_lock);
+   }
+
return 0;
 }
 
@@ -1034,7 +1048,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
if (ret == -EBUSY)
continue;
if (ret)
-   return ret;
+   goto error;
 
type_found = true;
mem->mm_node = NULL;
@@ -1044,13 +1058,13 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
man = >man[mem->mem_type];
ret = (*man->func->get_node)(man, bo, place, mem);
if (unlikely(ret))
-   return ret;
+   goto error;
 
if (mem->mm_node) {
ret = ttm_bo_add_move_fence(bo, man, mem);
if (unlikely(ret)) {
(*man->func->put_node)(man, mem);
-   return ret;
+   goto error;
}
return 0;
}
@@ -1063,7 +1077,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
if (ret == -EBUSY)
continue;
if (ret)
-   return ret;
+   goto error;
 
type_found = true;
mem->mm_node = NULL;
@@ -1075,15 +1089,23 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
return 0;
 
if (ret && ret != -EBUSY)
-   return ret;
+   goto error;
}
 
+   ret = -ENOMEM;
if (!type_found) {
pr_err(TTM_PFX "No compatible memory type found\n");
-   return -EINVAL;
+   ret = -EINVAL;
}
 
-   return -ENOMEM;
+error:
+   if (bo->mem.mem_type == TTM_PL_SYSTEM && !list_empty(>lru)) {
+   spin_lock(>bdev->glob->lru_lock);
+   ttm_bo_move_to_lru_tail(bo, NULL);
+   spin_unlock(>bdev->glob->lru_lock);
+   }
+
+   return ret;
 }
 EXPORT_SYMBOL(ttm_bo_mem_space);
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3

2019-05-29 Thread Christian König
This avoids OOM situations when we have lots of threads
submitting at the same time.

v3: apply this to the whole driver, not just CS

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c| 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 20f2955d2a55..3e2da24cd17a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
}
 
r = ttm_eu_reserve_buffers(>ticket, >validated, true,
-  , true);
+  , false);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 06f83cac0d3a..f660628e6af9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
list_add(_tv.head, );
amdgpu_vm_get_pd_bo(vm, , );
 
-   r = ttm_eu_reserve_buffers(, , true, NULL, true);
+   r = ttm_eu_reserve_buffers(, , true, NULL, false);
if (r) {
DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r);
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index d513a5ad03dd..ed25a4e14404 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,
 
amdgpu_vm_get_pd_bo(vm, , _pd);
 
-   r = ttm_eu_reserve_buffers(, , false, , true);
+   r = ttm_eu_reserve_buffers(, , false, , false);
if (r) {
dev_err(adev->dev, "leaking bo va because "
"we fail to reserve bo (%d)\n", r);
@@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 
amdgpu_vm_get_pd_bo(>vm, , _pd);
 
-   r = ttm_eu_reserve_buffers(, , true, , true);
+   r = ttm_eu_reserve_buffers(, , true, , false);
if (r)
goto error_unref;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index c430e8259038..d60593cc436e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -155,7 +155,7 @@ static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, 
bool no_intr)
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
int r;
 
-   r = ttm_bo_reserve(>tbo, !no_intr, false, NULL);
+   r = __ttm_bo_reserve(>tbo, !no_intr, false, NULL);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
dev_err(adev->dev, "%p reserve failed\n", bo);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 01/10] drm/ttm: Make LRU removal optional v2

2019-05-29 Thread Christian König
We are already doing this for DMA-buf imports and also for
amdgpu VM BOs for quite a while now.

If this doesn't run into any problems we are probably going
to stop removing BOs from the LRU altogether.

v2: drop BUG_ON from ttm_bo_add_to_lru

Signed-off-by: Christian König 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  9 
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  4 ++--
 drivers/gpu/drm/qxl/qxl_release.c |  2 +-
 drivers/gpu/drm/radeon/radeon_gem.c   |  2 +-
 drivers/gpu/drm/radeon/radeon_object.c|  2 +-
 drivers/gpu/drm/ttm/ttm_bo.c  | 23 ++-
 drivers/gpu/drm/ttm/ttm_execbuf_util.c| 20 +---
 drivers/gpu/drm/virtio/virtgpu_ioctl.c|  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c  |  3 ++-
 drivers/gpu/drm/vmwgfx/vmwgfx_validation.h|  2 +-
 include/drm/ttm/ttm_bo_driver.h   |  5 +++-
 include/drm/ttm/ttm_execbuf_util.h|  3 ++-
 14 files changed, 46 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index e1cae4a37113..647e18f9e136 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -574,7 +574,7 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
amdgpu_vm_get_pd_bo(vm, >list, >vm_pd[0]);
 
ret = ttm_eu_reserve_buffers(>ticket, >list,
-false, >duplicates);
+false, >duplicates, true);
if (!ret)
ctx->reserved = true;
else {
@@ -647,7 +647,7 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
}
 
ret = ttm_eu_reserve_buffers(>ticket, >list,
-false, >duplicates);
+false, >duplicates, true);
if (!ret)
ctx->reserved = true;
else
@@ -1800,7 +1800,8 @@ static int validate_invalid_user_pages(struct 
amdkfd_process_info *process_info)
}
 
/* Reserve all BOs and page tables for validation */
-   ret = ttm_eu_reserve_buffers(, _list, false, );
+   ret = ttm_eu_reserve_buffers(, _list, false, ,
+true);
WARN(!list_empty(), "Duplicates should be empty");
if (ret)
goto out_free;
@@ -2006,7 +2007,7 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, 
struct dma_fence **ef)
}
 
ret = ttm_eu_reserve_buffers(, ,
-false, _save);
+false, _save, true);
if (ret) {
pr_debug("Memory eviction: TTM Reserve Failed. Try again\n");
goto ttm_reserve_fail;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index d72cc583ebd1..fff558cf385b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
}
 
r = ttm_eu_reserve_buffers(>ticket, >validated, true,
-  );
+  , true);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 54dd02a898b9..06f83cac0d3a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
list_add(_tv.head, );
amdgpu_vm_get_pd_bo(vm, , );
 
-   r = ttm_eu_reserve_buffers(, , true, NULL);
+   r = ttm_eu_reserve_buffers(, , true, NULL, true);
if (r) {
DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r);
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 7b840367004c..d513a5ad03dd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,
 
amdgpu_vm_get_pd_bo(vm, , _pd);
 
-   r = ttm_eu_reserve_buffers(, , false, );
+   r = ttm_eu_reserve_buffers(, , false, , true);
if (r) {
dev_err(adev->dev, "leaking bo va because "
"we fail to reserve bo (%d)\n", r);
@@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 
amdgpu_vm_get_pd_bo(>vm, , _pd);
 
-   r = ttm_eu_reserve_buffers(, , true, );
+   r = ttm_eu_reserve_buffers(, , true, , true);
if 

Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-29 Thread Catalin Marinas
On Tue, May 28, 2019 at 11:11:26PM -0700, Christoph Hellwig wrote:
> On Tue, May 28, 2019 at 04:14:45PM +0200, Andrey Konovalov wrote:
> > Thanks for a lot of valuable input! I've read through all the replies
> > and got somewhat lost. What are the changes I need to do to this
> > series?
> > 
> > 1. Should I move untagging for memory syscalls back to the generic
> > code so other arches would make use of it as well, or should I keep
> > the arm64 specific memory syscalls wrappers and address the comments
> > on that patch?
> 
> It absolutely needs to move to common code.  Having arch code leads
> to pointless (often unintentional) semantic difference between
> architectures, and lots of boilerplate code.

That's fine by me as long as we agree on the semantics (which shouldn't
be hard; Khalid already following up). We should probably also move the
proposed ABI document [1] into a common place (or part of since we'll
have arm64-specifics like prctl() calls to explicitly opt in to memory
tagging).

[1] 
https://lore.kernel.org/lkml/20190318163533.26838-1-vincenzo.frasc...@arm.com/T/#u

-- 
Catalin


Re: [PATCH v6 1/2] drm/sched: Refactor ring mirror list handling.

2019-05-29 Thread Daniel Vetter
On Thu, Dec 27, 2018 at 8:28 PM Andrey Grodzovsky
 wrote:
>
> Decauple sched threads stop and start and ring mirror
> list handling from the policy of what to do about the
> guilty jobs.
> When stoppping the sched thread and detaching sched fences
> from non signaled HW fenes wait for all signaled HW fences
> to complete before rerunning the jobs.
>
> v2: Fix resubmission of guilty job into HW after refactoring.
>
> v4:
> Full restart for all the jobs, not only from guilty ring.
> Extract karma increase into standalone function.
>
> v5:
> Rework waiting for signaled jobs without relying on the job
> struct itself as those might already be freed for non 'guilty'
> job's schedulers.
> Expose karma increase to drivers.
>
> v6:
> Use list_for_each_entry_safe_continue and drm_sched_process_job
> in case fence already signaled.
> Call drm_sched_increase_karma only once for amdgpu and add documentation.
>
> Suggested-by: Christian Koenig 
> Signed-off-by: Andrey Grodzovsky 

./drivers/gpu/drm/scheduler/sched_main.c:429: warning: Function
parameter or member 'full_recovery' not described in 'drm_sched_start'

Please fix, thanks.
-Daniel

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  20 ++-
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c|  11 +-
>  drivers/gpu/drm/scheduler/sched_main.c | 195 
> +++--
>  drivers/gpu/drm/v3d/v3d_sched.c|  12 +-
>  include/drm/gpu_scheduler.h|   8 +-
>  5 files changed, 157 insertions(+), 89 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 98df8e4..6a0601c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3298,17 +3298,15 @@ static int amdgpu_device_pre_asic_reset(struct 
> amdgpu_device *adev,
> if (!ring || !ring->sched.thread)
> continue;
>
> -   kthread_park(ring->sched.thread);
> -
> -   if (job && job->base.sched != >sched)
> -   continue;
> -
> -   drm_sched_hw_job_reset(>sched, job ? >base : NULL);
> +   drm_sched_stop(>sched, job ? >base : NULL);
>
> /* after all hw jobs are reset, hw fence is meaningless, so 
> force_completion */
> amdgpu_fence_driver_force_completion(ring);
> }
>
> +   if(job)
> +   drm_sched_increase_karma(>base);
> +
>
>
> if (!amdgpu_sriov_vf(adev)) {
> @@ -3454,14 +3452,10 @@ static void amdgpu_device_post_asic_reset(struct 
> amdgpu_device *adev,
> if (!ring || !ring->sched.thread)
> continue;
>
> -   /* only need recovery sched of the given job's ring
> -* or all rings (in the case @job is NULL)
> -* after above amdgpu_reset accomplished
> -*/
> -   if ((!job || job->base.sched == >sched) && 
> !adev->asic_reset_res)
> -   drm_sched_job_recovery(>sched);
> +   if (!adev->asic_reset_res)
> +   drm_sched_resubmit_jobs(>sched);
>
> -   kthread_unpark(ring->sched.thread);
> +   drm_sched_start(>sched, !adev->asic_reset_res);
> }
>
> if (!amdgpu_device_has_dc_support(adev)) {
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 49a6763..6f1268f 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -109,16 +109,19 @@ static void etnaviv_sched_timedout_job(struct 
> drm_sched_job *sched_job)
> }
>
> /* block scheduler */
> -   kthread_park(gpu->sched.thread);
> -   drm_sched_hw_job_reset(>sched, sched_job);
> +   drm_sched_stop(>sched, sched_job);
> +
> +   if(sched_job)
> +   drm_sched_increase_karma(sched_job);
>
> /* get the GPU back into the init state */
> etnaviv_core_dump(gpu);
> etnaviv_gpu_recover_hang(gpu);
>
> +   drm_sched_resubmit_jobs(>sched);
> +
> /* restart scheduler after GPU is usable again */
> -   drm_sched_job_recovery(>sched);
> -   kthread_unpark(gpu->sched.thread);
> +   drm_sched_start(>sched, true);
>  }
>
>  static void etnaviv_sched_free_job(struct drm_sched_job *sched_job)
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index dbb6906..54e809b 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -60,8 +60,6 @@
>
>  static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb 
> *cb);
>
> -static void drm_sched_expel_job_unlocked(struct drm_sched_job *s_job);
> -
>  /**
>   * drm_sched_rq_init - initialize a given run queue struct
>   *
> @@ -335,6 +333,51 @@ static void drm_sched_job_timedout(struct work_struct 
> *work)
> 

Re: [PATCH v5 3/6] drm/scheduler: rework job destruction

2019-05-29 Thread Daniel Vetter
On Thu, Apr 18, 2019 at 5:00 PM Andrey Grodzovsky
 wrote:
>
> From: Christian König 
>
> We now destroy finished jobs from the worker thread to make sure that
> we never destroy a job currently in timeout processing.
> By this we avoid holding lock around ring mirror list in drm_sched_stop
> which should solve a deadlock reported by a user.
>
> v2: Remove unused variable.
> v4: Move guilty job free into sched code.
> v5:
> Move sched->hw_rq_count to drm_sched_start to account for counter
> decrement in drm_sched_stop even when we don't call resubmit jobs
> if guily job did signal.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692
>
> Signed-off-by: Christian König 
> Signed-off-by: Andrey Grodzovsky 

$ make htmldocs

./drivers/gpu/drm/scheduler/sched_main.c:365: warning: Function
parameter or member 'bad' not described in 'drm_sched_stop'

Please fix, thanks.
-Daniel

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   9 +-
>  drivers/gpu/drm/etnaviv/etnaviv_dump.c |   4 -
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c|   2 +-
>  drivers/gpu/drm/lima/lima_sched.c  |   2 +-
>  drivers/gpu/drm/panfrost/panfrost_job.c|   2 +-
>  drivers/gpu/drm/scheduler/sched_main.c | 159 
> +
>  drivers/gpu/drm/v3d/v3d_sched.c|   2 +-
>  include/drm/gpu_scheduler.h|   6 +-
>  8 files changed, 102 insertions(+), 84 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 7cee269..a0e165c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3334,7 +3334,7 @@ static int amdgpu_device_pre_asic_reset(struct 
> amdgpu_device *adev,
> if (!ring || !ring->sched.thread)
> continue;
>
> -   drm_sched_stop(>sched);
> +   drm_sched_stop(>sched, >base);
>
> /* after all hw jobs are reset, hw fence is meaningless, so 
> force_completion */
> amdgpu_fence_driver_force_completion(ring);
> @@ -3343,8 +3343,6 @@ static int amdgpu_device_pre_asic_reset(struct 
> amdgpu_device *adev,
> if(job)
> drm_sched_increase_karma(>base);
>
> -
> -
> if (!amdgpu_sriov_vf(adev)) {
>
> if (!need_full_reset)
> @@ -3482,8 +3480,7 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info 
> *hive,
> return r;
>  }
>
> -static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev,
> - struct amdgpu_job *job)
> +static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev)
>  {
> int i;
>
> @@ -3623,7 +3620,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
>
> /* Post ASIC reset for all devs .*/
> list_for_each_entry(tmp_adev, device_list_handle, gmc.xgmi.head) {
> -   amdgpu_device_post_asic_reset(tmp_adev, tmp_adev == adev ? 
> job : NULL);
> +   amdgpu_device_post_asic_reset(tmp_adev);
>
> if (r) {
> /* bad news, how to tell it to userspace ? */
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> index 33854c9..5778d9c 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> @@ -135,13 +135,11 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
> mmu_size + gpu->buffer.size;
>
> /* Add in the active command buffers */
> -   spin_lock_irqsave(>sched.job_list_lock, flags);
> list_for_each_entry(s_job, >sched.ring_mirror_list, node) {
> submit = to_etnaviv_submit(s_job);
> file_size += submit->cmdbuf.size;
> n_obj++;
> }
> -   spin_unlock_irqrestore(>sched.job_list_lock, flags);
>
> /* Add in the active buffer objects */
> list_for_each_entry(vram, >mmu->mappings, mmu_node) {
> @@ -183,14 +181,12 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
>   gpu->buffer.size,
>   etnaviv_cmdbuf_get_va(>buffer));
>
> -   spin_lock_irqsave(>sched.job_list_lock, flags);
> list_for_each_entry(s_job, >sched.ring_mirror_list, node) {
> submit = to_etnaviv_submit(s_job);
> etnaviv_core_dump_mem(, ETDUMP_BUF_CMD,
>   submit->cmdbuf.vaddr, 
> submit->cmdbuf.size,
>   etnaviv_cmdbuf_get_va(>cmdbuf));
> }
> -   spin_unlock_irqrestore(>sched.job_list_lock, flags);
>
> /* Reserve space for the bomap */
> if (n_bomap_pages) {
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 6d24fea..a813c82 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ 

[PATCH] drm/amdgpu: gpu reset run IB test once

2019-05-29 Thread Pan, Xinhui
currently gpu reset will schedule late_init_work which will do one IB
test. That would race with the IB test in gpu reset itself.

add another do_ip_late_init() function which skips the late_init_work.

Signed-off-by: xinhui pan 
Suggested-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 30 +-
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 241cd2c75433..a1dcee125d1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1900,7 +1900,7 @@ static int amdgpu_device_set_pg_state(struct 
amdgpu_device *adev, enum amd_power
 }
 
 /**
- * amdgpu_device_ip_late_init - run late init for hardware IPs
+ * amdgpu_device_do_ip_late_init - run late init for hardware IPs
  *
  * @adev: amdgpu_device pointer
  *
@@ -1911,7 +1911,7 @@ static int amdgpu_device_set_pg_state(struct 
amdgpu_device *adev, enum amd_power
  * late in the init process.
  * Returns 0 on success, negative error code on failure.
  */
-static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
+static int amdgpu_device_do_ip_late_init(struct amdgpu_device *adev)
 {
int i = 0, r;
 
@@ -1932,14 +1932,32 @@ static int amdgpu_device_ip_late_init(struct 
amdgpu_device *adev)
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_GATE);
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_GATE);
 
-   queue_delayed_work(system_wq, >late_init_work,
-  msecs_to_jiffies(AMDGPU_RESUME_MS));
-
amdgpu_device_fill_reset_magic(adev);
 
return 0;
 }
 
+/**
+ * amdgpu_device_ip_late_init - run late init for hardware IPs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * schedule late_init_work after late_init.
+ * Return 0 on success.
+ */
+static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
+{
+   int r;
+
+   r = amdgpu_device_do_ip_late_init(adev);
+   if (r)
+   return r;
+
+   queue_delayed_work(system_wq, >late_init_work,
+  msecs_to_jiffies(AMDGPU_RESUME_MS));
+   return 0;
+}
+
 /**
  * amdgpu_device_ip_fini - run fini for hardware IPs
  *
@@ -3545,7 +3563,7 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info 
*hive,
if (vram_lost)

amdgpu_device_fill_reset_magic(tmp_adev);
 
-   r = amdgpu_device_ip_late_init(tmp_adev);
+   r = amdgpu_device_do_ip_late_init(tmp_adev);
if (r)
goto out;
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer

2019-05-29 Thread Deng, Emily
No problem. Thanks for your reviewing.

Best wishes
Emily Deng
From: Christian König 
Sent: Wednesday, May 29, 2019 3:54 PM
To: Deng, Emily ; amd-gfx@lists.freedesktop.org; Koenig, 
Christian ; Quan, Evan 
Subject: Re: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer

Sorry for the delay, your patch simply got stuck in the daily wave of mails.

Reviewed-by: Christian König 


Regards,
Christian.

Am 29.05.19 um 05:07 schrieb Deng, Emily:

Hi Christian,

 I have reverted the before change as your suggestion, and sent this new 
patch, could you help to review this?



Best wishes

Emily Deng







-Original Message-

From: amd-gfx 

 On Behalf Of Deng,

Emily

Sent: Wednesday, May 29, 2019 10:52 AM

To: amd-gfx@lists.freedesktop.org

Subject: RE: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer



Ping..



Best wishes

Emily Deng







-Original Message-

From: Deng, Emily 

Sent: Tuesday, May 28, 2019 6:14 PM

To: Deng, Emily ; 
amd-gfx@lists.freedesktop.org

Subject: RE: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer



Ping ..



Best wishes

Emily Deng







-Original Message-

From: amd-gfx 

 On Behalf Of

Emily Deng

Sent: Tuesday, May 28, 2019 4:06 PM

To: amd-gfx@lists.freedesktop.org

Cc: Deng, Emily 

Subject: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer



As it will destroy clear_state_obj, and also will unpin it in the

gfx_v9_0_sw_fini, so don't need to call amdgpu_bo_free_kernel in

gfx_v9_0_sw_fini, or it will have unpin warning.



Signed-off-by: Emily Deng 

---

drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +---

1 file changed, 1 insertion(+), 3 deletions(-)



diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c

b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c

index c763733..cc5a382 100644

--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c

+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c

@@ -1794,9 +1794,7 @@ static int gfx_v9_0_sw_fini(void *handle)



  gfx_v9_0_mec_fini(adev);

  gfx_v9_0_ngg_fini(adev);

- amdgpu_bo_free_kernel(>gfx.rlc.clear_state_obj,

->gfx.rlc.clear_state_gpu_addr,

-(void **)>gfx.rlc.cs_ptr);

+ amdgpu_bo_unref(>gfx.rlc.clear_state_obj);

  if (adev->asic_type == CHIP_RAVEN) {

   amdgpu_bo_free_kernel(>gfx.rlc.cp_table_obj,

   >gfx.rlc.cp_table_gpu_addr,

--

2.7.4



___

amd-gfx mailing list

amd-gfx@lists.freedesktop.org

https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___

amd-gfx mailing list

amd-gfx@lists.freedesktop.org

https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___

amd-gfx mailing list

amd-gfx@lists.freedesktop.org

https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer

2019-05-29 Thread Christian König

Sorry for the delay, your patch simply got stuck in the daily wave of mails.

Reviewed-by: Christian König 

Regards,
Christian.

Am 29.05.19 um 05:07 schrieb Deng, Emily:

Hi Christian,
  I have reverted the before change as your suggestion, and sent this new 
patch, could you help to review this?

Best wishes
Emily Deng




-Original Message-
From: amd-gfx  On Behalf Of Deng,
Emily
Sent: Wednesday, May 29, 2019 10:52 AM
To: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer

Ping..

Best wishes
Emily Deng




-Original Message-
From: Deng, Emily 
Sent: Tuesday, May 28, 2019 6:14 PM
To: Deng, Emily ; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer

Ping ..

Best wishes
Emily Deng




-Original Message-
From: amd-gfx  On Behalf Of
Emily Deng
Sent: Tuesday, May 28, 2019 4:06 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deng, Emily 
Subject: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer

As it will destroy clear_state_obj, and also will unpin it in the
gfx_v9_0_sw_fini, so don't need to call amdgpu_bo_free_kernel in
gfx_v9_0_sw_fini, or it will have unpin warning.

Signed-off-by: Emily Deng 
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index c763733..cc5a382 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1794,9 +1794,7 @@ static int gfx_v9_0_sw_fini(void *handle)

gfx_v9_0_mec_fini(adev);
gfx_v9_0_ngg_fini(adev);
-   amdgpu_bo_free_kernel(>gfx.rlc.clear_state_obj,
-   >gfx.rlc.clear_state_gpu_addr,
-   (void **)>gfx.rlc.cs_ptr);
+   amdgpu_bo_unref(>gfx.rlc.clear_state_obj);
if (adev->asic_type == CHIP_RAVEN) {
amdgpu_bo_free_kernel(>gfx.rlc.cp_table_obj,
>gfx.rlc.cp_table_gpu_addr,
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 3.0

2019-05-29 Thread Koenig, Christian
Hi Louis,

please don't create tickets for this, that is just overkill.

Those are obvious and simple to fix bugs and should go upstream immediately.

Regards,
Christian.

Am 29.05.19 um 03:14 schrieb Li, Ching-shih (Louis):
Hi Christian,

Your explanation matches my code study and test results. Well, I will remove 
original read. Shirish and I will re-test it. I will submit it after test done.
As for other related code, yes, I assumed there might be similar issues as 
well. My plan is to create internal ticket and write test code to check if any 
problem can be hit. Then we can do fix. I have the current patch focus on the 
current Chrome issue.

Thanks for your review and information.

BR,
Louis

From: Christian König 

Sent: Tuesday, May 28, 2019 3:24 PM
To: Li, Ching-shih (Louis) 
; Liu, Leo 
; S, Shirish 
; Grodzovsky, Andrey 
; Zhang, Jerry 
; Deng, Emily 
; Deucher, Alexander 

Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 
3.0

[CAUTION: External Email]
Wow, really good catch!

The underlying problem is most likely that VCE block is either power or clock 
gated and because of this the readptr read always returns zero.

Now amdgpu_ring_alloc() informs the power management code that the block is 
about to be used and so the gating is turned off.

Mhm, that is probably wrong at a hole bunch of other places, at least the UVD 
and VCN code comes to mind.

I agree with Leo that you should remove the original read (so to not read 
twice) and it would be realy nice if you could double check the other code 
(UVD/VCN) for similar problems as well.

Regards,
Christian.

Am 27.05.19 um 19:20 schrieb Li, Ching-shih (Louis):
I don’t mean to read it twice. The solution is to make first read later. I 
didn’t modify the original code to make code difference less and simple. I 
guess it should work to remove the original read there.


From: Liu, Leo 
Sent: Tuesday, May 28, 2019 12:40 AM
To: Li, Ching-shih (Louis) 
; S, Shirish 
; Grodzovsky, Andrey 
; Zhang, Jerry 
; Deng, Emily 
; Deucher, Alexander 

Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 
3.0


int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)
{
struct amdgpu_device *adev = ring->adev;

uint32_t rptr = amdgpu_ring_get_rptr(ring);

unsigned i;
int r, timeout = adev->usec_timeout;

/* skip ring test for sriov*/
if (amdgpu_sriov_vf(adev))
return 0;

r = amdgpu_ring_alloc(ring, 16);
if (r)
return r;

amdgpu_ring_write(ring, VCE_CMD_END);
amdgpu_ring_commit(ring);



Above is original code, rptr is updated when called, and below is your patch, 
my question is why do you need to get rptr twice?

@@ -1084,6 +1084,8 @@ int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)

if (r)

   return r;



+   rptr = amdgpu_ring_get_rptr(ring);

+

amdgpu_ring_write(ring, VCE_CMD_END);

amdgpu_ring_commit(ring);




On 5/27/19 12:22 PM, Li, Ching-shih (Louis) wrote:
Hi Leo,

Yes, I confirm it is the root cause for the Chrome S3 issue. Whenever system is 
resumed, the original instruction always gets zero. However, I have no idea why 
it fails, and didn’t verify this problem on CRB or any other Linux platform yet.
Although I think the ideal solution is an indicator, e.g. a register, for 
driver to check if related firmware and hardware are ready to work. So driver 
can make sure it is ok to read rptr. Without any reference document, I can only 
try to solve the problem by modifying driver. Debug traces reveal that only 
first rptr read fails, but the read in check loop is ok. Therefore, a solution 
comes to mind: to update rptr later for initial rptr value. Tests prove it 
working in Chrome platforms. Fyi~

BR,
Louis

From: Liu, Leo 
Sent: Monday, May 27, 2019 9:01 PM
To: S, Shirish ; Grodzovsky, 
Andrey ; Zhang, 
Jerry ; Deng, Emily 
; Deucher, Alexander 

Cc: amd-gfx@lists.freedesktop.org; Li, 
Ching-shih (Louis) 
Subject: Re: [PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 
3.0



On 5/27/19 3:42 AM, S, Shirish wrote:

From: Louis Li 



[What]

Re: [PATCH] drm/amdgpu: fix a race in GPU reset with IB test

2019-05-29 Thread Christian König

Am 28.05.19 um 21:29 schrieb Alex Deucher:

Split late_init into two functions, one (do_late_init) which
just does the hw init, and late_init which calls do_late_init
and schedules the IB test work.  Call do_late_init in
the GPU reset code to run the init code, but not schedule
the IB test code.  The IB test code is called directly
in the gpu reset code so no need to run the IB tests
in a separate work thread.  If we do, we end up racing.


I already had the suspicion that we do something wrong in the reset path 
with this.


But I don't really like the naming. How about we call the late_init_work 
delayed_init_work instead and explicitly schedule it from 
amdgpu_device_init() ?


Christian.



Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 43 +-
  1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7a8c2201cd04..6b90840307dc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1869,19 +1869,7 @@ static int amdgpu_device_set_pg_state(struct 
amdgpu_device *adev, enum amd_power
return 0;
  }
  
-/**

- * amdgpu_device_ip_late_init - run late init for hardware IPs
- *
- * @adev: amdgpu_device pointer
- *
- * Late initialization pass for hardware IPs.  The list of all the hardware
- * IPs that make up the asic is walked and the late_init callbacks are run.
- * late_init covers any special initialization that an IP requires
- * after all of the have been initialized or something that needs to happen
- * late in the init process.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
+static int amdgpu_device_do_ip_late_init(struct amdgpu_device *adev)
  {
int i = 0, r;
  
@@ -1902,14 +1890,35 @@ static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)

amdgpu_device_set_cg_state(adev, AMD_CG_STATE_GATE);
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_GATE);
  
-	queue_delayed_work(system_wq, >late_init_work,

-  msecs_to_jiffies(AMDGPU_RESUME_MS));
-
amdgpu_device_fill_reset_magic(adev);
  
  	return 0;

  }
  
+/**

+ * amdgpu_device_ip_late_init - run late init for hardware IPs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Late initialization pass for hardware IPs.  The list of all the hardware
+ * IPs that make up the asic is walked and the late_init callbacks are run.
+ * late_init covers any special initialization that an IP requires
+ * after all of the have been initialized or something that needs to happen
+ * late in the init process.
+ * Returns 0 on success, negative error code on failure.
+ */
+static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
+{
+   int r;
+
+   r = amdgpu_device_do_ip_late_init(adev);
+
+   queue_delayed_work(system_wq, >late_init_work,
+  msecs_to_jiffies(AMDGPU_RESUME_MS));
+
+   return r;
+}
+
  /**
   * amdgpu_device_ip_fini - run fini for hardware IPs
   *
@@ -3502,7 +3511,7 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info 
*hive,
if (vram_lost)

amdgpu_device_fill_reset_magic(tmp_adev);
  
-r = amdgpu_device_ip_late_init(tmp_adev);

+   r = amdgpu_device_do_ip_late_init(tmp_adev);
if (r)
goto out;
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] dri2: reply to client for WaitMSC request in any case

2019-05-29 Thread Cui, Flora
otherwise client would wait for reply forever and desktop appears hang.

Signed-off-by: Flora Cui 
---
 src/amdgpu_dri2.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/amdgpu_dri2.c b/src/amdgpu_dri2.c
index 44316ac..34353a7 100644
--- a/src/amdgpu_dri2.c
+++ b/src/amdgpu_dri2.c
@@ -1062,6 +1062,9 @@ static int amdgpu_dri2_schedule_wait_msc(ClientPtr 
client, DrawablePtr draw,
 out_complete:
if (wait_info)
amdgpu_dri2_deferred_event(NULL, 0, wait_info);
+
+   DRI2WaitMSCComplete(client, draw, target_msc, 0, 0);
+
return TRUE;
 }
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/3] drm/amdgpu: add field indicating if has PCIE atomics support

2019-05-29 Thread Xiao, Jack
The new field in amdgpu device is used to record whether the
system has PCIE atomics support. The field can be exposed to
UMD or kfd whether PCIE atomics have supported.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d355e9a..b0a01e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -767,6 +767,7 @@ struct amdgpu_device {
struct mutexgrbm_idx_mutex;
struct dev_pm_domainvga_pm_domain;
boolhave_disp_power_ref;
+   boolhave_atomics_support;
 
/* BIOS */
boolis_atom_fw;
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/3] drm/amdgpu: enable PCIE atomics ops support

2019-05-29 Thread Xiao, Jack
GPU atomics operation depends on PCIE atomics support.
Always enable PCIE atomics ops support in case that
it hasn't been enabled.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bdd1fe73..55772eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2562,6 +2562,17 @@ int amdgpu_device_init(struct amdgpu_device *adev,
if (adev->rio_mem == NULL)
DRM_INFO("PCI I/O BAR is not found.\n");
 
+   /* enable PCIE atomic ops */
+   r = pci_enable_atomic_ops_to_root(adev->pdev,
+   PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
+   PCI_EXP_DEVCAP2_ATOMIC_COMP64);
+   if (r) {
+   adev->have_atomics_support = false;
+   DRM_INFO("PCIE atomic ops is not supported\n");
+   } else {
+   adev->have_atomics_support = true;
+   }
+
amdgpu_device_get_pcie_info(adev);
 
/* early init functions */
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/3] drm/amdkfd: remove duplicated PCIE atomics request

2019-05-29 Thread Xiao, Jack
Since amdgpu has always requested PCIE atomics, kfd don't
need duplicated PCIE atomics enablement. Referring to amdgpu
request result is enough.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |  7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 10 --
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 98326e3b..ddd6c52 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -620,6 +620,13 @@ bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, 
u32 vmid)
return false;
 }
 
+bool amdgpu_amdkfd_have_atomics_support(struct kgd_dev *kgd)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
+
+   return adev->have_atomics_support;
+}
+
 #ifndef CONFIG_HSA_AMD
 bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index f57f297..8d135c82 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -135,6 +135,7 @@ int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum 
kgd_engine_type engine,
uint32_t vmid, uint64_t gpu_addr,
uint32_t *ib_cmd, uint32_t ib_len);
 void amdgpu_amdkfd_set_compute_idle(struct kgd_dev *kgd, bool idle);
+bool amdgpu_amdkfd_have_atomics_support(struct kgd_dev *kgd);
 
 struct kfd2kgd_calls *amdgpu_amdkfd_gfx_7_get_functions(void);
 struct kfd2kgd_calls *amdgpu_amdkfd_gfx_8_0_get_functions(void);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 7b4ea24..76a1599 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -481,17 +481,15 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
 * 32 and 64-bit requests are possible and must be
 * supported.
 */
-   ret = pci_enable_atomic_ops_to_root(pdev,
-   PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
-   PCI_EXP_DEVCAP2_ATOMIC_COMP64);
-   if (device_info->needs_pci_atomics && ret < 0) {
+   kfd->pci_atomic_requested = amdgpu_amdkfd_have_atomics_support(kgd);
+   if (device_info->needs_pci_atomics &&
+   !kfd->pci_atomic_requested) {
dev_info(kfd_device,
 "skipped device %x:%x, PCI rejects atomics\n",
 pdev->vendor, pdev->device);
kfree(kfd);
return NULL;
-   } else if (!ret)
-   kfd->pci_atomic_requested = true;
+   }
 
kfd->kgd = kgd;
kfd->device_info = device_info;
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 0/3] always enable PCIE atomics

2019-05-29 Thread Xiao, Jack
GPU atomics operation depends on PCIE atomics support.
Always enable PCIE atomics ops support in case that
it hasn't been enabled.

Add field to amdgpu device, that can be exposed to UMD or
kfd indicating whether the system has PCIE atomics support.

Jack Xiao (3):
  drm/amdgpu: add field indicating if has PCIE atomics support
  drm/amdgpu: enable PCIE atomics ops support
  drm/amdkfd: remove duplicated PCIE atomics request

 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |  7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 10 --
 5 files changed, 24 insertions(+), 6 deletions(-)

-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-29 Thread Christoph Hellwig
On Tue, May 28, 2019 at 04:14:45PM +0200, Andrey Konovalov wrote:
> Thanks for a lot of valuable input! I've read through all the replies
> and got somewhat lost. What are the changes I need to do to this
> series?
> 
> 1. Should I move untagging for memory syscalls back to the generic
> code so other arches would make use of it as well, or should I keep
> the arm64 specific memory syscalls wrappers and address the comments
> on that patch?

It absolutely needs to move to common code.  Having arch code leads
to pointless (often unintentional) semantic difference between
architectures, and lots of boilerplate code.

Btw, can anyone of the arm crowd or Khalid comment on the linux-mm
thread on generic gup where I'm dealing with the pre-existing ADI
case of pointer untagging?