[no subject]

2024-05-14 Thread Jesse Zhang
>From 3348a4bb465834b165de80dc42d11630ac5c6a83 Mon Sep 17 00:00:00 2001
From: Jesse Zhang 
Date: Tue, 14 May 2024 13:59:18 +0800
Subject: [PATCH 2/2 v2] drm/amd/pm: check specific index for aldebaran

To avoid warning problems, drop index and
use PPSMC_MSG_GfxDriverReset instead of index for aldebaran.

Signed-off-by: Jesse Zhang 
Suggested-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
index a22eb6bbb05e..d671314c46c8 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
@@ -1880,17 +1880,18 @@ static int aldebaran_mode1_reset(struct smu_context 
*smu)
 
 static int aldebaran_mode2_reset(struct smu_context *smu)
 {
-   int ret = 0, index;
+   int ret = 0;
struct amdgpu_device *adev = smu->adev;
int timeout = 10;
 
-   index = smu_cmn_to_asic_specific_index(smu, CMN2ASIC_MAPPING_MSG,
-   SMU_MSG_GfxDeviceDriverReset);
-   if (index < 0 )
-   return -EINVAL;
mutex_lock(>message_lock);
if (smu->smc_fw_version >= 0x00441400) {
-   ret = smu_cmn_send_msg_without_waiting(smu, (uint16_t)index, 
SMU_RESET_MODE_2);
+   ret = smu_cmn_send_smc_msg_with_param(smu, 
PPSMC_MSG_GfxDriverReset,
+   
SMU_RESET_MODE_2, NULL);
+   if (ret) {
+   dev_err(smu->adev->dev, "Failed to mode2 reset!\n");
+   goto out;
+   }
/* This is similar to FLR, wait till max FLR timeout */
msleep(100);
dev_dbg(smu->adev->dev, "restore config space...\n");
-- 
2.25.1



[no subject]

2023-07-24 Thread David Francis
This is in support of a RCCL change that requires specific
coherence behaviour.

Corresponding Thunk patch is at
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88




[no subject]

2023-03-12 Thread Danila Chernetsov
Date: Sat, 11 Mar 2023 19:00:03 +
Subject: [PATCH 5.10 1/1] drm/amdgpu: add error handling for 
drm_fb_helper_initial_config

The type of return value of drm_fb_helper_initial_config is int, which may 
return wrong result, so we add error handling for it to reclaim memory resource,
and return when an error occurs.   

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: d38ceaf99ed0 (drm/amdgpu: add core driver (v4))
Signed-off-by: Danila Chernetsov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
index 43f29ee0e3b0..e445a2c9f569 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
@@ -348,8 +348,17 @@ int amdgpu_fbdev_init(struct amdgpu_device *adev)
if (!amdgpu_device_has_dc_support(adev))
drm_helper_disable_unused_functions(adev_to_drm(adev));
 
-   drm_fb_helper_initial_config(>helper, bpp_sel);
-   return 0;
+   ret = drm_fb_helper_initial_config(>helper, bpp_sel);
+   if (ret)
+   goto fini;
+
+   return 0;
+
+fini:
+   drm_fb_helper_fini(>helper);
+
+   kfree(rfbdev);
+   return ret;
 }
 
 void amdgpu_fbdev_fini(struct amdgpu_device *adev)
-- 
2.25.1



Re: Subject [PATCH] drm/radeon: Fix eDP for single-display iMac11,2

2023-02-23 Thread Alex Deucher
I've applied this manually.  Please use git to generate and email
patches in the future.

Thanks!

Alex

On Sun, Feb 19, 2023 at 12:02 AM Mark Hawrylak  wrote:
>
> From Mark Hawrylak 
>
> Apple iMac11,2 (mid 2010) also with Radeon HD-4670 that has the same
> issue as iMac10,1 (late 2009) where the internal eDP panel stays dark on
> driver load.  This patch treats iMac11,2 the same as iMac10,1,
> so the eDP panel stays active.
>
> Additional steps:
> Kernel boot parameter radeon.nomodeset=0 required to keep the eDP
> panel active.
>
> This patch is an extension of the commit 
> 564d8a2cf3abf16575af48bdc3e86e92ee8a617d
> Subject: [PATCH 3.16 100/192] drm/radeon: Fix eDP for single-display iMac10,1 
> (v2)
> Date: Mon, 09 Oct 2017 13:44:24 +0100 [thread overview]
> https://lore.kernel.org/all/lsq.1507553064.833262...@decadent.org.uk/
>
> By making a contribution to this project, I certify that:
> The contribution was created in whole or in part by me and I have the 
> right to submit it under the open source license indicated in the file; or
> The contribution is based upon previous work that, to the best of my 
> knowledge, is covered under an appropriate open source license and I have the 
> right under that license to submit that work with modifications, whether 
> created in whole or in part by me, under the same open source license (unless 
> I am permitted to submit under a different license), as indicated in the 
> file; or
> The contribution was provided directly to me by some other person who 
> certified (a), (b) or (c) and I have not modified it.
> I understand and agree that this project and the contribution are 
> public and that a record of the contribution (including all personal 
> information I submit with it, including my sign-off) is maintained 
> indefinitely and may be redistributed consistent with this project or the 
> open source license(s) involved.
>
> Signed-off-by: Mark Hawrylak 
>
> ---
>
> --- linux/drivers/gpu/drm/radeon/atombios_encoders.c.orig   2023-02-19 
> 14:03:03.126499290 +1100
> +++ linux/drivers/gpu/drm/radeon/atombios_encoders.c2023-02-19 
> 14:04:15.449831506 +1100
> @@ -2122,11 +2122,11 @@ int radeon_atom_pick_dig_encoder(struct
>
> /*
>  * On DCE32 any encoder can drive any block so usually just use crtc 
> id,
> -* but Apple thinks different at least on iMac10,1, so there use 
> linkb,
> +* but Apple thinks different at least on iMac10,1 and iMac11,2, so 
> there use linkb,
>  * otherwise the internal eDP panel will stay dark.
>  */
> if (ASIC_IS_DCE32(rdev)) {
> -   if (dmi_match(DMI_PRODUCT_NAME, "iMac10,1"))
> +   if (dmi_match(DMI_PRODUCT_NAME, "iMac10,1") || 
> dmi_match(DMI_PRODUCT_NAME, "iMac11,2"))
> enc_idx = (dig->linkb) ? 1 : 0;
> else
> enc_idx = radeon_crtc->crtc_id;
>
>
> --
>
> Regards
> Mark Hawrylak
> 0425 714 725


Subject [PATCH] drm/radeon: Fix eDP for single-display iMac11,2

2023-02-18 Thread Mark Hawrylak
>From Mark Hawrylak 

Apple iMac11,2 (mid 2010) also with Radeon HD-4670 that has the same
issue as iMac10,1 (late 2009) where the internal eDP panel stays dark on
driver load.  This patch treats iMac11,2 the same as iMac10,1,
so the eDP panel stays active.

Additional steps:
Kernel boot parameter radeon.nomodeset=0 required to keep the eDP
panel active.

This patch is an extension of the commit
564d8a2cf3abf16575af48bdc3e86e92ee8a617d
Subject: [PATCH 3.16 100/192] drm/radeon: Fix eDP for single-display
iMac10,1 (v2)
Date: Mon, 09 Oct 2017 13:44:24 +0100 [thread overview]
https://lore.kernel.org/all/lsq.1507553064.833262...@decadent.org.uk/

By making a contribution to this project, I certify that:
The contribution was created in whole or in part by me and I have
the right to submit it under the open source license indicated in the file;
or
The contribution is based upon previous work that, to the best of
my knowledge, is covered under an appropriate open source license and I
have the right under that license to submit that work with modifications,
whether created in whole or in part by me, under the same open source
license (unless I am permitted to submit under a different license), as
indicated in the file; or
The contribution was provided directly to me by some other person
who certified (a), (b) or (c) and I have not modified it.
I understand and agree that this project and the contribution are
public and that a record of the contribution (including all personal
information I submit with it, including my sign-off) is maintained
indefinitely and may be redistributed consistent with this project or the
open source license(s) involved.

Signed-off-by: Mark Hawrylak 

---

--- linux/drivers/gpu/drm/radeon/atombios_encoders.c.orig   2023-02-19
14:03:03.126499290 +1100
+++ linux/drivers/gpu/drm/radeon/atombios_encoders.c2023-02-19
14:04:15.449831506 +1100
@@ -2122,11 +2122,11 @@ int radeon_atom_pick_dig_encoder(struct

/*
 * On DCE32 any encoder can drive any block so usually just use
crtc id,
-* but Apple thinks different at least on iMac10,1, so there use
linkb,
+* but Apple thinks different at least on iMac10,1 and iMac11,2, so
there use linkb,
 * otherwise the internal eDP panel will stay dark.
 */
if (ASIC_IS_DCE32(rdev)) {
-   if (dmi_match(DMI_PRODUCT_NAME, "iMac10,1"))
+   if (dmi_match(DMI_PRODUCT_NAME, "iMac10,1") ||
dmi_match(DMI_PRODUCT_NAME, "iMac11,2"))
enc_idx = (dig->linkb) ? 1 : 0;
else
enc_idx = radeon_crtc->crtc_id;


-- 

Regards
Mark Hawrylak
0425 714 725


[no subject]

2022-12-06 Thread Denis Arefev
Date: Thu, 10 Nov 2022 16:47:26 +0300
Subject: [PATCH] drm/amdgpu/display: Add pointer check

Return value of a function 'dc_create_state' is
dereferenced at amdgpu_dm.c:2027 without checking for null

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Denis Arefev 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 0f7749e9424d..529483997154 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1960,7 +1960,9 @@ static int dm_resume(void *handle)
dc_release_state(dm_state->context);
dm_state->context = dc_create_state(dm->dc);
/* TODO: Remove dc_state->dccg, use dc->dccg directly. */
-   dc_resource_state_construct(dm->dc, dm_state->context);
+   if (dm_state->context) {
+   dc_resource_state_construct(dm->dc, dm_state->context);
+   }
 
/* Before powering on DC we need to re-initialize DMUB. */
r = dm_dmub_hw_init(adev);
-- 
2.25.1



Subject: [PATCH] driver: gpu: add failure check for ftell

2022-10-31 Thread 沈言峰
add return-value check of ftell to improve robustness(and avoid abnormal 
behavior)

Signed-off-by: SPeak 
Signed-off-by: shenyanfeng 
---
 drivers/gpu/drm/radeon/mkregtable.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/mkregtable.c 
b/drivers/gpu/drm/radeon/mkregtable.c
index 52a7246fe..c31c58e5f 100644
--- a/drivers/gpu/drm/radeon/mkregtable.c
+++ b/drivers/gpu/drm/radeon/mkregtable.c
@@ -193,6 +193,7 @@ static int parser_auth(struct table *t, const char 
*filename)
  regmatch_t match[4];
  char buf[1024];
  size_t end;
+ long pos;
  int len;
  int done = 0;
  int r;
@@ -228,12 +229,12 @@ static int parser_auth(struct table *t, const char 
*filename)
  last_reg = strtol(last_reg_s, NULL, 16);

  do {
- if (fgets(buf, 1024, file) == NULL) {
+ if ((fgets(buf, 1024, file) == NULL) || (pos = ftell(file)) < 0) {
  fclose(file);
  return -1;
  }
  len = strlen(buf);
- if (ftell(file) == end)
+ if (pos == end)
  done = 1;
  if (len) {
  r = regexec(_rex, buf, 4, match, 0);
--
2.37.2

#/**本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
 This e-mail and its attachments contain confidential information from XIAOMI, 
which is intended only for the person or entity whose address is listed above. 
Any use of the information contained herein in any way (including, but not 
limited to, total or partial disclosure, reproduction, or dissemination) by 
persons other than the intended recipient(s) is prohibited. If you receive this 
e-mail in error, please notify the sender by phone or email immediately and 
delete it!**/#


[no subject]

2022-09-12 Thread Christian König
Hey Alex,

I've decided to split this patch set into two because we still can't
figure out where the VCN regressions come from.

Ruijing tested them and confirmed that they don't regress VCN.

Can you and maybe Felix take a look and review them?

Thanks,
Christian.




Subject: [PATCH 1/1] drm/amdgpu: Update RAS trigger error block support

2021-09-09 Thread Clements, John
[AMD Official Use Only]

Submitting patch to update RAS trigger error to support additional blocks


0002-drm-amdgpu-Update-RAS-trigger-error-block-support.patch
Description: 0002-drm-amdgpu-Update-RAS-trigger-error-block-support.patch


RE: Subject: [PATCH 1/1] drm/amdgpu: Update RAS trigger error block support

2021-09-09 Thread Zhang, Hawking
[AMD Official Use Only]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
From: Clements, John 
Sent: Thursday, September 9, 2021 15:59
To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Li, 
Candice 
Subject: Subject: [PATCH 1/1] drm/amdgpu: Update RAS trigger error block support


[AMD Official Use Only]

Submitting patch to update RAS trigger error to support additional blocks


Subject: [PATCH v2 0/2] Fix a hung during memory pressure test

2021-09-05 Thread Pan, Xinhui
[AMD Official Use Only]

A long time ago, someone reports system got hung during memory test.
In recent days, I am trying to look for or understand the potential
deadlock in ttm/amdgpu code.

This patchset aims to fix the deadlock during ttm populate.

TTM has a parameter called pages_limit, when allocated GTT memory
reaches this limit, swapout would be triggered. As ttm_bo_swapout does
not return the correct retval, populate might get hung.

UVD ib test uses GTT which might be insufficient. So a gpu recovery
would hung if populate hung.

I have made one drm test which alloc two GTT BOs, submit gfx copy
commands and free these BOs without waiting fence. What's more, these
gfx copy commands will cause gfx ring hang. So gpu recovery would be
triggered.

Now here is one possible deadlock case.
gpu_recovery
 -> stop drm scheduler
 -> asic reset
   -> ib test
  -> tt populate (uvd ib test)
->  ttm_bo_swapout (BO A) // this always fails as the fence of
BO A would not be signaled by schedluer or HW. Hit deadlock.

I paste the drm test patch below.
#modprobe ttm pages_limit=65536
#amdgpu_test -s 1 -t 4
---
 tests/amdgpu/basic_tests.c | 32 ++--
 1 file changed, 14 insertions(+), 18 deletions(-)

diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c
index dbf02fee..f85ed340 100644
--- a/tests/amdgpu/basic_tests.c
+++ b/tests/amdgpu/basic_tests.c
@@ -65,13 +65,16 @@ static void amdgpu_direct_gma_test(void);
 static void amdgpu_command_submission_write_linear_helper(unsigned ip_type);
 static void amdgpu_command_submission_const_fill_helper(unsigned ip_type);
 static void amdgpu_command_submission_copy_linear_helper(unsigned ip_type);
-static void amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle,
+static void _amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle,
   unsigned ip_type,
   int instance, int pm4_dw, uint32_t 
*pm4_src,
   int res_cnt, amdgpu_bo_handle *resources,
   struct amdgpu_cs_ib_info *ib_info,
-  struct amdgpu_cs_request *ibs_request);
+  struct amdgpu_cs_request *ibs_request, 
int sync, int repeat);

+#define amdgpu_test_exec_cs_helper(...) \
+   _amdgpu_test_exec_cs_helper(__VA_ARGS__, 1, 1)
+
 CU_TestInfo basic_tests[] = {
{ "Query Info Test",  amdgpu_query_info_test },
{ "Userptr Test",  amdgpu_userptr_test },
@@ -1341,12 +1344,12 @@ static void amdgpu_command_submission_compute(void)
  * pm4_src, resources, ib_info, and ibs_request
  * submit command stream described in ibs_request and wait for this IB 
accomplished
  */
-static void amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle,
+static void _amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle,
   unsigned ip_type,
   int instance, int pm4_dw, uint32_t 
*pm4_src,
   int res_cnt, amdgpu_bo_handle *resources,
   struct amdgpu_cs_ib_info *ib_info,
-  struct amdgpu_cs_request *ibs_request)
+  struct amdgpu_cs_request *ibs_request, 
int sync, int repeat)
 {
int r;
uint32_t expired;
@@ -1395,12 +1398,15 @@ static void 
amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle,
CU_ASSERT_NOT_EQUAL(ibs_request, NULL);

/* submit CS */
-   r = amdgpu_cs_submit(context_handle, 0, ibs_request, 1);
+   while (repeat--)
+   r = amdgpu_cs_submit(context_handle, 0, ibs_request, 1);
CU_ASSERT_EQUAL(r, 0);

r = amdgpu_bo_list_destroy(ibs_request->resources);
CU_ASSERT_EQUAL(r, 0);

+   if (!sync)
+   return;
fence_status.ip_type = ip_type;
fence_status.ip_instance = 0;
fence_status.ring = ibs_request->ring;
@@ -1667,7 +1673,7 @@ static void 
amdgpu_command_submission_sdma_const_fill(void)

 static void amdgpu_command_submission_copy_linear_helper(unsigned ip_type)
 {
-   const int sdma_write_length = 1024;
+   const int sdma_write_length = (255) << 20;
const int pm4_dw = 256;
amdgpu_context_handle context_handle;
amdgpu_bo_handle bo1, bo2;
@@ -1715,8 +1721,6 @@ static void 
amdgpu_command_submission_copy_linear_helper(unsigned ip_type)
_va_handle);
CU_ASSERT_EQUAL(r, 0);

-   /* set bo1 */
-   memset((void*)bo1_cpu, 0xaa, sdma_write_length);

/* allocate UC bo2 for sDMA use */
r = amdgpu_bo_alloc_and_map(device_handle,
@@ -1727,8 +1731,6 

[no subject]

2021-05-06 Thread David M Nieto
During stress testing we found that with some Vulkan applications
the fence information displayed in the recently added fdinfo was not
properly calculated, two issues were discovered:

(1) A missing dma_put_fence on the loop that calculates the usage
ratios when the fence is being ignored.
(2) The approximation for the ratio calculation is not accurate
when accounting for non-active contexts. The fix is to ignore those
context if they have activity ratios lower than 0.01%

Attached is also a script demonstrating how the fdinfo can be used
to monitor gpu usage on running processes.

#!/usr/bin/env python3

#
# Copyright (C) 2021 Advanced Micro Devices. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of
# this software and associated documentation files (the "Software"), to
# deal in
# the Software without restriction, including without limitation the
# rights to
# use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of
# the Software, and to permit persons to whom the Software is furnished
# to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS
# FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR
# COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
# WHETHER
# IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR
# IN
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#

from tokenize import tokenize
import sys
import os
import pwd

total_mem = dict()
total_usage = dict()
def can_access(path):
return os.access(path + "/fdinfo", os.X_OK)


def calc_perc(entry, metric):
if not metric in entry:
return 0.0
if (type(entry[metric]) == list) :
return sum(entry[metric])
else :
return entry[metric]

def process_pid(file):
stat = dict()
pasids = []

for fd in os.scandir(file.path + "/fdinfo"):
entry = {}
with open(fd) as f:
for line in f:
entries = line.strip().split()
if (entries[0] == "pdev:") :
entry["pdev"] = entries[1]
elif (entries[0] == "pasid:") :
entry["pasid"] = entries[1]
elif (entries[0] == "vram") :
entry["mem"] = int(entries[2])
elif ("gfx" in entries[0]) :
if not "gfx" in entry :
entry["gfx"] = [0,0,0,0,0,0,0,0]
entry["gfx"][int(entries[0].lstrip("gfx").rstrip(":"))]
=
float(entries[1].rstrip("%"))
elif ("dma" in entries[0]) :
if not "dma" in entry :
entry["dma"] = [0,0,0,0,0,0,0,0]
entry["dma"][int(entries[0].lstrip("dma").rstrip(":"))]
=
float(entries[1].rstrip("%"))
elif ("dec" in entries[0]) :
if not "dec" in entry :
entry["dec"] = [0,0,0,0,0,0,0,0]
entry["dec"][int(entries[0].lstrip("dec").rstrip(":"))]
=
float(entries[1].rstrip("%"))
elif ("enc" in entries[0]) :
if not "enc" in entry :
entry["enc"] = [0,0,0,0,0,0,0,0]
entry["enc"][int(entries[0].lstrip("enc").rstrip(":"))]
=
float(entries[1].rstrip("%"))
elif ("compute" in entries[0]) :
if not "compute" in entry :
entry["compute"] = [0,0,0,0,0,0,0,0]

entry["compute"][int(entries[0].lstrip("compute").rstrip(":"))]
=
float(entries[1].rstrip("%"))

if not "pdev" in entry:
continue
if not "pasid" in entry :
continue
if (entry["pdev"], entry["pasid"]) in pasids:
  continue
pasids.append((entry["pdev"], entry["pasid"]))

pdev = entry["pdev"]

if not pdev in stat:
stat[pdev] = dict()

if "mem" in entry :
  

[no subject]

2021-02-27 Thread CCF_100
get 058960
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Subject: [RFC] clang tooling cleanups

2020-11-10 Thread Tom Rix

On 11/9/20 6:52 PM, Joe Perches wrote:
> On Tue, 2020-10-27 at 09:42 -0700, t...@redhat.com wrote:
>> This rfc will describe
>> An upcoming treewide cleanup.
>> How clang tooling was used to programatically do the clean up.
>> Solicit opinions on how to generally use clang tooling.
>>
>> The clang warning -Wextra-semi-stmt produces about 10k warnings.
>> Reviewing these, a subset of semicolon after a switch looks safe to
>> fix all the time.  An example problem
>>
>> void foo(int a) {
>>  switch(a) {
>>     case 1:
>> ...
>>  }; <--- extra semicolon
>> }
>>
>> Treewide, there are about 100 problems in 50 files for x86_64 allyesconfig.
>> These fixes will be the upcoming cleanup.
> coccinelle already does some of these.
>
> For instance: scripts/coccinelle/misc/semicolon.cocci
>
> Perhaps some tool coordination can be done here as
> coccinelle/checkpatch/clang/Lindent call all be used
> to do some facet or another of these cleanup issues.

Thanks for pointing this out.

I will take a look at it.

Tom

>
>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Subject: [RFC] clang tooling cleanups

2020-11-10 Thread Joe Perches
On Tue, 2020-10-27 at 09:42 -0700, t...@redhat.com wrote:
> This rfc will describe
> An upcoming treewide cleanup.
> How clang tooling was used to programatically do the clean up.
> Solicit opinions on how to generally use clang tooling.
> 
> The clang warning -Wextra-semi-stmt produces about 10k warnings.
> Reviewing these, a subset of semicolon after a switch looks safe to
> fix all the time.  An example problem
> 
> void foo(int a) {
>  switch(a) {
>  case 1:
>  ...
>  }; <--- extra semicolon
> }
> 
> Treewide, there are about 100 problems in 50 files for x86_64 allyesconfig.
> These fixes will be the upcoming cleanup.

coccinelle already does some of these.

For instance: scripts/coccinelle/misc/semicolon.cocci

Perhaps some tool coordination can be done here as
coccinelle/checkpatch/clang/Lindent call all be used
to do some facet or another of these cleanup issues.



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Subject: [RFC] clang tooling cleanups

2020-10-27 Thread trix
This rfc will describe
An upcoming treewide cleanup.
How clang tooling was used to programatically do the clean up.
Solicit opinions on how to generally use clang tooling.

The clang warning -Wextra-semi-stmt produces about 10k warnings.
Reviewing these, a subset of semicolon after a switch looks safe to
fix all the time.  An example problem

void foo(int a) {
 switch(a) {
   case 1:
   ...
 }; <--- extra semicolon
}

Treewide, there are about 100 problems in 50 files for x86_64 allyesconfig.
These fixes will be the upcoming cleanup.

clang already supports fixing this problem. Add to your command line

  clang -c -Wextra-semi-stmt -Xclang -fixit foo.c

  foo.c:8:3: warning: empty expression statement has no effect;
remove unnecessary ';' to silence this warning [-Wextra-semi-stmt]
};
 ^
  foo.c:8:3: note: FIX-IT applied suggested code changes
  1 warning generated.

The big problem is using this treewide is it will fix all 10k problems.
10k changes to analyze and upstream is not practical.

Another problem is the generic fixer only removes the semicolon.
So empty lines with some tabs need to be manually cleaned.

What is needed is a more precise fixer.

Enter clang-tidy.
https://clang.llvm.org/extra/clang-tidy/

Already part of the static checker infrastructure, invoke on the clang
build with
  make clang-tidy

It is only a matter of coding up a specific checker for the cleanup.
Upstream this is review is happening here
https://reviews.llvm.org/D90180

The development of a checker/fixer is
Start with a reproducer

void foo (int a) {
  switch (a) {};
}

Generate the abstract syntax tree (AST)

  clang -Xclang -ast-dump foo.c

`-FunctionDecl 
  |-ParmVarDecl 
  `-CompoundStmt 
|-SwitchStmt 
| |-ImplicitCastExpr
| | `-DeclRefExpr
| `-CompoundStmt
`-NullStmt

Write a matcher to get you most of the way

void SwitchSemiCheck::registerMatchers(MatchFinder *Finder) {
  Finder->addMatcher(
  compoundStmt(has(switchStmt().bind("switch"))).bind("comp"), this);
}

The 'bind' method is important, it allows a string to be associated
with a node in the AST.  In this case these are

`-FunctionDecl 
  |-ParmVarDecl 
  `-CompoundStmt < comp
|-SwitchStmt < switch
| |-ImplicitCastExpr
| | `-DeclRefExpr
| `-CompoundStmt
`-NullStmt

When a match is made the 'check' method will be called.

  void SwitchSemiCheck::check(const MatchFinder::MatchResult ) {
auto *C = Result.Nodes.getNodeAs("comp");
auto *S = Result.Nodes.getNodeAs("switch");

This is where the string in the bind calls are changed to nodes

`-FunctionDecl 
  |-ParmVarDecl 
  `-CompoundStmt < comp, C
|-SwitchStmt < switch, S
| |-ImplicitCastExpr
| | `-DeclRefExpr
| `-CompoundStmt
`-NullStmt <-- looking for N

And then more logic to find the NullStmt

  auto Current = C->body_begin();
  auto Next = Current;
  Next++;
  while (Next != C->body_end()) {
if (*Current == S) {
  if (const auto *N = dyn_cast(*Next)) {

When it is found, a warning is printed and a FixItHint is proposed.

  auto H = FixItHint::CreateReplacement(
SourceRange(S->getBody()->getEndLoc(), N->getSemiLoc()), "}");
  diag(N->getSemiLoc(), "unneeded semicolon") << H;

This fixit replaces from the end of switch to the semicolon with a
'}'.  Because the end of the switch is '}' this has the effect of
removing all the whitespace as well as the semicolon.

Because of the checker's placement in clang-tidy existing linuxkernel
checkers, all that was needed to fix the tree was to add a '-fix'to the
build's clang-tidy call.

I am looking for opinions on what we want to do specifically with
cleanups and generally about other source-to-source programmatic
changes to the code base.

For cleanups, I think we need a new toplevel target

clang-tidy-fix

And an explicit list of fixers that have a very high (100%?) fix rate.

Ideally a bot should make the changes, but a bot could also nag folks.
Is there interest in a bot making the changes? Does one already exist?

The general source-to-source is a bit blue sky.  Ex/ could automagicly
refactor api, outline similar cut-n-pasted functions etc. Anything on
someone's wishlist you want to try out ?

Signed-off-by: Tom Rix 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[no subject]

2020-07-16 Thread Mauro Rossi
The series adds SI support to AMD DC

Changelog:

[RFC]
Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c

[PATCH v2]
Rebase on amd-staging-drm-next dated 17-Oct-2018

[PATCH v3]
Add support for DCE6 specific headers,
ad hoc DCE6 macros, funtions and fixes,
rebase on current amd-staging-drm-next


Commits [01/27]..[08/27] SI support added in various DC components

[PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
[PATCH v3 02/27] drm/amd/display: add asics info for SI parts
[PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
[PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
[PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
[PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
[PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
[PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)

Commits [09/27]..[24/27] DCE6 specific code adaptions

[PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
[PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
[PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
[PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
[PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
[PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
[PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific 
macros,functions
[PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific 
macros,functions
[PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
[PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific 
macros,functions
[PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
[PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter 
Init
[PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
[PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific 
.cursor_lock
[PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific 
functions
[PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)


Commits [25/27]..[27/27] SI support final enablements

[PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie 
and later
[PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
[PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)


Signed-off-by: Mauro Rossi 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v2 4/5] Subject: drm/amdgpu: Redo XGMI reset synchronization.

2019-12-13 Thread Andrey Grodzovsky
Use task barrier in XGMI hive to synchronize ASIC resets
across devices in XGMI hive.

v2: Retrun right away with a warning if no xgmi hive, update doc.
Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 37 +-
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1d19edfa..2ae944c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -67,6 +67,7 @@
 #include "amdgpu_tmz.h"
 
 #include 
+#include 
 
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
@@ -2663,14 +2664,38 @@ static void amdgpu_device_xgmi_reset_func(struct 
work_struct *__work)
 {
struct amdgpu_device *adev =
container_of(__work, struct amdgpu_device, xgmi_reset_work);
+   struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev, 0);
 
-   if (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO)
-   adev->asic_reset_res = (adev->in_baco == false) ?
-   amdgpu_device_baco_enter(adev->ddev) :
-   qamdgpu_device_baco_exit(adev->ddev);
-   else
-   adev->asic_reset_res = amdgpu_asic_reset(adev);
+   /* It's a bug to not have a hive within this function */
+   if (WARN_ON(!hive))
+   return;
+
+   /*
+* Use task barrier to synchronize all xgmi reset works across the
+* hive. task_barrier_enter and task_barrier_exit will block
+* until all the threads running the xgmi reset works reach
+* those points. task_barrier_full will do both blocks.
+*/
+   if (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) {
+
+   task_barrier_enter(>tb);
+   adev->asic_reset_res = amdgpu_device_baco_enter(adev->ddev);
+
+   if (adev->asic_reset_res)
+   goto fail;
+
+   task_barrier_exit(>tb);
+   adev->asic_reset_res = amdgpu_device_baco_exit(adev->ddev);
+
+   if (adev->asic_reset_res)
+   goto fail;
+   } else {
+
+   task_barrier_full(>tb);
+   adev->asic_reset_res =  amdgpu_asic_reset(adev);
+   }
 
+fail:
if (adev->asic_reset_res)
DRM_WARN("ASIC reset failed with error, %d for drm dev, %s",
 adev->asic_reset_res, adev->ddev->unique);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [RESEND PATCH 4/5] Subject: drm/amdgpu: Redo XGMI reset synchronization.

2019-12-12 Thread Andrey Grodzovsky


On 12/11/19 11:05 PM, Ma, Le wrote:


[AMD Official Use Only - Internal Distribution Only]

-Original Message-
From: Andrey Grodzovsky 
Sent: Thursday, December 12, 2019 4:39 AM
To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Ma, Le 
; Zhang, Hawking ; Quan, Evan 
; Grodzovsky, Andrey 
Subject: [RESEND PATCH 4/5] Subject: drm/amdgpu: Redo XGMI reset 
synchronization.


Use task barrier in XGMI hive to synchronize ASIC resets across 
devices in XGMI hive.


Signed-off-by: Andrey Grodzovsky <mailto:andrey.grodzov...@amd.com>>


---

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 42 
+-


1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c


index 1d19edfa..e4089a0 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

@@ -67,6 +67,7 @@

#include "amdgpu_tmz.h"

 #include 

+#include 

 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");

MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");

@@ -2663,14 +2664,43 @@ static void 
amdgpu_device_xgmi_reset_func(struct work_struct *__work)  {


   struct amdgpu_device *adev =

container_of(__work, struct amdgpu_device, xgmi_reset_work);

+  struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev, 0);

-   if (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO)

- adev->asic_reset_res = (adev->in_baco == false) ?

- amdgpu_device_baco_enter(adev->ddev) :

- qamdgpu_device_baco_exit(adev->ddev);

-   else

- adev->asic_reset_res = amdgpu_asic_reset(adev);

+  /*

+  * Use task barrier to synchronize all xgmi reset works 
across the


+  * hive.

+  * task_barrier_enter and task_barrier_exit will block 
untill all the


+  * threads running the xgmi reset works reach those points. 
I assume


+  * guarantee of progress here for all the threads as the 
workqueue code


+  * creates new worker threads as needed by amount of work 
items in queue


+  * (see worker_thread) and also each thread sleeps in the 
barrir and by


+  * this yielding the CPU for other work threads to make 
progress.


+  */

[Le]: This comments can be adjusted since we switch to 
system_unbound_wq in patch #5.


+  if (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) {

+

+  if (hive)

+ task_barrier_enter(>tb);

[Le]: The multiple hive condition can be checked only once and moved 
to the location right after the assignment.




Not sure what you meant here but in fact let's note that while in 
amdgpu_device_xgmi_reset_func it's a bug for amdgpu_get_xgmi_hive to 
return NULL so I think better instead to add WARN_ON(!hive,"...") and 
return right at the beginning of the function if indeed hive == NULL


Andrey



+

+ adev->asic_reset_res = amdgpu_device_baco_enter(adev->ddev);

+

+  if (adev->asic_reset_res)

+  goto fail;

+

+  if (hive)

+ task_barrier_exit(>tb);

[Le]: Same as above.

+

+ adev->asic_reset_res = amdgpu_device_baco_exit(adev->ddev);

+

+  if (adev->asic_reset_res)

+  goto fail;

+  } else {

+  if (hive)

+ task_barrier_full(>tb);

[Le]: Same as above.

With above addressed, Reviewed-by: Le Ma <mailto:le...@amd.com>>


Regards,

Ma Le

+

+ adev->asic_reset_res =  amdgpu_asic_reset(adev);

+  }

+fail:

   if (adev->asic_reset_res)

   DRM_WARN("ASIC reset failed with error, %d for 
drm dev, %s",


 adev->asic_reset_res, adev->ddev->unique);

--

2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [RESEND PATCH 4/5] Subject: drm/amdgpu: Redo XGMI reset synchronization.

2019-12-11 Thread Ma, Le
[AMD Official Use Only - Internal Distribution Only]






-Original Message-
From: Andrey Grodzovsky 
Sent: Thursday, December 12, 2019 4:39 AM
To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Ma, Le ; 
Zhang, Hawking ; Quan, Evan ; 
Grodzovsky, Andrey 
Subject: [RESEND PATCH 4/5] Subject: drm/amdgpu: Redo XGMI reset 
synchronization.



Use task barrier in XGMI hive to synchronize ASIC resets across devices in XGMI 
hive.



Signed-off-by: Andrey Grodzovsky 
mailto:andrey.grodzov...@amd.com>>

---

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 42 +-

1 file changed, 36 insertions(+), 6 deletions(-)



diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 1d19edfa..e4089a0 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

@@ -67,6 +67,7 @@

#include "amdgpu_tmz.h"



 #include 

+#include 



 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");

MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");

@@ -2663,14 +2664,43 @@ static void amdgpu_device_xgmi_reset_func(struct 
work_struct *__work)  {

   struct amdgpu_device *adev =

   container_of(__work, struct amdgpu_device, 
xgmi_reset_work);

+  struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev, 0);



-   if (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO)

-   adev->asic_reset_res = (adev->in_baco == false) ?

-   
amdgpu_device_baco_enter(adev->ddev) :

-   
qamdgpu_device_baco_exit(adev->ddev);

-   else

-   adev->asic_reset_res = amdgpu_asic_reset(adev);

+  /*

+  * Use task barrier to synchronize all xgmi reset works across the

+  * hive.

+  * task_barrier_enter and task_barrier_exit will block untill all the

+  * threads running the xgmi reset works reach those points. I assume

+  * guarantee of progress here for all the threads as the workqueue 
code

+  * creates new worker threads as needed by amount of work items in 
queue

+  * (see worker_thread) and also each thread sleeps in the barrir and 
by

+  * this yielding the CPU for other work threads to make progress.

+  */

[Le]: This comments can be adjusted since we switch to system_unbound_wq in 
patch #5.

+  if (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) {

+

+  if (hive)

+  task_barrier_enter(>tb);

[Le]: The multiple hive condition can be checked only once and moved to the 
location right after the assignment.

+

+  adev->asic_reset_res = 
amdgpu_device_baco_enter(adev->ddev);

+

+  if (adev->asic_reset_res)

+  goto fail;

+

+  if (hive)

+  task_barrier_exit(>tb);

[Le]: Same as above.

+

+  adev->asic_reset_res = 
amdgpu_device_baco_exit(adev->ddev);

+

+  if (adev->asic_reset_res)

+  goto fail;

+  } else {

+  if (hive)

+  task_barrier_full(>tb);

[Le]: Same as above.



With above addressed, Reviewed-by: Le Ma mailto:le...@amd.com>>



Regards,

Ma Le

+

+  adev->asic_reset_res =  amdgpu_asic_reset(adev);

+  }



+fail:

   if (adev->asic_reset_res)

   DRM_WARN("ASIC reset failed with error, %d for drm dev, 
%s",

adev->asic_reset_res, adev->ddev->unique);

--

2.7.4


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[RESEND PATCH 4/5] Subject: drm/amdgpu: Redo XGMI reset synchronization.

2019-12-11 Thread Andrey Grodzovsky
Use task barrier in XGMI hive to synchronize ASIC resets
across devices in XGMI hive.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 42 +-
 1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1d19edfa..e4089a0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -67,6 +67,7 @@
 #include "amdgpu_tmz.h"
 
 #include 
+#include 
 
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
@@ -2663,14 +2664,43 @@ static void amdgpu_device_xgmi_reset_func(struct 
work_struct *__work)
 {
struct amdgpu_device *adev =
container_of(__work, struct amdgpu_device, xgmi_reset_work);
+   struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev, 0);
 
-   if (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO)
-   adev->asic_reset_res = (adev->in_baco == false) ?
-   amdgpu_device_baco_enter(adev->ddev) :
-   qamdgpu_device_baco_exit(adev->ddev);
-   else
-   adev->asic_reset_res = amdgpu_asic_reset(adev);
+   /*
+* Use task barrier to synchronize all xgmi reset works across the
+* hive.
+* task_barrier_enter and task_barrier_exit will block untill all the
+* threads running the xgmi reset works reach those points. I assume
+* guarantee of progress here for all the threads as the workqueue code
+* creates new worker threads as needed by amount of work items in queue
+* (see worker_thread) and also each thread sleeps in the barrir and by
+* this yielding the CPU for other work threads to make progress.
+*/
+   if (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) {
+
+   if (hive)
+   task_barrier_enter(>tb);
+
+   adev->asic_reset_res = amdgpu_device_baco_enter(adev->ddev);
+
+   if (adev->asic_reset_res)
+   goto fail;
+
+   if (hive)
+   task_barrier_exit(>tb);
+
+   adev->asic_reset_res = amdgpu_device_baco_exit(adev->ddev);
+
+   if (adev->asic_reset_res)
+   goto fail;
+   } else {
+   if (hive)
+   task_barrier_full(>tb);
+
+   adev->asic_reset_res =  amdgpu_asic_reset(adev);
+   }
 
+fail:
if (adev->asic_reset_res)
DRM_WARN("ASIC reset failed with error, %d for drm dev, %s",
 adev->asic_reset_res, adev->ddev->unique);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 4/5] Subject: drm/amdgpu: Redo XGMI reset synchronization.

2019-12-11 Thread Andrey Grodzovsky
Use task barrier in XGMI hive to synchronize ASIC resets
across devices in XGMI hive.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 42 +-
 1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1d19edfa..e4089a0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -67,6 +67,7 @@
 #include "amdgpu_tmz.h"
 
 #include 
+#include 
 
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
@@ -2663,14 +2664,43 @@ static void amdgpu_device_xgmi_reset_func(struct 
work_struct *__work)
 {
struct amdgpu_device *adev =
container_of(__work, struct amdgpu_device, xgmi_reset_work);
+   struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev, 0);
 
-   if (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO)
-   adev->asic_reset_res = (adev->in_baco == false) ?
-   amdgpu_device_baco_enter(adev->ddev) :
-   qamdgpu_device_baco_exit(adev->ddev);
-   else
-   adev->asic_reset_res = amdgpu_asic_reset(adev);
+   /*
+* Use task barrier to synchronize all xgmi reset works across the
+* hive.
+* task_barrier_enter and task_barrier_exit will block untill all the
+* threads running the xgmi reset works reach those points. I assume
+* guarantee of progress here for all the threads as the workqueue code
+* creates new worker threads as needed by amount of work items in queue
+* (see worker_thread) and also each thread sleeps in the barrir and by
+* this yielding the CPU for other work threads to make progress.
+*/
+   if (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) {
+
+   if (hive)
+   task_barrier_enter(>tb);
+
+   adev->asic_reset_res = amdgpu_device_baco_enter(adev->ddev);
+
+   if (adev->asic_reset_res)
+   goto fail;
+
+   if (hive)
+   task_barrier_exit(>tb);
+
+   adev->asic_reset_res = amdgpu_device_baco_exit(adev->ddev);
+
+   if (adev->asic_reset_res)
+   goto fail;
+   } else {
+   if (hive)
+   task_barrier_full(>tb);
+
+   adev->asic_reset_res =  amdgpu_asic_reset(adev);
+   }
 
+fail:
if (adev->asic_reset_res)
DRM_WARN("ASIC reset failed with error, %d for drm dev, %s",
 adev->asic_reset_res, adev->ddev->unique);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[no subject]

2018-03-02 Thread Keith Packard
Here are the patches to the modesetting driver amended for the amdgpu
driver.

-keith

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[no subject]

2017-02-10 Thread Tom St Denis
Fix bug where GPU_POWER wasn't accessible because we wrote
to *size early...


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx