On 2019/8/29 下午3:22, Christian König wrote:
Am 29.08.19 um 07:55 schrieb zhoucm1:
On 2019/8/29 上午1:08, Marek Olšák wrote:
It can't break an older driver, because there is no older driver
that requires the static allocation.
Note that closed source drivers don't count, because they don't
Only skimmed over the patches, but in general looks good to me.
Feel free to add an Acked-by: Christian König
to the whole series.
But somebody with more ras knowledge than I have should probably take a
look as well.
Christian.
Am 29.08.19 um 10:59 schrieb Guchun Chen:
These patches are
The series is:
Reviewed-by: Tao Zhou
> -Original Message-
> From: Guchun Chen
> Sent: 2019年8月29日 16:59
> To: amd-gfx@lists.freedesktop.org; Zhang, Hawking
> ; Li, Dennis ; Koenig,
> Christian ; Deucher, Alexander
> ; Zhou1, Tao
> Cc: Li, Candice ; Chen, Guchun
>
> Subject: [PATCH
Am 29.08.19 um 05:05 schrieb Tianci Yin:
From: "Tianci.Yin"
stolen memory should be fixed in visible region.
Change-Id: Icbbbd39fd113e93423aad8d2555f4073c08020e5
Signed-off-by: Tianci.Yin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 --
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3
Am 28.08.19 um 22:00 schrieb Andrey Grodzovsky:
Problem:
Under certain conditions, when some IP bocks take a RAS error,
we can get into a situation where a GPU reset is not possible
due to issues in RAS in SMU/PSP.
Temporary fix until proper solution in PSP/SMU is ready:
When uncorrectable
> -Original Message-
> From: amd-gfx On Behalf Of
> Andrey Grodzovsky
> Sent: 2019年8月29日 4:00
> To: amd-gfx@lists.freedesktop.org
> Cc: alexdeuc...@gmail.com; ckoenig.leichtzumer...@gmail.com;
> Grodzovsky, Andrey ; Zhang, Hawking
>
> Subject: [PATCH 1/2] dmr/amdgpu: Avoid HW GPU reset
These patches are to remove additional external lib-jsonc
dependence, and to put all test configurations into C code.
Guchun Chen (4):
amdgpu: remove json package dependence
amdgpu: delete test configuration file
amdgpu: add ras inject unit test
amdgpu: add ras feature capability check in
Except CUnit library, no additional external
library should be needed when compiling amdgpu_test.
This will keep this binary self containing.
Change-Id: Id1935ef4431a0674c69391a67813370a3e9348e6
Suggested-by: Christian König
Signed-off-by: Guchun Chen
---
configure.ac | 18 -
Json package dependence is removed from amdgpu_test,
so this json configuration file is not needed any more.
Change-Id: Ibd64c30244c5ae894928d9de5460f1c776408054
Suggested-by: Christian König
Signed-off-by: Guchun Chen
---
data/amdgpu_ras.json | 267 ---
Both UMC and GFX ras single_correctable
inject tests are added.
Change-Id: I46c29b8761294122fc9acb620441a7aace6509e4
Signed-off-by: Guchun Chen
---
tests/amdgpu/ras_tests.c | 144 +--
1 file changed, 107 insertions(+), 37 deletions(-)
diff --git
When running ras inject test, it's needed to be aligned
with kernel's ras enablement.
Change-Id: I7e69a1a3f6ab7a0053f67f7f1dd3fb9af64f478f
Signed-off-by: Guchun Chen
---
tests/amdgpu/ras_tests.c | 4
1 file changed, 4 insertions(+)
diff --git a/tests/amdgpu/ras_tests.c
Since DP-specific information is stored in driver's structures, every
driver needs to implement subconnector property by itself.
Reviewed-by: Emil Velikov
Signed-off-by: Oleg Vasilev
Cc: Alex Deucher
Cc: Christian König
Cc: David (ChunMing) Zhou
Cc: amd-gfx@lists.freedesktop.org
---
Am 29.08.19 um 08:05 schrieb Kenny Ho:
> Allow DRM TTM memory manager to register a work_struct, such that, when
> a drmcgrp is under memory pressure, memory reclaiming can be triggered
> immediately.
>
> Change-Id: I25ac04e2db9c19ff12652b88ebff18b44b2706d8
> Signed-off-by: Kenny Ho
> ---
>
Hi Eric,
Yin has already proposed patches for fixing this a few days ago. Please
help to review those instead.
Thanks,
Christian
Am 28.08.19 um 16:59 schrieb Huang, JinHuiEric:
It partially reverts the regression of
commit e4a67e6cf14c258619f
("drm/amdgpu/psp: move TMR to cpu invisible
Since DP-specific information is stored in driver's structures, every
driver needs to implement subconnector property by itself. Display
Core already has the subconnector information, we only need to
expose it through DRM property.
Signed-off-by: Oleg Vasilev
Tested-by: Oleg Vasilev
Cc: Alex
Am 29.08.19 um 07:55 schrieb zhoucm1:
On 2019/8/29 上午1:08, Marek Olšák wrote:
It can't break an older driver, because there is no older driver that
requires the static allocation.
Note that closed source drivers don't count, because they don't need
backward compatibility.
Yes, I agree,
Ok, I'll fix that, thanks!
From: Christian König
Sent: Thursday, August 29, 2019 15:13
To: Yin, Tianci (Rico) ; amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Ma, Le ; Xiao, Jack
; Zhang, Hawking
Subject: Re: [PATCH 2/2] drm/amdgpu: keep the stolen memory in
Am 28.08.19 um 22:00 schrieb Andrey Grodzovsky:
In case of RAS error allow user configure auto system
reboot through ras_ctrl.
This is also part of the temproray work around for the RAS
hang problem.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 18
On Mon, Aug 26, 2019 at 9:22 AM Oleg Vasilev wrote:
>
> Since DP-specific information is stored in driver's structures, every
> driver needs to implement subconnector property by itself.
>
> Reviewed-by: Emil Velikov
> Signed-off-by: Oleg Vasilev
> Cc: Alex Deucher
> Cc: Christian König
> Cc:
ras_late_init callback function will be used to do common ras
init in late init phase.
v2: call ras_late_fini to do cleanup when fails to enable interrupt
v3: rename sysfs/debugfs node name to pcie_bif_xxx
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h | 2 ++
call helper function in late init phase to handle ras init
for nbio ip block
v2: init local var r to 0 in case the function return failure
on asics that don't have ras_late_init implementation
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/soc15.c | 13 -
1 file
Agree, the placement of amdgpu_amdkfd_pre/post _reset in
amdgpu_device_lock/unlock_adev is a bit wierd.
Andrey
On 8/29/19 10:06 AM, Koenig, Christian wrote:
Felix advised that the way to stop all KFD activity is simply to NOT
call amdgpu_amdkfd_post_reset so that why I added this. Do you mean
On 8/29/19 3:56 AM, Zhou1, Tao wrote:
>
>> -Original Message-
>> From: amd-gfx On Behalf Of
>> Andrey Grodzovsky
>> Sent: 2019年8月29日 4:00
>> To: amd-gfx@lists.freedesktop.org
>> Cc: alexdeuc...@gmail.com; ckoenig.leichtzumer...@gmail.com;
>> Grodzovsky, Andrey ; Zhang, Hawking
>>
>>
On 2019-08-29 1:38 a.m., Dave Airlie wrote:
> On Thu, 29 Aug 2019 at 07:04, Bhawanpreet Lakha
> wrote:
>>
>> From: Bayan Zabihiyan
>>
>> [Why]
>> Edid Utility wishes to include DSC module from driver instead
>> of doing it's own logic which will need to be updated every time
>> someone modifies
Thanks for the feedback Christian. I am still digging into this one.
Daniel suggested leveraging the Shrinker API for the functionality of this
commit in RFC v3 but I am still trying to figure it out how/if ttm fit with
shrinker (though the idea behind the shrinker API seems fairly
Yeah, that's also a really good idea as well.
The problem with the shrinker API is that it only applies to system memory
currently.
So you won't have a distinction which domain you need to evict stuff from.
Regards,
Christian.
Am 29.08.19 um 16:07 schrieb Kenny Ho:
Thanks for the feedback
The function will be called in late init phase to do mmhub
ras init
v2: check ras_late_init function pointer before invoking the
function
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.h | 1 +
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 26 --
call helper function in late init phase to handle ras init
for sdma ip block
v2: call ras_late_fini to do clean up when fail to enable interrupt
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 98 +-
1 file changed, 24 insertions(+), 74
call helper function in late init phase to handle ras init
for gfx ip block
v2: call ras_late_fini to do clean up when fail to enable interrupt
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 92 ---
1 file changed, 21 insertions(+), 71
call helper function in late init phase to handle ras init
for gmc ip block
v2: call ras_late_fini to do clean up when fail to enable interrupt
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 159 ++
1 file changed, 47 insertions(+), 112
In late_init for ras, the helper function will be used to
1). disable ras feature if the IP block is masked as disabled
2). send enable feature command if the ip block was masked as enabled
3). create debugfs/sysfs node per IP block
4). register interrupt handler
v2: check ih_info.cb to decide
Am 29.08.19 um 16:03 schrieb Grodzovsky, Andrey:
> On 8/29/19 3:30 AM, Christian König wrote:
>> Am 28.08.19 um 22:00 schrieb Andrey Grodzovsky:
>>> Problem:
>>> Under certain conditions, when some IP bocks take a RAS error,
>>> we can get into a situation where a GPU reset is not possible
>>> due
Hi all,
Since I connected my Dell display on my Radeon R5 240 (Oland) card over
DisplayPort instead of VGA, I get the following error messages logged at every
boot:
[drm:dce_v6_0_encoder_mode_set [amdgpu]] *ERROR* Couldn't read Speaker
Allocation Data Block: -2
[drm:dce_v6_0_encoder_mode_set
On 8/29/19 3:30 AM, Christian König wrote:
> Am 28.08.19 um 22:00 schrieb Andrey Grodzovsky:
>> Problem:
>> Under certain conditions, when some IP bocks take a RAS error,
>> we can get into a situation where a GPU reset is not possible
>> due to issues in RAS in SMU/PSP.
>>
>> Temporary fix until
Yes, and I think it has quite a lot of coupling with mm's page and
pressure mechanisms. My current thought is to just copy the API but
have a separate implementation of "ttm_shrinker" and
"ttm_shrinker_control" or something like that. I am certainly happy
to listen to additional feedbacks and
On 8/29/19 12:18 PM, Kuehling, Felix wrote:
> On 2019-08-29 10:08 a.m., Grodzovsky, Andrey wrote:
>> Agree, the placement of amdgpu_amdkfd_pre/post _reset in
>> amdgpu_device_lock/unlock_adev is a bit wierd.
>>
> amdgpu_device_reset_sriov already calls amdgpu_amdkfd_pre/post_reset
> itself while
On 2019-08-28 3:52 p.m., Siqueira, Rodrigo wrote:
> DP 1.4a specification defines Link Training Tunable PHY Repeater (LTTPR)
> which is required to add support for systems with Thunderbolt or other
> repeater devices.
>
> Changes since V2:
> - Drop the kernel-doc comment
> - Reorder LTTPR
From: Ramalingam C
DRM API for generating uevent for a status changes of connector's
property.
This uevent will have following details related to the status change:
HOTPLUG=1, CONNECTOR= and PROPERTY=
Pekka have completed the Weston DRM-backend review in
[Why]
All the HDCP transactions should be verified using PSP.
[How]
This patch calls psp with the correct inputs to verify the steps
of authentication.
Signed-off-by: Bhawanpreet Lakha
---
.../drm/amd/display/modules/hdcp/hdcp_psp.c | 328 ++
DTM is the display topology manager. This is needed to communicate with
psp about the display configurations.
This patch adds
-Loading the firmware
-The functions and definitions for communication with the firmware
Signed-off-by: Bhawanpreet Lakha
---
This patch set introduces HDCP 1.4 capability to Asics starting with Raven(DCN
1.0).
This only introduces the ability to authenticate and encrypt the link. These
patches by themselves don't constitute a complete and compliant
HDCP content protection solution but are a requirement for such a
From: Ramalingam C
Content protection property is created once and stored in
drm_mode_config. And attached to all HDCP capable connectors.
Signed-off-by: Ramalingam C
Reviewed-by: Daniel Vetter
Acked-by: Dave Airlie
Signed-off-by: Daniel Vetter
Link:
From: Ramalingam C
Considering the significant size of hdcp related code in drm, all
hdcp related codes are moved into separate file called drm_hdcp.c.
v2:
Rebased.
v2:
Rebased.
Signed-off-by: Ramalingam C
Suggested-by: Daniel Vetter
Reviewed-by: Daniel Vetter
Acked-by: Dave Airlie
[Why]
We need this to enable HDCP on linux, as we need events to interact
with the hdcp module
[How]
Add work queue to display manager and handle the creation and destruction
of the queue
Signed-off-by: Bhawanpreet Lakha
---
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 30
On 2019-08-29 10:08 a.m., Grodzovsky, Andrey wrote:
>
> Agree, the placement of amdgpu_amdkfd_pre/post _reset in
> amdgpu_device_lock/unlock_adev is a bit wierd.
>
amdgpu_device_reset_sriov already calls amdgpu_amdkfd_pre/post_reset
itself while it has exclusive access to the GPU. It would make
From: Ramalingam C
On every hdcp revocation check request SRM is read from fw file
/lib/firmware/display_hdcp_srm.bin
SRM table is parsed and stored at drm_hdcp.c, with functions exported
for the services for revocation check from drivers (which
implements the HDCP authentication)
This patch
From: Ramalingam C
Existing functions for converting a 3bytes(be24) of big endian value
into u32 of little endian and vice versa are renamed as
s/drm_hdcp2_seq_num_to_u32/drm_hdcp_be24_to_cpu
s/drm_hdcp2_u32_to_seq_num/drm_hdcp_cpu_to_be24
Signed-off-by: Ramalingam C
Suggested-by: Daniel
This patch adds
-Loading the firmware
-The functions and definitions for communication with the firmware
Signed-off-by: Bhawanpreet Lakha
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 188 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 17 ++
[Why]
We need to interact with the hdcp module from the DM, the module
has to be interacted with in terms of events
[How]
Create the files needed for linux hdcp. These files manage the events
needed for the dm to interact with the hdcp module.
We use the kernel work queue to process the events
[Why]
We need to read and write specific i2c and dpcd messages.
[How]
Created static functions for packing the dpcd and i2c messages for hdcp.
Signed-off-by: Bhawanpreet Lakha
---
.../amd/display/amdgpu_dm/amdgpu_dm_hdcp.c| 40 ++-
1 file changed, 39 insertions(+), 1
From: Ramalingam C
drm function is defined and exported to update a connector's
content protection property state and to generate a uevent along
with it.
Pekka have completed the Weston DRM-backend review in
https://gitlab.freedesktop.org/wayland/weston/merge_requests/48
and the UAPI for HDCP
From: Ramalingam C
This patch adds a DRM ENUM property to the selected connectors.
This property is used for mentioning the protected content's type
from userspace to kernel HDCP authentication.
Type of the stream is decided by the protected content providers.
Type 0 content can be rendered on
[Why]
We don't support HDCP for pre RAVEN asics
[How]
Check if we are RAVEN+. Use this to attach the content_protection
property, this way usermode can't try to enable HDCP on pre DCN asics.
Also we need to update the module on hpd so guard it aswell
Change-Id:
[Why]
We need to update the hdcp display parameter whenever the link is
updated, so the next time there is an update to hdcp we have the
latest display info
[How]
Create a callback, and use this anytime there is a change in the link. This will
be used later by the dm.
Signed-off-by: Bhawanpreet
[Why]
We need to manage the content protection property changes for
different usecase, once cp is DESIRED we need to maintain the
ENABLED/DESIRED status for different cases.
[How]
1. Attach the content_protection property
2. HDCP enable (UNDESIRED -> DESIRED)
call into the module with
[Why]
This is needed for DP as DP can send us info using irq.
[How]
Check if irq bit is set on short pulse and call the
function that handles cpirq in amdgpu_dm_hdcp
Signed-off-by: Bhawanpreet Lakha
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +++
1 file changed, 15
[Why]
We need to use HW state to set content protection to ENABLED.
This way we know that the link is encrypted from the HW side
[How]
Create a workqueue that queries the HW every ~2seconds, and sets it to
ENABLED or DESIRED based on the result from the hardware
Change-Id:
[Why]
HDCP is not fully finished, so we need to be able to
build and run the driver without it.
[How]
Add a Kconfig to toggle it
Signed-off-by: Bhawanpreet Lakha
---
drivers/gpu/drm/amd/display/Kconfig | 8
1 file changed, 8 insertions(+)
diff --git
On Sun, Aug 25, 2019 at 10:13:05PM +0800, Hillf Danton wrote:
> Can we try to add the fallback timer manually?
>
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -322,6 +322,10 @@ int amdgpu_fence_wait_empty(struct amdgp
> }
>
On 8/28/19 4:58 AM, Ernst Sjöstrand wrote:
> Den tis 27 aug. 2019 kl 20:17 skrev Andrey Grodzovsky
> :
>> This should be checked at all places job is accessed.
>>
>> Signed-off-by: Andrey Grodzovsky
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8
>> 1 file changed, 4
From: Nicolai Hähnle
[ Upstream commit 1a701ea924815b0518733aa8d5d05c1f6fa87062 ]
Error out if the AMDGPU_CS ioctl is called with multiple SYNCOBJ_OUT and/or
TIMELINE_SIGNAL chunks, since otherwise the last chunk wins while the
allocated array as well as the reference counts of sync objects are
Patches are
Acked-by: Harry Wentland
Harry
On 2019-08-28 3:56 p.m., Alex Deucher wrote:
> This patch set adds initial DC display support for
> Renoir. Renoir is a new APU.
>
> I have omitted the register patch due to size. The
> full tree is available here:
>
On Thu, Aug 29, 2019 at 9:11 AM Jean Delvare wrote:
>
> Hi all,
>
> Since I connected my Dell display on my Radeon R5 240 (Oland) card over
> DisplayPort instead of VGA, I get the following error messages logged at
> every boot:
>
> [drm:dce_v6_0_encoder_mode_set [amdgpu]] *ERROR* Couldn't read
On 2019-08-29 1:21 p.m., Grodzovsky, Andrey wrote:
> On 8/29/19 12:18 PM, Kuehling, Felix wrote:
>> On 2019-08-29 10:08 a.m., Grodzovsky, Andrey wrote:
>>> Agree, the placement of amdgpu_amdkfd_pre/post _reset in
>>> amdgpu_device_lock/unlock_adev is a bit wierd.
>>>
>> amdgpu_device_reset_sriov
If you decide to add it back, use this instead, it's simpler:
https://patchwork.freedesktop.org/patch/318391/?series=63775=1
Maybe remove OA reservation if you don't need it.
Marek
On Thu, Aug 29, 2019 at 5:06 AM zhoucm1 wrote:
>
> On 2019/8/29 下午3:22, Christian König wrote:
>
> Am 29.08.19
The same BO can be mapped with different PTE flags by different GPUs.
Therefore determine the PTE flags separately for each mapping instead
of storing them in the KFD buffer object.
Add a helper function to determine the PTE flags to be extended with
ASIC and memory-type-specific logic in
On 2019-08-29 8:53 p.m., Andrey Grodzovsky wrote:
> Problem:
> Under certain conditions, when some IP bocks take a RAS error,
> we can get into a situation where a GPU reset is not possible
> due to issues in RAS in SMU/PSP.
>
> Temporary fix until proper solution in PSP/SMU is ready:
> When
For apu, SMU_MSG_OverridePcieParameters is unsupport.
So return directly in smu_override_pcie_parameters function.
Signed-off-by: Aaron Liu
---
drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
For apu, SMU_MSG_OverridePcieParameters is unsupport.
So return directly in smu_override_pcie_parameters function.
Signed-off-by: Aaron Liu
---
drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
Reviewed-by: Evan Quan
-Original Message-
From: amd-gfx On Behalf Of Aaron Liu
Sent: Friday, August 30, 2019 10:10 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Huang, Ray
; Liu, Aaron
Subject: [PATCH V2] drm/amd/powerplay: SMU_MSG_OverridePcieParameters is
unsupport
> -Original Message-
> From: Hawking Zhang
> Sent: 2019年8月29日 21:30
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> ; Zhou1, Tao ; Chen,
> Guchun
> Cc: Zhang, Hawking
> Subject: [PATCH 1/7] drm/amdgpu: add helper function to do common
> ras_late_init/fini (v3)
>
> In
Hi John...
I added this patch series on top of Linux 5.3rc6 and ran
xfstests with no regressions...
Acked-by: Mike Marshall
-Mike
On Tue, Aug 6, 2019 at 9:50 PM John Hubbard wrote:
>
> On 8/6/19 6:32 PM, john.hubb...@gmail.com wrote:
> > From: John Hubbard
> > ...
> >
> > John Hubbard (38):
On 8/29/2019 6:29 PM, Mike Marshall wrote:
Hi John...
I added this patch series on top of Linux 5.3rc6 and ran
xfstests with no regressions...
Acked-by: Mike Marshall
Hi Mike (and I hope Ira and others are reading as well, because
I'm making a bunch of claims further down),
That's great
With the two points in patch #1 and patch #5 are fixed, the series is:
Reviewed-by: Tao Zhou
> -Original Message-
> From: amd-gfx On Behalf Of
> Hawking Zhang
> Sent: 2019年8月29日 21:31
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> ; Zhou1, Tao ; Chen,
> Guchun
> Cc: Zhang,
Problem:
Under certain conditions, when some IP bocks take a RAS error,
we can get into a situation where a GPU reset is not possible
due to issues in RAS in SMU/PSP.
Temporary fix until proper solution in PSP/SMU is ready:
When uncorrectable error happens the DF will unconditionally
broadcast
In case of RAS error allow user configure auto system
reboot through ras_ctrl.
This is also part of the temproray work around for the RAS
hang problem.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 18 ++
> -Original Message-
> From: Andrey Grodzovsky
> Sent: 2019年8月30日 8:54
> To: amd-gfx@lists.freedesktop.org
> Cc: alexdeuc...@gmail.com; Zhang, Hawking ;
> ckoenig.leichtzumer...@gmail.com; Zhou1, Tao ;
> Grodzovsky, Andrey
> Subject: [PATCH v2 1/2] dmr/amdgpu: Avoid HW GPU reset for
> -Original Message-
> From: Hawking Zhang
> Sent: 2019年8月29日 21:31
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> ; Zhou1, Tao ; Chen,
> Guchun
> Cc: Zhang, Hawking
> Subject: [PATCH 5/7] drm/amdgpu: add mmhub ras_late_init callback
> function (v2)
>
> The function will
In renoir's vega10_ih model, there's a security change in mmIH_CHICKEN
register, that limits IH to use physical address (FBPA, GPA) directly.
Those chicken bits need to be programmed first.
Signed-off-by: Aaron Liu
Reviewed-by: Huang Rui
Reviewed-by: Hawking Zhang
Acked-by: Alex Deucher
These wptrs must be pinned and GPU accessible when this is called
from hqd_load functions. So they should never fault. This resolves
a circular lock dependency issue involving four locks including the
DQM lock and mmap_sem.
Signed-off-by: Felix Kuehling
---
This workaround is better handled in user mode in a way that doesn't
require allocating extra memory and breaking userptr BOs.
The TLB bug is a performance bug, not a functional or security bug.
Hence it is safe to remove this kernel part of the workaround to
allow a better workaround using only
In Renoir's emulator, those chicken bits need to be programmed.
Signed-off-by: Aaron Liu
Reviewed-by: Huang Rui
Reviewed-by: Hawking Zhang
Acked-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/include/asic_reg/oss/osssys_4_0_sh_mask.h | 4
1 file changed, 4
To allow other subsystems to iterate through all stored DRM minors and
act upon them.
Also exposes drm_minor_acquire and drm_minor_release for other subsystem
to handle drm_minor. DRM cgroup controller is the initial consumer of
this new features.
Change-Id:
The drm resource being limited here is the GEM buffer objects. User
applications allocate and free these buffers. In addition, a process
can allocate a buffer and share it with another process. The consumer
of a shared buffer can also outlive the allocator of the buffer.
For the purpose of
This is a follow up to the RFC I made previously to introduce a cgroup
controller for the GPU/DRM subsystem [v1,v2,v3]. The goal is to be able to
provide resource management to GPU resources using things like container.
With this RFC v4, I am hoping to have some consensus on a merge plan. I
With the increased importance of machine learning, data science and
other cloud-based applications, GPUs are already in production use in
data centers today. Existing GPU resource management is very coarse
grain, however, as sysadmins are only able to distribute workload on a
per-GPU basis. An
drm.memory.peak.stats
A read-only nested-keyed file which exists on all cgroups.
Each entry is keyed by the drm device's major:minor. The
following nested keys are defined.
== ==
system
drmcg initialization involves allocating a per cgroup, per device data
structure and setting the defaults. There are two entry points for
drmcg init:
1) When struct drmcg is created via css_alloc, initialization is done
for each device
2) When DRM devices are created after drmcgs are created
Allow DRM TTM memory manager to register a work_struct, such that, when
a drmcgrp is under memory pressure, memory reclaiming can be triggered
immediately.
Change-Id: I25ac04e2db9c19ff12652b88ebff18b44b2706d8
Signed-off-by: Kenny Ho
---
drivers/gpu/drm/ttm/ttm_bo.c| 49
drm.buffer.count.stats
A read-only flat-keyed file which exists on all cgroups. Each
entry is keyed by the drm device's major:minor.
Total number of GEM buffer allocated.
Change-Id: Id3e1809d5fee8562e47a7d2b961688956d844ec6
Signed-off-by: Kenny Ho
---
The drm resource being limited is the TTM (Translation Table Manager)
buffers. TTM manages different types of memory that a GPU might access.
These memory types include dedicated Video RAM (VRAM) and host/system
memory accessible through IOMMU (GART/GTT). TTM is currently used by
multiple drm
The number of logical gpu (lgpu) is defined to be the number of compute
unit (CU) for a device. The lgpu allocation limit only applies to
compute workload for the moment (enforced via kfd queue creation.) Any
cu_mask update is validated against the availability of the compute unit
as defined by
The drm resource being measured is the TTM (Translation Table Manager)
buffers. TTM manages different types of memory that a GPU might access.
These memory types include dedicated Video RAM (VRAM) and host/system
memory accessible through IOMMU (GART/GTT). TTM is currently used by
multiple drm
drm.buffer.peak.stats
A read-only flat-keyed file which exists on all cgroups. Each
entry is keyed by the drm device's major:minor.
Largest (high water mark) GEM buffer allocated in bytes.
Change-Id: I79e56222151a3d33a76a61ba0097fe93ebb3449f
Signed-off-by: Kenny Ho
---
Before this commit, drmcg limits are updated but enforcement is delayed
until the next time the driver check against the new limit. While this
is sufficient for certain resources, a more proactive enforcement may be
needed for other resources.
Introducing an optional drmcg_limit_updated callback
drm.buffer.peak.default
A read-only flat-keyed file which exists on the root cgroup.
Each entry is keyed by the drm device's major:minor.
Default limits on the largest GEM buffer allocation in bytes.
drm.buffer.peak.max
A read-write flat-keyed file which exists on
The drm resource being measured here is the GEM buffer objects. User
applications allocate and free these buffers. In addition, a process
can allocate a buffer and share it with another process. The consumer
of a shared buffer can also outlive the allocator of the buffer.
For the purpose of
drm.lgpu
A read-write nested-keyed file which exists on all cgroups.
Each entry is keyed by the DRM device's major:minor.
lgpu stands for logical GPU, it is an abstraction used to
subdivide a physical DRM device for the purpose of resource
management.
The bandwidth is measured by keeping track of the amount of bytes moved
by ttm within a time period. We defined two type of bandwidth: burst
and average. Average bandwidth is calculated by dividing the total
amount of bytes moved within a cgroup by the lifetime of the cgroup.
Burst bandwidth is
99 matches
Mail list logo