Don't have the code in front of me now but as far as I remember it will only
prematurely terminate in drm_sched_cleanup_jobs if there is timeout work in
progress which would not be the case if nothing hangs.
Andrey
From: Erico Nunes
Sent: 17 May 2019
On Fri, May 17, 2019 at 10:43 PM Grodzovsky, Andrey
wrote:
> On 5/17/19 3:35 PM, Erico Nunes wrote:
> > Lima currently defaults to an "infinite" timeout. Setting a 500ms
> > default timeout like most other drm_sched users do fixed the leak for
> > me.
>
> I am not very clear about the problem -
Hello,
I have recently observed a memory leak issue with lima using
drm-misc-next, which I initially reported here:
https://gitlab.freedesktop.org/lima/linux/issues/24
It is an easily reproduceable memory leak which I was able to bisect to commit:
5918045c4ed4 drm/scheduler: rework job
On 5/17/19 3:35 PM, Erico Nunes wrote:
> [CAUTION: External Email]
>
> Hello,
>
> I have recently observed a memory leak issue with lima using
> drm-misc-next, which I initially reported here:
> https://gitlab.freedesktop.org/lima/linux/issues/24
> It is an easily reproduceable memory leak which
Patches 1,2,4 will be submitted through amd-staging-drm-next. Patch 3
goes through the cgroup tree. Patch 4 depends on patch 3. So submitting
patch 4 will need to wait until we rebase amd-staging-drm-next on a new
enough kernel release that includes patch 3.
Patch 1 and 2 could be submitted
Hi Tejun,
Thanks for comments. I can definitely add more documentation but just need a
bit of clarification on this.
1). Documentation for user on how to use device cgroup for amdkfd device. I
have some more information on this in patch 4.
or
2) The reason devcgroup_check_permission() needs
On Fri, May 17, 2019 at 2:34 PM Abramov, Slava wrote:
>
> v1: replace casting to unsigned long with div64_ul
>
> Change-Id: Ia48671ed0756bb73c7b4760a800bcb6f600cbef2
> Signed-off-by: Slava Abramov
Acked-by: Alex Deucher
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 ++--
> 1 file
Acked-by: Amber Lin
On 2019-05-17 1:02 p.m., Alex Deucher wrote:
> [CAUTION: External Email]
>
> On Fri, May 17, 2019 at 10:47 AM Lin, Amber wrote:
>>
>>
>> On 2019-05-17 10:26 a.m., Alex Deucher wrote:
>>> [CAUTION: External Email]
>>>
>>> Not necessary on soc15 and breaks driver reload on
On Fri, May 17, 2019 at 2:31 PM Russell, Kent wrote:
>
> Add a file that provides a Unique ID for the GPU.
> This will persist across machines and is guaranteed to be unique.
> This is only available for GFX9 and newer, so older ASICs will not
> have this file in the sysfs pool
>
> v2: Store it
v1: replace casting to unsigned long with div64_ul
Change-Id: Ia48671ed0756bb73c7b4760a800bcb6f600cbef2
Signed-off-by: Slava Abramov
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
Add a file that provides a Unique ID for the GPU.
This will persist across machines and is guaranteed to be unique.
This is only available for GFX9 and newer, so older ASICs will not
have this file in the sysfs pool
v2: Store it in adev for ASICs that don't have a hwmgr
Change-Id:
On Fri, May 17, 2019 at 9:45 AM Russell, Kent wrote:
>
> Add a file that provides a Unique ID for the GPU.
> This will persist across machines and is guaranteed to be unique.
> This is only available for GFX9 and newer, so older ASICs will not
> have this file in the sysfs pool
>
> Change-Id:
Hello!
Registration & Call for Proposals are now open for XDC 2019, which will
take place at the Concordia University Conference Centre in Montréal,
Canada on October 2-4, 2019.
Thanks to LWN.net, this year we have a brand new website using the
Indico platform, a fully open source event
For SR-IOV, vram_width can't be read from ATOM as
RAVEN, and DF related registers is not readable, so hardcord
is the only way to set the correct vram_width.
Signed-off-by: Trigger Huang
Signed-off-by: Yintian Tao
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 10
On Fri, May 17, 2019 at 04:15:06PM +, Kasiviswanathan, Harish wrote:
> For AMD compute (amdkfd) driver.
>
> All AMD compute devices are exported via single device node /dev/kfd. As
> a result devices cannot be controlled individually using device cgroup.
>
> AMD compute devices will rely on
On Fri, May 17, 2019 at 10:47 AM Lin, Amber wrote:
>
>
>
> On 2019-05-17 10:26 a.m., Alex Deucher wrote:
> > [CAUTION: External Email]
> >
> > Not necessary on soc15 and breaks driver reload on server cards.
> >
> > Signed-off-by: Alex Deucher
> > Cc: sta...@vger.kernel.org
> > ---
> >
On Fri, May 17, 2019 at 11:36 AM Micah Morton wrote:
>
> On Thu, May 16, 2019 at 1:39 PM Alex Deucher wrote:
> >
> > On Thu, May 16, 2019 at 4:07 PM Micah Morton wrote:
> > >
> > > On Wed, May 15, 2019 at 7:19 PM Alex Deucher
> > > wrote:
> > > >
> > > > On Wed, May 15, 2019 at 2:26 PM Micah
On Fri, May 17, 2019 at 04:14:52PM +, Kasiviswanathan, Harish wrote:
> amdkfd (part of amdgpu) driver supports the AMD GPU compute stack.
> amdkfd exposes only a single device /dev/kfd even if multiple AMD GPU
> (compute) devices exist in a system. However, amdgpu drvier exposes a
> separate
I am going to put more members which are also array after this struct, not only
obj[]. Looks like this struct_size did not help on multiple array case. Thanks
anyway.
From: xiaolinkui
Sent: Friday, May 17, 2019 4:46:00 PM
To: Deucher, Alexander; Koenig,
better to use div64_ul(x, y) for compatiblity.
From: Abramov, Slava
Sent: Friday, May 17, 2019 5:19:54 AM
To: amd-gfx@lists.freedesktop.org
Cc: Pan, Xinhui; Deucher, Alexander
Subject: [PATCH] drm/amdgpu: cast to unsigned int for 32-bit portability
Without
Participate in device cgroup. All kfd devices are exposed via /dev/kfd.
So use /dev/dri/renderN node.
Before exposing the device to a task check if it has permission to
access it. If the task (based on its cgroup) can access /dev/dri/renderN
then expose the device via kfd node.
If the task
For AMD compute (amdkfd) driver.
All AMD compute devices are exported via single device node /dev/kfd. As
a result devices cannot be controlled individually using device cgroup.
AMD compute devices will rely on its graphics counterpart that exposes
/dev/dri/renderN node for each device. For each
This is required to check against cgroup permissions.
Signed-off-by: Harish Kasiviswanathan
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 10 ++
drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 3 +++
2 files changed, 13 insertions(+)
diff --git
kfd needs drm_device to call into drm_cgroup functions
Signed-off-by: Harish Kasiviswanathan
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 +
drivers/gpu/drm/amd/amdkfd/kfd_device.c| 2 ++
amdkfd (part of amdgpu) driver supports the AMD GPU compute stack.
amdkfd exposes only a single device /dev/kfd even if multiple AMD GPU
(compute) devices exist in a system. However, amdgpu drvier exposes a
separate render device file /dev/dri/renderDN for each device. To participate
in device
On Thu, May 16, 2019 at 1:39 PM Alex Deucher wrote:
>
> On Thu, May 16, 2019 at 4:07 PM Micah Morton wrote:
> >
> > On Wed, May 15, 2019 at 7:19 PM Alex Deucher wrote:
> > >
> > > On Wed, May 15, 2019 at 2:26 PM Micah Morton wrote:
> > > >
> > > > Hi folks,
> > > >
> > > > I'm interested in
Hi Alex
Many thanks for your review. I will merge these two patches into one and submit
again.
Best Regards
Yintian Tao
发件人: Alex Deucher
发送时间: 2019年5月17日 22:34:30
收件人: Tao, Yintian
抄送: amd-gfx@lists.freedesktop.org; Koenig, Christian; Deucher, Alexander;
Hi Andrey,
On Mon, May 06, 2019 at 06:30:46PM +0200, Andrey Konovalov wrote:
> One of the alternative approaches to untagging that was considered is to
> completely strip the pointer tag as the pointer enters the kernel with
> some kind of a syscall wrapper, but that won't work with the countless
On 2019-05-17 10:26 a.m., Alex Deucher wrote:
> [CAUTION: External Email]
>
> Not necessary on soc15 and breaks driver reload on server cards.
>
> Signed-off-by: Alex Deucher
> Cc: sta...@vger.kernel.org
> ---
> drivers/gpu/drm/amd/amdgpu/soc15.c | 5 +
> 1 file changed, 5 insertions(+)
How about combining these two patches into one? This seems cleaner.
Alex
On Thu, May 16, 2019 at 10:39 PM Tao, Yintian wrote:
>
> Ping...
>
> Hi Christian and Alex
>
>
> Can you help review this? Thanks in advance.
>
>
> Best Regards
> Yintian Tao
>
> -Original Message-
> From: Yintian
Not necessary on soc15 and breaks driver reload on server cards.
Signed-off-by: Alex Deucher
Cc: sta...@vger.kernel.org
---
drivers/gpu/drm/amd/amdgpu/soc15.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index
> -Original Message-
> From: Christian König
> Sent: Tuesday, May 14, 2019 8:31 PM
> To: Olsak, Marek ; Zhou, David(ChunMing)
> ; Liang, Prike ; dri-
> de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Subject: [PATCH 01/11] drm/ttm: Make LRU removal optional.
>
> [CAUTION:
Add a file that provides a Unique ID for the GPU.
This will persist across machines and is guaranteed to be unique.
This is only available for GFX9 and newer, so older ASICs will not
have this file in the sysfs pool
Change-Id: I3c673f78efcc5bf93ca58d65edbe39fc3a86b42a
Signed-off-by: Kent Russell
On 2019-04-30 3:56 p.m., Mario Kleiner wrote:
> [CAUTION: External Email]
>
> On Tue, Apr 30, 2019 at 2:22 PM Kazlauskas, Nicholas
> wrote:
>>
>> On 4/30/19 3:44 AM, Michel Dänzer wrote:
>>> [CAUTION: External Email]
>>>
>>> On 2019-04-30 9:37 a.m., Mario Kleiner wrote:
Allow to detect
Use struct_size() helper to keep code simple.
Signed-off-by: xiaolinkui
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 22bd21e..4717a64
Instead if guest load “sos” and “sysdrv” there are two kinds of result:
1. For elder version PSP boot loader guest would hit failure.
2. For new version PSP boot loader those ucode won’t work at all so loading
the is redundant
/Monk
From: Tao, Yintian
Sent: Friday, May 17, 2019 3:17 PM
Hi Christian,
With the series patch set , amdgpu_vm_validate_pt_bos occasionally evicted
amdgpu BOs failed and can’t
find the valid first busy bo . Another problem is that during the first BOs
get lock period will run into deadlock .
/* check if other user occupy memory too long time */
On Thu, May 09, 2019 at 10:31:05AM +, Pan, Xinhui wrote:
> add badpages node.
> it will output badpages list in format
> gpu pfn : gpu page size : flags
>
> example
> 0x : 0x1000 : R
> 0x0001 : 0x1000 : R
> 0x0002 : 0x1000 : R
> 0x0003 : 0x1000 : R
>
Hi Chrisitian
Yes, of course. Thanks for your reminder.
Best Regards
Yintian Tao
发件人: Christian König
发送时间: 2019年5月17日 15:20:54
收件人: Tao, Yintian; Koenig, Christian
抄送: amd-gfx@lists.freedesktop.org; Liu, Monk
主题: Re: 答复: [PATCH] drm/amdgpu: skip fw pri bo
Hi Yintian,
please add this as a code comment to the patch.
Christian.
Am 17.05.19 um 09:17 schrieb Tao, Yintian:
Hi Christian
Many thanks for your review.
The background is that this bo is to let psp load sos and sysdrv but
under sriov, sos and sysdrv is loaded by VBIOS or hypervisor
Hi Christian
Many thanks for your review.
The background is that this bo is to let psp load sos and sysdrv but under
sriov, sos and sysdrv is loaded by VBIOS or hypervisor driver.
The reason why not let guest driver to load it under SRIOV is that it is not
safe.
Best Regards
Yintian
Looks good to me now, but I don't know the technical background why this
BO is not needed under SRIOV.
So this patch is Acked-by: Christian König .
Regards,
Christian.
Am 17.05.19 um 04:41 schrieb Tao, Yintian:
> Hi Christian
>
>
> I have modified it according to your suggestion. Can you help
Hi Yintian,
sorry I have not the slightest idea how this part of the hw works. Maybe
try to explain more what the DF register is actually doing in the commit
message. I suspect that it is only about detecting which VRAM connection
is used, but I'm not 100% sure.
Regards,
Christian.
Am
43 matches
Mail list logo