Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
on the same topic I added you to.
Andrey
On 1/29/21 2:25 PM, Christian König wrote:
Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:
On 1/29/21 10:16 AM, Christian König wrote:
Am 28.01.21 um 18:23 schrieb Andrey
On 1/29/21 10:16 AM, Christian König wrote:
Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:
On 1/19/21 1:59 PM, Christian König wrote:
Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
On 1/19/21 1:05 PM, Daniel Vetter wrote:
[SNIP]
So say writing in a loop to some harmless scratch register
On 1/19/21 1:59 PM, Christian König wrote:
Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
On 1/19/21 1:05 PM, Daniel Vetter wrote:
On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
wrote:
There is really no other way according to this article
https://nam11.safelinks.protection.outlook.com
Hey Daniel, just a ping.
Andrey
On 1/25/21 10:28 AM, Andrey Grodzovsky wrote:
On 1/19/21 8:56 AM, Daniel Vetter wrote:
On Mon, Jan 18, 2021 at 04:01:10PM -0500, Andrey Grodzovsky wrote:
On device removal reroute all CPU mappings to dummy page.
v3:
Remove loop to find DRM file and instead
On 1/19/21 8:56 AM, Daniel Vetter wrote:
On Mon, Jan 18, 2021 at 04:01:10PM -0500, Andrey Grodzovsky wrote:
On device removal reroute all CPU mappings to dummy page.
v3:
Remove loop to find DRM file and instead access it
by vma->vm_file->private_data. Move dummy page instal
Looks good to me
Reviewed-by: Andrey Grodzovsky
Andrey
On 1/21/21 5:21 AM, Horace Chen wrote:
Fix a racing issue when jobs on 2 rings timeout simultaneously.
If 2 rings timed out at the same time, the
amdgpu_device_gpu_recover will be reentered. Then the
adev->gmc.xgmi.head will be grab
Ping
Andrey
On 1/20/21 12:01 AM, Andrey Grodzovsky wrote:
On 1/19/21 3:48 AM, Christian König wrote:
Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
Handle all DMA IOMMU gropup related dependencies before the
group is removed.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd
On 1/19/21 2:16 PM, Andrey Grodzovsky wrote:
On 1/19/21 1:59 PM, Christian König wrote:
Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
On 1/19/21 1:05 PM, Daniel Vetter wrote:
On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
wrote:
There is really no other way according to this article
On 1/20/21 9:12 AM, Horace Chen wrote:
Fix a racing issue when jobs on 2 rings timeout simultaneously.
If 2 rings timed out at the same time, the
amdgpu_device_gpu_recover will be reentered. Then the
adev->gmc.xgmi.head will be grabbed by 2 local linked list,
which may cause wild pointer
On 1/20/21 4:05 AM, Daniel Vetter wrote:
On Tue, Jan 19, 2021 at 01:18:15PM -0500, Andrey Grodzovsky wrote:
On 1/19/21 1:08 PM, Daniel Vetter wrote:
On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
wrote:
On 1/19/21 9:16 AM, Daniel Vetter wrote:
On Mon, Jan 18, 2021 at 04:01:09PM -0500
On 1/19/21 3:48 AM, Christian König wrote:
Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
Handle all DMA IOMMU gropup related dependencies before the
group is removed.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 5
drivers/gpu/drm/amd/amdgpu
On 1/19/21 5:01 PM, Daniel Vetter wrote:
On Tue, Jan 19, 2021 at 10:22 PM Andrey Grodzovsky
wrote:
On 1/19/21 8:45 AM, Daniel Vetter wrote:
On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
Handle all DMA IOMMU gropup related
On 1/15/21 2:21 AM, Chen, Xiaogang wrote:
On 1/14/2021 1:24 AM, Grodzovsky, Andrey wrote:
On 1/14/21 12:11 AM, Chen, Xiaogang wrote:
On 1/12/2021 10:54 PM, Grodzovsky, Andrey wrote:
On 1/4/21 1:01 AM, Xiaogang.Chen wrote:
From: Xiaogang Chen
amdgpu DM handles INTERRUPT_LOW_IRQ_CONTEXT
On 1/19/21 8:45 AM, Daniel Vetter wrote:
On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
Handle all DMA IOMMU gropup related dependencies before the
group is removed.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd
On 1/19/21 1:59 PM, Christian König wrote:
Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:
On 1/19/21 1:05 PM, Daniel Vetter wrote:
On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
wrote:
There is really no other way according to this article
https://nam11.safelinks.protection.outlook.com
On 1/19/21 2:04 PM, Alex Deucher wrote:
On Tue, Jan 19, 2021 at 1:26 PM Greg KH wrote:
On Tue, Jan 19, 2021 at 11:36:01AM -0500, Andrey Grodzovsky wrote:
On 1/19/21 2:34 AM, Greg KH wrote:
On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
static struct pci_driver
On 1/19/21 1:05 PM, Daniel Vetter wrote:
On Tue, Jan 19, 2021 at 4:35 PM Andrey Grodzovsky
wrote:
There is really no other way according to this article
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F767885%2Fdata=04%7C01%7CAndrey.Grodzovsky%40amd.com
On 1/19/21 1:08 PM, Daniel Vetter wrote:
On Tue, Jan 19, 2021 at 6:31 PM Andrey Grodzovsky
wrote:
On 1/19/21 9:16 AM, Daniel Vetter wrote:
On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
Until now extracting a card either by physical extraction (e.g. eGPU
On 1/19/21 9:16 AM, Daniel Vetter wrote:
On Mon, Jan 18, 2021 at 04:01:09PM -0500, Andrey Grodzovsky wrote:
Until now extracting a card either by physical extraction (e.g. eGPU with
thunderbolt connection or by emulation through syfs ->
/sys/bus/pci/devices/device_id/remove)
would ca
Well, it shouldn't happen with the hive locked as I am browsing the code but
then your code should
reflect that and if you do fail to lock particular adev AFTER the hive is locked
you should not silently break
iteration but throw an error, WARN_ON or BUG_ON then. Or alternatively bail out
with
On 1/19/21 11:39 AM, Chen, Horace wrote:
[AMD Official Use Only - Internal Distribution Only]
Hi Andrey,
I think the list in the XGMI hive won't be break in the middle if we lock the
device before we change the list. Because if 2 devices in 1 hive went into the
function, it will follow
On 1/19/21 2:34 AM, Greg KH wrote:
On Mon, Jan 18, 2021 at 04:01:19PM -0500, Andrey Grodzovsky wrote:
static struct pci_driver amdgpu_kms_pci_driver = {
.name = DRIVER_NAME,
.id_table = pciidlist,
@@ -1595,6 +1607,7 @@ static struct pci_driver amdgpu_kms_pci_driver
ould upset the PCI layer (the MMIO ranges reservation part).
Andrey
On 1/19/21 3:55 AM, Christian König wrote:
Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
This should prevent writing to memory or IO ranges possibly
already allocated for other uses after our device is removed.
Wow, that adds
Reviewed-by: Andrey Grodzovsky
Andrey
On 1/19/21 7:22 AM, Horace Chen wrote:
If 2 jobs on 2 different ring timed out the at a very short
period, the reset for second job will be skipped because the
reset is already in progress.
But it doesn't mean the second job is not guilty since it
also
On 1/19/21 7:22 AM, Horace Chen wrote:
Fix a racing issue when jobs on 2 rings timeout simultaneously.
If 2 rings timed out at the same time, the amdgpu_device_gpu_recover
will be reentered. Then the adev->gmc.xgmi.head will be grabbed
by 2 local linked list, which may cause wild pointer
Return DRM_TASK_STATUS_ENODEV back to the scheduler when device
is not present so they timeout timer will not be rearmed.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 19 ---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/drivers
We don't want to rearm the timer if driver hook reports
that the device is gone.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/scheduler/sched_main.c | 11 +++
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm
: (By Andrey Grodzovsky)
Replace DRM_TASK_STATUS_COMPLETE with DRM_TASK_STATUS_ENODEV
to enable a hint to the schduler for when NOT to rearm the
timeout timer.
Cc: Alexander Deucher
Cc: Andrey Grodzovsky
Cc: Christian König
Cc: Daniel Vetter
Cc: Lucas Stach
Cc: Russell King
Cc: Christian Gmeiner
Cc
This should prevent writing to memory or IO ranges possibly
already allocated for other uses after our device is removed.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c| 9
drivers
This allows to remove explicit creation and destruction
of those attrs and by this avoids warnings on device
finilizing post physical device extraction.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 17 +
drivers/gpu/drm/amd/amdgpu
On device removal reroute all CPU mappings to dummy page
per drm_file instance or imported GEM object.
v4:
Update for modified ttm_bo_vm_dummy_page
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 -
1 file changed, 16 insertions(+), 5
We can't allocate and submit IBs post device unplug.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index
Handle all DMA IOMMU gropup related dependencies before the
group is removed.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h| 5
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
Use it to call disply code dependent on device->drv_data
before it's set to NULL on device unplug
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 20
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 ++--
drivers/
the last device
reference is dropped.
v4: Change functions prefix early->hw and late->sw
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h| 6 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 ++
drivers/gpu/drm/amd/amdgpu/amdgpu
To avoid any possible use after free.
Signed-off-by: Andrey Grodzovsky
Reviewed-by: Christian König
---
drivers/gpu/drm/scheduler/sched_main.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index 997aa15
Invalidate all BOs CPU mappings once device is removed.
v3: Move the code from TTM into drm_dev_unplug
Signed-off-by: Andrey Grodzovsky
Reviewed-by: Christian König
Reviewed-by: Daniel Vetter
---
drivers/gpu/drm/drm_drv.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm
It's needed to drop iommu backed pages on device unplug
before device's IOMMU group is released.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/ttm/ttm_tt.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 7f75a13
fault for that BO.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/ttm/ttm_bo_vm.c | 82 -
include/drm/ttm/ttm_bo_api.h| 2 +
2 files changed, 83 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
tml
[3] - Related gitlab ticket https://gitlab.freedesktop.org/drm/amd/-/issues/1081
Andrey Grodzovsky (13):
drm/ttm: Remap all page faults to per process dummy page.
drm: Unamp the entire device address space on device unplug
drm/ttm: Expose ttm_tt_unpopulate for driver use
drm/sched: Can
Reviewed-by: Andrey Grodzovsky
Andrey
On 1/14/21 8:37 AM, Horace Chen wrote:
If 2 jobs on 2 different ring timed out the at a very
short period, the reset for second job will be skipped
because the reset is already in progress.
But it doesn't mean the second job is not guilty since it also
On 1/14/21 8:37 AM, Horace Chen wrote:
Fix a racing issue when jobs on 2 rings timeout simultaneously.
If 2 rings timed out at the same time, the
amdgpu_device_gpu_recover will be reentered. Then the
adev->gmc.xgmi.head will be grabbed by 2 local linked list,
which may cause wild pointer
On 1/14/21 12:11 AM, Chen, Xiaogang wrote:
On 1/12/2021 10:54 PM, Grodzovsky, Andrey wrote:
On 1/4/21 1:01 AM, Xiaogang.Chen wrote:
From: Xiaogang Chen
amdgpu DM handles INTERRUPT_LOW_IRQ_CONTEXT interrupt(hpd, hpd_rx) by
using work queue and uses single work_struct. If previous interrupt
On 1/13/21 4:14 AM, Christian König wrote:
Am 12.01.21 um 16:59 schrieb Andrey Grodzovsky:
On 1/12/21 7:32 AM, Christian König wrote:
Am 12.01.21 um 10:10 schrieb Daniel Vetter:
On Mon, Jan 11, 2021 at 03:45:10PM -0500, Andrey Grodzovsky wrote:
On 1/11/21 11:15 AM, Daniel Vetter wrote
On 1/4/21 1:01 AM, Xiaogang.Chen wrote:
From: Xiaogang Chen
amdgpu DM handles INTERRUPT_LOW_IRQ_CONTEXT interrupt(hpd, hpd_rx) by
using work queue and uses single work_struct. If previous interrupt
has not been handled new interrupts(same type) will be discarded and
driver just sends
On 1/12/21 7:32 AM, Christian König wrote:
Am 12.01.21 um 10:10 schrieb Daniel Vetter:
On Mon, Jan 11, 2021 at 03:45:10PM -0500, Andrey Grodzovsky wrote:
On 1/11/21 11:15 AM, Daniel Vetter wrote:
On Mon, Jan 11, 2021 at 05:13:56PM +0100, Daniel Vetter wrote:
On Fri, Jan 08, 2021 at 04:49
So - basically allocate the page and pass it as void* pointer to drmm_add_action
with a release function which will do the free page, right ?
Andrey
On 1/12/21 4:10 AM, Daniel Vetter wrote:
drm_add_action_or_reset (for better control flow) has both a void * data
and a cleanup function (and it
So there's plenty of room still, if the size of a bo struct is really that
critical. Imo it's not.
Christian.
Am 08.01.21 um 15:46 schrieb Andrey Grodzovsky:
Daniel had some objections to this (see bellow) and so I guess I need
you both to agree on the approach before I proceed.
Andrey
On 1
plenty of room still, if the size of a bo struct is really that
critical. Imo it's not.
Christian.
Am 08.01.21 um 15:46 schrieb Andrey Grodzovsky:
Daniel had some objections to this (see bellow) and so I guess I need
you both to agree on the approach before I proceed.
Andrey
On 1/8/21 9:33 AM
Daniel had some objections to this (see bellow) and so I guess I need you both
to agree on the approach before I proceed.
Andrey
On 1/8/21 9:33 AM, Christian König wrote:
Am 08.01.21 um 15:26 schrieb Andrey Grodzovsky:
Hey Christian, just a ping.
Was there any question for me here?
As far
Hey Christian, just a ping.
Andrey
On 1/7/21 11:37 AM, Andrey Grodzovsky wrote:
On 1/7/21 11:30 AM, Daniel Vetter wrote:
On Thu, Jan 07, 2021 at 11:26:52AM -0500, Andrey Grodzovsky wrote:
On 1/7/21 11:21 AM, Daniel Vetter wrote:
On Tue, Jan 05, 2021 at 04:04:16PM -0500, Andrey Grodzovsky
On 1/7/21 11:30 AM, Daniel Vetter wrote:
On Thu, Jan 07, 2021 at 11:26:52AM -0500, Andrey Grodzovsky wrote:
On 1/7/21 11:21 AM, Daniel Vetter wrote:
On Tue, Jan 05, 2021 at 04:04:16PM -0500, Andrey Grodzovsky wrote:
On 11/23/20 3:01 AM, Christian König wrote:
Am 23.11.20 um 05:54 schrieb
Typo Correction bellow
On 1/7/21 11:26 AM, Andrey Grodzovsky wrote:
Or is the idea to save the struct page * pointer? That feels a bit like
over-optimizing stuff. Better to have a simple implementation first and
then tune it if (and only if) any part of it becomes a problem for normal
usage
On 1/7/21 11:21 AM, Daniel Vetter wrote:
On Tue, Jan 05, 2021 at 04:04:16PM -0500, Andrey Grodzovsky wrote:
On 11/23/20 3:01 AM, Christian König wrote:
Am 23.11.20 um 05:54 schrieb Andrey Grodzovsky:
On 11/21/20 9:15 AM, Christian König wrote:
Am 21.11.20 um 06:21 schrieb Andrey Grodzovsky
On 11/23/20 3:01 AM, Christian König wrote:
Am 23.11.20 um 05:54 schrieb Andrey Grodzovsky:
On 11/21/20 9:15 AM, Christian König wrote:
Am 21.11.20 um 06:21 schrieb Andrey Grodzovsky:
Will be used to reroute CPU mapped BO's page faults once
device is removed.
Uff, one page for each
Hey Daniel, back from vacation and going over our last long thread i think you
didn't reply
to my last question bellow (Or at least I can't find it).
Andrey
On 12/17/20 4:13 PM, Andrey Grodzovsky wrote:
Ok, so I assumed that with vmap_local you were trying to solve the problem of
quick
On 12/17/20 3:42 PM, Daniel Vetter wrote:
On Thu, Dec 17, 2020 at 8:19 PM Andrey Grodzovsky
wrote:
On 12/17/20 7:01 AM, Daniel Vetter wrote:
On Wed, Dec 16, 2020 at 07:20:02PM -0500, Andrey Grodzovsky wrote:
On 12/16/20 6:15 PM, Daniel Vetter wrote:
On Wed, Dec 16, 2020 at 7:26 PM Andrey
On 12/17/20 3:48 PM, Daniel Vetter wrote:
On Thu, Dec 17, 2020 at 9:38 PM Andrey Grodzovsky
wrote:
On 12/17/20 3:10 PM, Christian König wrote:
[SNIP]
By eliminating such users, and replacing them with local maps which
are strictly bound in how long they can exist (and hence we can
On 12/17/20 3:10 PM, Christian König wrote:
[SNIP]
By eliminating such users, and replacing them with local maps which
are strictly bound in how long they can exist (and hence we can
serialize against them finishing in our hotunplug code).
Not sure I see how serializing against BO map/unmap
On 12/17/20 7:01 AM, Daniel Vetter wrote:
On Wed, Dec 16, 2020 at 07:20:02PM -0500, Andrey Grodzovsky wrote:
On 12/16/20 6:15 PM, Daniel Vetter wrote:
On Wed, Dec 16, 2020 at 7:26 PM Andrey Grodzovsky
wrote:
On 12/16/20 12:12 PM, Daniel Vetter wrote:
On Wed, Dec 16, 2020 at 5:18 PM
On 12/16/20 6:15 PM, Daniel Vetter wrote:
On Wed, Dec 16, 2020 at 7:26 PM Andrey Grodzovsky
wrote:
On 12/16/20 12:12 PM, Daniel Vetter wrote:
On Wed, Dec 16, 2020 at 5:18 PM Christian König
wrote:
Am 16.12.20 um 17:13 schrieb Andrey Grodzovsky:
On 12/16/20 9:21 AM, Daniel Vetter wrote
On 12/16/20 12:12 PM, Daniel Vetter wrote:
On Wed, Dec 16, 2020 at 5:18 PM Christian König
wrote:
Am 16.12.20 um 17:13 schrieb Andrey Grodzovsky:
On 12/16/20 9:21 AM, Daniel Vetter wrote:
On Wed, Dec 16, 2020 at 9:04 AM Christian König
wrote:
Am 15.12.20 um 21:18 schrieb Andrey Grodzovsky
On 12/16/20 9:21 AM, Daniel Vetter wrote:
On Wed, Dec 16, 2020 at 9:04 AM Christian König
wrote:
Am 15.12.20 um 21:18 schrieb Andrey Grodzovsky:
[SNIP]
While we can't control user application accesses to the mapped
buffers explicitly and hence we use page fault rerouting
I am thinking
On 11/24/20 11:44 AM, Christian König wrote:
Am 24.11.20 um 17:22 schrieb Andrey Grodzovsky:
On 11/24/20 2:41 AM, Christian König wrote:
Am 23.11.20 um 22:08 schrieb Andrey Grodzovsky:
On 11/23/20 3:41 PM, Christian König wrote:
Am 23.11.20 um 21:38 schrieb Andrey Grodzovsky:
On 11/23
PRIME callbacks
from struct drm_driver")
v2: Use use amdgpu_gem_object_create() directly
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 8 --
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 41 -
2 files changed, 29 inserti
On 12/8/20 2:01 PM, Christian König wrote:
Am 08.12.20 um 19:52 schrieb Andrey Grodzovsky:
On 12/8/20 1:47 PM, Christian König wrote:
Am 08.12.20 um 19:44 schrieb Andrey Grodzovsky:
On 12/8/20 1:29 PM, Christian König wrote:
Am 08.12.20 um 19:26 schrieb Andrey Grodzovsky:
On 12/8/20 12
On 12/8/20 1:47 PM, Christian König wrote:
Am 08.12.20 um 19:44 schrieb Andrey Grodzovsky:
On 12/8/20 1:29 PM, Christian König wrote:
Am 08.12.20 um 19:26 schrieb Andrey Grodzovsky:
On 12/8/20 12:36 PM, Christian König wrote:
Am 08.12.20 um 18:10 schrieb Andrey Grodzovsky:
For BOs
On 12/8/20 1:29 PM, Christian König wrote:
Am 08.12.20 um 19:26 schrieb Andrey Grodzovsky:
On 12/8/20 12:36 PM, Christian König wrote:
Am 08.12.20 um 18:10 schrieb Andrey Grodzovsky:
For BOs imported from outside of amdgpu, setting of amdgpu_gem_object_funcs
was missing
On 12/8/20 12:36 PM, Christian König wrote:
Am 08.12.20 um 18:10 schrieb Andrey Grodzovsky:
For BOs imported from outside of amdgpu, setting of amdgpu_gem_object_funcs
was missing in amdgpu_dma_buf_create_obj. Fix by refactoring BO creation
and amdgpu_gem_object_funcs setting into single
: Remove obsolete GEM and PRIME callbacks from struct drm_driver
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 13 ++---
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 22 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h | 5 +
3
On 12/7/20 2:19 PM, Christian König wrote:
Am 07.12.20 um 20:09 schrieb Andrey Grodzovsky:
On 12/7/20 1:04 PM, Christian König wrote:
Am 07.12.20 um 17:00 schrieb Andrey Grodzovsky:
On 12/7/20 6:13 AM, Christian König wrote:
Am 04.12.20 um 16:10 schrieb Andrey Grodzovsky:
On 12/4/20 3
On 12/7/20 1:04 PM, Christian König wrote:
Am 07.12.20 um 17:00 schrieb Andrey Grodzovsky:
On 12/7/20 6:13 AM, Christian König wrote:
Am 04.12.20 um 16:10 schrieb Andrey Grodzovsky:
On 12/4/20 3:13 AM, Christian König wrote:
Thinking more about that I came to the conclusion that the whole
On 12/7/20 6:13 AM, Christian König wrote:
Am 04.12.20 um 16:10 schrieb Andrey Grodzovsky:
On 12/4/20 3:13 AM, Christian König wrote:
Thinking more about that I came to the conclusion that the whole approach
here isn't correct.
See even when the job has been completed or canceled we still
.
Signed-off-by: Luben Tuikov
Reported-by: kernel test robot
Cc: Alexander Deucher
Cc: Andrey Grodzovsky
Cc: Christian König
Cc: Daniel Vetter
Cc: Lucas Stach
Cc: Russell King
Cc: Christian Gmeiner
Cc: Qiang Yu
Cc: Rob Herring
Cc: Tomeu Vizoso
Cc: Steven Price
Cc: Alyssa Rosenzweig
On 12/2/20 1:20 PM, Greg KH wrote:
On Wed, Dec 02, 2020 at 01:02:06PM -0500, Andrey Grodzovsky wrote:
On 12/2/20 12:34 PM, Greg KH wrote:
On Wed, Dec 02, 2020 at 10:48:01AM -0500, Andrey Grodzovsky wrote:
On 11/11/20 10:34 AM, Greg KH wrote:
On Wed, Nov 11, 2020 at 10:13:13AM -0500, Andrey
On 12/2/20 12:34 PM, Greg KH wrote:
On Wed, Dec 02, 2020 at 10:48:01AM -0500, Andrey Grodzovsky wrote:
On 11/11/20 10:34 AM, Greg KH wrote:
On Wed, Nov 11, 2020 at 10:13:13AM -0500, Andrey Grodzovsky wrote:
On 11/10/20 12:59 PM, Greg KH wrote:
On Tue, Nov 10, 2020 at 12:54:21PM -0500
On 11/11/20 10:34 AM, Greg KH wrote:
On Wed, Nov 11, 2020 at 10:13:13AM -0500, Andrey Grodzovsky wrote:
On 11/10/20 12:59 PM, Greg KH wrote:
On Tue, Nov 10, 2020 at 12:54:21PM -0500, Andrey Grodzovsky wrote:
Hi, back to this after a long context switch for some higher priority stuff.
So
On 11/27/20 9:59 AM, Daniel Vetter wrote:
On Wed, Nov 25, 2020 at 02:34:44PM -0500, Andrey Grodzovsky wrote:
On 11/25/20 11:36 AM, Daniel Vetter wrote:
On Wed, Nov 25, 2020 at 01:57:40PM +0100, Christian König wrote:
Am 25.11.20 um 11:40 schrieb Daniel Vetter:
On Tue, Nov 24, 2020 at 05:44
On 11/27/20 10:04 AM, Daniel Vetter wrote:
On Wed, Nov 25, 2020 at 12:39:47PM -0500, Andrey Grodzovsky wrote:
On 11/25/20 4:04 AM, Daniel Vetter wrote:
On Tue, Nov 24, 2020 at 11:27 PM Andrey Grodzovsky
wrote:
On 11/24/20 9:49 AM, Daniel Vetter wrote:
On Sat, Nov 21, 2020 at 12:21:20AM
On 11/24/20 10:17 PM, Luben Tuikov wrote:
The job timeout handler now returns status
indicating back to the DRM layer whether the job
was successfully cancelled or whether more time
should be given to the job to complete.
Signed-off-by: Luben Tuikov
---
On 11/25/20 11:36 AM, Daniel Vetter wrote:
On Wed, Nov 25, 2020 at 01:57:40PM +0100, Christian König wrote:
Am 25.11.20 um 11:40 schrieb Daniel Vetter:
On Tue, Nov 24, 2020 at 05:44:07PM +0100, Christian König wrote:
Am 24.11.20 um 17:22 schrieb Andrey Grodzovsky:
On 11/24/20 2:41 AM
On 11/25/20 5:41 AM, Daniel Vetter wrote:
On Tue, Nov 24, 2020 at 10:51:57AM -0500, Andrey Grodzovsky wrote:
On 11/24/20 9:53 AM, Daniel Vetter wrote:
On Sat, Nov 21, 2020 at 12:21:18AM -0500, Andrey Grodzovsky wrote:
Some of the stuff in amdgpu_device_fini such as HW interrupts
disable
On 11/25/20 4:04 AM, Daniel Vetter wrote:
On Tue, Nov 24, 2020 at 11:27 PM Andrey Grodzovsky
wrote:
On 11/24/20 9:49 AM, Daniel Vetter wrote:
On Sat, Nov 21, 2020 at 12:21:20AM -0500, Andrey Grodzovsky wrote:
Avoids NULL ptr due to kobj->sd being unset on device removal.
Signed-
On 11/24/20 9:49 AM, Daniel Vetter wrote:
On Sat, Nov 21, 2020 at 12:21:20AM -0500, Andrey Grodzovsky wrote:
Avoids NULL ptr due to kobj->sd being unset on device removal.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 +++-
drivers/gpu/drm/amd/amd
On 11/24/20 12:11 PM, Luben Tuikov wrote:
On 2020-11-24 2:50 a.m., Christian König wrote:
Am 24.11.20 um 02:12 schrieb Luben Tuikov:
On 2020-11-23 3:06 a.m., Christian König wrote:
Am 23.11.20 um 06:37 schrieb Andrey Grodzovsky:
On 11/22/20 6:57 AM, Christian König wrote:
Am 21.11.20 um 06
On 11/24/20 2:41 AM, Christian König wrote:
Am 23.11.20 um 22:08 schrieb Andrey Grodzovsky:
On 11/23/20 3:41 PM, Christian König wrote:
Am 23.11.20 um 21:38 schrieb Andrey Grodzovsky:
On 11/23/20 3:20 PM, Christian König wrote:
Am 23.11.20 um 21:05 schrieb Andrey Grodzovsky:
On 11/25/20
On 11/24/20 9:53 AM, Daniel Vetter wrote:
On Sat, Nov 21, 2020 at 12:21:18AM -0500, Andrey Grodzovsky wrote:
Some of the stuff in amdgpu_device_fini such as HW interrupts
disable and pending fences finilization must be done right away on
pci_remove while most of the stuff which relates
On 11/23/20 3:41 PM, Christian König wrote:
Am 23.11.20 um 21:38 schrieb Andrey Grodzovsky:
On 11/23/20 3:20 PM, Christian König wrote:
Am 23.11.20 um 21:05 schrieb Andrey Grodzovsky:
On 11/25/20 5:42 AM, Christian König wrote:
Am 21.11.20 um 06:21 schrieb Andrey Grodzovsky:
It's needed
On 11/23/20 3:20 PM, Christian König wrote:
Am 23.11.20 um 21:05 schrieb Andrey Grodzovsky:
On 11/25/20 5:42 AM, Christian König wrote:
Am 21.11.20 um 06:21 schrieb Andrey Grodzovsky:
It's needed to drop iommu backed pages on device unplug
before device's IOMMU group is released.
It would
On 11/25/20 5:42 AM, Christian König wrote:
Am 21.11.20 um 06:21 schrieb Andrey Grodzovsky:
It's needed to drop iommu backed pages on device unplug
before device's IOMMU group is released.
It would be cleaner if we could do the whole handling in TTM. I also need to
double check what you
On 11/22/20 6:57 AM, Christian König wrote:
Am 21.11.20 um 06:21 schrieb Andrey Grodzovsky:
No point to try recovery if device is gone, it's meaningless.
I think that this should go into the device specific recovery function and not
in the scheduler.
The timeout timer is rearmed here, so
On 11/21/20 9:13 AM, Christian König wrote:
Am 21.11.20 um 06:21 schrieb Andrey Grodzovsky:
Fixes oops.
That file doesn't even exist any more. What oops should this fix?
Which file ?
We set dma_address to NULL in every other place after unmap. This is so that
if dma address was already
On 11/21/20 9:15 AM, Christian König wrote:
Am 21.11.20 um 06:21 schrieb Andrey Grodzovsky:
Will be used to reroute CPU mapped BO's page faults once
device is removed.
Uff, one page for each exported DMA-buf? That's not something we can do.
We need to find a different approach here.
Can't
We can't allocate and submit IBs post device unplug.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index
Handle all DMA IOMMU gropup related dependencies before the
group is removed.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h| 5
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
No point to try recovery if device is gone, it's meaningless.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 2 +-
drivers/gpu/drm/etnaviv/etnaviv_sched.c | 3 ++-
drivers/gpu/drm/lima/lima_sched.c | 3 ++-
drivers/gpu/drm/panfrost/panfrost_job.c
It's needed to drop iommu backed pages on device unplug
before device's IOMMU group is released.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/ttm/ttm_tt.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 1ccf1ef
the last device
reference is dropped.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h| 6 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 7 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 15
Use it to call disply code dependent on device->drv_data
before it's set to NULL on device unplug
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 20
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 ++--
drivers/
Avoids NULL ptr due to kobj->sd being unset on device removal.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 4 +++-
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amd
601 - 700 of 1477 matches
Mail list logo