This reverts commit 2046d46db9166bddc84778f0b3477f6d1e9068ea.
Not needed any more.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 3 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c | 4 ++--
Using the standard clear turned out to be to inflexible.
First of all it is executed on the system queue, together with buffer
moves instead on the per VM queue.
And second we need to fill in the page tables with more than just zero.
Signed-off-by: Christian König
---
This reverts commit 7bdc53f925af085ffa0580f10489f82b36cc2f1c and commit
330df03b3abf944f8f5180f2abc61367749984c0.
Neither are needed any more.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 7 ---
At least on x86-64 the upper range is purely used by the kernel,
avoid creating any ATS mappings there as security precaution and to
allow proper page fault reporting in the upper range.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 83
Change-Id: I6535023db57d4ff5e79f7928595bf8ff1f60f23f
Signed-off-by: Chunming Zhou
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 4
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 3 +++
2 files changed, 7 insertions(+)
diff --git
there is a scheduling balance issue about get node like:
a. process A allocates full memory and use it for submission.
b. process B tries to allocates memory, will wait for process A BO idle in
eviction.
c. process A completes the job, process B eviction will put process A BO node,
but in the
Add unknown mode instand of manual mode.
pp_dpm_sclk/pp_dpm_mclk/pp_dpm_pcie and power_dpm_force_performance_level
all for change clock range.
pp_dpm_sclk/pp_dpm_mclk/pp_dpm_pcie for change clock range seperately.
power_dpm_force_performance_level can change all clock range to
some defined
I am off work, so reply mail by phone, the format could not be text.
back to topic itself:
the problem indeed happen on amdgpu driver, someone reports me that application
runs with two instances, the performance are different.
I also reproduced the issue with unit test(bo_eviction_test). They
After my investigation, this issue should be detect of TTM design
self, which breaks scheduling balance.
Yeah, but again. This is indented design we can't change easily.
Regards,
Christian.
Am 26.01.2018 um 13:36 schrieb Zhou, David(ChunMing):
I am off work, so reply mail by phone, the format
I know, but this has the same effect. You prevent concurrent allocation
from happening.
What we could do is to pipeline reusing of deleted memory as well, this
makes it less likely to cause the problem you are seeing because the
evicting processes doesn't need to block for deleted BOs any
That got mixed up with the encode ring function.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 19 ++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
Yes, exactly that's the problem.
See when you want to prevent a process B from allocating the memory
process A has evicted, you need to prevent all concurrent allocation.
And we don't do that because it causes a major performance drop.
Regards,
Christian.
Am 26.01.2018 um 14:21 schrieb
You patch will prevent concurrent allocation, and will result in allocation
performance drop much.
发自坚果 Pro
Christian K鰊ig 于 2018年1月26日 下午9:04写道:
Attached is what you actually want to do cleanly implemented. But as I said
this is a NO-GO.
Regards,
I don't want to prevent all, my new approach is to prevent the later allocation
is trying and ahead of front to get the memory space that the front made from
eviction.
发自坚果 Pro
Christian K鰊ig 于 2018年1月26日 下午9:24写道:
Yes, exactly that's the problem.
See
Am 26.01.2018 um 15:31 schrieb Tom St Denis:
Hi all,
In the function ttm_bo_vm_access_kmap() it doesn't seem to me like the
'buf' pointer is incremented. That seems like a bug no?
Yeah, looks suspicious to me as well. But TTM questions should CC the
dri list as well.
And in this
I just realized that a change I'm thinking about for a while would solve
your problem as well, but keep concurrent allocation possible.
See ttm_mem_evict_first() unlocks the BO after evicting it:
ttm_bo_del_from_lru(bo);
spin_unlock(>lru_lock);
ret = ttm_bo_evict(bo,
On 26/01/18 09:38 AM, Christian König wrote:
Am 26.01.2018 um 15:31 schrieb Tom St Denis:
Hi all,
In the function ttm_bo_vm_access_kmap() it doesn't seem to me like the
'buf' pointer is incremented. That seems like a bug no?
Yeah, looks suspicious to me as well. But TTM questions should CC
Hi all,
In the function ttm_bo_vm_access_kmap() it doesn't seem to me like the
'buf' pointer is incremented. That seems like a bug no?
Cheers,
Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
Hi Felix,
>That would make sense. But switching to manual mode would disable
>profiles and automatic profile selection. That was one reason why I
>objected to your plan to control profile clock limits using these files.
Rex:
I am not very clear the old logic of gfx/compute power profile
Attached is what you actually want to do cleanly implemented. But as I
said this is a NO-GO.
Regards,
Christian.
Am 26.01.2018 um 13:43 schrieb Christian König:
After my investigation, this issue should be detect of TTM design
self, which breaks scheduling balance.
Yeah, but again. This is
Change-Id: I222f4bb2c9a91c7a4764e6aa706e7d7f2e6d948d
Signed-off-by: Shaoyun Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 19 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 6 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +
3 files
Am 26.01.2018 um 18:38 schrieb Shaoyun Liu:
Change-Id: I222f4bb2c9a91c7a4764e6aa706e7d7f2e6d948d
Signed-off-by: Shaoyun Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 19 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 6 ++
If we did use force flag and amdgpu_gpu_recovery = 0 , the reset will be
ignored . I'm kind of like this reset can go through like sriov . If we
depends on the parameter amdgpu_gpu_recovery , it may think the GPU is hang
and trigger the GPU reset when rocm submit some heavy compute
Sorry , I meant if I use the "false" flag and gpu_recovery is not enabled ,
the reset will be ignored.
-Original Message-
From: Liu, Shaoyun
Sent: Friday, January 26, 2018 12:54 PM
To: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 3/3] drm/amdgpu: reset kfd
Change-Id: Id2d38642bec9ea1e5fd471e5b8aff027a08438b0
Signed-off-by: Shaoyun Liu
---
drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 10 ++
1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
Change-Id: If0babeaeb5237da6ce4d5c400e7df649cba5a8ac
Signed-off-by: Shaoyun Liu
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 ++
drivers/gpu/drm/amd/amdkfd/kfd_module.c | 2 ++
2 files changed, 12 insertions(+)
diff --git
Power containment has two parts, TDC limit and power limit. Only
changing power limit is not equal to disabling power containment.
Power containment in Fiji degrades compute performance. So we disable it
to fix. It has been verified by compute performance test. If you want to
replace it, it
Hi Rex,
I think I understand what you're trying to do. To summarize my concerns,
there are two reasons I'm against your plan:
1. You're breaking the semantics of the existing pp_dpm_sclk/mclk/pcie
interfaces, which affects existing tools
2. You're taking the clock limits out of the power
On 2018-01-26 09:40 AM, Tom St Denis wrote:
> On 26/01/18 09:38 AM, Christian König wrote:
>> Am 26.01.2018 um 15:31 schrieb Tom St Denis:
>>> Hi all,
>>>
>>> In the function ttm_bo_vm_access_kmap() it doesn't seem to me like
>>> the 'buf' pointer is incremented. That seems like a bug no?
>>
>>
Sorry , I meant if I use the "false" flag and gpu_recovery is not enabled ,
the reset will be ignored.
And exactly that is the intention here. So please use the false flag,
apart from that the patch looks good to me.
Regards,
Christian.
Am 26.01.2018 um 18:56 schrieb Liu, Shaoyun:
Sorry
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 --
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
index 373ced0b2fc2..fa44f7b15285 100644
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/ttm/ttm_bo_vm.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 07b22f04b969..6311f8a481ea 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
index 579c4aedc17e..aa1ec35dc187 100644
---
Instead of "fix indentation" better write "fix coding style" and add
some commit message to each patch. Something like "No functional
change..." for the style changes should be ok.
Additional to that please move patch #11 to the top of the list and
triple check in patch #10 that this is
Ok, the wrong hang detection when amdgpu_gpu_recovery is enabled may be another
issue , we can fix it later .
Changed to 'false' flag as suggested and submitted .
Regards
Shaoyun.liu
-Original Message-
From: Koenig, Christian
Sent: Friday, January 26, 2018 1:28 PM
To: Liu, Shaoyun;
On 26/01/18 01:38 PM, Christian König wrote:
Instead of "fix indentation" better write "fix coding style" and add
some commit message to each patch. Something like "No functional
change..." for the style changes should be ok.
Additional to that please move patch #11 to the top of the list and
It could be done in next step .
I just notice my change is based on amd-kfd-staging , I will add you as the
reviewer again .
Regards
Shaoyun.liu
-Original Message-
From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
Sent: Friday, January 26, 2018 1:42 PM
To: Liu,
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/ttm/ttm_bo.c | 7 ++-
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index d33a6bb742a1..8cf89da7030d 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/ttm/ttm_bo_util.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 153de1bf0232..33ffe286f3a5 100644
---
Explicitly return errors in ttm_tt_alloc_page_directory() and
ttm_dma_tt_alloc_page_directory() instead of relying on
further logic to detect errors.
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/ttm/ttm_tt.c | 16 ++--
1 file changed, 10 insertions(+), 6
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 12 +---
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
index 962838cfb1a3..579c4aedc17e
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/ttm/ttm_bo.c | 14 +-
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 8cf89da7030d..4e85c32fea26 100644
---
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
index 647eb5f40ab9..962838cfb1a3 100644
---
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/ttm/ttm_tt.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index e90d3ed6283f..95a77dab8cc9 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/ttm/ttm_bo_vm.c | 28 +---
1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 08a3c324242e..07b22f04b969 100644
---
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
index 9e90d0ebc773..647eb5f40ab9 100644
This series includes mostly no-functional-changes to simplify
or cleanup various routines.
Patch #11 includes an fix to functional behaviour.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Alternatively if you really want to add some distinction here you could
add an enum like trigger source.
Something like manual, SRIOV host, GPU scheduler, KFD, interrupt etc...
And then use the user configurable option as bitmask to enable/disable
GPU recovery for each trigger source.
On 2018-01-26 01:29 PM, Tom St Denis wrote:
> Signed-off-by: Tom St Denis
> ---
> drivers/gpu/drm/ttm/ttm_bo_vm.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 07b22f04b969..6311f8a481ea
>1. You're breaking the semantics of the existing pp_dpm_sclk/mclk/pcie
>interfaces, which affects existing tools
Rex: I don't think the patch will affects existing tools.
User set "manual" to power_performance_level, and then change the clock range
through pp_dpm_sclk/mclk/pcie.
with
On 2018-01-26 01:29 PM, Tom St Denis wrote:
> Signed-off-by: Tom St Denis
> ---
> drivers/gpu/drm/ttm/ttm_bo.c | 14 +-
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index
On 2018-01-26 01:29 PM, Tom St Denis wrote:
> Signed-off-by: Tom St Denis
> ---
> drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 12 +---
> 1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>
On 2018-01-26 02:20 PM, Zhu, Rex wrote:
>
> >1. You're breaking the semantics of the existing pp_dpm_sclk/mclk/pcie
> > interfaces, which affects existing tools
>
>
> Rex: I don't think the patch will affects existing tools.
>
>
> User set "manual" to power_performance_level, and then change
On Fri, Jan 26, 2018 at 02:23:39PM -0500, Felix Kuehling wrote:
> On 2018-01-26 01:29 PM, Tom St Denis wrote:
> > Signed-off-by: Tom St Denis
> > ---
> > drivers/gpu/drm/ttm/ttm_bo.c | 14 +-
> > 1 file changed, 9 insertions(+), 5 deletions(-)
> >
> > diff --git
Hi Shaoyun,
As I'm upstreaming GPUVM support for KFD right now, a bunch of
amdgpu_amdkfd_... changes are going to go through Oded's tree. To avoid
conflicts when merging with Alex's tree, I'd recommend we don't submit
your changes to amd-staging-drm-next.
Instead submit them to amd-kfd-staging,
Yes, understood . I already rebased the changes on amd-kfd-staging and add
you and Christian as reviewer.
Regards
Shaoyun.liu
-Original Message-
From: Kuehling, Felix
Sent: Friday, January 26, 2018 2:47 PM
To: Liu, Shaoyun; Koenig, Christian; amd-gfx@lists.freedesktop.org
That got mixed up with the encode ring function.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 19 ++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>Existing tools and users expect that switching back to auto removes the
>manual clock settings. If you allow changing the clock in auto mode,
>that won't happen any more.
I have sent the patch v2 to fix this problem. user can swith back auto mode and
all manual clock setting will be removed.
The amdgpu_vm_clear_bo function takes over this functionality in the
first patch.
This patch only limits filling in the ats values in the lower halve of
the address range (previously it was filled in the whole address space).
Regards,
Christian.
Am 26.01.2018 um 21:18 schrieb Felix
I think we have two use cases for the profiles:
1. automatic profile switching for different driver use cases
2. manually tweaking profiles/clocks/power for testing
How about we make the profile selection dependent on selecting the
manual force_performance_level and not add an auto to the
That got mixed up with the encode ring function.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 19 ++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
Add emit_reg_wait implementation for VCE v4.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 22 ++
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
Add an IOCTL to enable SVM for the current process.
One step further towards HMM support.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 4 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 94 +++--
Shouldn't this change come before all the reverts? Otherwise you're
briefly breaking ATS support on Raven for KFD.
Regards,
Felix
On 2018-01-26 05:04 AM, Christian König wrote:
> At least on x86-64 the upper range is purely used by the kernel,
> avoid creating any ATS mappings there as
Yeah, good point. I should better note that in the first patch.
Christian.
Am 26.01.2018 um 21:24 schrieb Felix Kuehling:
So the first patch is not a straight revert, although the title looks
like it is. I'll read the first patch more carefully.
On 2018-01-26 03:21 PM, Christian König wrote:
Add emit_reg_wait implementation for VCN v1.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 29 +
1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
Allow us to bind a PASID to the current process and unbind it when the
VM isn't used any more.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 28 ++--
drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h | 3 +++
Allows us to wait for a register value/mask on a ring.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 ++
2 files changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
Add emit_reg_wait implementation for SDMA v4.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 15 +++
1 file changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
Update the PASID in the ATC as well and wait for the update to finish.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 21 -
drivers/gpu/drm/amd/amdgpu/soc15.h| 4 ++--
2 files changed, 22 insertions(+), 3
Add emit_reg_wait implementation for UVD v7.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 40 +--
1 file changed, 24 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
Completely pointless, it is the same reservation object as the root PD
anyway.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 -
1 file changed, 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
Keep that at a common place instead of spread over all engines.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 19 +--
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 4
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 18
Implement emit_reg_wait for gfx v9.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 9 +
1 file changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index
Move amd_iommu_v2 initialization into amdgpu when it is enabled.
This is WIP and really ugly since amdgpu should not depend directly on
amd_iommu_v2.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/Kconfig | 8
Reviewed-by: Alex Deucher
From: amd-gfx on behalf of Christian
König
Sent: Friday, January 26, 2018 8:24 AM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH]
So the first patch is not a straight revert, although the title looks
like it is. I'll read the first patch more carefully.
On 2018-01-26 03:21 PM, Christian König wrote:
> The amdgpu_vm_clear_bo function takes over this functionality in the
> first patch.
>
> This patch only limits filling in
On 01/25/2018 05:06 PM, Michel Dänzer wrote:
On 2018-01-24 06:35 PM, Mario Kleiner wrote:
On 01/22/2018 07:01 PM, Michel Dänzer wrote:
On 2018-01-22 03:14 AM, Mario Kleiner wrote:
...
Having the extension enabled in the ddx is needed for any test to work.
But when i set a breakpoint on
From: Yong Zhao
The extra fields in struct kgd_mem aren't actually needed. This struct
will be used for GPUVM allocations later.
Signed-off-by: Yong Zhao
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
Add GPUVM size and DRM render node. Also add function to query the
VMID mask to avoid hard-coding it in multiple places later.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 19 +--
When the TTM memory manager in KGD evicts BOs, all user mode queues
potentially accessing these BOs must be evicted temporarily. Once
user mode queues are evicted, the eviction fence is signaled,
allowing the migration of the BO to proceed.
A delayed worker is scheduled to restore all the BOs
I split this into an AMDGPU and AMDKFD part. The bigger patches that
add lots of new code are not cherry-picked and squashed. Instead I
copied, reorganized and cleaned up the code by hand and then split it
into some semblance of a sensible history. I acknowledged major
contributors with
The events page must be accessible in user mode by the GPU and CPU
as well as in kernel mode by the CPU. On dGPUs user mode virtual
addresses are managed by the Thunk's GPU memory allocation code.
Therefore we can't allocate the memory in kernel mode like we do
on APUs. But KFD still needs to map
From: Oak Zeng
Populate DRM render device minor in kfd topology
Signed-off-by: Oak Zeng
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 4
drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 1 +
2 files
Create/destroy the GPUVM context during PDD creation/destruction.
Get VM page table base and program it during process registration
(HWS) or VMID allocation (non-HWS).
Signed-off-by: Felix Kuehling
---
.../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 20 +
One minor nit-pick inline. Other than that and the confusing headline on
Patch 1 the series is Reviewed-by: Felix Kuehling
On 2018-01-26 05:04 AM, Christian König wrote:
> At least on x86-64 the upper range is purely used by the kernel,
> avoid creating any ATS mappings
Also used for cleaning up on process termination.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 11 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 66
2 files changed, 77 insertions(+)
diff --git
Add helpers for allocating GPUVM memory in kernel mode and use them
to allocate memory for the CWSR trap handler.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 125 +++
1 file changed, 112 insertions(+), 13
Set up the GPUVM aperture for SVM (shared virtual memory) that allows
sharing a part of virtual address space between GPUs and CPUs.
Report the size of the the GPUVM size supported by KGD accurately.
The low part of the GPUVM aperture is reserved for kernel use. This is
for kernel-allocated
Currently the number of GPUs is limited by aperture placement options
available on GFX7 and GFX8 hardware. This limitation is not necessary.
Scratch and LDS represent per-work-item and per-work-group storage
respectively. Different work-items and work-groups use the same virtual
address to access
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c| 329
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 8 +
drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 2 +
include/uapi/linux/kfd_ioctl.h
From: Harish Kasiviswanathan
Unaligned atomic operations can cause problems on some CPU
architectures. Use simpler bitmask operations instead. Atomic bit
manipulations are not necessary since dqm->lock is held during these
operations.
Signed-off-by: Harish
On GFX7 the CP does not perform a TC flush when queues are unmapped.
To avoid TC eviction from accessing an invalid VMID, flush it
explicitly before releasing a VMID.
Signed-off-by: Amber Lin
Signed-off-by: Felix Kuehling
---
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 43c89c5..a527c22 100644
---
This can be used for flushing caches when not using the HWS.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 55 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 4 ++
Dereferencing NULL pointers will cause a BUG anyway. No need to do
an explicit check.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 6 --
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 2 --
Cloning a sync object is useful for waiting for a sync object
without locking the original structure indefinitely, blocking
other threads.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 35
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index
Simulate large-BAR system by exporting only visible memory. This
limits the amount of available VRAM to the size of the BAR, but
enables CPU access to VRAM.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 5 +
1 - 100 of 101 matches
Mail list logo