Re: [PATCH 2/2] drm/amdgpu: bind GTT on demand

2016-09-26 Thread Andy Furniss
With current drm-next-4.9-wip plus  drm/amdgpu: fix addr handling in 
amdgpu_vm_bo_update_mapping


testdma is now working OK for my R9285 - just did 35k tests while 
simultaneously running valley/heaven, plus messing with 
powerplay/cpufreq and all is

good, no lock, no vm faults.


Andy Furniss wrote:

Christian König wrote:

Am 22.09.2016 um 23:54 schrieb Andy Furniss:

Marek Olšák wrote:

This breaks Tonga such that it hangs. Reproducible quickly with:

R600_DEBUG=testdma glxgears

It's a randomized test that runs forever. It should hang within 2
seconds.


So what is the status of this now?


The status is that I'm hammering my head on the desk for two weeks now
what the heck is going wrong here.


Oh, OK, I was just checking no one thought it was fixed really.



What I've found so far is that a VM update command doesn't have the
result it should have, but I absolutely don't understand what happens
here.

The good news is that it's 100% reproducible and I already found and
fixed two bugs.

If you have any idea or more testing results what is going wrong here
please let me know.


I guess, you already did it anyway or it doesn't help, but

R600_DEBUG=testdma,check_vm glxgears

Does catch a vmfault and output to ddebug_dumps.
It initially saves me from locking, though I will lock on repeated runs.



If we can't fix this before the 4.9 release we are just going to revert
the patch causing this.

Regards,
Christian.



R600_DEBUG=testdma glxgears isn't a test I've run on my r9285 tonga
until I, by luck, saw this.

On agddf 4.9-wip at post time it did hang/reboot after 2 seconds.

On every update since it's still hung though now it takes longer.

Todays update tested both on head and reset on to the drm-next tag
still hang.

Going back over older 4.9-wips it's one built on 26th August that is
the first not to hang.



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx








___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: bind GTT on demand

2016-09-23 Thread Andy Furniss

Christian König wrote:

Am 22.09.2016 um 23:54 schrieb Andy Furniss:

Marek Olšák wrote:

This breaks Tonga such that it hangs. Reproducible quickly with:

R600_DEBUG=testdma glxgears

It's a randomized test that runs forever. It should hang within 2
seconds.


So what is the status of this now?


The status is that I'm hammering my head on the desk for two weeks now
what the heck is going wrong here.


Oh, OK, I was just checking no one thought it was fixed really.



What I've found so far is that a VM update command doesn't have the
result it should have, but I absolutely don't understand what happens here.

The good news is that it's 100% reproducible and I already found and
fixed two bugs.

If you have any idea or more testing results what is going wrong here
please let me know.


I guess, you already did it anyway or it doesn't help, but

R600_DEBUG=testdma,check_vm glxgears

Does catch a vmfault and output to ddebug_dumps.
It initially saves me from locking, though I will lock on repeated runs.



If we can't fix this before the 4.9 release we are just going to revert
the patch causing this.

Regards,
Christian.



R600_DEBUG=testdma glxgears isn't a test I've run on my r9285 tonga
until I, by luck, saw this.

On agddf 4.9-wip at post time it did hang/reboot after 2 seconds.

On every update since it's still hung though now it takes longer.

Todays update tested both on head and reset on to the drm-next tag
still hang.

Going back over older 4.9-wips it's one built on 26th August that is
the first not to hang.



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx






___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: bind GTT on demand

2016-09-23 Thread Christian König

Am 22.09.2016 um 23:54 schrieb Andy Furniss:

Marek Olšák wrote:

This breaks Tonga such that it hangs. Reproducible quickly with:

R600_DEBUG=testdma glxgears

It's a randomized test that runs forever. It should hang within 2 
seconds.


So what is the status of this now?


The status is that I'm hammering my head on the desk for two weeks now 
what the heck is going wrong here.


What I've found so far is that a VM update command doesn't have the 
result it should have, but I absolutely don't understand what happens here.


The good news is that it's 100% reproducible and I already found and 
fixed two bugs.


If you have any idea or more testing results what is going wrong here 
please let me know.


If we can't fix this before the 4.9 release we are just going to revert 
the patch causing this.


Regards,
Christian.



R600_DEBUG=testdma glxgears isn't a test I've run on my r9285 tonga
until I, by luck, saw this.

On agddf 4.9-wip at post time it did hang/reboot after 2 seconds.

On every update since it's still hung though now it takes longer.

Todays update tested both on head and reset on to the drm-next tag
still hang.

Going back over older 4.9-wips it's one built on 26th August that is
the first not to hang.



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: bind GTT on demand

2016-09-22 Thread Andy Furniss

Marek Olšák wrote:

This breaks Tonga such that it hangs. Reproducible quickly with:

R600_DEBUG=testdma glxgears

It's a randomized test that runs forever. It should hang within 2 seconds.


So what is the status of this now?

R600_DEBUG=testdma glxgears isn't a test I've run on my r9285 tonga
until I, by luck, saw this.

On agddf 4.9-wip at post time it did hang/reboot after 2 seconds.

On every update since it's still hung though now it takes longer.

Todays update tested both on head and reset on to the drm-next tag
still hang.

Going back over older 4.9-wips it's one built on 26th August that is
the first not to hang.



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: bind GTT on demand

2016-09-17 Thread Marek Olšák
This breaks Tonga such that it hangs. Reproducible quickly with:

R600_DEBUG=testdma glxgears

It's a randomized test that runs forever. It should hang within 2 seconds.

Marek

On Tue, Sep 6, 2016 at 11:41 AM, Christian König
 wrote:
> From: Christian König 
>
> We don't really need the GTT table any more most of the time. So bind it
> only on demand.
>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 34 
> --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  7 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 33 ++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h|  3 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c|  4 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c|  6 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  3 ++-
>  8 files changed, 84 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 9d39fa8..28d4a67 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -2521,6 +2521,7 @@ static inline void amdgpu_acpi_fini(struct 
> amdgpu_device *adev) { }
>  struct amdgpu_bo_va_mapping *
>  amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
>uint64_t addr, struct amdgpu_bo **bo);
> +int amdgpu_cs_sysvm_access_required(struct amdgpu_cs_parser *parser);
>
>  #if defined(CONFIG_DRM_AMD_DAL)
>  int amdgpu_dm_display_resume(struct amdgpu_device *adev );
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 20a1962..e069978 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -640,8 +640,12 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
> *p,
> }
> }
>
> -   if (p->uf_entry.robj)
> -   p->job->uf_addr += amdgpu_bo_gpu_offset(p->uf_entry.robj);
> +   if (!r && p->uf_entry.robj) {
> +   struct amdgpu_bo *uf = p->uf_entry.robj;
> +
> +   r = amdgpu_ttm_bind(uf->tbo.ttm, >tbo.mem);
> +   p->job->uf_addr += amdgpu_bo_gpu_offset(uf);
> +   }
>
>  error_validate:
> if (r) {
> @@ -1156,3 +1160,29 @@ amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
>
> return NULL;
>  }
> +
> +/**
> + * amdgpu_cs_sysvm_access_required - make BOs accessible by the system VM
> + *
> + * @parser: command submission parser context
> + *
> + * Helper for UVD/VCE VM emulation, make sure BOs are accessible by the 
> system VM.
> + */
> +int amdgpu_cs_sysvm_access_required(struct amdgpu_cs_parser *parser)
> +{
> +   unsigned i;
> +   int r;
> +
> +   if (!parser->bo_list)
> +   return 0;
> +
> +   for (i = 0; i < parser->bo_list->num_entries; i++) {
> +   struct amdgpu_bo *bo = parser->bo_list->array[i].robj;
> +
> +   r = amdgpu_ttm_bind(bo->tbo.ttm, >tbo.mem);
> +   if (unlikely(r))
> +   return r;
> +   }
> +
> +   return 0;
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index b17734e..dc73f11 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -646,6 +646,11 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 
> domain,
> dev_err(bo->adev->dev, "%p pin failed\n", bo);
> goto error;
> }
> +   r = amdgpu_ttm_bind(bo->tbo.ttm, >tbo.mem);
> +   if (unlikely(r)) {
> +   dev_err(bo->adev->dev, "%p bind failed\n", bo);
> +   goto error;
> +   }
>
> bo->pin_count = 1;
> if (gpu_addr != NULL)
> @@ -918,6 +923,8 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct fence 
> *fence,
>  u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
>  {
> WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
> +   WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_TT &&
> +!amdgpu_ttm_is_bound(bo->tbo.ttm));
> WARN_ON_ONCE(!ww_mutex_is_locked(>tbo.resv->lock) &&
>  !bo->pin_count);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 4337b3a..c294aa9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -256,8 +256,12 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
> new_start = (u64)new_mem->start << PAGE_SHIFT;
>
> switch (old_mem->mem_type) {
> -   case TTM_PL_VRAM:
> case TTM_PL_TT:
> +   r = amdgpu_ttm_bind(bo->ttm, old_mem);
> +   if (r)
> +   return r;
> +
> +   case TTM_PL_VRAM:
>  

Re: [PATCH 2/2] drm/amdgpu: bind GTT on demand

2016-09-07 Thread Felix Kuehling
Yeah, you save yourself the GTT update, which always uses the CPU.
Instead you write the page table entries to IBs using the CPU. Will be
interesting to see if that's a net speed-up or slow-down, if the
difference is significant at all.

The patch looks good to me. Thanks for tackling this. I didn't review it
too thoroughly, and if you forgot to map to GART somewhere important, I
would probably have missed it.

Acked-by: Felix Kuehling 

Once we pull this change into the KFD branch, we can probably revert our
ridiculous GART size without losing the ability to map lots of system
memory to GPUVM. :)

Regards,
  Felix

On 16-09-07 04:18 AM, Christian König wrote:
> Am 07.09.2016 um 09:18 schrieb zhoucm1:
>>
>>
>> On 2016年09月06日 18:48, Christian König wrote:
>>> Am 06.09.2016 um 11:53 schrieb zhoucm1:


 On 2016年09月06日 17:41, Christian König wrote:
> From: Christian König 
>
> We don't really need the GTT table any more most of the time.
 Why? I thought GTT bo is always needed to be bound when GPU is
 trying to access it, doesn't it?
>>>
>>> We only need it to be bound when we try to access it from the system
>>> VM (UVD/VCE, ring buffers, fences etc...).
>>>
>>> The VMs for each process can work without it and it is just overhead
>>> to bind it all the time.
>> Yes, I got your mean, copying pte from GART to VM confused me, I
>> thought binding GTT table is a must, that's wrong.
>> Go though vm code again, pte can be created by amdgpu_vm_map_gart if
>> no GTT pte source when we don't bind it at all under VM process, am I
>> right now?
>
> Yes exactly. I still need to measure the performance impact of that,
> adding the copy from the GTT made quite a difference when I introduced
> that.
>
> I hope that doing the write when the update the VM is still faster
> than doing the GTT write.
>
> Regards,
> Christian.
>
>>
>> Regards,
>> David Zhou
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>>

 Regards,
 David Zhou
>   So bind it
> only on demand.
>
> Signed-off-by: Christian König 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 34
> --
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  7 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 33
> ++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h|  3 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c|  4 
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c|  6 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  3 ++-
>   8 files changed, 84 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 9d39fa8..28d4a67 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -2521,6 +2521,7 @@ static inline void amdgpu_acpi_fini(struct
> amdgpu_device *adev) { }
>   struct amdgpu_bo_va_mapping *
>   amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
>  uint64_t addr, struct amdgpu_bo **bo);
> +int amdgpu_cs_sysvm_access_required(struct amdgpu_cs_parser
> *parser);
> #if defined(CONFIG_DRM_AMD_DAL)
>   int amdgpu_dm_display_resume(struct amdgpu_device *adev );
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 20a1962..e069978 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -640,8 +640,12 @@ static int amdgpu_cs_parser_bos(struct
> amdgpu_cs_parser *p,
>   }
>   }
>   -if (p->uf_entry.robj)
> -p->job->uf_addr += amdgpu_bo_gpu_offset(p->uf_entry.robj);
> +if (!r && p->uf_entry.robj) {
> +struct amdgpu_bo *uf = p->uf_entry.robj;
> +
> +r = amdgpu_ttm_bind(uf->tbo.ttm, >tbo.mem);
> +p->job->uf_addr += amdgpu_bo_gpu_offset(uf);
> +}
> error_validate:
>   if (r) {
> @@ -1156,3 +1160,29 @@ amdgpu_cs_find_mapping(struct
> amdgpu_cs_parser *parser,
> return NULL;
>   }
> +
> +/**
> + * amdgpu_cs_sysvm_access_required - make BOs accessible by the
> system VM
> + *
> + * @parser: command submission parser context
> + *
> + * Helper for UVD/VCE VM emulation, make sure BOs are accessible
> by the system VM.
> + */
> +int amdgpu_cs_sysvm_access_required(struct amdgpu_cs_parser *parser)
> +{
> +unsigned i;
> +int r;
> +
> +if (!parser->bo_list)
> +return 0;
> +
> +for (i = 0; i < parser->bo_list->num_entries; i++) {
> +struct amdgpu_bo *bo = parser->bo_list->array[i].robj;
> +
> +r = 

Re: [PATCH 2/2] drm/amdgpu: bind GTT on demand

2016-09-07 Thread zhoucm1



On 2016年09月06日 18:48, Christian König wrote:

Am 06.09.2016 um 11:53 schrieb zhoucm1:



On 2016年09月06日 17:41, Christian König wrote:

From: Christian König 

We don't really need the GTT table any more most of the time.
Why? I thought GTT bo is always needed to be bound when GPU is trying 
to access it, doesn't it?


We only need it to be bound when we try to access it from the system 
VM (UVD/VCE, ring buffers, fences etc...).


The VMs for each process can work without it and it is just overhead 
to bind it all the time.
Yes, I got your mean, copying pte from GART to VM confused me, I thought 
binding GTT table is a must, that's wrong.
Go though vm code again, pte can be created by amdgpu_vm_map_gart if no 
GTT pte source when we don't bind it at all under VM process, am I right 
now?


Regards,
David Zhou



Regards,
Christian.




Regards,
David Zhou

  So bind it
only on demand.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 34 
--

  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  7 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 33 
++---

  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h|  3 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c|  4 
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c|  6 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  3 ++-
  8 files changed, 84 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

index 9d39fa8..28d4a67 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -2521,6 +2521,7 @@ static inline void amdgpu_acpi_fini(struct 
amdgpu_device *adev) { }

  struct amdgpu_bo_va_mapping *
  amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
 uint64_t addr, struct amdgpu_bo **bo);
+int amdgpu_cs_sysvm_access_required(struct amdgpu_cs_parser *parser);
#if defined(CONFIG_DRM_AMD_DAL)
  int amdgpu_dm_display_resume(struct amdgpu_device *adev );
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c

index 20a1962..e069978 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -640,8 +640,12 @@ static int amdgpu_cs_parser_bos(struct 
amdgpu_cs_parser *p,

  }
  }
  -if (p->uf_entry.robj)
-p->job->uf_addr += amdgpu_bo_gpu_offset(p->uf_entry.robj);
+if (!r && p->uf_entry.robj) {
+struct amdgpu_bo *uf = p->uf_entry.robj;
+
+r = amdgpu_ttm_bind(uf->tbo.ttm, >tbo.mem);
+p->job->uf_addr += amdgpu_bo_gpu_offset(uf);
+}
error_validate:
  if (r) {
@@ -1156,3 +1160,29 @@ amdgpu_cs_find_mapping(struct 
amdgpu_cs_parser *parser,

return NULL;
  }
+
+/**
+ * amdgpu_cs_sysvm_access_required - make BOs accessible by the 
system VM

+ *
+ * @parser: command submission parser context
+ *
+ * Helper for UVD/VCE VM emulation, make sure BOs are accessible by 
the system VM.

+ */
+int amdgpu_cs_sysvm_access_required(struct amdgpu_cs_parser *parser)
+{
+unsigned i;
+int r;
+
+if (!parser->bo_list)
+return 0;
+
+for (i = 0; i < parser->bo_list->num_entries; i++) {
+struct amdgpu_bo *bo = parser->bo_list->array[i].robj;
+
+r = amdgpu_ttm_bind(bo->tbo.ttm, >tbo.mem);
+if (unlikely(r))
+return r;
+}
+
+return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index b17734e..dc73f11 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -646,6 +646,11 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo 
*bo, u32 domain,

  dev_err(bo->adev->dev, "%p pin failed\n", bo);
  goto error;
  }
+r = amdgpu_ttm_bind(bo->tbo.ttm, >tbo.mem);
+if (unlikely(r)) {
+dev_err(bo->adev->dev, "%p bind failed\n", bo);
+goto error;
+}
bo->pin_count = 1;
  if (gpu_addr != NULL)
@@ -918,6 +923,8 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, 
struct fence *fence,

  u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
  {
  WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
+WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_TT &&
+ !amdgpu_ttm_is_bound(bo->tbo.ttm));
WARN_ON_ONCE(!ww_mutex_is_locked(>tbo.resv->lock) &&
   !bo->pin_count);
  diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index 4337b3a..c294aa9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -256,8 +256,12 @@ static int amdgpu_move_blit(struct 
ttm_buffer_object *bo,

  new_start = (u64)new_mem->start << PAGE_SHIFT;
switch (old_mem->mem_type) {
-case TTM_PL_VRAM:
  case TTM_PL_TT:
+r = 

Re: [PATCH 2/2] drm/amdgpu: bind GTT on demand

2016-09-06 Thread Christian König

Am 06.09.2016 um 11:53 schrieb zhoucm1:



On 2016年09月06日 17:41, Christian König wrote:

From: Christian König 

We don't really need the GTT table any more most of the time.
Why? I thought GTT bo is always needed to be bound when GPU is trying 
to access it, doesn't it?


We only need it to be bound when we try to access it from the system VM 
(UVD/VCE, ring buffers, fences etc...).


The VMs for each process can work without it and it is just overhead to 
bind it all the time.


Regards,
Christian.




Regards,
David Zhou

  So bind it
only on demand.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 34 
--

  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  7 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 33 
++---

  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h|  3 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c|  4 
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c|  6 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  3 ++-
  8 files changed, 84 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

index 9d39fa8..28d4a67 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -2521,6 +2521,7 @@ static inline void amdgpu_acpi_fini(struct 
amdgpu_device *adev) { }

  struct amdgpu_bo_va_mapping *
  amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
 uint64_t addr, struct amdgpu_bo **bo);
+int amdgpu_cs_sysvm_access_required(struct amdgpu_cs_parser *parser);
#if defined(CONFIG_DRM_AMD_DAL)
  int amdgpu_dm_display_resume(struct amdgpu_device *adev );
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c

index 20a1962..e069978 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -640,8 +640,12 @@ static int amdgpu_cs_parser_bos(struct 
amdgpu_cs_parser *p,

  }
  }
  -if (p->uf_entry.robj)
-p->job->uf_addr += amdgpu_bo_gpu_offset(p->uf_entry.robj);
+if (!r && p->uf_entry.robj) {
+struct amdgpu_bo *uf = p->uf_entry.robj;
+
+r = amdgpu_ttm_bind(uf->tbo.ttm, >tbo.mem);
+p->job->uf_addr += amdgpu_bo_gpu_offset(uf);
+}
error_validate:
  if (r) {
@@ -1156,3 +1160,29 @@ amdgpu_cs_find_mapping(struct amdgpu_cs_parser 
*parser,

return NULL;
  }
+
+/**
+ * amdgpu_cs_sysvm_access_required - make BOs accessible by the 
system VM

+ *
+ * @parser: command submission parser context
+ *
+ * Helper for UVD/VCE VM emulation, make sure BOs are accessible by 
the system VM.

+ */
+int amdgpu_cs_sysvm_access_required(struct amdgpu_cs_parser *parser)
+{
+unsigned i;
+int r;
+
+if (!parser->bo_list)
+return 0;
+
+for (i = 0; i < parser->bo_list->num_entries; i++) {
+struct amdgpu_bo *bo = parser->bo_list->array[i].robj;
+
+r = amdgpu_ttm_bind(bo->tbo.ttm, >tbo.mem);
+if (unlikely(r))
+return r;
+}
+
+return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index b17734e..dc73f11 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -646,6 +646,11 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo 
*bo, u32 domain,

  dev_err(bo->adev->dev, "%p pin failed\n", bo);
  goto error;
  }
+r = amdgpu_ttm_bind(bo->tbo.ttm, >tbo.mem);
+if (unlikely(r)) {
+dev_err(bo->adev->dev, "%p bind failed\n", bo);
+goto error;
+}
bo->pin_count = 1;
  if (gpu_addr != NULL)
@@ -918,6 +923,8 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct 
fence *fence,

  u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
  {
  WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
+WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_TT &&
+ !amdgpu_ttm_is_bound(bo->tbo.ttm));
WARN_ON_ONCE(!ww_mutex_is_locked(>tbo.resv->lock) &&
   !bo->pin_count);
  diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index 4337b3a..c294aa9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -256,8 +256,12 @@ static int amdgpu_move_blit(struct 
ttm_buffer_object *bo,

  new_start = (u64)new_mem->start << PAGE_SHIFT;
switch (old_mem->mem_type) {
-case TTM_PL_VRAM:
  case TTM_PL_TT:
+r = amdgpu_ttm_bind(bo->ttm, old_mem);
+if (r)
+return r;
+
+case TTM_PL_VRAM:
  old_start += bo->bdev->man[old_mem->mem_type].gpu_offset;
  break;
  default:
@@ -265,8 +269,12 @@ static int amdgpu_move_blit(struct 
ttm_buffer_object *bo,

  return -EINVAL;
  }
  switch 

Re: [PATCH 2/2] drm/amdgpu: bind GTT on demand

2016-09-06 Thread zhoucm1



On 2016年09月06日 17:41, Christian König wrote:

From: Christian König 

We don't really need the GTT table any more most of the time.
Why? I thought GTT bo is always needed to be bound when GPU is trying to 
access it, doesn't it?


Regards,
David Zhou

  So bind it
only on demand.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 34 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  7 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 33 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h|  3 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c|  4 
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c|  6 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  3 ++-
  8 files changed, 84 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 9d39fa8..28d4a67 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -2521,6 +2521,7 @@ static inline void amdgpu_acpi_fini(struct amdgpu_device 
*adev) { }
  struct amdgpu_bo_va_mapping *
  amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
   uint64_t addr, struct amdgpu_bo **bo);
+int amdgpu_cs_sysvm_access_required(struct amdgpu_cs_parser *parser);
  
  #if defined(CONFIG_DRM_AMD_DAL)

  int amdgpu_dm_display_resume(struct amdgpu_device *adev );
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 20a1962..e069978 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -640,8 +640,12 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
}
}
  
-	if (p->uf_entry.robj)

-   p->job->uf_addr += amdgpu_bo_gpu_offset(p->uf_entry.robj);
+   if (!r && p->uf_entry.robj) {
+   struct amdgpu_bo *uf = p->uf_entry.robj;
+
+   r = amdgpu_ttm_bind(uf->tbo.ttm, >tbo.mem);
+   p->job->uf_addr += amdgpu_bo_gpu_offset(uf);
+   }
  
  error_validate:

if (r) {
@@ -1156,3 +1160,29 @@ amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
  
  	return NULL;

  }
+
+/**
+ * amdgpu_cs_sysvm_access_required - make BOs accessible by the system VM
+ *
+ * @parser: command submission parser context
+ *
+ * Helper for UVD/VCE VM emulation, make sure BOs are accessible by the system 
VM.
+ */
+int amdgpu_cs_sysvm_access_required(struct amdgpu_cs_parser *parser)
+{
+   unsigned i;
+   int r;
+
+   if (!parser->bo_list)
+   return 0;
+
+   for (i = 0; i < parser->bo_list->num_entries; i++) {
+   struct amdgpu_bo *bo = parser->bo_list->array[i].robj;
+
+   r = amdgpu_ttm_bind(bo->tbo.ttm, >tbo.mem);
+   if (unlikely(r))
+   return r;
+   }
+
+   return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index b17734e..dc73f11 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -646,6 +646,11 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 
domain,
dev_err(bo->adev->dev, "%p pin failed\n", bo);
goto error;
}
+   r = amdgpu_ttm_bind(bo->tbo.ttm, >tbo.mem);
+   if (unlikely(r)) {
+   dev_err(bo->adev->dev, "%p bind failed\n", bo);
+   goto error;
+   }
  
  	bo->pin_count = 1;

if (gpu_addr != NULL)
@@ -918,6 +923,8 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct fence 
*fence,
  u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
  {
WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
+   WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_TT &&
+!amdgpu_ttm_is_bound(bo->tbo.ttm));
WARN_ON_ONCE(!ww_mutex_is_locked(>tbo.resv->lock) &&
 !bo->pin_count);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index 4337b3a..c294aa9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -256,8 +256,12 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
new_start = (u64)new_mem->start << PAGE_SHIFT;
  
  	switch (old_mem->mem_type) {

-   case TTM_PL_VRAM:
case TTM_PL_TT:
+   r = amdgpu_ttm_bind(bo->ttm, old_mem);
+   if (r)
+   return r;
+
+   case TTM_PL_VRAM:
old_start += bo->bdev->man[old_mem->mem_type].gpu_offset;
break;
default:
@@ -265,8 +269,12 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
return -EINVAL;
}
switch (new_mem->mem_type) {
-   case TTM_PL_VRAM:
case TTM_PL_TT:
+