RE: [PATCH 06/18] drm/amdgpu/sriov:fix memory leak after gpu reset

2017-09-19 Thread Liu, Monk
Oh, I see your point, but that actually presents for a cleanup patch, and mine 
is to add a condition to fix memory leak, I think they different purpose and 
should be separated,

I can add one more patch to cleanup it with that "create_bo_kenel" to make code 
more tight and clean

BR Monk

-Original Message-
From: Koenig, Christian 
Sent: 2017年9月18日 19:35
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 06/18] drm/amdgpu/sriov:fix memory leak after gpu reset

Am 18.09.2017 um 12:47 schrieb Liu, Monk:
> I didn't get your point... how could bo_create_kernel solve my issue ?

It doesn't solve the underlying issue, you just need less code for your 
workaround.

With bo_create_kernel you can do create/pin/kmap in just one function call.

>
> The thing here is during gpu reset we invoke hw_init for every hw 
> component, and by design hw_init shouldn't doing anything software 
> related, thus the BO allocating in hw_init is wrong,

Yeah, but your patch doesn't fix that either as far as I can see.

> Even switch to bo_create_kernel won't address the issue ...

See the implementation of bo_create_kernel():
> if (!*bo_ptr) {
> r = amdgpu_bo_create(adev, size, align, true, domain,

> }

> r = amdgpu_bo_pin(*bo_ptr, domain, gpu_addr);
...
> if (cpu_addr) {
> r = amdgpu_bo_kmap(*bo_ptr, cpu_addr);
...
> }

Creating is actually optional, but the function always pins the BO once more 
and figures out it's CPU address.

As far as I can see that should solve your problem for now.

Christian.


>
>
> BR Monk
>
> -Original Message-
> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
> Sent: 2017年9月18日 17:13
> To: Liu, Monk ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 06/18] drm/amdgpu/sriov:fix memory leak after gpu 
> reset
>
> Am 18.09.2017 um 08:11 schrieb Monk Liu:
>> doing gpu reset will rerun all hw_init and thus ucode_init_bo is 
>> invoked again, so we need to skip the fw_buf allocation during sriov 
>> gpu reset to avoid memory leak.
>>
>> Change-Id: I31131eda1bd45ea2f5bdc50c5da5fc5a9fe9027d
>> Signed-off-by: Monk Liu 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  3 ++
>>drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 64 
>> +++
>>2 files changed, 35 insertions(+), 32 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 6ff2959..3d0c633 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -1185,6 +1185,9 @@ struct amdgpu_firmware {
>>
>>  /* gpu info firmware data pointer */
>>  const struct firmware *gpu_info_fw;
>> +
>> +void *fw_buf_ptr;
>> +uint64_t fw_buf_mc;
>>};
>>
>>/*
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
>> index f306374..6564902 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
>> @@ -360,8 +360,6 @@ static int amdgpu_ucode_patch_jt(struct 
>> amdgpu_firmware_info *ucode,
>>int amdgpu_ucode_init_bo(struct amdgpu_device *adev)
>>{
>>  struct amdgpu_bo **bo = >firmware.fw_buf;
>> -uint64_t fw_mc_addr;
>> -void *fw_buf_ptr = NULL;
>>  uint64_t fw_offset = 0;
>>  int i, err;
>>  struct amdgpu_firmware_info *ucode = NULL; @@ -372,37 +370,39 @@ 
>> int amdgpu_ucode_init_bo(struct amdgpu_device *adev)
>>  return 0;
>>  }
>>
>> -err = amdgpu_bo_create(adev, adev->firmware.fw_size, PAGE_SIZE, true,
>> -amdgpu_sriov_vf(adev) ? AMDGPU_GEM_DOMAIN_VRAM 
>> : AMDGPU_GEM_DOMAIN_GTT,
>> -AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
>> -NULL, NULL, 0, bo);
>> -if (err) {
>> -dev_err(adev->dev, "(%d) Firmware buffer allocate failed\n", 
>> err);
>> -goto failed;
>> -}
>> +if (!amdgpu_sriov_vf(adev) || !adev->in_sriov_reset) {
> Instead of all this better use amdgpu_bo_create_kernel(), this should already 
> include most of the handling necessary here.
>
> Christian.
>
>> +err = amdgpu_bo_create(adev, adev->firmware.fw_size, PAGE_SIZE, 
>> true,
>> +amdgpu_sriov_vf(adev) ? 
>> AMDGPU_GEM_DOMAIN_VRAM : AMDGPU_GEM_DOMAIN_GTT,
>> +AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
>> +NULL, NULL, 0, bo);
>> +if (err) {
>> +dev_err(adev->dev, "(%d) Firmware buffer allocate 
>> failed\n", err);
>> +goto failed;
>> +}
>>
>> -err = amdgpu_bo_reserve(*bo, false);
>> -if (err) {
>> -dev_err(adev->dev, "(%d) Firmware buffer reserve failed\n", 
>> err);
>> -goto 

RE: [PATCH 04/18] drm/amdgpu/sriov:don't load psp fw during gpu reset

2017-09-19 Thread Liu, Monk
This condition is wrong, you are right, and I followed up a patch to correct it 
Should be :

If (!amdgpu_sriov_vf()) || !adev->in_reset)


BR Monk

-Original Message-
From: Quan, Evan 
Sent: 2017年9月20日 9:32
To: Koenig, Christian ; Liu, Monk ; 
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 04/18] drm/amdgpu/sriov:don't load psp fw during gpu reset

Hi Monk,

I think your change affects barematal case. Per my confirmation, vega10 cannot 
boot with the change applied.
If the change is only intended to cover gpu reset case of sriov, maybe the 
logic should be If (!(amdgpu_sriov_vf(adev) && adev->in_sriov_reset))

Regards,
Evan
>-Original Message-
>From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf 
>Of Christian K?nig
>Sent: Monday, September 18, 2017 5:06 PM
>To: Liu, Monk ; amd-gfx@lists.freedesktop.org
>Subject: Re: [PATCH 04/18] drm/amdgpu/sriov:don't load psp fw during 
>gpu reset
>
>Am 18.09.2017 um 08:11 schrieb Monk Liu:
>> At least for SRIOV we found reload PSP fw during gpu reset cause PSP 
>> hang.
>>
>> Change-Id: I5f273187a10bb8571b77651dfba7656ce0429af0
>> Signed-off-by: Monk Liu 
>
>Acked-by: Christian König 
>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 15 +--
>>   1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> index 8a1ee97..4eee2ef 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> @@ -253,15 +253,18 @@ static int psp_asd_load(struct psp_context 
>> *psp)
>>
>>   static int psp_hw_start(struct psp_context *psp)
>>   {
>> +struct amdgpu_device *adev = psp->adev;
>>  int ret;
>>
>> -ret = psp_bootloader_load_sysdrv(psp);
>> -if (ret)
>> -return ret;
>> +if (amdgpu_sriov_vf(adev) && !adev->in_sriov_reset) {
>> +ret = psp_bootloader_load_sysdrv(psp);
>> +if (ret)
>> +return ret;
>>
>> -ret = psp_bootloader_load_sos(psp);
>> -if (ret)
>> -return ret;
>> +ret = psp_bootloader_load_sos(psp);
>> +if (ret)
>> +return ret;
>> +}
>>
>>  ret = psp_ring_create(psp, PSP_RING_TYPE__KM);
>>  if (ret)
>
>
>___
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 04/18] drm/amdgpu/sriov:don't load psp fw during gpu reset

2017-09-19 Thread Quan, Evan
Hi Monk,

I think your change affects barematal case. Per my confirmation, vega10 cannot 
boot with the change applied.
If the change is only intended to cover gpu reset case of sriov, maybe the 
logic should be
If (!(amdgpu_sriov_vf(adev) && adev->in_sriov_reset))

Regards,
Evan
>-Original Message-
>From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
>Christian
>K?nig
>Sent: Monday, September 18, 2017 5:06 PM
>To: Liu, Monk ; amd-gfx@lists.freedesktop.org
>Subject: Re: [PATCH 04/18] drm/amdgpu/sriov:don't load psp fw during gpu reset
>
>Am 18.09.2017 um 08:11 schrieb Monk Liu:
>> At least for SRIOV we found reload PSP fw during
>> gpu reset cause PSP hang.
>>
>> Change-Id: I5f273187a10bb8571b77651dfba7656ce0429af0
>> Signed-off-by: Monk Liu 
>
>Acked-by: Christian König 
>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 15 +--
>>   1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> index 8a1ee97..4eee2ef 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
>> @@ -253,15 +253,18 @@ static int psp_asd_load(struct psp_context *psp)
>>
>>   static int psp_hw_start(struct psp_context *psp)
>>   {
>> +struct amdgpu_device *adev = psp->adev;
>>  int ret;
>>
>> -ret = psp_bootloader_load_sysdrv(psp);
>> -if (ret)
>> -return ret;
>> +if (amdgpu_sriov_vf(adev) && !adev->in_sriov_reset) {
>> +ret = psp_bootloader_load_sysdrv(psp);
>> +if (ret)
>> +return ret;
>>
>> -ret = psp_bootloader_load_sos(psp);
>> -if (ret)
>> -return ret;
>> +ret = psp_bootloader_load_sos(psp);
>> +if (ret)
>> +return ret;
>> +}
>>
>>  ret = psp_ring_create(psp, PSP_RING_TYPE__KM);
>>  if (ret)
>
>
>___
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2 1/1] drm/amdgpu: Add gem_prime_mmap support

2017-09-19 Thread Samuel Li
>> +vma->vm_pgoff = amdgpu_bo_mmap_offset(bo) >> PAGE_SHIFT;
> Maybe better use "vma->vm_pgoff += amdgpu_bo_mmap_offset(bo) >> PAGE_SHIFT;", 
> but I'm not sure.
> How other drivers handle this?
This is a good catch. Looks like pgoff is honored during prime mmap, not a fake 
offset here.

>> +dma_buf->ops = _dmabuf_ops;
> This isn't race free and needs to be fixed.
> Better add callbacks to drm_prime.c similar to drm_gem_dmabuf_mmap().
What do you mean "This isn't race free"?

Regards,
Sam



On 2017-09-15 11:05 AM, Christian König wrote:
> Am 14.09.2017 um 00:39 schrieb Samuel Li:
>> v2: drop hdp invalidate/flush.
>>
>> Signed-off-by: Samuel Li 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  3 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  3 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 77 
>> ++-
>>   3 files changed, 81 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index d2aaad7..188b705 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -395,11 +395,14 @@ amdgpu_gem_prime_import_sg_table(struct drm_device 
>> *dev,
>>   struct dma_buf *amdgpu_gem_prime_export(struct drm_device *dev,
>>   struct drm_gem_object *gobj,
>>   int flags);
>> +struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
>> +struct dma_buf *dma_buf);
>>   int amdgpu_gem_prime_pin(struct drm_gem_object *obj);
>>   void amdgpu_gem_prime_unpin(struct drm_gem_object *obj);
>>   struct reservation_object *amdgpu_gem_prime_res_obj(struct drm_gem_object 
>> *);
>>   void *amdgpu_gem_prime_vmap(struct drm_gem_object *obj);
>>   void amdgpu_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
>> +int amdgpu_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct 
>> *vma);
>>   int amdgpu_gem_debugfs_init(struct amdgpu_device *adev);
>> /* sub-allocation manager, it has to be protected by another lock.
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 2cdf844..9b63ac5 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -835,7 +835,7 @@ static struct drm_driver kms_driver = {
>>   .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
>>   .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
>>   .gem_prime_export = amdgpu_gem_prime_export,
>> -.gem_prime_import = drm_gem_prime_import,
>> +.gem_prime_import = amdgpu_gem_prime_import,
>>   .gem_prime_pin = amdgpu_gem_prime_pin,
>>   .gem_prime_unpin = amdgpu_gem_prime_unpin,
>>   .gem_prime_res_obj = amdgpu_gem_prime_res_obj,
>> @@ -843,6 +843,7 @@ static struct drm_driver kms_driver = {
>>   .gem_prime_import_sg_table = amdgpu_gem_prime_import_sg_table,
>>   .gem_prime_vmap = amdgpu_gem_prime_vmap,
>>   .gem_prime_vunmap = amdgpu_gem_prime_vunmap,
>> +.gem_prime_mmap = amdgpu_gem_prime_mmap,
>> .name = DRIVER_NAME,
>>   .desc = DRIVER_DESC,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
>> index 5b3f928..13c977a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
>> @@ -57,6 +57,40 @@ void amdgpu_gem_prime_vunmap(struct drm_gem_object *obj, 
>> void *vaddr)
>>   ttm_bo_kunmap(>dma_buf_vmap);
>>   }
>>   +int amdgpu_gem_prime_mmap(struct drm_gem_object *obj, struct 
>> vm_area_struct *vma)
>> +{
>> +struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
>> +struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>> +unsigned asize = amdgpu_bo_size(bo);
>> +int ret;
>> +
>> +if (!vma->vm_file)
>> +return -ENODEV;
>> +
>> +if (adev == NULL)
>> +return -ENODEV;
>> +
>> +/* Check for valid size. */
>> +if (asize < vma->vm_end - vma->vm_start)
>> +return -EINVAL;
>> +
>> +if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm) ||
>> +(bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS)) {
>> +return -EPERM;
>> +}
>> +vma->vm_pgoff = amdgpu_bo_mmap_offset(bo) >> PAGE_SHIFT;
> 
> Maybe better use "vma->vm_pgoff += amdgpu_bo_mmap_offset(bo) >> PAGE_SHIFT;", 
> but I'm not sure.
> 
> How other drivers handle this?
> 
>> +
>> +/* prime mmap does not need to check access, so allow here */
>> +ret = drm_vma_node_allow(>vma_node, vma->vm_file->private_data);
>> +if (ret)
>> +return ret;
>> +
>> +ret = ttm_bo_mmap(vma->vm_file, vma, >mman.bdev);
>> +drm_vma_node_revoke(>vma_node, vma->vm_file->private_data);
>> +
>> +return ret;
>> +}
>> +
>>   struct drm_gem_object *
>>   amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
>>struct dma_buf_attachment *attach,
>> @@ -130,14 +164,55 @@ struct 

Re: [PATCH] drm/amd/amdgpu: Fix iova debugfs for non-iommu case

2017-09-19 Thread Tom St Denis

On 19/09/17 02:33 PM, Christian König wrote:
[root@carrizo ~]# xxd -l 1024 -s 0xC 
/sys/kernel/debug/dri/0/amdgpu_iova


Actually 0xC is a special address, e.g. video BIOS if I'm not 
completely mistaken.


Not sure  why that would be mapped by the driver but I can also read the 
kernel's bss section with this...


$ xxd -l 1048576 -s 0x01e4c000  /sys/kernel/debug/dri/0/amdgpu_iova

..
01e6a430: 4c69 6e75 7820 7665 7273 696f 6e20 342e  Linux version 4.
01e6a440: 3133 2e30 2d72 6335 2b20 2872 6f6f 7440  13.0-rc5+ (root@
01e6a450: 6361 7272 697a 6f29 2028 6763 6320 7665  carrizo) (gcc ve
01e6a460: 7273 696f 6e20 362e 332e 3120 3230 3136  rsion 6.3.1 2016
01e6a470: 3132 3231 2028 5265 6420 4861 7420 362e  1221 (Red Hat 6.
01e6a480: 332e 312d 3129 2028 4743 4329 2920 2333  3.1-1) (GCC)) #3
01e6a490: 3120 534d 5020 5475 6520 5365 7020 3139  1 SMP Tue Sep 19
01e6a4a0: 2030 373a 3138 3a33 3120 4544 5420 3230   07:18:31 EDT 20


That's part of the dmesg buffer apparently.

I pointed it at all sorts of address (bios/system ram/etc) it pretty 
much will read anything.


Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/2] drm/amd: Avoid using x86-specific _BITOPS_LONG_SHIFT

2017-09-19 Thread Felix Kuehling
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/include/linux/chash.h | 22 +++---
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/include/linux/chash.h 
b/drivers/gpu/drm/amd/include/linux/chash.h
index c89b92b..6dc1599 100644
--- a/drivers/gpu/drm/amd/include/linux/chash.h
+++ b/drivers/gpu/drm/amd/include/linux/chash.h
@@ -27,7 +27,15 @@
 #include 
 #include 
 #include 
-#include 
+#include 
+
+#if BITS_PER_LONG == 32
+# define _CHASH_LONG_SHIFT 5
+#elif BITS_PER_LONG == 64
+# define _CHASH_LONG_SHIFT 6
+#else
+# error "Unexpected BITS_PER_LONG"
+#endif
 
 struct __chash_table {
u8 bits;
@@ -283,31 +291,31 @@ struct chash_iter {
 static inline bool chash_iter_is_valid(const struct chash_iter iter)
 {
BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
-   return !!(iter.table->valid_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &
+   return !!(iter.table->valid_bitmap[iter.slot >> _CHASH_LONG_SHIFT] &
  iter.mask);
 }
 static inline bool chash_iter_is_empty(const struct chash_iter iter)
 {
BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
-   return !(iter.table->occup_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &
+   return !(iter.table->occup_bitmap[iter.slot >> _CHASH_LONG_SHIFT] &
 iter.mask);
 }
 
 static inline void chash_iter_set_valid(const struct chash_iter iter)
 {
BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
-   iter.table->valid_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] |= iter.mask;
-   iter.table->occup_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] |= iter.mask;
+   iter.table->valid_bitmap[iter.slot >> _CHASH_LONG_SHIFT] |= iter.mask;
+   iter.table->occup_bitmap[iter.slot >> _CHASH_LONG_SHIFT] |= iter.mask;
 }
 static inline void chash_iter_set_invalid(const struct chash_iter iter)
 {
BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
-   iter.table->valid_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &= ~iter.mask;
+   iter.table->valid_bitmap[iter.slot >> _CHASH_LONG_SHIFT] &= ~iter.mask;
 }
 static inline void chash_iter_set_empty(const struct chash_iter iter)
 {
BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
-   iter.table->occup_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &= ~iter.mask;
+   iter.table->occup_bitmap[iter.slot >> _CHASH_LONG_SHIFT] &= ~iter.mask;
 }
 
 static inline u32 chash_iter_key32(const struct chash_iter iter)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] drm/amd: Use portable do_div helper

2017-09-19 Thread Felix Kuehling
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/lib/chash.c | 30 +++---
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/lib/chash.c b/drivers/gpu/drm/amd/lib/chash.c
index 1bc4287..e07e6f3 100644
--- a/drivers/gpu/drm/amd/lib/chash.c
+++ b/drivers/gpu/drm/amd/lib/chash.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /**
@@ -70,9 +71,21 @@ EXPORT_SYMBOL(chash_table_free);
 #ifdef CONFIG_CHASH_STATS
 
 #define DIV_FRAC(nom, denom, quot, frac, frac_digits) do { \
-   (quot) = (nom) / (denom);   \
-   (frac) = ((nom) % (denom) * (frac_digits) + \
- (denom) / 2) / (denom);   \
+   u64 __nom = (nom);  \
+   u64 __denom = (denom);  \
+   u64 __quot, __frac; \
+   u32 __rem;  \
+   \
+   while (__denom >> 32) { \
+   __nom   >>= 1;  \
+   __denom >>= 1;  \
+   }   \
+   __quot = __nom; \
+   __rem  = do_div(__quot, __denom);   \
+   __frac = __rem * (frac_digits) + (__denom >> 1);\
+   do_div(__frac, __denom);\
+   (quot) = __quot;\
+   (frac) = __frac;\
} while (0)
 
 void __chash_table_dump_stats(struct __chash_table *table)
@@ -562,7 +575,7 @@ module_param_named(test_iters, chash_test_iters, ulong, 
0444);
 static int __init chash_init(void)
 {
int ret;
-   u64 ts1_ns, ts_delta_us;
+   u64 ts1_ns;
 
/* Skip self test on user errors */
if (chash_test_bits < 4 || chash_test_bits > 20) {
@@ -603,10 +616,13 @@ static int __init chash_init(void)
  chash_test_minfill, chash_test_maxfill,
  chash_test_iters);
if (!ret) {
-   ts_delta_us = (local_clock() - ts1_ns) / 1000;
+   u64 ts_delta_us = local_clock() - ts1_ns;
+   u64 iters_per_second = (u64)chash_test_iters * 100;
+
+   do_div(ts_delta_us, 1000);
+   do_div(iters_per_second, ts_delta_us);
pr_info("chash: self test took %llu us, %llu iterations/s\n",
-   ts_delta_us,
-   (u64)chash_test_iters * 100 / ts_delta_us);
+   ts_delta_us, iters_per_second);
} else {
pr_err("chash: self test failed: %d\n", ret);
}
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: clarify license in amdgpu_trace_points.c

2017-09-19 Thread Dave Airlie
Fine from my end.

Acked-by: Dave Airlie 
Dave.

On 20 September 2017 at 00:27, Christian König
 wrote:
> Am 19.09.2017 um 16:22 schrieb Alex Deucher:
>>
>> It was not clear.  The rest of the driver is MIT/X11.
>>
>> Signed-off-by: Alex Deucher 
>
>
> Reviewed-by: Christian König 
>
>
>> ---
>>
>> Dave, any objections?
>>
>> drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c | 19 +++
>>   1 file changed, 19 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c
>> index 385b7e1..9ec96b9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c
>> @@ -1,4 +1,23 @@
>>   /* Copyright Red Hat Inc 2010.
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining
>> a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be
>> included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>> SHALL
>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES
>> OR
>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
>> + * OTHER DEALINGS IN THE SOFTWARE.
>> + *
>>* Author : Dave Airlie 
>>*/
>>   #include 
>
>
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/4] drm/amd/powerplay: add register thermal interrupt in hwmgr_hw_init

2017-09-19 Thread Eric Huang
Signed-off-by: Eric Huang 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c | 75 -
 drivers/gpu/drm/amd/powerplay/inc/hwmgr.h   |  6 +++
 2 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
index d1960c1..ae60f74 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
@@ -26,8 +26,8 @@
 #include 
 #include 
 #include 
+#include 
 #include 
-#include "cgs_common.h"
 #include "power_state.h"
 #include "hwmgr.h"
 #include "pppcielanes.h"
@@ -50,6 +50,75 @@ uint8_t convert_to_vid(uint16_t vddc)
return (uint8_t) ((6200 - (vddc * VOLTAGE_SCALE)) / 25);
 }
 
+static int phm_get_pci_bus_devfn(struct pp_hwmgr *hwmgr,
+   struct cgs_system_info *sys_info)
+{
+   sys_info->size = sizeof(struct cgs_system_info);
+   sys_info->info_id = CGS_SYSTEM_INFO_PCIE_BUS_DEVFN;
+
+   return cgs_query_system_info(hwmgr->device, sys_info);
+}
+
+static int phm_thermal_l2h_irq(void *private_data,
+unsigned src_id, const uint32_t *iv_entry)
+{
+   struct pp_hwmgr *hwmgr = (struct pp_hwmgr *)private_data;
+   struct cgs_system_info sys_info = {0};
+   int result;
+
+   result = phm_get_pci_bus_devfn(hwmgr, _info);
+   if (result)
+   return -EINVAL;
+
+   pr_warn("GPU over temperature range detected on PCIe %lld:%lld.%lld!\n",
+   PCI_BUS_NUM(sys_info.value),
+   PCI_SLOT(sys_info.value),
+   PCI_FUNC(sys_info.value));
+   return 0;
+}
+
+static int phm_thermal_h2l_irq(void *private_data,
+unsigned src_id, const uint32_t *iv_entry)
+{
+   struct pp_hwmgr *hwmgr = (struct pp_hwmgr *)private_data;
+   struct cgs_system_info sys_info = {0};
+   int result;
+
+   result = phm_get_pci_bus_devfn(hwmgr, _info);
+   if (result)
+   return -EINVAL;
+
+   pr_warn("GPU under temperature range detected on PCIe 
%lld:%lld.%lld!\n",
+   PCI_BUS_NUM(sys_info.value),
+   PCI_SLOT(sys_info.value),
+   PCI_FUNC(sys_info.value));
+   return 0;
+}
+
+static int phm_ctf_irq(void *private_data,
+unsigned src_id, const uint32_t *iv_entry)
+{
+   struct pp_hwmgr *hwmgr = (struct pp_hwmgr *)private_data;
+   struct cgs_system_info sys_info = {0};
+   int result;
+
+   result = phm_get_pci_bus_devfn(hwmgr, _info);
+   if (result)
+   return -EINVAL;
+
+   pr_warn("GPU Critical Temperature Fault detected on PCIe 
%lld:%lld.%lld!\n",
+   PCI_BUS_NUM(sys_info.value),
+   PCI_SLOT(sys_info.value),
+   PCI_FUNC(sys_info.value));
+   return 0;
+}
+
+static const struct cgs_irq_src_funcs thermal_irq_src[3] = {
+   {NULL, phm_thermal_l2h_irq},
+   {NULL, phm_thermal_h2l_irq},
+   {NULL, phm_ctf_irq}
+};
+
 int hwmgr_early_init(struct pp_instance *handle)
 {
struct pp_hwmgr *hwmgr;
@@ -170,6 +239,10 @@ int hwmgr_hw_init(struct pp_instance *handle)
if (ret)
goto err2;
 
+   ret = phm_register_thermal_interrupt(hwmgr, _irq_src);
+   if (ret)
+   goto err2;
+
return 0;
 err2:
if (hwmgr->hwmgr_func->backend_fini)
diff --git a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h 
b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
index 3bbe7d5..7f6d1cb 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
@@ -32,6 +32,7 @@
 #include "ppatomctrl.h"
 #include "hwmgr_ppt.h"
 #include "power_state.h"
+#include "cgs_linux.h"
 
 struct pp_instance;
 struct pp_hwmgr;
@@ -796,6 +797,11 @@ struct pp_hwmgr {
bool en_umd_pstate;
 };
 
+struct cgs_irq_src_funcs {
+   cgs_irq_source_set_func_t set;
+   cgs_irq_handler_func_t handler;
+};
+
 extern int hwmgr_early_init(struct pp_instance *handle);
 extern int hwmgr_hw_init(struct pp_instance *handle);
 extern int hwmgr_hw_fini(struct pp_instance *handle);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 0/4] some changes for thermal interrupts

2017-09-19 Thread Eric Huang
Eric Huang (4):
  drm/amdgpu: add cgs query info of pci bus devfn
  drm/amd/powerplay: add register thermal interrupt in hwmgr_hw_init
  drm/amd/powerplay: implement register thermal interrupt for Vega10
  drm/amd/powerplay: change alert temperature range

 drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c|  3 +
 drivers/gpu/drm/amd/include/cgs_common.h   |  1 +
 .../gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c  |  4 +-
 drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c| 75 +-
 drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 33 ++
 drivers/gpu/drm/amd/powerplay/inc/hwmgr.h  |  6 ++
 6 files changed, 119 insertions(+), 3 deletions(-)

-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 4/4] drm/amd/powerplay: change alert temperature range

2017-09-19 Thread Eric Huang
Change to more meaningful range that triggers thermal
interrupts.

Signed-off-by: Eric Huang 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
index ce378bd..15bb32a 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
@@ -27,8 +27,8 @@
 #include "power_state.h"
 
 
-#define TEMP_RANGE_MIN (90 * 1000)
-#define TEMP_RANGE_MAX (120 * 1000)
+#define TEMP_RANGE_MIN (0)
+#define TEMP_RANGE_MAX (80 * 1000)
 
 #define PHM_FUNC_CHECK(hw) \
do {\
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/2] drm/amdgpu: Add copy_pte_num_dw member in amdgpu_vm_pte_funcs

2017-09-19 Thread Yong Zhao
Use it to replace the hard coded value in amdgpu_vm_bo_update_mapping().

Change-Id: I85d89d401b8dbcf01ca9c55c281e552db874fde5
Signed-off-by: Yong Zhao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c  | 2 ++
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 2 ++
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 2 ++
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 ++
 drivers/gpu/drm/amd/amdgpu/si_dma.c| 2 ++
 7 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 8708476..e7de600 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -296,10 +296,14 @@ struct amdgpu_buffer_funcs {
 
 /* provided by hw blocks that can write ptes, e.g., sdma */
 struct amdgpu_vm_pte_funcs {
+   /* number of dw to reserve per operation */
+   unsignedcopy_pte_num_dw;
+
/* copy pte entries from GART */
void (*copy_pte)(struct amdgpu_ib *ib,
 uint64_t pe, uint64_t src,
 unsigned count);
+
/* write pte one entry at a time with addr mapping */
void (*write_pte)(struct amdgpu_ib *ib, uint64_t pe,
  uint64_t value, unsigned count,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 28d16781..8fcc743 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1597,7 +1597,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
 
if (pages_addr) {
/* copy commands needed */
-   ndw += ncmds * 7;
+   ndw += ncmds * adev->vm_manager.vm_pte_funcs->copy_pte_num_dw;
 
/* and also PTEs */
ndw += nptes * 2;
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index c64dcd1..60cecd1 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -1387,7 +1387,9 @@ static void cik_sdma_set_buffer_funcs(struct 
amdgpu_device *adev)
 }
 
 static const struct amdgpu_vm_pte_funcs cik_sdma_vm_pte_funcs = {
+   .copy_pte_num_dw = 7,
.copy_pte = cik_sdma_vm_copy_pte,
+
.write_pte = cik_sdma_vm_write_pte,
 
.set_max_nums_pte_pde = 0x1f >> 3,
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index c05eb74..acdee3a 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -1324,7 +1324,9 @@ static void sdma_v2_4_set_buffer_funcs(struct 
amdgpu_device *adev)
 }
 
 static const struct amdgpu_vm_pte_funcs sdma_v2_4_vm_pte_funcs = {
+   .copy_pte_num_dw = 7,
.copy_pte = sdma_v2_4_vm_copy_pte,
+
.write_pte = sdma_v2_4_vm_write_pte,
 
.set_max_nums_pte_pde = 0x1f >> 3,
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 2079340..72f31cc 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -1748,7 +1748,9 @@ static void sdma_v3_0_set_buffer_funcs(struct 
amdgpu_device *adev)
 }
 
 static const struct amdgpu_vm_pte_funcs sdma_v3_0_vm_pte_funcs = {
+   .copy_pte_num_dw = 7,
.copy_pte = sdma_v3_0_vm_copy_pte,
+
.write_pte = sdma_v3_0_vm_write_pte,
 
/* not 0x3f due to HW limitation */
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 798fc2d23..7bf25271 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -1714,7 +1714,9 @@ static void sdma_v4_0_set_buffer_funcs(struct 
amdgpu_device *adev)
 }
 
 static const struct amdgpu_vm_pte_funcs sdma_v4_0_vm_pte_funcs = {
+   .copy_pte_num_dw = 7,
.copy_pte = sdma_v4_0_vm_copy_pte,
+
.write_pte = sdma_v4_0_vm_write_pte,
 
.set_max_nums_pte_pde = 0x40 >> 3,
diff --git a/drivers/gpu/drm/amd/amdgpu/si_dma.c 
b/drivers/gpu/drm/amd/amdgpu/si_dma.c
index adb6ae7..3fa2fbf 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_dma.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c
@@ -887,7 +887,9 @@ static void si_dma_set_buffer_funcs(struct amdgpu_device 
*adev)
 }
 
 static const struct amdgpu_vm_pte_funcs si_dma_vm_pte_funcs = {
+   .copy_pte_num_dw = 5,
.copy_pte = si_dma_vm_copy_pte,
+
.write_pte = si_dma_vm_write_pte,
 
.set_max_nums_pte_pde = 0x8 >> 3,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/4] drm/amdgpu: add cgs query info of pci bus devfn

2017-09-19 Thread Eric Huang
Signed-off-by: Eric Huang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c  | 3 +++
 drivers/gpu/drm/amd/include/cgs_common.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
index 53d1591..c316f8f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
@@ -796,6 +796,9 @@ static int amdgpu_cgs_query_system_info(struct cgs_device 
*cgs_device,
case CGS_SYSTEM_INFO_PCIE_SUB_SYS_VENDOR_ID:
sys_info->value = adev->pdev->subsystem_vendor;
break;
+   case CGS_SYSTEM_INFO_PCIE_BUS_DEVFN:
+   sys_info->value = adev->pdev->devfn;
+   break;
default:
return -ENODEV;
}
diff --git a/drivers/gpu/drm/amd/include/cgs_common.h 
b/drivers/gpu/drm/amd/include/cgs_common.h
index 2c1f13e..030b146 100644
--- a/drivers/gpu/drm/amd/include/cgs_common.h
+++ b/drivers/gpu/drm/amd/include/cgs_common.h
@@ -100,6 +100,7 @@ enum cgs_system_info_id {
CGS_SYSTEM_INFO_GFX_SE_INFO,
CGS_SYSTEM_INFO_PCIE_SUB_SYS_ID,
CGS_SYSTEM_INFO_PCIE_SUB_SYS_VENDOR_ID,
+   CGS_SYSTEM_INFO_PCIE_BUS_DEVFN,
CGS_SYSTEM_INFO_ID_MAXIMUM,
 };
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Fix iova debugfs for non-iommu case

2017-09-19 Thread Christian König
[root@carrizo ~]# xxd -l 1024 -s 0xC 
/sys/kernel/debug/dri/0/amdgpu_iova


Actually 0xC is a special address, e.g. video BIOS if I'm not 
completely mistaken.


Take a look at /proc/iomem and try reading something which is marked as 
"System RAM".


If that still works we are lost and need to get back to square one.

On the other hand give me a day to check if we can't figure out through 
reverse mapping if a page is TTM allocated or not.


Christian.


Am 19.09.2017 um 19:37 schrieb Tom St Denis:

As I predicted ...

[root@carrizo ~]# xxd -l 1024 -s 0xC 
/sys/kernel/debug/dri/0/amdgpu_iova

000c: 55aa 74e9 a902      U.t.
000c0010:     4c02   4942 L.IB
000c0020: 4daa 2e93      0004 M...
000c0030: 2037 3631 3239 3535 3230    761295520..
000c0040: a102    2802    (...
000c0050: 3037 2f32 372f 3136 2030 343a 3036   07/27/16 04:06..
000c0060: 3600  e9b4 0300 e9c3 0300  f400 6...
000c0070: 0013  00d0 0100 1a16 20e1 0280 7e00  .. ...~.
000c0080: a200 4402 1200   003c 400e 0207 ..D<@...
000c0090: 3c01 1a00 0400  eea0 ff06 0008 3040 <.0@
000c00a0: 0e01    1403    
000c00b0: be7e 1100 b907 1ad6 502c    .~..P,..
000c00c0:   1440 4143   1000  .@AC
000c00d0: 4200  407c 0600 2000 2000 1200 0e00  B...@|.. . .
000c00e0:         
000c00f0:   3131 332d 4339 3534 3031 3031 113-C9540101
000c0100: 2d30 3036 0046 474c 2045 4c4c 4553 4d45  -006.FGL ELLESME
000c0110: 5245 0050 4349 5f45 5850 5245 5353 0047 RE.PCI_EXPRESS.G
000c0120:  5235 000d 0a43 3935 3420 506f 6c61  DDR5...C954 Pola
000c0130: 7269 7331 3020 474c 2041 3120 4744 4452  ris10 GL A1 GDDR
000c0140: 3520 3235 364d 7833 3220 3847 4220 3330  5 256Mx32 8GB 30
000c0150: 3065 2f33 3030 6d20 2020 2020 2020 2020  0e/300m
000c0160: 2020 2020 2020 2020 2020 2020 2020 2020
000c0170: 2020 200d 0a00 0d0a 200d 0a00 2843 2920 . ...(C)
000c0180: 3139 3838 2d32 3031 302c 2041 6476 616e  1988-2010, Advan
000c0190: 6365 6420 4d69 6372 6f20 4465 7669 6365  ced Micro Device
000c01a0: 732c 2049 6e63 2e00 4154 4f4d 4249 4f53  s, Inc..ATOMBIOS
000c01b0: 424b 2d41 4d44 2056 4552 3031 352e 3035  BK-AMD VER015.05
000c01c0: 302e 3030 302e 3030 302e 3030 3730 3036 0.000.000.007006
000c01d0: 0043 3935 3430 3130 312e 3030 3600 3132 .C9540101.006.12
000c01e0: 3935 3838 3720 0033 3534 3731 3220 2000  95887 .354712 .
000c01f0: 2020 2020 2020 2020 0041 4d44 5f45 4c4c .AMD_ELL
000c0200: 4553 4d45 5245 5f43 3935 3430 315f 5854 ESMERE_C95401_XT
000c0210: 5f41 315f 4744 355f 3847 425c 636f 6e66 _A1_GD5_8GB\conf
000c0220: 6967 2e68  0090 2400 0101 4154 4f4d ig.h$...ATOM

I can dump arbitrary system memory with the iommu enabled in the bios 
via my carrizo "dev" in the debugfs entry.


So if the issue is arbitrary access is a no no (and I don't get why) 
then the entire patch is a NAK because clearly I can "abuse" it.


Tom

On 19/09/17 01:26 PM, Tom St Denis wrote:

On 19/09/17 01:23 PM, Christian König wrote:

Am 19.09.2017 um 19:20 schrieb Tom St Denis:

On 19/09/17 01:18 PM, Christian König wrote:

Am 19.09.2017 um 19:14 schrieb Tom St Denis:

On 19/09/17 01:10 PM, Christian König wrote:
As far as I know we don't need #ifdefs cause there are dummy 
functions when IOMMU is not compiled in.


But this patch is a NAK since it circumvents the /dev/mem 
restriction when IOMMU is disabled and that is not something we 
can easily allow.


I raised this objection 24 hours ago and was told to go ahead 
with the read/write methods anyways.


Short of making a list of mappings/allocations in the ttm layer I 
have no idea how we would track this in the non-iommu case which 
means the entire iova debugfs entry should have been NAK'ed in 
the first place.


I'm open to suggestions.


As long as IOMMU is enabled the entry is perfectly fine, cause 
only memory mapped to the IOMMU is accessible to the GPU and so 
accessible using the debugfs entry as well.


On devices where there's no translation (e.g. Carrizo) does the 
iommu layer track mappings?  I'm wondering if on those I could seek 
outside of boundaries and read system memory anyways.


Why do you think there isn't any translation on CZ? I mean the whole 
ATC thing uses the IOMMU (v2 in this case) on APUs.


On my Carrizo devel system (A12-9800) the GPU dma addresses are 
physical addresses.  I only first saw iommu with APUs enabled on Raven.


I'll see if I can read non-mapped pages via the iova file on my Carrizo.

Only when IOMMU is disabled we can't go this way cause we can't 
know which memory is mapped to the driver and which isn't (or 
could we somehow track this?).


I suggest to not create the file in the first place if IOMMU is 
disabled.



Re: [PATCH] drm/amd/amdgpu: Fix iova debugfs for non-iommu case

2017-09-19 Thread Tom St Denis

As I predicted ...

[root@carrizo ~]# xxd -l 1024 -s 0xC 
/sys/kernel/debug/dri/0/amdgpu_iova

000c: 55aa 74e9 a902       U.t.
000c0010:     4c02   4942  L.IB
000c0020: 4daa 2e93      0004  M...
000c0030: 2037 3631 3239 3535 3230      761295520..
000c0040: a102    2802     (...
000c0050: 3037 2f32 372f 3136 2030 343a 3036   07/27/16 04:06..
000c0060: 3600  e9b4 0300 e9c3 0300  f400  6...
000c0070: 0013  00d0 0100 1a16 20e1 0280 7e00  .. ...~.
000c0080: a200 4402 1200   003c 400e 0207  ..D<@...
000c0090: 3c01 1a00 0400  eea0 ff06 0008 3040  <.0@
000c00a0: 0e01    1403     
000c00b0: be7e 1100 b907 1ad6 502c     .~..P,..
000c00c0:   1440 4143   1000   .@AC
000c00d0: 4200  407c 0600 2000 2000 1200 0e00  B...@|.. . .
000c00e0:          
000c00f0:   3131 332d 4339 3534 3031 3031  113-C9540101
000c0100: 2d30 3036 0046 474c 2045 4c4c 4553 4d45  -006.FGL ELLESME
000c0110: 5245 0050 4349 5f45 5850 5245 5353 0047  RE.PCI_EXPRESS.G
000c0120:  5235 000d 0a43 3935 3420 506f 6c61  DDR5...C954 Pola
000c0130: 7269 7331 3020 474c 2041 3120 4744 4452  ris10 GL A1 GDDR
000c0140: 3520 3235 364d 7833 3220 3847 4220 3330  5 256Mx32 8GB 30
000c0150: 3065 2f33 3030 6d20 2020 2020 2020 2020  0e/300m
000c0160: 2020 2020 2020 2020 2020 2020 2020 2020
000c0170: 2020 200d 0a00 0d0a 200d 0a00 2843 2920 . ...(C)
000c0180: 3139 3838 2d32 3031 302c 2041 6476 616e  1988-2010, Advan
000c0190: 6365 6420 4d69 6372 6f20 4465 7669 6365  ced Micro Device
000c01a0: 732c 2049 6e63 2e00 4154 4f4d 4249 4f53  s, Inc..ATOMBIOS
000c01b0: 424b 2d41 4d44 2056 4552 3031 352e 3035  BK-AMD VER015.05
000c01c0: 302e 3030 302e 3030 302e 3030 3730 3036  0.000.000.007006
000c01d0: 0043 3935 3430 3130 312e 3030 3600 3132  .C9540101.006.12
000c01e0: 3935 3838 3720 0033 3534 3731 3220 2000  95887 .354712  .
000c01f0: 2020 2020 2020 2020 0041 4d44 5f45 4c4c  .AMD_ELL
000c0200: 4553 4d45 5245 5f43 3935 3430 315f 5854  ESMERE_C95401_XT
000c0210: 5f41 315f 4744 355f 3847 425c 636f 6e66  _A1_GD5_8GB\conf
000c0220: 6967 2e68  0090 2400 0101 4154 4f4d  ig.h$...ATOM

I can dump arbitrary system memory with the iommu enabled in the bios 
via my carrizo "dev" in the debugfs entry.


So if the issue is arbitrary access is a no no (and I don't get why) 
then the entire patch is a NAK because clearly I can "abuse" it.


Tom

On 19/09/17 01:26 PM, Tom St Denis wrote:

On 19/09/17 01:23 PM, Christian König wrote:

Am 19.09.2017 um 19:20 schrieb Tom St Denis:

On 19/09/17 01:18 PM, Christian König wrote:

Am 19.09.2017 um 19:14 schrieb Tom St Denis:

On 19/09/17 01:10 PM, Christian König wrote:
As far as I know we don't need #ifdefs cause there are dummy 
functions when IOMMU is not compiled in.


But this patch is a NAK since it circumvents the /dev/mem 
restriction when IOMMU is disabled and that is not something we 
can easily allow.


I raised this objection 24 hours ago and was told to go ahead with 
the read/write methods anyways.


Short of making a list of mappings/allocations in the ttm layer I 
have no idea how we would track this in the non-iommu case which 
means the entire iova debugfs entry should have been NAK'ed in the 
first place.


I'm open to suggestions.


As long as IOMMU is enabled the entry is perfectly fine, cause only 
memory mapped to the IOMMU is accessible to the GPU and so 
accessible using the debugfs entry as well.


On devices where there's no translation (e.g. Carrizo) does the iommu 
layer track mappings?  I'm wondering if on those I could seek outside 
of boundaries and read system memory anyways.


Why do you think there isn't any translation on CZ? I mean the whole 
ATC thing uses the IOMMU (v2 in this case) on APUs.


On my Carrizo devel system (A12-9800) the GPU dma addresses are physical 
addresses.  I only first saw iommu with APUs enabled on Raven.


I'll see if I can read non-mapped pages via the iova file on my Carrizo.

Only when IOMMU is disabled we can't go this way cause we can't know 
which memory is mapped to the driver and which isn't (or could we 
somehow track this?).


I suggest to not create the file in the first place if IOMMU is 
disabled.


I could easily do this (basically check if the domain is not null at 
debugfs init time).


Yeah, please do this for now.

I mean it would be great to have this interface even with IOMMU 
disabled, but of hand I can't think of a way to allow this.


Sent to the list already.  Tested with enabled/disabled.  Seems to work 
for me.


Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org

[PATCH] drm/amd/amdgpu: Fix iova debugfs for non-iommu case (v2)

2017-09-19 Thread Tom St Denis
Signed-off-by: Tom St Denis 

(v2): Don't create iova file when iommu is disabled
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 0e5f78f3a97e..566b24bc06ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1902,6 +1902,10 @@ static int amdgpu_ttm_debugfs_init(struct amdgpu_device 
*adev)
struct dentry *ent, *root = minor->debugfs_root;
 
for (count = 0; count < ARRAY_SIZE(ttm_debugfs_entries); count++) {
+   if (ttm_debugfs_entries[count].domain == TTM_PL_SYSTEM) {
+   if (!iommu_get_domain_for_dev(adev->dev))
+   continue;
+   }
ent = debugfs_create_file(
ttm_debugfs_entries[count].name,
S_IFREG | S_IRUGO, root,
-- 
2.12.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] drm/amdgpu: Fix a bug in amdgpu_fill_buffer()

2017-09-19 Thread Yong Zhao
When max_bytes is not 8 bytes aligned and bo size is larger than
max_bytes, the last 8 bytes in a ttm node may be left unchanged.
For example, on pre SDMA 4.0, max_bytes = 0x1f, and the bo size
is 0x20, the problem will happen.

In order to fix the problem, we separately store the max nums of
PTEs/PDEs a single operation can set in amdgpu_vm_pte_funcs
structure, rather than inferring it from bytes limit of SDMA
constant fill, i.e. fill_max_bytes.

Together with the fix, we replace the hard code value "10" in
amdgpu_vm_bo_update_mapping() with the corresponding values from
structure amdgpu_vm_pte_funcs.

Change-Id: I37c588a57cb63f1a8251fb5ead2eff4b39e047c9
Signed-off-by: Yong Zhao 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  | 5 +++--
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c   | 3 +++
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c  | 3 +++
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c  | 4 
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c  | 3 +++
 drivers/gpu/drm/amd/amdgpu/si_dma.c | 3 +++
 8 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index e45e6e9..8708476 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -304,6 +304,13 @@ struct amdgpu_vm_pte_funcs {
void (*write_pte)(struct amdgpu_ib *ib, uint64_t pe,
  uint64_t value, unsigned count,
  uint32_t incr);
+
+   /* maximum nums of PTEs/PDEs in a single operation */
+   uint32_tset_max_nums_pte_pde;
+
+   /* number of dw to reserve per operation */
+   unsignedset_pte_pde_num_dw;
+
/* for linear pte/pde updates without addr mapping */
void (*set_pte_pde)(struct amdgpu_ib *ib,
uint64_t pe,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 0e5f78f..bd43268 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1527,8 +1527,8 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
   struct dma_fence **fence)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
-   /* max_bytes applies to SDMA_OP_PTEPDE as well as SDMA_OP_CONST_FILL*/
-   uint32_t max_bytes = adev->mman.buffer_funcs->fill_max_bytes;
+   uint32_t max_bytes = 8 *
+   adev->vm_manager.vm_pte_funcs->set_max_nums_pte_pde;
struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
 
struct drm_mm_node *mm_node;
@@ -1560,8 +1560,8 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
++mm_node;
}
 
-   /* 10 double words for each SDMA_OP_PTEPDE cmd */
-   num_dw = num_loops * 10;
+   /* num of dwords for each SDMA_OP_PTEPDE cmd */
+   num_dw = num_loops * adev->vm_manager.vm_pte_funcs->set_pte_pde_num_dw;
 
/* for IB padding */
num_dw += 64;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 6c11332..28d16781 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1606,10 +1606,11 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
 
} else {
/* set page commands needed */
-   ndw += ncmds * 10;
+   ndw += ncmds * 
adev->vm_manager.vm_pte_funcs->set_pte_pde_num_dw;
 
/* extra commands for begin/end fragments */
-   ndw += 2 * 10 * adev->vm_manager.fragment_size;
+   ndw += 2 * adev->vm_manager.vm_pte_funcs->set_pte_pde_num_dw
+   * adev->vm_manager.fragment_size;
 
params.func = amdgpu_vm_do_set_ptes;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index f508f4d..c64dcd1 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -1389,6 +1389,9 @@ static void cik_sdma_set_buffer_funcs(struct 
amdgpu_device *adev)
 static const struct amdgpu_vm_pte_funcs cik_sdma_vm_pte_funcs = {
.copy_pte = cik_sdma_vm_copy_pte,
.write_pte = cik_sdma_vm_write_pte,
+
+   .set_max_nums_pte_pde = 0x1f >> 3,
+   .set_pte_pde_num_dw = 10,
.set_pte_pde = cik_sdma_vm_set_pte_pde,
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index f2d0710..c05eb74 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -1326,6 +1326,9 @@ static void sdma_v2_4_set_buffer_funcs(struct 
amdgpu_device *adev)
 static const struct amdgpu_vm_pte_funcs sdma_v2_4_vm_pte_funcs = {
.copy_pte = sdma_v2_4_vm_copy_pte,
  

Re: [PATCH] drm/amd/amdgpu: Fix iova debugfs for non-iommu case

2017-09-19 Thread Tom St Denis

On 19/09/17 01:18 PM, Christian König wrote:

Am 19.09.2017 um 19:14 schrieb Tom St Denis:

On 19/09/17 01:10 PM, Christian König wrote:
As far as I know we don't need #ifdefs cause there are dummy 
functions when IOMMU is not compiled in.


But this patch is a NAK since it circumvents the /dev/mem restriction 
when IOMMU is disabled and that is not something we can easily allow.


I raised this objection 24 hours ago and was told to go ahead with the 
read/write methods anyways.


Short of making a list of mappings/allocations in the ttm layer I have 
no idea how we would track this in the non-iommu case which means the 
entire iova debugfs entry should have been NAK'ed in the first place.


I'm open to suggestions.


As long as IOMMU is enabled the entry is perfectly fine, cause only 
memory mapped to the IOMMU is accessible to the GPU and so accessible 
using the debugfs entry as well.


On devices where there's no translation (e.g. Carrizo) does the iommu 
layer track mappings?  I'm wondering if on those I could seek outside of 
boundaries and read system memory anyways.


Only when IOMMU is disabled we can't go this way cause we can't know 
which memory is mapped to the driver and which isn't (or could we 
somehow track this?).


I suggest to not create the file in the first place if IOMMU is disabled.


I could easily do this (basically check if the domain is not null at 
debugfs init time).


Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Fix iova debugfs for non-iommu case

2017-09-19 Thread Christian König

Am 19.09.2017 um 19:14 schrieb Tom St Denis:

On 19/09/17 01:10 PM, Christian König wrote:
As far as I know we don't need #ifdefs cause there are dummy 
functions when IOMMU is not compiled in.


But this patch is a NAK since it circumvents the /dev/mem restriction 
when IOMMU is disabled and that is not something we can easily allow.


I raised this objection 24 hours ago and was told to go ahead with the 
read/write methods anyways.


Short of making a list of mappings/allocations in the ttm layer I have 
no idea how we would track this in the non-iommu case which means the 
entire iova debugfs entry should have been NAK'ed in the first place.


I'm open to suggestions.


As long as IOMMU is enabled the entry is perfectly fine, cause only 
memory mapped to the IOMMU is accessible to the GPU and so accessible 
using the debugfs entry as well.


Only when IOMMU is disabled we can't go this way cause we can't know 
which memory is mapped to the driver and which isn't (or could we 
somehow track this?).


I suggest to not create the file in the first place if IOMMU is disabled.

Christian.



Tom



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Fix iova debugfs for non-iommu case

2017-09-19 Thread Tom St Denis

On 19/09/17 01:10 PM, Christian König wrote:
As far as I know we don't need #ifdefs cause there are dummy functions 
when IOMMU is not compiled in.


But this patch is a NAK since it circumvents the /dev/mem restriction 
when IOMMU is disabled and that is not something we can easily allow.


I raised this objection 24 hours ago and was told to go ahead with the 
read/write methods anyways.


Short of making a list of mappings/allocations in the ttm layer I have 
no idea how we would track this in the non-iommu case which means the 
entire iova debugfs entry should have been NAK'ed in the first place.


I'm open to suggestions.

Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Fix iova debugfs for non-iommu case

2017-09-19 Thread Christian König
As far as I know we don't need #ifdefs cause there are dummy functions 
when IOMMU is not compiled in.


But this patch is a NAK since it circumvents the /dev/mem restriction 
when IOMMU is disabled and that is not something we can easily allow.


Regards,
Christian.

Am 19.09.2017 um 17:36 schrieb Tom St Denis:

Do we need to put #ifdefs of IOMMU_SUPPORT around this too?

The cavium driver doesn't seem to have any depends on IOMMU_* 
anywhere... (for instance)


Tom


On 19/09/17 11:30 AM, Tom St Denis wrote:

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 14 --
  1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index 0e5f78f3a97e..1ee51a15a56c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1791,13 +1791,14 @@ static ssize_t 
amdgpu_iova_to_phys_read(struct file *f, char __user *buf,

  struct iommu_domain *dom;
dom = iommu_get_domain_for_dev(adev->dev);
-if (!dom)
-return -EFAULT;
result = 0;
  while (size) {
  // get physical address and map
-phys = iommu_iova_to_phys(dom, *pos);
+if (dom)
+phys = iommu_iova_to_phys(dom, *pos);
+else
+phys = *pos;
// copy upto one page
  if (size > PAGE_SIZE)
@@ -1837,13 +1838,14 @@ static ssize_t 
amdgpu_iova_to_phys_write(struct file *f, const char __user *buf,

  struct iommu_domain *dom;
dom = iommu_get_domain_for_dev(adev->dev);
-if (!dom)
-return -EFAULT;
result = 0;
  while (size) {
  // get physical address and map
-phys = iommu_iova_to_phys(dom, *pos);
+if (dom)
+phys = iommu_iova_to_phys(dom, *pos);
+else
+phys = *pos;
// copy upto one page
  if (size > PAGE_SIZE)



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 0/2] [reference] Disable implicit sync for non-wsi bos

2017-09-19 Thread Andres Rodriguez
This is a reference patch series for the kernel series:
"drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2"

Andres Rodriguez (2):
  radv: factor out radv_alloc_memory
  radv: disable implicit sync for radv allocated bos v2

 src/amd/vulkan/radv_device.c  | 22 +-
 src/amd/vulkan/radv_private.h | 11 +++
 src/amd/vulkan/radv_radeon_winsys.h   |  1 +
 src/amd/vulkan/radv_wsi.c |  3 ++-
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c |  2 ++
 5 files changed, 33 insertions(+), 6 deletions(-)

-- 
2.9.3

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/2] radv: disable implicit sync for radv allocated bos v2

2017-09-19 Thread Andres Rodriguez
Implicit sync kicks in when a buffer is used by two different amdgpu
contexts simultaneously. Jobs that use explicit synchronization
mechanisms end up needlessly waiting to be scheduled for long periods
of time in order to achieve serialized execution.

This patch disables implicit synchronization for all radv allocations
except for wsi bos. The only systems that require implicit
synchronization are DRI2/3 and PRIME.

v2: mark wsi bos as RADV_MEM_IMPLICIT_SYNC

Signed-off-by: Andres Rodriguez 
---
 src/amd/vulkan/radv_device.c  | 3 +++
 src/amd/vulkan/radv_radeon_winsys.h   | 1 +
 src/amd/vulkan/radv_wsi.c | 3 ++-
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c | 2 ++
 4 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 106eaf6..26944d2 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2309,6 +2309,9 @@ VkResult radv_alloc_memory(VkDevice   
 _device,
if (pAllocateInfo->memoryTypeIndex == RADV_MEM_TYPE_GTT_WRITE_COMBINE)
flags |= RADEON_FLAG_GTT_WC;
 
+   if (mem_flags & RADV_MEM_IMPLICIT_SYNC)
+   flags |= RADEON_FLAG_IMPLICIT_SYNC;
+
mem->bo = device->ws->buffer_create(device->ws, alloc_size, 
device->physical_device->rad_info.max_alignment,
   domain, flags);
 
diff --git a/src/amd/vulkan/radv_radeon_winsys.h 
b/src/amd/vulkan/radv_radeon_winsys.h
index 82ec5fe..02e0243 100644
--- a/src/amd/vulkan/radv_radeon_winsys.h
+++ b/src/amd/vulkan/radv_radeon_winsys.h
@@ -53,6 +53,7 @@ enum radeon_bo_flag { /* bitfield */
RADEON_FLAG_NO_CPU_ACCESS = (1 << 2),
RADEON_FLAG_VIRTUAL =   (1 << 3),
RADEON_FLAG_VA_UNCACHED =   (1 << 4),
+   RADEON_FLAG_IMPLICIT_SYNC = (1 << 5),
 };
 
 enum radeon_bo_usage { /* bitfield */
diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c
index aa44b7d..f8307ee 100644
--- a/src/amd/vulkan/radv_wsi.c
+++ b/src/amd/vulkan/radv_wsi.c
@@ -193,7 +193,7 @@ radv_wsi_image_create(VkDevice device_h,
.image = image_h
};
 
-   result = radv_AllocateMemory(device_h,
+   result = radv_alloc_memory(device_h,
 &(VkMemoryAllocateInfo) {
 .sType = 
VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
 .pNext = _alloc,
@@ -201,6 +201,7 @@ radv_wsi_image_create(VkDevice device_h,
 .memoryTypeIndex = linear ? 1 : 0,
 },
 NULL /* XXX: pAllocator */,
+RADV_MEM_IMPLICIT_SYNC,
 _h);
if (result != VK_SUCCESS)
goto fail_create_image;
diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c
index 325f875..cd0efbe 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c
@@ -330,6 +330,8 @@ radv_amdgpu_winsys_bo_create(struct radeon_winsys *_ws,
request.flags |= AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
if (flags & RADEON_FLAG_GTT_WC)
request.flags |= AMDGPU_GEM_CREATE_CPU_GTT_USWC;
+   if (!(flags & RADEON_FLAG_IMPLICIT_SYNC))
+   request.flags |= AMDGPU_GEM_CREATE_EXPLICIT_SYNC;
 
r = amdgpu_bo_alloc(ws->dev, , _handle);
if (r) {
-- 
2.9.3

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] radv: factor out radv_alloc_memory

2017-09-19 Thread Andres Rodriguez
This allows us to pass extra parameters to the memory allocation
operation that are not defined in the vulkan spec. This is useful for
internal usage.

Signed-off-by: Andres Rodriguez 
---
 src/amd/vulkan/radv_device.c  | 19 ++-
 src/amd/vulkan/radv_private.h | 11 +++
 2 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 5d96070..106eaf6 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2242,11 +2242,11 @@ bool radv_get_memory_fd(struct radv_device *device,
 pFD);
 }
 
-VkResult radv_AllocateMemory(
-   VkDevice_device,
-   const VkMemoryAllocateInfo* pAllocateInfo,
-   const VkAllocationCallbacks*pAllocator,
-   VkDeviceMemory* pMem)
+VkResult radv_alloc_memory(VkDevice_device,
+  const VkMemoryAllocateInfo* pAllocateInfo,
+  const VkAllocationCallbacks*pAllocator,
+  enum radv_mem_flags_bitsmem_flags,
+  VkDeviceMemory* pMem)
 {
RADV_FROM_HANDLE(radv_device, device, _device);
struct radv_device_memory *mem;
@@ -2328,6 +2328,15 @@ fail:
return result;
 }
 
+VkResult radv_AllocateMemory(
+   VkDevice_device,
+   const VkMemoryAllocateInfo* pAllocateInfo,
+   const VkAllocationCallbacks*pAllocator,
+   VkDeviceMemory* pMem)
+{
+   return radv_alloc_memory(_device, pAllocateInfo, pAllocator, 0, pMem);
+}
+
 void radv_FreeMemory(
VkDevice_device,
VkDeviceMemory  _mem,
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index a9da9e7..8cb3807 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -105,6 +105,11 @@ enum radv_mem_type {
RADV_MEM_TYPE_COUNT
 };
 
+enum radv_mem_flags_bits {
+   /* enable implicit synchronization when accessing the underlying bo */
+   RADV_MEM_IMPLICIT_SYNC = 1 << 0,
+};
+
 #define radv_printflike(a, b) __attribute__((__format__(__printf__, a, b)))
 
 static inline uint32_t
@@ -935,6 +940,12 @@ void radv_cmd_buffer_trace_emit(struct radv_cmd_buffer 
*cmd_buffer);
 bool radv_get_memory_fd(struct radv_device *device,
struct radv_device_memory *memory,
int *pFD);
+VkResult radv_alloc_memory(VkDevice _device,
+  const VkMemoryAllocateInfo* pAllocateInfo,
+  const VkAllocationCallbacks* pAllocator,
+  enum radv_mem_flags_bits flags,
+  VkDeviceMemory* pMem);
+
 /*
  * Takes x,y,z as exact numbers of invocations, instead of blocks.
  *
-- 
2.9.3

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2

2017-09-19 Thread Mao, David
Hi Andres,
The explicit sync should not be used for DrI3  and DRI2 but for cross process 
memory sharing, right?
We still have to rely on implicit sync to guarantee the. Correct order of 
rendering and present.
Could you confirm?

Thanks.

Sent from my iPhone

On 19 Sep 2017, at 9:57 PM, Andres Rodriguez 
> wrote:



On 2017-09-19 09:24 AM, Christian König wrote:
Am 19.09.2017 um 14:59 schrieb Andres Rodriguez:
Introduce a flag to signal that access to a BO will be synchronized
through an external mechanism.

Currently all buffers shared between contexts are subject to implicit
synchronization. However, this is only required for protocols that
currently don't support an explicit synchronization mechanism (DRI2/3).

This patch introduces the AMDGPU_GEM_CREATE_EXPLICIT_SYNC, so that
users can specify when it is safe to disable implicit sync.

v2: only disable explicit sync in amdgpu_cs_ioctl

Signed-off-by: Andres Rodriguez >
---

Hey Christian,

I kept the amdgpu_bo_explicit_sync() function since it makes it easier
to maintain an 80 line wrap in amdgpu_cs_sync_rings()
Looks good to me, but I would like to see the matching user space code as well.
Especially I have no idea how you want to have DRI3 compatibility with that?

No problem. I'm fixing the radv patch atm and I'll re-send it for your 
reference.

Regards,
Andres

Regards,
Christian.

Regards,
Andres

  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 +++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 8 
  drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c   | 7 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h   | 3 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 5 +++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 
  include/uapi/drm/amdgpu_drm.h  | 2 ++
  8 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index db97e78..bc8a403 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -704,7 +704,8 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
  list_for_each_entry(e, >validated, tv.head) {
  struct reservation_object *resv = e->robj->tbo.resv;
-r = amdgpu_sync_resv(p->adev, >job->sync, resv, p->filp);
+r = amdgpu_sync_resv(p->adev, >job->sync, resv, p->filp,
+ amdgpu_bo_explicit_sync(e->robj));
  if (r)
  return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index b0d45c8..21e9936 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -212,7 +212,9 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
*data,
AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
AMDGPU_GEM_CREATE_CPU_GTT_USWC |
AMDGPU_GEM_CREATE_VRAM_CLEARED |
-  AMDGPU_GEM_CREATE_VM_ALWAYS_VALID))
+  AMDGPU_GEM_CREATE_VM_ALWAYS_VALID |
+  AMDGPU_GEM_CREATE_EXPLICIT_SYNC))
+
  return -EINVAL;
  /* reject invalid gem domains */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index c26ef53..428aae0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -193,6 +193,14 @@ static inline bool amdgpu_bo_gpu_accessible(struct 
amdgpu_bo *bo)
  }
  }
+/**
+ * amdgpu_bo_explicit_sync - return whether the bo is explicitly synced
+ */
+static inline bool amdgpu_bo_explicit_sync(struct amdgpu_bo *bo)
+{
+return bo->flags & AMDGPU_GEM_CREATE_EXPLICIT_SYNC;
+}
+
  int amdgpu_bo_create(struct amdgpu_device *adev,
  unsigned long size, int byte_align,
  bool kernel, u32 domain, u64 flags,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index c586f44..a4bf21f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -169,14 +169,14 @@ int amdgpu_sync_fence(struct amdgpu_device *adev, struct 
amdgpu_sync *sync,
   *
   * @sync: sync object to add fences from reservation object to
   * @resv: reservation object with embedded fence
- * @shared: true if we should only sync to the exclusive fence
+ * @explicit_sync: true if we should only sync to the exclusive fence
   *
   * Sync to the fence
   */
  int amdgpu_sync_resv(struct amdgpu_device *adev,
   struct amdgpu_sync *sync,
   struct reservation_object *resv,
- void *owner)
+ void *owner, bool explicit_sync)
  {
  struct reservation_object_list *flist;
  struct dma_fence *f;
@@ -191,6 +191,9 @@ int amdgpu_sync_resv(struct amdgpu_device *adev,
  f = reservation_object_get_excl(resv);
 

Re: [PATCH] drm/amd/amdgpu: add support for iova_to_phys to replace TTM trace (v5)

2017-09-19 Thread Tom St Denis

Booting with IOMMU disabled in the bios breaks this.

I'll submit a fix in a minute.  Should be easy.

Tom

On 19/09/17 11:25 AM, Tom St Denis wrote:

On 19/09/17 11:17 AM, Deucher, Alexander wrote:

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
Of Tom St Denis
Sent: Tuesday, September 19, 2017 8:13 AM
To: amd-gfx@lists.freedesktop.org
Cc: StDenis, Tom
Subject: [PATCH] drm/amd/amdgpu: add support for iova_to_phys to
replace TTM trace (v5)

Signed-off-by: Tom St Denis 

(v2): Add domain to iova debugfs
(v3): Add true read/write methods to access system memory of pages
   mapped to the device
(v4): Move get_domain call out of loop and return on error
(v5): Just use kmap/kunmap
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 99
+
  1 file changed, 99 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 50d20903de4f..c7f8e081a772 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -43,6 +43,7 @@
  #include 
  #include 
  #include 
+#include 


Just to double check, do you need any compile guards for the IOMMU 
disabled case?


I honestly don't know.  Is there an easy way to setup a cross build that 
won't automagically enable the iommu in the configuration?


(also, if the iommu team co-operated from the get go...).

Looking at the cavium thunder ethernet source (I grepped for 
iova_to_phys) it seems the return of dev_to_domain might be NULL if 
IOMMU is disabled in which case these read/writes will return -EFAULT...


ideally the iommu_iova_to_phys() would be an identity function if the 
domain is NULL.  Which it doesn't.  It'll OOPS if domain==NULL.


So the fix might be to skip the iova_to_phys call if domain==NULL and 
act on the address as a physical address.


On my carrizo with IOMMU enabled I can read the gfx ring buffer directly 
with --vm-read for both the Carrizo and Polaris10 (the Carrizo doesn't 
do mapping, the Polaris10 does).  I can try a boot with iommu disabled 
and see what happens.


Tom




Alex


  #include "amdgpu.h"
  #include "amdgpu_trace.h"
  #include "bif/bif_4_1_d.h"
@@ -1809,7 +1810,104 @@ static const struct file_operations
amdgpu_ttm_gtt_fops = {

  #endif

+static ssize_t amdgpu_iova_to_phys_read(struct file *f, char __user 
*buf,

+   size_t size, loff_t *pos)
+{
+    struct amdgpu_device *adev = file_inode(f)->i_private;
+    ssize_t result, n;
+    int r;
+    uint64_t phys;
+    void *ptr;
+    struct iommu_domain *dom;
+
+    dom = iommu_get_domain_for_dev(adev->dev);
+    if (!dom)
+    return -EFAULT;
+
+    result = 0;
+    while (size) {
+    // get physical address and map
+    phys = iommu_iova_to_phys(dom, *pos);
+
+    // copy upto one page
+    if (size > PAGE_SIZE)
+    n = PAGE_SIZE;
+    else
+    n = size;
+
+    // to end of the page
+    if (((*pos & (PAGE_SIZE - 1)) + n) >= PAGE_SIZE)
+    n = PAGE_SIZE - (*pos & (PAGE_SIZE - 1));
+
+    ptr = kmap(pfn_to_page(PFN_DOWN(phys)));
+    if (!ptr)
+    return -EFAULT;
+
+    r = copy_to_user(buf, ptr, n);
+    kunmap(pfn_to_page(PFN_DOWN(phys)));
+    if (r)
+    return -EFAULT;
+
+    *pos += n;
+    size -= n;
+    result += n;
+    }
+
+    return result;
+}
+
+static ssize_t amdgpu_iova_to_phys_write(struct file *f, const char 
__user

*buf,
+   size_t size, loff_t *pos)
+{
+    struct amdgpu_device *adev = file_inode(f)->i_private;
+    ssize_t result, n;
+    int r;
+    uint64_t phys;
+    void *ptr;
+    struct iommu_domain *dom;
+
+    dom = iommu_get_domain_for_dev(adev->dev);
+    if (!dom)
+    return -EFAULT;
+
+    result = 0;
+    while (size) {
+    // get physical address and map
+    phys = iommu_iova_to_phys(dom, *pos);

+    // copy upto one page
+    if (size > PAGE_SIZE)
+    n = PAGE_SIZE;
+    else
+    n = size;
+
+    // to end of the page
+    if (((*pos & (PAGE_SIZE - 1)) + n) >= PAGE_SIZE)
+    n = PAGE_SIZE - (*pos & (PAGE_SIZE - 1));
+
+    ptr = kmap(pfn_to_page(PFN_DOWN(phys)));
+    if (!ptr)
+    return -EFAULT;
+
+    r = copy_from_user(ptr, buf, n);
+    kunmap(pfn_to_page(PFN_DOWN(phys)));
+    if (r)
+    return -EFAULT;
+
+    *pos += n;
+    size -= n;
+    result += n;
+    }
+
+    return result;
+}
+
+static const struct file_operations amdgpu_ttm_iova_fops = {
+    .owner = THIS_MODULE,
+    .read = amdgpu_iova_to_phys_read,
+    .write = amdgpu_iova_to_phys_write,
+    .llseek = default_llseek
+};

  static const struct {
  char *name;
@@ -1820,6 +1918,7 @@ static const struct {
  #ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
  { "amdgpu_gtt", _ttm_gtt_fops, TTM_PL_TT },
  #endif
+    { "amdgpu_iova", 

Re: [PATCH] drm/amd/amdgpu: add support for iova_to_phys to replace TTM trace (v5)

2017-09-19 Thread Tom St Denis

On 19/09/17 11:17 AM, Deucher, Alexander wrote:

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
Of Tom St Denis
Sent: Tuesday, September 19, 2017 8:13 AM
To: amd-gfx@lists.freedesktop.org
Cc: StDenis, Tom
Subject: [PATCH] drm/amd/amdgpu: add support for iova_to_phys to
replace TTM trace (v5)

Signed-off-by: Tom St Denis 

(v2): Add domain to iova debugfs
(v3): Add true read/write methods to access system memory of pages
   mapped to the device
(v4): Move get_domain call out of loop and return on error
(v5): Just use kmap/kunmap
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 99
+
  1 file changed, 99 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 50d20903de4f..c7f8e081a772 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -43,6 +43,7 @@
  #include 
  #include 
  #include 
+#include 


Just to double check, do you need any compile guards for the IOMMU disabled 
case?


I honestly don't know.  Is there an easy way to setup a cross build that 
won't automagically enable the iommu in the configuration?


(also, if the iommu team co-operated from the get go...).

Looking at the cavium thunder ethernet source (I grepped for 
iova_to_phys) it seems the return of dev_to_domain might be NULL if 
IOMMU is disabled in which case these read/writes will return -EFAULT...


ideally the iommu_iova_to_phys() would be an identity function if the 
domain is NULL.  Which it doesn't.  It'll OOPS if domain==NULL.


So the fix might be to skip the iova_to_phys call if domain==NULL and 
act on the address as a physical address.


On my carrizo with IOMMU enabled I can read the gfx ring buffer directly 
with --vm-read for both the Carrizo and Polaris10 (the Carrizo doesn't 
do mapping, the Polaris10 does).  I can try a boot with iommu disabled 
and see what happens.


Tom




Alex


  #include "amdgpu.h"
  #include "amdgpu_trace.h"
  #include "bif/bif_4_1_d.h"
@@ -1809,7 +1810,104 @@ static const struct file_operations
amdgpu_ttm_gtt_fops = {

  #endif

+static ssize_t amdgpu_iova_to_phys_read(struct file *f, char __user *buf,
+  size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result, n;
+   int r;
+   uint64_t phys;
+   void *ptr;
+   struct iommu_domain *dom;
+
+   dom = iommu_get_domain_for_dev(adev->dev);
+   if (!dom)
+   return -EFAULT;
+
+   result = 0;
+   while (size) {
+   // get physical address and map
+   phys = iommu_iova_to_phys(dom, *pos);
+
+   // copy upto one page
+   if (size > PAGE_SIZE)
+   n = PAGE_SIZE;
+   else
+   n = size;
+
+   // to end of the page
+   if (((*pos & (PAGE_SIZE - 1)) + n) >= PAGE_SIZE)
+   n = PAGE_SIZE - (*pos & (PAGE_SIZE - 1));
+
+   ptr = kmap(pfn_to_page(PFN_DOWN(phys)));
+   if (!ptr)
+   return -EFAULT;
+
+   r = copy_to_user(buf, ptr, n);
+   kunmap(pfn_to_page(PFN_DOWN(phys)));
+   if (r)
+   return -EFAULT;
+
+   *pos += n;
+   size -= n;
+   result += n;
+   }
+
+   return result;
+}
+
+static ssize_t amdgpu_iova_to_phys_write(struct file *f, const char __user
*buf,
+  size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result, n;
+   int r;
+   uint64_t phys;
+   void *ptr;
+   struct iommu_domain *dom;
+
+   dom = iommu_get_domain_for_dev(adev->dev);
+   if (!dom)
+   return -EFAULT;
+
+   result = 0;
+   while (size) {
+   // get physical address and map
+   phys = iommu_iova_to_phys(dom, *pos);

+   // copy upto one page
+   if (size > PAGE_SIZE)
+   n = PAGE_SIZE;
+   else
+   n = size;
+
+   // to end of the page
+   if (((*pos & (PAGE_SIZE - 1)) + n) >= PAGE_SIZE)
+   n = PAGE_SIZE - (*pos & (PAGE_SIZE - 1));
+
+   ptr = kmap(pfn_to_page(PFN_DOWN(phys)));
+   if (!ptr)
+   return -EFAULT;
+
+   r = copy_from_user(ptr, buf, n);
+   kunmap(pfn_to_page(PFN_DOWN(phys)));
+   if (r)
+   return -EFAULT;
+
+   *pos += n;
+   size -= n;
+   result += n;
+   }
+
+   return result;
+}
+
+static const struct file_operations amdgpu_ttm_iova_fops = {
+   .owner = THIS_MODULE,
+   .read = amdgpu_iova_to_phys_read,
+   

Re: [PATCH] drm/amdgpu: clarify license in amdgpu_trace_points.c

2017-09-19 Thread Christian König

Am 19.09.2017 um 16:22 schrieb Alex Deucher:

It was not clear.  The rest of the driver is MIT/X11.

Signed-off-by: Alex Deucher 


Reviewed-by: Christian König 


---

Dave, any objections?

drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c | 19 +++
  1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c
index 385b7e1..9ec96b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c
@@ -1,4 +1,23 @@
  /* Copyright Red Hat Inc 2010.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
   * Author : Dave Airlie 
   */
  #include 



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/3] drm/amdgpu:fix typo

2017-09-19 Thread Alex Deucher
On Tue, Sep 19, 2017 at 2:41 AM, Monk Liu  wrote:
> previously a patch has typo error, correct it

Please squash this into the previous broken patch before committing it.

Alex

>
> Change-Id: I91bcefd7148b5db1c7d957c868e13a46ca40ef74
> Signed-off-by: Monk Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 4eee2ef..35cc5ee 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -256,7 +256,7 @@ static int psp_hw_start(struct psp_context *psp)
> struct amdgpu_device *adev = psp->adev;
> int ret;
>
> -   if (amdgpu_sriov_vf(adev) && !adev->in_sriov_reset) {
> +   if (!amdgpu_sriov_vf(adev) || !adev->in_sriov_reset) {
> ret = psp_bootloader_load_sysdrv(psp);
> if (ret)
> return ret;
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: clarify license in amdgpu_trace_points.c

2017-09-19 Thread Alex Deucher
It was not clear.  The rest of the driver is MIT/X11.

Signed-off-by: Alex Deucher 
---

Dave, any objections?

drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c
index 385b7e1..9ec96b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c
@@ -1,4 +1,23 @@
 /* Copyright Red Hat Inc 2010.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
  * Author : Dave Airlie 
  */
 #include 
-- 
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2

2017-09-19 Thread Andres Rodriguez

Correct.

The idea is to only set AMDGPU_GEM_CREATE_EXPLICIT_SYNC for buffers that 
are not associated with dri2/3 or PRIME.


Regards,
Andres

On 2017-09-19 10:10 AM, Mao, David wrote:

Hi Andres,
The explicit sync should not be used for DrI3  and DRI2 but for cross 
process memory sharing, right?
We still have to rely on implicit sync to guarantee the. Correct order 
of rendering and present.

Could you confirm?

Thanks.

Sent from my iPhone

On 19 Sep 2017, at 9:57 PM, Andres Rodriguez > wrote:





On 2017-09-19 09:24 AM, Christian König wrote:

Am 19.09.2017 um 14:59 schrieb Andres Rodriguez:

Introduce a flag to signal that access to a BO will be synchronized
through an external mechanism.

Currently all buffers shared between contexts are subject to implicit
synchronization. However, this is only required for protocols that
currently don't support an explicit synchronization mechanism (DRI2/3).

This patch introduces the AMDGPU_GEM_CREATE_EXPLICIT_SYNC, so that
users can specify when it is safe to disable implicit sync.

v2: only disable explicit sync in amdgpu_cs_ioctl

Signed-off-by: Andres Rodriguez >

---

Hey Christian,

I kept the amdgpu_bo_explicit_sync() function since it makes it easier
to maintain an 80 line wrap in amdgpu_cs_sync_rings()
Looks good to me, but I would like to see the matching user space 
code as well.
Especially I have no idea how you want to have DRI3 compatibility 
with that?


No problem. I'm fixing the radv patch atm and I'll re-send it for your 
reference.


Regards,
Andres


Regards,
Christian.


Regards,
Andres

  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c    | 4 +++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 8 
  drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c   | 7 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h   | 3 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c    | 5 +++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 
  include/uapi/drm/amdgpu_drm.h  | 2 ++
  8 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c

index db97e78..bc8a403 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -704,7 +704,8 @@ static int amdgpu_cs_sync_rings(struct 
amdgpu_cs_parser *p)

  list_for_each_entry(e, >validated, tv.head) {
  struct reservation_object *resv = e->robj->tbo.resv;
-    r = amdgpu_sync_resv(p->adev, >job->sync, resv, p->filp);
+    r = amdgpu_sync_resv(p->adev, >job->sync, resv, p->filp,
+ amdgpu_bo_explicit_sync(e->robj));
  if (r)
  return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c

index b0d45c8..21e9936 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -212,7 +212,9 @@ int amdgpu_gem_create_ioctl(struct drm_device 
*dev, void *data,

    AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
    AMDGPU_GEM_CREATE_CPU_GTT_USWC |
    AMDGPU_GEM_CREATE_VRAM_CLEARED |
-  AMDGPU_GEM_CREATE_VM_ALWAYS_VALID))
+  AMDGPU_GEM_CREATE_VM_ALWAYS_VALID |
+  AMDGPU_GEM_CREATE_EXPLICIT_SYNC))
+
  return -EINVAL;
  /* reject invalid gem domains */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h

index c26ef53..428aae0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -193,6 +193,14 @@ static inline bool 
amdgpu_bo_gpu_accessible(struct amdgpu_bo *bo)

  }
  }
+/**
+ * amdgpu_bo_explicit_sync - return whether the bo is explicitly synced
+ */
+static inline bool amdgpu_bo_explicit_sync(struct amdgpu_bo *bo)
+{
+    return bo->flags & AMDGPU_GEM_CREATE_EXPLICIT_SYNC;
+}
+
  int amdgpu_bo_create(struct amdgpu_device *adev,
  unsigned long size, int byte_align,
  bool kernel, u32 domain, u64 flags,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c

index c586f44..a4bf21f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -169,14 +169,14 @@ int amdgpu_sync_fence(struct amdgpu_device 
*adev, struct amdgpu_sync *sync,

   *
   * @sync: sync object to add fences from reservation object to
   * @resv: reservation object with embedded fence
- * @shared: true if we should only sync to the exclusive fence
+ * @explicit_sync: true if we should only sync to the exclusive fence
   *
   * Sync to the fence
   */
  int amdgpu_sync_resv(struct amdgpu_device *adev,
   struct amdgpu_sync *sync,
   struct reservation_object *resv,
- void *owner)
+ void *owner, 

Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp invalidate

2017-09-19 Thread Alex Deucher
On Tue, Sep 19, 2017 at 4:30 AM, Christian König
 wrote:
> I don't know why, but the HDP is generally not part of the register spec.
>
> So you can neither find HDP_DEBUG0 nor HDP_READ_CACHE_INVALIDATE in it as
> far as I know.
>
> Point is that the HDP invalidates it's read cache on any register write (the
> register itself doesn't matter). So far we used the HDP_DEBUG0 register
> because it is unused otherwise, but having a dedicated register just for
> this job is clearly a good idea.

Both show up in the register spec for me and the descriptions both say
writing 1 to the register invalidates the read cache.

Alex

>
> Regards,
> Christian.
>
>
> Am 19.09.2017 um 09:02 schrieb zhoucm1:
>>
>> >using this register to replace MM_HDP_DEBUG0 is suggested from a HDP HW
>> > guys
>>
>> I'm OK with this line.
>>
>> Thanks for explain.
>> David Zhou
>>
>> On 2017年09月19日 15:00, Liu, Monk wrote:
>>>
>>> First, I didn't check if windows did this way or not, because I don't
>>> sure if windows is always doing the right thing, e.g. for GFX preemption I
>>> didn't copy windows scheme and we found couple bugs in windows but not in
>>> linux ...
>>> So please don't assume we should copy from windows, unless it's solid
>>> like a dead bone
>>>
>>> Second, this register is originally comes from the definition file
>>> "hdp_4_0_offset.h", not recently introduced by me or someone else, and using
>>> this register to replace MM_HDP_DEBUG0 is suggested from a HDP HW guys when
>>> I was working on the PAL/VULKAN preemption hang issue in Orlando, sorry I
>>> missed that guy's name ...
>>>
>>> @Deucher, Alexander do you know who is on hdp hw ? we can confirm with
>>> him
>>>
>>>
>>> If you're feeling bad about this change, I can add "if sriov" condition
>>> to all of it, so bare-metal will keep still,  is that okay ?
>>>
>>> BR Monk
>>>
>>>
>>> -Original Message-
>>> From: Zhou, David(ChunMing)
>>> Sent: 2017年9月19日 14:51
>>> To: Liu, Monk ; Koenig, Christian
>>> ; amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp
>>> invalidate
>>>
>>>
>>> Seems the change is more proper, but where do you find
>>> mmHDP_READ_CACHE_INVALIDATE? Could you double check if Windows driver has
>>> changed to use this?
>>> I'm confusing it, since mmHDP_DEBUG0 implementation is from windows as
>>> well.
>>> I even don't find mmHDP_READ_CACHE_INVALIDATE in register spec.
>>>
>>> Regards,
>>> David Zhou
>>> On 2017年09月19日 14:46, Liu, Monk wrote:

 What question ? please reply here

 -Original Message-
 From: Zhou, David(ChunMing)
 Sent: 2017年9月19日 12:25
 To: Liu, Monk ; Koenig, Christian
 ; amd-gfx@lists.freedesktop.org
 Subject: RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger
 hdp invalidate

 Please answer my question as I raised in another thread, otherwise I
 will give a NAK on this!

 Regards,
 David Zhou

 -Original Message-
 From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
 Of Liu, Monk
 Sent: Tuesday, September 19, 2017 12:04 PM
 To: Koenig, Christian ;
 amd-gfx@lists.freedesktop.org
 Subject: RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger
 hdp invalidate

 Yeah, vnc1_0 and uvd_v7_0

 Thanks

 -Original Message-
 From: Koenig, Christian
 Sent: 2017年9月18日 19:39
 To: Liu, Monk ; amd-gfx@lists.freedesktop.org
 Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger
 hdp invalidate

 Yeah, but Vega10 has UVD7 and in uvd_v7_0.c we have:

> static void uvd_v7_0_ring_emit_hdp_invalidate(struct amdgpu_ring
> *ring) {
>   amdgpu_ring_write(ring, PACKET0(SOC15_REG_OFFSET(HDP, 0,
> mmHDP_DEBUG0), 0));
>   amdgpu_ring_write(ring, 1);
> }

 That should probably be fixed as well.

 Regards,
 Christian.

 Am 18.09.2017 um 13:03 schrieb Liu, Monk:
>
> Only vega10 has this register
>
> -Original Message-
> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
> Sent: 2017年9月18日 17:20
> To: Liu, Monk ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger
> hdp invalidate
>
> Am 18.09.2017 um 08:11 schrieb Monk Liu:
>>
>> Change-Id: I61dc02ea6a450f9acfa3bae07aa20244261f5369
>> Signed-off-by: Monk Liu 
>
> Reviewed-by: Christian König 
>
> Please scan the code once more, we most likely have used mmHDP_DEBUG0
> for this at even more places.
>
> Christian.
>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 2 +-
>> 

[PATCH] drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2

2017-09-19 Thread Andres Rodriguez
Introduce a flag to signal that access to a BO will be synchronized
through an external mechanism.

Currently all buffers shared between contexts are subject to implicit
synchronization. However, this is only required for protocols that
currently don't support an explicit synchronization mechanism (DRI2/3).

This patch introduces the AMDGPU_GEM_CREATE_EXPLICIT_SYNC, so that
users can specify when it is safe to disable implicit sync.

v2: only disable explicit sync in amdgpu_cs_ioctl

Signed-off-by: Andres Rodriguez 
---

Hey Christian,

I kept the amdgpu_bo_explicit_sync() function since it makes it easier
to maintain an 80 line wrap in amdgpu_cs_sync_rings()

Regards,
Andres

 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 8 
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c   | 7 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h   | 3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 5 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 
 include/uapi/drm/amdgpu_drm.h  | 2 ++
 8 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index db97e78..bc8a403 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -704,7 +704,8 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
 
list_for_each_entry(e, >validated, tv.head) {
struct reservation_object *resv = e->robj->tbo.resv;
-   r = amdgpu_sync_resv(p->adev, >job->sync, resv, p->filp);
+   r = amdgpu_sync_resv(p->adev, >job->sync, resv, p->filp,
+amdgpu_bo_explicit_sync(e->robj));
 
if (r)
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index b0d45c8..21e9936 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -212,7 +212,9 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
*data,
  AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
  AMDGPU_GEM_CREATE_CPU_GTT_USWC |
  AMDGPU_GEM_CREATE_VRAM_CLEARED |
- AMDGPU_GEM_CREATE_VM_ALWAYS_VALID))
+ AMDGPU_GEM_CREATE_VM_ALWAYS_VALID |
+ AMDGPU_GEM_CREATE_EXPLICIT_SYNC))
+
return -EINVAL;
 
/* reject invalid gem domains */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index c26ef53..428aae0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -193,6 +193,14 @@ static inline bool amdgpu_bo_gpu_accessible(struct 
amdgpu_bo *bo)
}
 }
 
+/**
+ * amdgpu_bo_explicit_sync - return whether the bo is explicitly synced
+ */
+static inline bool amdgpu_bo_explicit_sync(struct amdgpu_bo *bo)
+{
+   return bo->flags & AMDGPU_GEM_CREATE_EXPLICIT_SYNC;
+}
+
 int amdgpu_bo_create(struct amdgpu_device *adev,
unsigned long size, int byte_align,
bool kernel, u32 domain, u64 flags,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index c586f44..a4bf21f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -169,14 +169,14 @@ int amdgpu_sync_fence(struct amdgpu_device *adev, struct 
amdgpu_sync *sync,
  *
  * @sync: sync object to add fences from reservation object to
  * @resv: reservation object with embedded fence
- * @shared: true if we should only sync to the exclusive fence
+ * @explicit_sync: true if we should only sync to the exclusive fence
  *
  * Sync to the fence
  */
 int amdgpu_sync_resv(struct amdgpu_device *adev,
 struct amdgpu_sync *sync,
 struct reservation_object *resv,
-void *owner)
+void *owner, bool explicit_sync)
 {
struct reservation_object_list *flist;
struct dma_fence *f;
@@ -191,6 +191,9 @@ int amdgpu_sync_resv(struct amdgpu_device *adev,
f = reservation_object_get_excl(resv);
r = amdgpu_sync_fence(adev, sync, f);
 
+   if (explicit_sync)
+   return r;
+
flist = reservation_object_get_list(resv);
if (!flist || r)
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h
index dc76879..70d7e3a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h
@@ -45,7 +45,8 @@ int amdgpu_sync_fence(struct amdgpu_device *adev, struct 
amdgpu_sync *sync,
 int amdgpu_sync_resv(struct amdgpu_device *adev,
 struct amdgpu_sync *sync,
   

Re: [PATCH] drm/amd/amdgpu: add support for iova_to_phys to replace TTM trace (v5)

2017-09-19 Thread Christian König

Am 19.09.2017 um 14:13 schrieb Tom St Denis:

Signed-off-by: Tom St Denis 


Reviewed-by: Christian König 



(v2): Add domain to iova debugfs
(v3): Add true read/write methods to access system memory of pages
   mapped to the device
(v4): Move get_domain call out of loop and return on error
(v5): Just use kmap/kunmap
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 99 +
  1 file changed, 99 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 50d20903de4f..c7f8e081a772 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -43,6 +43,7 @@
  #include 
  #include 
  #include 
+#include 
  #include "amdgpu.h"
  #include "amdgpu_trace.h"
  #include "bif/bif_4_1_d.h"
@@ -1809,7 +1810,104 @@ static const struct file_operations amdgpu_ttm_gtt_fops 
= {
  
  #endif
  
+static ssize_t amdgpu_iova_to_phys_read(struct file *f, char __user *buf,

+  size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result, n;
+   int r;
+   uint64_t phys;
+   void *ptr;
+   struct iommu_domain *dom;
+
+   dom = iommu_get_domain_for_dev(adev->dev);
+   if (!dom)
+   return -EFAULT;
+
+   result = 0;
+   while (size) {
+   // get physical address and map
+   phys = iommu_iova_to_phys(dom, *pos);
+
+   // copy upto one page
+   if (size > PAGE_SIZE)
+   n = PAGE_SIZE;
+   else
+   n = size;
+
+   // to end of the page
+   if (((*pos & (PAGE_SIZE - 1)) + n) >= PAGE_SIZE)
+   n = PAGE_SIZE - (*pos & (PAGE_SIZE - 1));
+
+   ptr = kmap(pfn_to_page(PFN_DOWN(phys)));
+   if (!ptr)
+   return -EFAULT;
+
+   r = copy_to_user(buf, ptr, n);
+   kunmap(pfn_to_page(PFN_DOWN(phys)));
+   if (r)
+   return -EFAULT;
+
+   *pos += n;
+   size -= n;
+   result += n;
+   }
+
+   return result;
+}
+
+static ssize_t amdgpu_iova_to_phys_write(struct file *f, const char __user 
*buf,
+  size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result, n;
+   int r;
+   uint64_t phys;
+   void *ptr;
+   struct iommu_domain *dom;
+
+   dom = iommu_get_domain_for_dev(adev->dev);
+   if (!dom)
+   return -EFAULT;
+
+   result = 0;
+   while (size) {
+   // get physical address and map
+   phys = iommu_iova_to_phys(dom, *pos);
  
+		// copy upto one page

+   if (size > PAGE_SIZE)
+   n = PAGE_SIZE;
+   else
+   n = size;
+
+   // to end of the page
+   if (((*pos & (PAGE_SIZE - 1)) + n) >= PAGE_SIZE)
+   n = PAGE_SIZE - (*pos & (PAGE_SIZE - 1));
+
+   ptr = kmap(pfn_to_page(PFN_DOWN(phys)));
+   if (!ptr)
+   return -EFAULT;
+
+   r = copy_from_user(ptr, buf, n);
+   kunmap(pfn_to_page(PFN_DOWN(phys)));
+   if (r)
+   return -EFAULT;
+
+   *pos += n;
+   size -= n;
+   result += n;
+   }
+
+   return result;
+}
+
+static const struct file_operations amdgpu_ttm_iova_fops = {
+   .owner = THIS_MODULE,
+   .read = amdgpu_iova_to_phys_read,
+   .write = amdgpu_iova_to_phys_write,
+   .llseek = default_llseek
+};
  
  static const struct {

char *name;
@@ -1820,6 +1918,7 @@ static const struct {
  #ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
{ "amdgpu_gtt", _ttm_gtt_fops, TTM_PL_TT },
  #endif
+   { "amdgpu_iova", _ttm_iova_fops, TTM_PL_SYSTEM },
  };
  
  #endif



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/amdgpu: add support for iova_to_phys to replace TTM trace (v5)

2017-09-19 Thread Tom St Denis
Signed-off-by: Tom St Denis 

(v2): Add domain to iova debugfs
(v3): Add true read/write methods to access system memory of pages
  mapped to the device
(v4): Move get_domain call out of loop and return on error
(v5): Just use kmap/kunmap
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 99 +
 1 file changed, 99 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 50d20903de4f..c7f8e081a772 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 #include "bif/bif_4_1_d.h"
@@ -1809,7 +1810,104 @@ static const struct file_operations amdgpu_ttm_gtt_fops 
= {
 
 #endif
 
+static ssize_t amdgpu_iova_to_phys_read(struct file *f, char __user *buf,
+  size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result, n;
+   int r;
+   uint64_t phys;
+   void *ptr;
+   struct iommu_domain *dom;
+
+   dom = iommu_get_domain_for_dev(adev->dev);
+   if (!dom)
+   return -EFAULT;
+
+   result = 0;
+   while (size) {
+   // get physical address and map
+   phys = iommu_iova_to_phys(dom, *pos);
+
+   // copy upto one page
+   if (size > PAGE_SIZE)
+   n = PAGE_SIZE;
+   else
+   n = size;
+
+   // to end of the page
+   if (((*pos & (PAGE_SIZE - 1)) + n) >= PAGE_SIZE)
+   n = PAGE_SIZE - (*pos & (PAGE_SIZE - 1));
+
+   ptr = kmap(pfn_to_page(PFN_DOWN(phys)));
+   if (!ptr)
+   return -EFAULT;
+
+   r = copy_to_user(buf, ptr, n);
+   kunmap(pfn_to_page(PFN_DOWN(phys)));
+   if (r)
+   return -EFAULT;
+
+   *pos += n;
+   size -= n;
+   result += n;
+   }
+
+   return result;
+}
+
+static ssize_t amdgpu_iova_to_phys_write(struct file *f, const char __user 
*buf,
+  size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result, n;
+   int r;
+   uint64_t phys;
+   void *ptr;
+   struct iommu_domain *dom;
+
+   dom = iommu_get_domain_for_dev(adev->dev);
+   if (!dom)
+   return -EFAULT;
+
+   result = 0;
+   while (size) {
+   // get physical address and map
+   phys = iommu_iova_to_phys(dom, *pos);
 
+   // copy upto one page
+   if (size > PAGE_SIZE)
+   n = PAGE_SIZE;
+   else
+   n = size;
+
+   // to end of the page
+   if (((*pos & (PAGE_SIZE - 1)) + n) >= PAGE_SIZE)
+   n = PAGE_SIZE - (*pos & (PAGE_SIZE - 1));
+
+   ptr = kmap(pfn_to_page(PFN_DOWN(phys)));
+   if (!ptr)
+   return -EFAULT;
+
+   r = copy_from_user(ptr, buf, n);
+   kunmap(pfn_to_page(PFN_DOWN(phys)));
+   if (r)
+   return -EFAULT;
+
+   *pos += n;
+   size -= n;
+   result += n;
+   }
+
+   return result;
+}
+
+static const struct file_operations amdgpu_ttm_iova_fops = {
+   .owner = THIS_MODULE,
+   .read = amdgpu_iova_to_phys_read,
+   .write = amdgpu_iova_to_phys_write,
+   .llseek = default_llseek
+};
 
 static const struct {
char *name;
@@ -1820,6 +1918,7 @@ static const struct {
 #ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
{ "amdgpu_gtt", _ttm_gtt_fops, TTM_PL_TT },
 #endif
+   { "amdgpu_iova", _ttm_iova_fops, TTM_PL_SYSTEM },
 };
 
 #endif
-- 
2.12.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/8] Retry page fault handling for Vega10

2017-09-19 Thread Christian König

Am 19.09.2017 um 03:05 schrieb Felix Kuehling:

Thanks for the reviews. I rebased this on amd-staging-drm-next, retested
and submitted.

Christian, do you want to do some graphics PASID and VMFault work on top
of that? I think I'll be working on more KFD upstreaming this week and
maybe look at this subject again next week.


Yeah, that's on my TODO list together with quite a bunch of other things.

Going to give that a try when I have time, but don't expect anything 
before xmas.


Regards,
Christian.



Regards,
   Felix


On 2017-09-12 07:05 PM, Felix Kuehling wrote:

Rebased on adeucher/amd-staging-4.13 and tested on Vega10 (graphics)
and Kaveri (KFD). Meaningful graphics tests with retry faults enabled
will only be possible after PASID support is added to amdgpu_cs.

The chash table was moved to drivers/gpu/drm/amd/lib for now but is
ready to move to lib if needed. I have not got any feedback on LKLM
and I don't want that to hold up the patch series.

TODO:
* Finish upstreaming KFD
* Allocate PASIDs for graphics contexts
* Setup VMID-PASID mapping during graphics command submission
* Confirm that graphics page faults have the correct PASID in the IV


Felix Kuehling (8):
   drm/amdgpu: Fix error handling in amdgpu_vm_init
   drm/amdgpu: Add PASID management
   drm/radeon: Add PASID manager for KFD
   drm/amdkfd: Separate doorbell allocation from PASID
   drm/amdkfd: Use PASID manager from KGD
   drm/amdgpu: Add prescreening stage in IH processing
   drm/amd: Closed hash table with low overhead
   drm/amdgpu: Track pending retry faults in IH and VM (v2)

  drivers/gpu/drm/Kconfig   |   3 +
  drivers/gpu/drm/Makefile  |   1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c|  82 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h|  12 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c   |   2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c|  84 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|  21 +-
  drivers/gpu/drm/amd/amdgpu/cik_ih.c   |  14 +
  drivers/gpu/drm/amd/amdgpu/cz_ih.c|  14 +
  drivers/gpu/drm/amd/amdgpu/iceland_ih.c   |  14 +
  drivers/gpu/drm/amd/amdgpu/si_ih.c|  14 +
  drivers/gpu/drm/amd/amdgpu/tonga_ih.c |  14 +
  drivers/gpu/drm/amd/amdgpu/vega10_ih.c|  90 
  drivers/gpu/drm/amd/amdkfd/kfd_device.c   |   7 -
  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c |  50 +-
  drivers/gpu/drm/amd/amdkfd/kfd_module.c   |   6 -
  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c|  90 ++--
  drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  10 +-
  drivers/gpu/drm/amd/amdkfd/kfd_process.c  |   6 +
  drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |   6 +
  drivers/gpu/drm/amd/include/linux/chash.h | 358 +
  drivers/gpu/drm/amd/lib/Kconfig   |  27 +
  drivers/gpu/drm/amd/lib/Makefile  |  11 +
  drivers/gpu/drm/amd/lib/chash.c   | 622 ++
  drivers/gpu/drm/radeon/radeon_kfd.c   |  31 ++
  28 files changed, 1504 insertions(+), 91 deletions(-)
  create mode 100644 drivers/gpu/drm/amd/include/linux/chash.h
  create mode 100644 drivers/gpu/drm/amd/lib/Kconfig
  create mode 100644 drivers/gpu/drm/amd/lib/Makefile
  create mode 100644 drivers/gpu/drm/amd/lib/chash.c



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/3] drm/amdgpu:fix typo

2017-09-19 Thread Christian König

Am 19.09.2017 um 08:41 schrieb Monk Liu:

previously a patch has typo error, correct it

Change-Id: I91bcefd7148b5db1c7d957c868e13a46ca40ef74
Signed-off-by: Monk Liu 


Acked-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 4eee2ef..35cc5ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -256,7 +256,7 @@ static int psp_hw_start(struct psp_context *psp)
struct amdgpu_device *adev = psp->adev;
int ret;
  
-	if (amdgpu_sriov_vf(adev) && !adev->in_sriov_reset) {

+   if (!amdgpu_sriov_vf(adev) || !adev->in_sriov_reset) {
ret = psp_bootloader_load_sysdrv(psp);
if (ret)
return ret;



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] drm/amdgpu:make ctx_add_fence interruptible(v2)

2017-09-19 Thread Christian König

Am 19.09.2017 um 08:41 schrieb Monk Liu:

otherwise a gpu hang will make application couldn't be killed
under timedout=0 mode

v2:
Fix memoryleak job/job->s_fence issue
unlock mn
remove the ERROR msg after waiting being interrupted

Change-Id: I6051b5b3ae1188983f49325a2438c84a6c12374a
Signed-off-by: Monk Liu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h |  4 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 17 +++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 12 +++-
  3 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index cc9a232..6ff2959 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -736,8 +736,8 @@ struct amdgpu_ctx_mgr {
  struct amdgpu_ctx *amdgpu_ctx_get(struct amdgpu_fpriv *fpriv, uint32_t id);
  int amdgpu_ctx_put(struct amdgpu_ctx *ctx);
  
-uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, struct amdgpu_ring *ring,

- struct dma_fence *fence);
+int amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, struct amdgpu_ring *ring,
+ struct dma_fence *fence, uint64_t *seq);
  struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx *ctx,
   struct amdgpu_ring *ring, uint64_t seq);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c

index b59749d..9bd4834 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1043,6 +1043,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
struct amd_sched_entity *entity = >ctx->rings[ring->idx].entity;
struct amdgpu_job *job;
unsigned i;
+   uint64_t seq;
+
int r;
  
  	amdgpu_mn_lock(p->mn);

@@ -1071,8 +1073,19 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
job->owner = p->filp;
job->fence_ctx = entity->fence_context;
p->fence = dma_fence_get(>base.s_fence->finished);
-   cs->out.handle = amdgpu_ctx_add_fence(p->ctx, ring, p->fence);
-   job->uf_sequence = cs->out.handle;
+   r = amdgpu_ctx_add_fence(p->ctx, ring, p->fence, );
+   if (r) {
+   /* release job include the sched fence as well */
+   dma_fence_put(>base.s_fence->finished);
+   dma_fence_put(>base.s_fence->scheduled);
+   amdgpu_job_free(job);
+   amdgpu_mn_unlock(p->mn);
+   dma_fence_put(p->fence);
+   return r;
+   }
+
+   cs->out.handle = seq;
+   job->uf_sequence = seq;
amdgpu_job_free_resources(job);
  
  	trace_amdgpu_cs_ioctl(job);

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index a11e443..551f114 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -246,8 +246,8 @@ int amdgpu_ctx_put(struct amdgpu_ctx *ctx)
return 0;
  }
  
-uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, struct amdgpu_ring *ring,

- struct dma_fence *fence)
+int amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, struct amdgpu_ring *ring,
+ struct dma_fence *fence, uint64_t* handler)
  {
struct amdgpu_ctx_ring *cring = & ctx->rings[ring->idx];
uint64_t seq = cring->sequence;
@@ -258,9 +258,9 @@ uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, 
struct amdgpu_ring *ring,
other = cring->fences[idx];
if (other) {
signed long r;
-   r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
+   r = dma_fence_wait_timeout(other, true, MAX_SCHEDULE_TIMEOUT);
if (r < 0)
-   DRM_ERROR("Error (%ld) waiting for fence!\n", r);
+   return -ERESTARTSYS;


Return the original error code here, e.g. "r".

With that fixed the patch is Reviewed-by: Christian König 



Regards,
Christian.


}
  
  	dma_fence_get(fence);

@@ -271,8 +271,10 @@ uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, 
struct amdgpu_ring *ring,
spin_unlock(>ring_lock);
  
  	dma_fence_put(other);

+   if (handler)
+   *handler = seq;
  
-	return seq;

+   return 0;
  }
  
  struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx *ctx,



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: add support for iova_to_phys to replace TTM trace (v4)

2017-09-19 Thread Christian König

Am 19.09.2017 um 13:20 schrieb Tom St Denis:

Signed-off-by: Tom St Denis 

(v2): Add domain to iova debugfs
(v3): Add true read/write methods to access system memory of pages
   mapped to the device
(v4): Move get_domain call out of loop and return on error
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 114 
  1 file changed, 114 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 50d20903de4f..71a2fdc91a85 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -43,6 +43,7 @@
  #include 
  #include 
  #include 
+#include 
  #include "amdgpu.h"
  #include "amdgpu_trace.h"
  #include "bif/bif_4_1_d.h"
@@ -1810,6 +1811,118 @@ static const struct file_operations amdgpu_ttm_gtt_fops 
= {
  #endif
  
  
+static void *transform_page(uint64_t phys)

+{
+   if (PageHighMem(pfn_to_page(PFN_DOWN(phys
+   return kmap(pfn_to_page(PFN_DOWN(phys)));
+   else
+   return __va(phys);
+}
+
+static void untransform_page(uint64_t phys)
+{
+   if (PageHighMem(pfn_to_page(PFN_DOWN(phys
+   return kunmap(pfn_to_page(PFN_DOWN(phys)));
+}


I mentioned that in the previous patch as well, why not using 
kmap()/kunmap() here directly?


See kmap() for x86 for example:

void  *kmap (struct  
page  *page)
{
	might_sleep 
();
	if  (!PageHighMem 
(page))
		return  page_address 
(page);
	return  kmap_high 
(page);

}


Both normal as well as highmem are correctly handled here.

Regards,
Christian.


+
+static ssize_t amdgpu_iova_to_phys_read(struct file *f, char __user *buf,
+  size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result, n;
+   int r;
+   uint64_t phys;
+   void *ptr;
+   struct iommu_domain *dom;
+
+   dom = iommu_get_domain_for_dev(adev->dev);
+   if (!dom)
+   return -EFAULT;
+
+   result = 0;
+   while (size) {
+   // get physical address and map
+   phys = iommu_iova_to_phys(dom, *pos);
+
+   // copy upto one page
+   if (size > PAGE_SIZE)
+   n = PAGE_SIZE;
+   else
+   n = size;
+
+   // to end of the page
+   if (((*pos & (PAGE_SIZE - 1)) + n) >= PAGE_SIZE)
+   n = PAGE_SIZE - (*pos & (PAGE_SIZE - 1));
+
+   ptr = transform_page(phys);
+   if (!ptr)
+   return -EFAULT;
+
+   r = copy_to_user(buf, ptr, n);
+   untransform_page(phys);
+   if (r)
+   return -EFAULT;
+
+   *pos += n;
+   size -= n;
+   result += n;
+   }
+
+   return result;
+}
+
+static ssize_t amdgpu_iova_to_phys_write(struct file *f, const char __user 
*buf,
+  size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result, n;
+   int r;
+   uint64_t phys;
+   void *ptr;
+   struct iommu_domain *dom;
+
+   dom = iommu_get_domain_for_dev(adev->dev);
+   if (!dom)
+   return -EFAULT;
+
+   result = 0;
+   while (size) {
+   // get physical address and map
+   phys = iommu_iova_to_phys(dom, *pos);
+
+   // copy upto one page
+   if (size > PAGE_SIZE)
+   n = PAGE_SIZE;
+   else
+   n = size;
+
+   // to end of the page
+   if (((*pos & (PAGE_SIZE - 1)) + n) >= PAGE_SIZE)
+   n = PAGE_SIZE - (*pos & (PAGE_SIZE - 1));
+
+   ptr = transform_page(phys);
+   if (!ptr)
+   return -EFAULT;
+
+   r = copy_from_user(ptr, buf, n);
+   untransform_page(phys);
+   if (r)
+   return -EFAULT;
+
+   *pos += n;
+   size -= n;
+   result += n;
+   }
+
+   return result;
+}
+
+static const struct file_operations amdgpu_ttm_iova_fops = {
+   .owner = THIS_MODULE,
+   .read = amdgpu_iova_to_phys_read,
+   .write = amdgpu_iova_to_phys_write,
+   .llseek = default_llseek
+};
  
  static const struct {

char *name;
@@ -1820,6 +1933,7 @@ static const struct {
  #ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
{ "amdgpu_gtt", _ttm_gtt_fops, TTM_PL_TT },
  #endif
+   { "amdgpu_iova", _ttm_iova_fops, 

RE: [PATCH 13/18] drm/amdgpu:fix driver unloading bug

2017-09-19 Thread Liu, Monk
> Well the question is why does the CPC still needs the MQD commands and how 
> can we prevent that?

You are right, I'll see what can do


-Original Message-
From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] 
Sent: 2017年9月19日 16:27
To: Liu, Monk ; Koenig, Christian ; 
amd-gfx@lists.freedesktop.org
Cc: Chen, Horace 
Subject: Re: [PATCH 13/18] drm/amdgpu:fix driver unloading bug

Am 19.09.2017 um 06:14 schrieb Liu, Monk:
> Christian,
>
>> That sounds at least a bit better. But my question is why doesn't this work 
>> like it does on Tonga, e.g. correctly clean things up?
> Tonga also suffer with this issue, just that we fixed it in the branch for 
> CSP customer and staging code usually behind our private branch ...
>
>> Yeah, gut keeping the GART mapping alive is complete nonsense. When the 
>> driver unloads all memory should be returned to the OS.
>> So we either keep a GART mapping to pages which are about to be reused and 
>> overwritten, or we leak memory on driver shutdown.
>> Neither options sounds very good,
> Unbinding the GART mapping makes CPC hang if it run MQD commands, and 
> CPC must run MQD commands because RLCV always Requires CPC do that 
> when RLCV doing SAVE_VF commands,
>
> Do you have way to fix above circle ?

Well the question is why does the CPC still needs the MQD commands and how can 
we prevent that?

The point is when we need to keep the GART alive to avoid a crash after driver 
unload we also need to keep the pages alive where the GART points to.

This means that the pages are either overwritten or we get massive complains 
from the MM that we are leaking pages here.

If it's not possible to turn of the CPC on driver unload the only alternative I 
can see is to reprogram it so that the MQD commands come from VRAM instead of 
GART.

Regards,
Christian.

>
>
> BR Monk
>
>
> -Original Message-
> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
> Sent: 2017年9月18日 19:54
> To: Liu, Monk ; Koenig, Christian 
> ; amd-gfx@lists.freedesktop.org
> Cc: Chen, Horace 
> Subject: Re: [PATCH 13/18] drm/amdgpu:fix driver unloading bug
>
> Am 18.09.2017 um 12:12 schrieb Liu, Monk:
>> Christian,
>>
>> Let's discuss this patch and the one follows which skip the KIQ MQD free to 
>> avoid SAVE_FAIL issue.
>>
>>
>> For skipping KIQ MQD deallocation patch, I think I will drop it and use a 
>> new way:
>> We allocate KIQ MQD in VRAM domain and this BO can be safely freed after 
>> driver unloaded, because after driver unloaded no one will change the data 
>> in this BO *usually*.
>> e.g. some root  app can map visible vram and alter the value in it
> That sounds at least a bit better. But my question is why doesn't this work 
> like it does on Tonga, e.g. correctly clean things up?
>
>> for this patch "to skipping unbind the GART mapping to keep KIQ MQD always 
>> valid":
>> Since hypervisor side always have couple hw component working, and they rely 
>> on GMC kept alive, so this is very different with BARE-METAL. That's to say 
>> we can only do like this way.
> Yeah, gut keeping the GART mapping alive is complete nonsense. When the 
> driver unloads all memory should be returned to the OS.
>
> So we either keep a GART mapping to pages which are about to be reused and 
> overwritten, or we leak memory on driver shutdown.
>
> Neither options sounds very good,
> Christian.
>
>> Besides, we'll have more patches in future for L1 secure mode, which 
>> forbidden VF access GMC registers, so under L1 secure mode driver 
>> will always skip GMC programing under SRIOV both in init and fini, 
>> but that will come later
>>
>> BR Monk
>>
>>
>>
>> -Original Message-
>> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
>> Sent: 2017年9月18日 17:28
>> To: Liu, Monk ; amd-gfx@lists.freedesktop.org
>> Cc: Chen, Horace 
>> Subject: Re: [PATCH 13/18] drm/amdgpu:fix driver unloading bug
>>
>> Am 18.09.2017 um 08:11 schrieb Monk Liu:
>>> [SWDEV-126631] - fix hypervisor save_vf fail that occured after 
>>> driver
>>> removed:
>>> 1. Because the KIQ and KCQ were not ummapped, save_vf will fail if driver 
>>> freed mqd of KIQ and KCQ.
>>> 2. KIQ can't be unmapped since RLCV always need it, the bo_free on 
>>> KIQ should be skipped 3. KCQ can be unmapped, and should be unmapped 
>>> during hw_fini, 4. RLCV still need to access other mc address from some hw 
>>> even after driver unloaded,
>>>   So we should not unbind gart for VF.
>>>
>>> Change-Id: I320487a9a848f41484c5f8cc11be34aca807b424
>>> Signed-off-by: Horace Chen 
>>> Signed-off-by: Monk Liu 
>> I absolutely can't judge if this is correct or not, but keeping the GART and 
>> KIQ alive after the driver is unloaded sounds really fishy to me.
>>
>> Isn't there any other clean way of handling 

Re: [PATCH] drm/amdgpu: add helper to convert a ttm bo to amdgpu_bo

2017-09-19 Thread Christian König

It actually looks like a valid cleanup to me.

Patch is Reviewed-by: Christian König .

Regards,
Christian.

Am 19.09.2017 um 05:20 schrieb Andres Rodriguez:
This is a small cleanup patch from my initial naive attempt at 
extracting a TTM bo in amdgpu_sync_resv(). It didn't end up being 
useful in that specific case, but I thought I'd send it out anyways in 
case you find it useful.


Regards,
Andres


On 2017-09-18 11:17 PM, Andres Rodriguez wrote:

Signed-off-by: Andres Rodriguez 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 8 +++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 5 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 9 +
  3 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index 726a662..73eedd3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -40,9 +40,7 @@
  static void amdgpu_ttm_bo_destroy(struct ttm_buffer_object *tbo)
  {
  struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev);
-struct amdgpu_bo *bo;
-
-bo = container_of(tbo, struct amdgpu_bo, tbo);
+struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
amdgpu_bo_kunmap(bo);
  @@ -891,7 +889,7 @@ void amdgpu_bo_move_notify(struct 
ttm_buffer_object *bo,

  if (!amdgpu_ttm_bo_is_amdgpu_bo(bo))
  return;
  -abo = container_of(bo, struct amdgpu_bo, tbo);
+abo = ttm_to_amdgpu_bo(bo);
  amdgpu_vm_bo_invalidate(adev, abo, evict);
amdgpu_bo_kunmap(abo);
@@ -918,7 +916,7 @@ int amdgpu_bo_fault_reserve_notify(struct 
ttm_buffer_object *bo)

  if (!amdgpu_ttm_bo_is_amdgpu_bo(bo))
  return 0;
  -abo = container_of(bo, struct amdgpu_bo, tbo);
+abo = ttm_to_amdgpu_bo(bo);
/* Remember that this BO was accessed by the CPU */
  abo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h

index 39b6bf6..c26ef53 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -94,6 +94,11 @@ struct amdgpu_bo {
  };
  };
  +static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct 
ttm_buffer_object *tbo)

+{
+return container_of(tbo, struct amdgpu_bo, tbo);
+}
+
  /**
   * amdgpu_mem_type_to_domain - return domain corresponding to mem_type
   * @mem_type:ttm memory type
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index b2b11e1..c9c059d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -44,6 +44,7 @@
  #include 
  #include 
  #include "amdgpu.h"
+#include "amdgpu_object.h"
  #include "amdgpu_trace.h"
  #include "bif/bif_4_1_d.h"
  @@ -209,7 +210,7 @@ static void amdgpu_evict_flags(struct 
ttm_buffer_object *bo,

  placement->num_busy_placement = 1;
  return;
  }
-abo = container_of(bo, struct amdgpu_bo, tbo);
+abo = ttm_to_amdgpu_bo(bo);
  switch (bo->mem.mem_type) {
  case TTM_PL_VRAM:
  if (adev->mman.buffer_funcs &&
@@ -257,7 +258,7 @@ static void amdgpu_evict_flags(struct 
ttm_buffer_object *bo,
static int amdgpu_verify_access(struct ttm_buffer_object *bo, 
struct file *filp)

  {
-struct amdgpu_bo *abo = container_of(bo, struct amdgpu_bo, tbo);
+struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo);
if (amdgpu_ttm_tt_get_usermm(bo->ttm))
  return -EPERM;
@@ -484,7 +485,7 @@ static int amdgpu_bo_move(struct 
ttm_buffer_object *bo,

  int r;
/* Can't move a pinned BO */
-abo = container_of(bo, struct amdgpu_bo, tbo);
+abo = ttm_to_amdgpu_bo(bo);
  if (WARN_ON_ONCE(abo->pin_count > 0))
  return -EINVAL;
  @@ -1172,7 +1173,7 @@ static int amdgpu_ttm_access_memory(struct 
ttm_buffer_object *bo,

  unsigned long offset,
  void *buf, int len, int write)
  {
-struct amdgpu_bo *abo = container_of(bo, struct amdgpu_bo, tbo);
+struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo);
  struct amdgpu_device *adev = amdgpu_ttm_adev(abo->tbo.bdev);
  struct drm_mm_node *nodes = abo->tbo.mem.mm_node;
  uint32_t value = 0;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: Fix a bug in amdgpu_fill_buffer()

2017-09-19 Thread Christian König

Am 18.09.2017 um 21:18 schrieb Yong Zhao:

When max_bytes is not 8 bytes aligned and bo size is larger than
max_bytes, the last 8 bytes in a ttm node may be left unchanged.
For example, on pre SDMA 4.0, max_bytes = 0x1f, and the bo size
is 0x20, the problem will happen.

In order to fix the problem, we store the max nums of PTEs/PDEs
a single operation can set separately in amdgpu_vm_pte_funcs
structure.

Change-Id: I37c588a57cb63f1a8251fb5ead2eff4b39e047c9
Signed-off-by: Yong Zhao 


Looks really good to me, patch is Reviewed-by: Christian König 
.


Bonus points for finding the right lines in 
amdgpu_vm_bo_update_mapping() and replace the hard coded "10" there with 
the value from amdgpu_vm_pte_funcs as well.


Regards,
Christian.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h | 6 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 
  drivers/gpu/drm/amd/amdgpu/cik_sdma.c   | 3 +++
  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c  | 3 +++
  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c  | 3 +++
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c  | 3 +++
  drivers/gpu/drm/amd/amdgpu/si_dma.c | 3 +++
  7 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index a34c4cb..91bb111 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -309,6 +309,12 @@ struct amdgpu_vm_pte_funcs {
uint64_t pe,
uint64_t addr, unsigned count,
uint32_t incr, uint64_t flags);
+
+   /* maximum nums of PTEs/PDEs in a single operation */
+   uint32_tset_max_nums_pte_pde;
+
+   /* number of dw to reserve per operation */
+   unsignedset_pte_pde_num_dw;
  };
  
  /* provided by the gmc block */

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 2180ed3..8685b0c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1556,8 +1556,8 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
   struct dma_fence **fence)
  {
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
-   /* max_bytes applies to SDMA_OP_PTEPDE as well as SDMA_OP_CONST_FILL*/
-   uint32_t max_bytes = adev->mman.buffer_funcs->fill_max_bytes;
+   uint32_t max_bytes = 8 *
+   adev->vm_manager.vm_pte_funcs->set_max_nums_pte_pde;
struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
  
  	struct drm_mm_node *mm_node;

@@ -1589,8 +1589,8 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
++mm_node;
}
  
-	/* 10 double words for each SDMA_OP_PTEPDE cmd */

-   num_dw = num_loops * 10;
+   /* num of dwords for each SDMA_OP_PTEPDE cmd */
+   num_dw = num_loops * adev->vm_manager.vm_pte_funcs->set_pte_pde_num_dw;
  
  	/* for IB padding */

num_dw += 64;
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index f508f4d..ff59351 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -1390,6 +1390,9 @@ static const struct amdgpu_vm_pte_funcs 
cik_sdma_vm_pte_funcs = {
.copy_pte = cik_sdma_vm_copy_pte,
.write_pte = cik_sdma_vm_write_pte,
.set_pte_pde = cik_sdma_vm_set_pte_pde,
+
+   .set_max_nums_pte_pde = 0x1f >> 3,
+   .set_pte_pde_num_dw = 10,
  };
  
  static void cik_sdma_set_vm_pte_funcs(struct amdgpu_device *adev)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index f2d0710..aec3586 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -1327,6 +1327,9 @@ static const struct amdgpu_vm_pte_funcs 
sdma_v2_4_vm_pte_funcs = {
.copy_pte = sdma_v2_4_vm_copy_pte,
.write_pte = sdma_v2_4_vm_write_pte,
.set_pte_pde = sdma_v2_4_vm_set_pte_pde,
+
+   .set_max_nums_pte_pde = 0x1f >> 3,
+   .set_pte_pde_num_dw = 10,
  };
  
  static void sdma_v2_4_set_vm_pte_funcs(struct amdgpu_device *adev)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 4e7fe07..7610272 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -1751,6 +1751,9 @@ static const struct amdgpu_vm_pte_funcs 
sdma_v3_0_vm_pte_funcs = {
.copy_pte = sdma_v3_0_vm_copy_pte,
.write_pte = sdma_v3_0_vm_write_pte,
.set_pte_pde = sdma_v3_0_vm_set_pte_pde,
+
+   .set_max_nums_pte_pde = 0x3fffe0 >> 3,
+   .set_pte_pde_num_dw = 10,
  };
  
  static void sdma_v3_0_set_vm_pte_funcs(struct amdgpu_device *adev)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index fd7c72a..6e1e0c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ 

Re: [PATCH 4/4] drm/ttm: Remove TTM dma tracepoint since it's not required anymore

2017-09-19 Thread Christian König

Am 18.09.2017 um 19:33 schrieb Tom St Denis:

Signed-off-by: Tom St Denis 


Mhm, I sometimes have good use for those. But just adding a printk at 
the right place does the job as well.


So patch is Reviewed-by: Christian König .

Regards,
Christian.


---
  drivers/gpu/drm/ttm/Makefile  |  2 +-
  drivers/gpu/drm/ttm/ttm_debug.c   | 74 -
  drivers/gpu/drm/ttm/ttm_trace.h   | 87 ---
  drivers/gpu/drm/ttm/ttm_tracepoints.c | 45 --
  4 files changed, 1 insertion(+), 207 deletions(-)
  delete mode 100644 drivers/gpu/drm/ttm/ttm_debug.c
  delete mode 100644 drivers/gpu/drm/ttm/ttm_trace.h
  delete mode 100644 drivers/gpu/drm/ttm/ttm_tracepoints.c

diff --git a/drivers/gpu/drm/ttm/Makefile b/drivers/gpu/drm/ttm/Makefile
index ab2bef1219e5..4d0c938ff4b2 100644
--- a/drivers/gpu/drm/ttm/Makefile
+++ b/drivers/gpu/drm/ttm/Makefile
@@ -4,7 +4,7 @@
  ttm-y := ttm_memory.o ttm_tt.o ttm_bo.o \
ttm_bo_util.o ttm_bo_vm.o ttm_module.o \
ttm_object.o ttm_lock.o ttm_execbuf_util.o ttm_page_alloc.o \
-   ttm_bo_manager.o ttm_page_alloc_dma.o ttm_debug.o ttm_tracepoints.o
+   ttm_bo_manager.o ttm_page_alloc_dma.o
  ttm-$(CONFIG_AGP) += ttm_agp_backend.o
  
  obj-$(CONFIG_DRM_TTM) += ttm.o

diff --git a/drivers/gpu/drm/ttm/ttm_debug.c b/drivers/gpu/drm/ttm/ttm_debug.c
deleted file mode 100644
index ef5f0d090154..
--- a/drivers/gpu/drm/ttm/ttm_debug.c
+++ /dev/null
@@ -1,74 +0,0 @@
-/**
- *
- * Copyright (c) 2017 Advanced Micro Devices, Inc.
- * All Rights Reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the
- * "Software"), to deal in the Software without restriction, including
- * without limitation the rights to use, copy, modify, merge, publish,
- * distribute, sub license, and/or sell copies of the Software, and to
- * permit persons to whom the Software is furnished to do so, subject to
- * the following conditions:
- *
- * The above copyright notice and this permission notice (including the
- * next paragraph) shall be included in all copies or substantial portions
- * of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
- * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
- * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
- * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
- * USE OR OTHER DEALINGS IN THE SOFTWARE.
- *
- **/
-/*
- * Authors: Tom St Denis 
- */
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include "ttm_trace.h"
-
-void ttm_trace_dma_map(struct device *dev, struct ttm_dma_tt *tt)
-{
-   unsigned i;
-
-   if (unlikely(trace_ttm_dma_map_enabled())) {
-   for (i = 0; i < tt->ttm.num_pages; i++) {
-   trace_ttm_dma_map(
-   dev,
-   tt->ttm.pages[i],
-   tt->dma_address[i]);
-   }
-   }
-}
-EXPORT_SYMBOL(ttm_trace_dma_map);
-
-void ttm_trace_dma_unmap(struct device *dev, struct ttm_dma_tt *tt)
-{
-   unsigned i;
-
-   if (unlikely(trace_ttm_dma_unmap_enabled())) {
-   for (i = 0; i < tt->ttm.num_pages; i++) {
-   trace_ttm_dma_unmap(
-   dev,
-   tt->ttm.pages[i],
-   tt->dma_address[i]);
-   }
-   }
-}
-EXPORT_SYMBOL(ttm_trace_dma_unmap);
-
diff --git a/drivers/gpu/drm/ttm/ttm_trace.h b/drivers/gpu/drm/ttm/ttm_trace.h
deleted file mode 100644
index 715ce68b7b33..
--- a/drivers/gpu/drm/ttm/ttm_trace.h
+++ /dev/null
@@ -1,87 +0,0 @@
-/**
- *
- * Copyright (c) 2017 Advanced Micro Devices, Inc.
- * All Rights Reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the
- * "Software"), to deal in the Software without restriction, including
- * without limitation the rights to use, copy, modify, merge, publish,
- * distribute, sub license, and/or sell copies of the Software, and to
- * permit persons to whom the Software is furnished to do so, subject to
- * the following conditions:
- *
- * The above copyright notice and this permission notice (including 

Re: [PATCH 1/4] drm/amd/amdgpu: Fold TTM debugfs entries into array (v2)

2017-09-19 Thread Christian König

Am 18.09.2017 um 19:33 schrieb Tom St Denis:

Signed-off-by: Tom St Denis 


Reviewed-by: Christian König 



(v2): add domains and avoid strcmp
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 54 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  4 +--
  2 files changed, 32 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8ee16dfdb8af..50d20903de4f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1809,6 +1809,19 @@ static const struct file_operations amdgpu_ttm_gtt_fops 
= {
  
  #endif
  
+

+
+static const struct {
+   char *name;
+   const struct file_operations *fops;
+   int domain;
+} ttm_debugfs_entries[] = {
+   { "amdgpu_vram", _ttm_vram_fops, TTM_PL_VRAM },
+#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
+   { "amdgpu_gtt", _ttm_gtt_fops, TTM_PL_TT },
+#endif
+};
+
  #endif
  
  static int amdgpu_ttm_debugfs_init(struct amdgpu_device *adev)

@@ -1819,22 +1832,21 @@ static int amdgpu_ttm_debugfs_init(struct amdgpu_device 
*adev)
struct drm_minor *minor = adev->ddev->primary;
struct dentry *ent, *root = minor->debugfs_root;
  
-	ent = debugfs_create_file("amdgpu_vram", S_IFREG | S_IRUGO, root,

- adev, _ttm_vram_fops);
-   if (IS_ERR(ent))
-   return PTR_ERR(ent);
-   i_size_write(ent->d_inode, adev->mc.mc_vram_size);
-   adev->mman.vram = ent;
-
-#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
-   ent = debugfs_create_file("amdgpu_gtt", S_IFREG | S_IRUGO, root,
- adev, _ttm_gtt_fops);
-   if (IS_ERR(ent))
-   return PTR_ERR(ent);
-   i_size_write(ent->d_inode, adev->mc.gart_size);
-   adev->mman.gtt = ent;
+   for (count = 0; count < ARRAY_SIZE(ttm_debugfs_entries); count++) {
+   ent = debugfs_create_file(
+   ttm_debugfs_entries[count].name,
+   S_IFREG | S_IRUGO, root,
+   adev,
+   ttm_debugfs_entries[count].fops);
+   if (IS_ERR(ent))
+   return PTR_ERR(ent);
+   if (ttm_debugfs_entries[count].domain == TTM_PL_VRAM)
+   i_size_write(ent->d_inode, adev->mc.mc_vram_size);
+   else if (ttm_debugfs_entries[count].domain == TTM_PL_TT)
+   i_size_write(ent->d_inode, adev->mc.gart_size);
+   adev->mman.debugfs_entries[count] = ent;
+   }
  
-#endif

count = ARRAY_SIZE(amdgpu_ttm_debugfs_list);
  
  #ifdef CONFIG_SWIOTLB

@@ -1844,7 +1856,6 @@ static int amdgpu_ttm_debugfs_init(struct amdgpu_device 
*adev)
  
  	return amdgpu_debugfs_add_files(adev, amdgpu_ttm_debugfs_list, count);

  #else
-
return 0;
  #endif
  }
@@ -1852,14 +1863,9 @@ static int amdgpu_ttm_debugfs_init(struct amdgpu_device 
*adev)
  static void amdgpu_ttm_debugfs_fini(struct amdgpu_device *adev)
  {
  #if defined(CONFIG_DEBUG_FS)
+   unsigned i;
  
-	debugfs_remove(adev->mman.vram);

-   adev->mman.vram = NULL;
-
-#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
-   debugfs_remove(adev->mman.gtt);
-   adev->mman.gtt = NULL;
-#endif
-
+   for (i = 0; i < ARRAY_SIZE(ttm_debugfs_entries); i++)
+   debugfs_remove(adev->mman.debugfs_entries[i]);
  #endif
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index 64709e041d5b..7abae6867339 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -24,6 +24,7 @@
  #ifndef __AMDGPU_TTM_H__
  #define __AMDGPU_TTM_H__
  
+#include "amdgpu.h"

  #include "gpu_scheduler.h"
  
  #define AMDGPU_PL_GDS		(TTM_PL_PRIV + 0)

@@ -45,8 +46,7 @@ struct amdgpu_mman {
boolinitialized;
  
  #if defined(CONFIG_DEBUG_FS)

-   struct dentry   *vram;
-   struct dentry   *gtt;
+   struct dentry   *debugfs_entries[8];
  #endif
  
  	/* buffer handling */



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp invalidate

2017-09-19 Thread Zhang, Jerry (Junwei)

On 09/19/2017 04:30 PM, Christian König wrote:

I don't know why, but the HDP is generally not part of the register spec.


AFAIW, these regs may be used for HW guys to debug some special cases.
Usually they are not suggested to touch formally.
(e.g. GFX cannot access PRT unmap range, but with debug bit set, it could. 
however, it's not normal way.)


Jerry




So you can neither find HDP_DEBUG0 nor HDP_READ_CACHE_INVALIDATE in it as far as
I know.

Point is that the HDP invalidates it's read cache on any register write (the
register itself doesn't matter). So far we used the HDP_DEBUG0 register because
it is unused otherwise, but having a dedicated register just for this job is
clearly a good idea.

Regards,
Christian.

Am 19.09.2017 um 09:02 schrieb zhoucm1:

>using this register to replace MM_HDP_DEBUG0 is suggested from a HDP HW guys

I'm OK with this line.

Thanks for explain.
David Zhou

On 2017年09月19日 15:00, Liu, Monk wrote:

First, I didn't check if windows did this way or not, because I don't sure if
windows is always doing the right thing, e.g. for GFX preemption I didn't
copy windows scheme and we found couple bugs in windows but not in linux ...
So please don't assume we should copy from windows, unless it's solid like a
dead bone

Second, this register is originally comes from the definition file
"hdp_4_0_offset.h", not recently introduced by me or someone else, and using
this register to replace MM_HDP_DEBUG0 is suggested from a HDP HW guys when
I was working on the PAL/VULKAN preemption hang issue in Orlando, sorry I
missed that guy's name ...

@Deucher, Alexander do you know who is on hdp hw ? we can confirm with him


If you're feeling bad about this change, I can add "if sriov" condition to
all of it, so bare-metal will keep still,  is that okay ?

BR Monk


-Original Message-
From: Zhou, David(ChunMing)
Sent: 2017年9月19日 14:51
To: Liu, Monk ; Koenig, Christian
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp
invalidate


Seems the change is more proper, but where do you find
mmHDP_READ_CACHE_INVALIDATE? Could you double check if Windows driver has
changed to use this?
I'm confusing it, since mmHDP_DEBUG0 implementation is from windows as well.
I even don't find mmHDP_READ_CACHE_INVALIDATE in register spec.

Regards,
David Zhou
On 2017年09月19日 14:46, Liu, Monk wrote:

What question ? please reply here

-Original Message-
From: Zhou, David(ChunMing)
Sent: 2017年9月19日 12:25
To: Liu, Monk ; Koenig, Christian
; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger
hdp invalidate

Please answer my question as I raised in another thread, otherwise I will
give a NAK on this!

Regards,
David Zhou

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
Of Liu, Monk
Sent: Tuesday, September 19, 2017 12:04 PM
To: Koenig, Christian ;
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger
hdp invalidate

Yeah, vnc1_0 and uvd_v7_0

Thanks

-Original Message-
From: Koenig, Christian
Sent: 2017年9月18日 19:39
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger
hdp invalidate

Yeah, but Vega10 has UVD7 and in uvd_v7_0.c we have:


static void uvd_v7_0_ring_emit_hdp_invalidate(struct amdgpu_ring
*ring) {
  amdgpu_ring_write(ring, PACKET0(SOC15_REG_OFFSET(HDP, 0,
mmHDP_DEBUG0), 0));
  amdgpu_ring_write(ring, 1);
}

That should probably be fixed as well.

Regards,
Christian.

Am 18.09.2017 um 13:03 schrieb Liu, Monk:

Only vega10 has this register

-Original Message-
From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
Sent: 2017年9月18日 17:20
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger
hdp invalidate

Am 18.09.2017 um 08:11 schrieb Monk Liu:

Change-Id: I61dc02ea6a450f9acfa3bae07aa20244261f5369
Signed-off-by: Monk Liu 

Reviewed-by: Christian König 

Please scan the code once more, we most likely have used mmHDP_DEBUG0 for
this at even more places.

Christian.


---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index f201510..44960b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -3549,7 +3549,7 @@ static void gfx_v9_0_ring_emit_hdp_flush(struct
amdgpu_ring *ring)
 static void gfx_v9_0_ring_emit_hdp_invalidate(struct amdgpu_ring *ring)
 {
 gfx_v9_0_write_data_to_reg(ring, 0, true,
-

[PATCH] drm/amdgpu: simplify trace point code

2017-09-19 Thread Xue, Ken
drm/amdgpu: simplify trace point code

amdgpu_trace_points.c does not declare license and it can be replaced by a 
simpler way.
 
Signed-off-by: Ken Xue 
---
 drivers/gpu/drm/amd/amdgpu/Makefile  | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c   | 2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c | 9 -
 3 files changed, 3 insertions(+), 10 deletions(-)
 delete mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 454e6ef..9d7db60 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -26,7 +26,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_encoders.o amdgpu_display.o amdgpu_i2c.o \
amdgpu_fb.o amdgpu_gem.o amdgpu_ring.o \
amdgpu_cs.o amdgpu_bios.o amdgpu_benchmark.o amdgpu_test.o \
-   amdgpu_pm.o atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \
+   amdgpu_pm.o atombios_dp.o amdgpu_afmt.o \
atombios_encoders.o amdgpu_sa.o atombios_i2c.o \
amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7d365fa..8881ab3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -37,7 +37,9 @@
 #include 
 #include 
 #include "amdgpu.h"
+#define CREATE_TRACE_POINTS
 #include "amdgpu_trace.h"
+#undef CREATE_TRACE_POINTS
 #include "amdgpu_i2c.h"
 #include "atom.h"
 #include "amdgpu_atombios.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c
deleted file mode 100644
index 385b7e1..000
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace_points.c
+++ /dev/null
@@ -1,9 +0,0 @@
-/* Copyright Red Hat Inc 2010.
- * Author : Dave Airlie 
- */
-#include 
-#include 
-#include "amdgpu.h"
-
-#define CREATE_TRACE_POINTS
-#include "amdgpu_trace.h"
--
2.7.4


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp invalidate

2017-09-19 Thread Christian König

I don't know why, but the HDP is generally not part of the register spec.

So you can neither find HDP_DEBUG0 nor HDP_READ_CACHE_INVALIDATE in it 
as far as I know.


Point is that the HDP invalidates it's read cache on any register write 
(the register itself doesn't matter). So far we used the HDP_DEBUG0 
register because it is unused otherwise, but having a dedicated register 
just for this job is clearly a good idea.


Regards,
Christian.

Am 19.09.2017 um 09:02 schrieb zhoucm1:
>using this register to replace MM_HDP_DEBUG0 is suggested from a HDP 
HW guys


I'm OK with this line.

Thanks for explain.
David Zhou

On 2017年09月19日 15:00, Liu, Monk wrote:
First, I didn't check if windows did this way or not, because I don't 
sure if windows is always doing the right thing, e.g. for GFX 
preemption I didn't copy windows scheme and we found couple bugs in 
windows but not in linux ...
So please don't assume we should copy from windows, unless it's solid 
like a dead bone


Second, this register is originally comes from the definition file 
"hdp_4_0_offset.h", not recently introduced by me or someone else, 
and using this register to replace MM_HDP_DEBUG0 is suggested from a 
HDP HW guys when
I was working on the PAL/VULKAN preemption hang issue in Orlando, 
sorry I missed that guy's name ...


@Deucher, Alexander do you know who is on hdp hw ? we can confirm 
with him



If you're feeling bad about this change, I can add "if sriov" 
condition to all of it, so bare-metal will keep still,  is that okay ?


BR Monk


-Original Message-
From: Zhou, David(ChunMing)
Sent: 2017年9月19日 14:51
To: Liu, Monk ; Koenig, Christian 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger 
hdp invalidate



Seems the change is more proper, but where do you find 
mmHDP_READ_CACHE_INVALIDATE? Could you double check if Windows driver 
has changed to use this?
I'm confusing it, since mmHDP_DEBUG0 implementation is from windows 
as well.

I even don't find mmHDP_READ_CACHE_INVALIDATE in register spec.

Regards,
David Zhou
On 2017年09月19日 14:46, Liu, Monk wrote:

What question ? please reply here

-Original Message-
From: Zhou, David(ChunMing)
Sent: 2017年9月19日 12:25
To: Liu, Monk ; Koenig, Christian
; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger
hdp invalidate

Please answer my question as I raised in another thread, otherwise I 
will give a NAK on this!


Regards,
David Zhou

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
Of Liu, Monk
Sent: Tuesday, September 19, 2017 12:04 PM
To: Koenig, Christian ;
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger
hdp invalidate

Yeah, vnc1_0 and uvd_v7_0

Thanks

-Original Message-
From: Koenig, Christian
Sent: 2017年9月18日 19:39
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger
hdp invalidate

Yeah, but Vega10 has UVD7 and in uvd_v7_0.c we have:


static void uvd_v7_0_ring_emit_hdp_invalidate(struct amdgpu_ring
*ring) {
  amdgpu_ring_write(ring, PACKET0(SOC15_REG_OFFSET(HDP, 0,
mmHDP_DEBUG0), 0));
  amdgpu_ring_write(ring, 1);
}

That should probably be fixed as well.

Regards,
Christian.

Am 18.09.2017 um 13:03 schrieb Liu, Monk:

Only vega10 has this register

-Original Message-
From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
Sent: 2017年9月18日 17:20
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger
hdp invalidate

Am 18.09.2017 um 08:11 schrieb Monk Liu:

Change-Id: I61dc02ea6a450f9acfa3bae07aa20244261f5369
Signed-off-by: Monk Liu 

Reviewed-by: Christian König 

Please scan the code once more, we most likely have used 
mmHDP_DEBUG0 for this at even more places.


Christian.


---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index f201510..44960b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -3549,7 +3549,7 @@ static void 
gfx_v9_0_ring_emit_hdp_flush(struct amdgpu_ring *ring)
 static void gfx_v9_0_ring_emit_hdp_invalidate(struct 
amdgpu_ring *ring)

 {
 gfx_v9_0_write_data_to_reg(ring, 0, true,
-   SOC15_REG_OFFSET(HDP, 0, mmHDP_DEBUG0), 1);
+   SOC15_REG_OFFSET(HDP, 0, 
mmHDP_READ_CACHE_INVALIDATE), 1);

 }
  static void gfx_v9_0_ring_emit_ib_gfx(struct amdgpu_ring 
*ring,

diff --git 

Re: [PATCH 13/18] drm/amdgpu:fix driver unloading bug

2017-09-19 Thread Christian König

Am 19.09.2017 um 06:14 schrieb Liu, Monk:

Christian,


That sounds at least a bit better. But my question is why doesn't this work 
like it does on Tonga, e.g. correctly clean things up?

Tonga also suffer with this issue, just that we fixed it in the branch for CSP 
customer and staging code usually behind our private branch ...


Yeah, gut keeping the GART mapping alive is complete nonsense. When the driver 
unloads all memory should be returned to the OS.
So we either keep a GART mapping to pages which are about to be reused and 
overwritten, or we leak memory on driver shutdown.
Neither options sounds very good,

Unbinding the GART mapping makes CPC hang if it run MQD commands, and CPC must 
run MQD commands because RLCV always
Requires CPC do that when RLCV doing SAVE_VF commands,

Do you have way to fix above circle ?


Well the question is why does the CPC still needs the MQD commands and 
how can we prevent that?


The point is when we need to keep the GART alive to avoid a crash after 
driver unload we also need to keep the pages alive where the GART points to.


This means that the pages are either overwritten or we get massive 
complains from the MM that we are leaking pages here.


If it's not possible to turn of the CPC on driver unload the only 
alternative I can see is to reprogram it so that the MQD commands come 
from VRAM instead of GART.


Regards,
Christian.




BR Monk


-Original Message-
From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
Sent: 2017年9月18日 19:54
To: Liu, Monk ; Koenig, Christian ; 
amd-gfx@lists.freedesktop.org
Cc: Chen, Horace 
Subject: Re: [PATCH 13/18] drm/amdgpu:fix driver unloading bug

Am 18.09.2017 um 12:12 schrieb Liu, Monk:

Christian,

Let's discuss this patch and the one follows which skip the KIQ MQD free to 
avoid SAVE_FAIL issue.


For skipping KIQ MQD deallocation patch, I think I will drop it and use a new 
way:
We allocate KIQ MQD in VRAM domain and this BO can be safely freed after driver 
unloaded, because after driver unloaded no one will change the data in this BO 
*usually*.
e.g. some root  app can map visible vram and alter the value in it

That sounds at least a bit better. But my question is why doesn't this work 
like it does on Tonga, e.g. correctly clean things up?


for this patch "to skipping unbind the GART mapping to keep KIQ MQD always 
valid":
Since hypervisor side always have couple hw component working, and they rely on 
GMC kept alive, so this is very different with BARE-METAL. That's to say we can 
only do like this way.

Yeah, gut keeping the GART mapping alive is complete nonsense. When the driver 
unloads all memory should be returned to the OS.

So we either keep a GART mapping to pages which are about to be reused and 
overwritten, or we leak memory on driver shutdown.

Neither options sounds very good,
Christian.


Besides, we'll have more patches in future for L1 secure mode, which
forbidden VF access GMC registers, so under L1 secure mode driver will
always skip GMC programing under SRIOV both in init and fini, but that
will come later

BR Monk



-Original Message-
From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
Sent: 2017年9月18日 17:28
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Cc: Chen, Horace 
Subject: Re: [PATCH 13/18] drm/amdgpu:fix driver unloading bug

Am 18.09.2017 um 08:11 schrieb Monk Liu:

[SWDEV-126631] - fix hypervisor save_vf fail that occured after
driver
removed:
1. Because the KIQ and KCQ were not ummapped, save_vf will fail if driver freed 
mqd of KIQ and KCQ.
2. KIQ can't be unmapped since RLCV always need it, the bo_free on
KIQ should be skipped 3. KCQ can be unmapped, and should be unmapped
during hw_fini, 4. RLCV still need to access other mc address from some hw even 
after driver unloaded,
  So we should not unbind gart for VF.

Change-Id: I320487a9a848f41484c5f8cc11be34aca807b424
Signed-off-by: Horace Chen 
Signed-off-by: Monk Liu 

I absolutely can't judge if this is correct or not, but keeping the GART and 
KIQ alive after the driver is unloaded sounds really fishy to me.

Isn't there any other clean way of handling this?

Christian.


---
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c |  3 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c  |  5 +++
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 60 
+++-
3 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index f437008..2fee071 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -394,7 +394,8 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
 */
void amdgpu_gart_fini(struct amdgpu_device *adev)
{
-   if (adev->gart.ready) {
+   /* gart is still used by other hw 

Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp invalidate

2017-09-19 Thread zhoucm1
>using this register to replace MM_HDP_DEBUG0 is suggested from a HDP 
HW guys


I'm OK with this line.

Thanks for explain.
David Zhou

On 2017年09月19日 15:00, Liu, Monk wrote:

First, I didn't check if windows did this way or not, because I don't sure if 
windows is always doing the right thing, e.g. for GFX preemption I didn't copy 
windows scheme and we found couple bugs in windows but not in linux ...
So please don't assume we should copy from windows, unless it's solid like a 
dead bone

Second, this register is originally comes from the definition file 
"hdp_4_0_offset.h", not recently introduced by me or someone else, and using 
this register to replace MM_HDP_DEBUG0 is suggested from a HDP HW guys when
I was working on the PAL/VULKAN preemption hang issue in Orlando, sorry I 
missed that guy's name ...

@Deucher, Alexander do you know who is on hdp hw ? we can confirm with him


If you're feeling bad about this change, I can add "if sriov" condition to all 
of it, so bare-metal will keep still,  is that okay ?

BR Monk


-Original Message-
From: Zhou, David(ChunMing)
Sent: 2017年9月19日 14:51
To: Liu, Monk ; Koenig, Christian ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp 
invalidate


Seems the change is more proper, but where do you find 
mmHDP_READ_CACHE_INVALIDATE? Could you double check if Windows driver has 
changed to use this?
I'm confusing it, since mmHDP_DEBUG0 implementation is from windows as well.
I even don't find mmHDP_READ_CACHE_INVALIDATE in register spec.

Regards,
David Zhou
On 2017年09月19日 14:46, Liu, Monk wrote:

What question ? please reply here

-Original Message-
From: Zhou, David(ChunMing)
Sent: 2017年9月19日 12:25
To: Liu, Monk ; Koenig, Christian
; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger
hdp invalidate

Please answer my question as I raised in another thread, otherwise I will give 
a NAK on this!

Regards,
David Zhou

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
Of Liu, Monk
Sent: Tuesday, September 19, 2017 12:04 PM
To: Koenig, Christian ;
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger
hdp invalidate

Yeah, vnc1_0 and uvd_v7_0

Thanks

-Original Message-
From: Koenig, Christian
Sent: 2017年9月18日 19:39
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger
hdp invalidate

Yeah, but Vega10 has UVD7 and in uvd_v7_0.c we have:


static void uvd_v7_0_ring_emit_hdp_invalidate(struct amdgpu_ring
*ring) {
  amdgpu_ring_write(ring, PACKET0(SOC15_REG_OFFSET(HDP, 0,
mmHDP_DEBUG0), 0));
  amdgpu_ring_write(ring, 1);
}

That should probably be fixed as well.

Regards,
Christian.

Am 18.09.2017 um 13:03 schrieb Liu, Monk:

Only vega10 has this register

-Original Message-
From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
Sent: 2017年9月18日 17:20
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger
hdp invalidate

Am 18.09.2017 um 08:11 schrieb Monk Liu:

Change-Id: I61dc02ea6a450f9acfa3bae07aa20244261f5369
Signed-off-by: Monk Liu 

Reviewed-by: Christian König 

Please scan the code once more, we most likely have used mmHDP_DEBUG0 for this 
at even more places.

Christian.


---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index f201510..44960b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -3549,7 +3549,7 @@ static void gfx_v9_0_ring_emit_hdp_flush(struct 
amdgpu_ring *ring)
 static void gfx_v9_0_ring_emit_hdp_invalidate(struct amdgpu_ring *ring)
 {
gfx_v9_0_write_data_to_reg(ring, 0, true,
-  SOC15_REG_OFFSET(HDP, 0, mmHDP_DEBUG0), 1);
+  SOC15_REG_OFFSET(HDP, 0, 
mmHDP_READ_CACHE_INVALIDATE), 1);
 }
 
 static void gfx_v9_0_ring_emit_ib_gfx(struct amdgpu_ring *ring,

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index fd7c72a..d5f3848 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -398,7 +398,7 @@ static void sdma_v4_0_ring_emit_hdp_invalidate(struct 
amdgpu_ring *ring)
 {
amdgpu_ring_write(ring, SDMA_PKT_HEADER_OP(SDMA_OP_SRBM_WRITE) |
  SDMA_PKT_SRBM_WRITE_HEADER_BYTE_EN(0xf));
-   amdgpu_ring_write(ring, 

RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp invalidate

2017-09-19 Thread Liu, Monk
First, I didn't check if windows did this way or not, because I don't sure if 
windows is always doing the right thing, e.g. for GFX preemption I didn't copy 
windows scheme and we found couple bugs in windows but not in linux ... 
So please don't assume we should copy from windows, unless it's solid like a 
dead bone

Second, this register is originally comes from the definition file 
"hdp_4_0_offset.h", not recently introduced by me or someone else, and using 
this register to replace MM_HDP_DEBUG0 is suggested from a HDP HW guys when
I was working on the PAL/VULKAN preemption hang issue in Orlando, sorry I 
missed that guy's name ...

@Deucher, Alexander do you know who is on hdp hw ? we can confirm with him 


If you're feeling bad about this change, I can add "if sriov" condition to all 
of it, so bare-metal will keep still,  is that okay ?

BR Monk


-Original Message-
From: Zhou, David(ChunMing) 
Sent: 2017年9月19日 14:51
To: Liu, Monk ; Koenig, Christian ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp 
invalidate


Seems the change is more proper, but where do you find 
mmHDP_READ_CACHE_INVALIDATE? Could you double check if Windows driver has 
changed to use this?
I'm confusing it, since mmHDP_DEBUG0 implementation is from windows as well.
I even don't find mmHDP_READ_CACHE_INVALIDATE in register spec.

Regards,
David Zhou
On 2017年09月19日 14:46, Liu, Monk wrote:
> What question ? please reply here
>
> -Original Message-
> From: Zhou, David(ChunMing)
> Sent: 2017年9月19日 12:25
> To: Liu, Monk ; Koenig, Christian 
> ; amd-gfx@lists.freedesktop.org
> Subject: RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger 
> hdp invalidate
>
> Please answer my question as I raised in another thread, otherwise I will 
> give a NAK on this!
>
> Regards,
> David Zhou
>
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf 
> Of Liu, Monk
> Sent: Tuesday, September 19, 2017 12:04 PM
> To: Koenig, Christian ; 
> amd-gfx@lists.freedesktop.org
> Subject: RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger 
> hdp invalidate
>
> Yeah, vnc1_0 and uvd_v7_0
>
> Thanks
>
> -Original Message-
> From: Koenig, Christian
> Sent: 2017年9月18日 19:39
> To: Liu, Monk ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger 
> hdp invalidate
>
> Yeah, but Vega10 has UVD7 and in uvd_v7_0.c we have:
>
>> static void uvd_v7_0_ring_emit_hdp_invalidate(struct amdgpu_ring
>> *ring) {
>>  amdgpu_ring_write(ring, PACKET0(SOC15_REG_OFFSET(HDP, 0, 
>> mmHDP_DEBUG0), 0));
>>  amdgpu_ring_write(ring, 1);
>> }
> That should probably be fixed as well.
>
> Regards,
> Christian.
>
> Am 18.09.2017 um 13:03 schrieb Liu, Monk:
>> Only vega10 has this register
>>
>> -Original Message-
>> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
>> Sent: 2017年9月18日 17:20
>> To: Liu, Monk ; amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger 
>> hdp invalidate
>>
>> Am 18.09.2017 um 08:11 schrieb Monk Liu:
>>> Change-Id: I61dc02ea6a450f9acfa3bae07aa20244261f5369
>>> Signed-off-by: Monk Liu 
>> Reviewed-by: Christian König 
>>
>> Please scan the code once more, we most likely have used mmHDP_DEBUG0 for 
>> this at even more places.
>>
>> Christian.
>>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 2 +-
>>> drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +-
>>> 2 files changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> index f201510..44960b3 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> @@ -3549,7 +3549,7 @@ static void gfx_v9_0_ring_emit_hdp_flush(struct 
>>> amdgpu_ring *ring)
>>> static void gfx_v9_0_ring_emit_hdp_invalidate(struct amdgpu_ring *ring)
>>> {
>>> gfx_v9_0_write_data_to_reg(ring, 0, true,
>>> -  SOC15_REG_OFFSET(HDP, 0, mmHDP_DEBUG0), 1);
>>> +  SOC15_REG_OFFSET(HDP, 0, 
>>> mmHDP_READ_CACHE_INVALIDATE), 1);
>>> }
>>> 
>>> static void gfx_v9_0_ring_emit_ib_gfx(struct amdgpu_ring *ring, 
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>> index fd7c72a..d5f3848 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>> @@ -398,7 +398,7 @@ static void sdma_v4_0_ring_emit_hdp_invalidate(struct 
>>> amdgpu_ring *ring)
>>> {
>>> amdgpu_ring_write(ring, SDMA_PKT_HEADER_OP(SDMA_OP_SRBM_WRITE) |
>>>   

RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp invalidate

2017-09-19 Thread Liu, Monk
What question ? please reply here

-Original Message-
From: Zhou, David(ChunMing) 
Sent: 2017年9月19日 12:25
To: Liu, Monk ; Koenig, Christian ; 
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp 
invalidate

Please answer my question as I raised in another thread, otherwise I will give 
a NAK on this!

Regards,
David Zhou

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Liu, 
Monk
Sent: Tuesday, September 19, 2017 12:04 PM
To: Koenig, Christian ; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp 
invalidate

Yeah, vnc1_0 and uvd_v7_0

Thanks 

-Original Message-
From: Koenig, Christian
Sent: 2017年9月18日 19:39
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger hdp 
invalidate

Yeah, but Vega10 has UVD7 and in uvd_v7_0.c we have:

> static void uvd_v7_0_ring_emit_hdp_invalidate(struct amdgpu_ring
> *ring) {
> amdgpu_ring_write(ring, PACKET0(SOC15_REG_OFFSET(HDP, 0, 
> mmHDP_DEBUG0), 0));
> amdgpu_ring_write(ring, 1);
> }

That should probably be fixed as well.

Regards,
Christian.

Am 18.09.2017 um 13:03 schrieb Liu, Monk:
> Only vega10 has this register
>
> -Original Message-
> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
> Sent: 2017年9月18日 17:20
> To: Liu, Monk ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 12/18] drm/amdgpu:use formal register to trigger 
> hdp invalidate
>
> Am 18.09.2017 um 08:11 schrieb Monk Liu:
>> Change-Id: I61dc02ea6a450f9acfa3bae07aa20244261f5369
>> Signed-off-by: Monk Liu 
> Reviewed-by: Christian König 
>
> Please scan the code once more, we most likely have used mmHDP_DEBUG0 for 
> this at even more places.
>
> Christian.
>
>> ---
>>drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 2 +-
>>drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +-
>>2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> index f201510..44960b3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> @@ -3549,7 +3549,7 @@ static void gfx_v9_0_ring_emit_hdp_flush(struct 
>> amdgpu_ring *ring)
>>static void gfx_v9_0_ring_emit_hdp_invalidate(struct amdgpu_ring *ring)
>>{
>>  gfx_v9_0_write_data_to_reg(ring, 0, true,
>> -   SOC15_REG_OFFSET(HDP, 0, mmHDP_DEBUG0), 1);
>> +   SOC15_REG_OFFSET(HDP, 0, 
>> mmHDP_READ_CACHE_INVALIDATE), 1);
>>}
>>
>>static void gfx_v9_0_ring_emit_ib_gfx(struct amdgpu_ring *ring, 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>> index fd7c72a..d5f3848 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>> @@ -398,7 +398,7 @@ static void sdma_v4_0_ring_emit_hdp_invalidate(struct 
>> amdgpu_ring *ring)
>>{
>>  amdgpu_ring_write(ring, SDMA_PKT_HEADER_OP(SDMA_OP_SRBM_WRITE) |
>>SDMA_PKT_SRBM_WRITE_HEADER_BYTE_EN(0xf));
>> -amdgpu_ring_write(ring, SOC15_REG_OFFSET(HDP, 0, mmHDP_DEBUG0));
>> +amdgpu_ring_write(ring, SOC15_REG_OFFSET(HDP, 0, 
>> +mmHDP_READ_CACHE_INVALIDATE));
>>  amdgpu_ring_write(ring, 1);
>>}
>>
>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 3/3] drm/amdgpu:fix typo

2017-09-19 Thread Monk Liu
previously a patch has typo error, correct it

Change-Id: I91bcefd7148b5db1c7d957c868e13a46ca40ef74
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 4eee2ef..35cc5ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -256,7 +256,7 @@ static int psp_hw_start(struct psp_context *psp)
struct amdgpu_device *adev = psp->adev;
int ret;
 
-   if (amdgpu_sriov_vf(adev) && !adev->in_sriov_reset) {
+   if (!amdgpu_sriov_vf(adev) || !adev->in_sriov_reset) {
ret = psp_bootloader_load_sysdrv(psp);
if (ret)
return ret;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/3] drm/amdgpu/sriov:fix memory leak after gpu reset

2017-09-19 Thread Monk Liu
GPU reset will require all hw doing hw_init thus
ucode_init_bo will be invoked again, which lead to
memory leak

skip the fw_buf allocation during sriov gpu reset to avoid
memory leak.

Change-Id: I31131eda1bd45ea2f5bdc50c5da5fc5a9fe9027d
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 64 +++
 2 files changed, 35 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 6ff2959..3d0c633 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1185,6 +1185,9 @@ struct amdgpu_firmware {
 
/* gpu info firmware data pointer */
const struct firmware *gpu_info_fw;
+
+   void *fw_buf_ptr;
+   uint64_t fw_buf_mc;
 };
 
 /*
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
index f306374..6564902 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
@@ -360,8 +360,6 @@ static int amdgpu_ucode_patch_jt(struct 
amdgpu_firmware_info *ucode,
 int amdgpu_ucode_init_bo(struct amdgpu_device *adev)
 {
struct amdgpu_bo **bo = >firmware.fw_buf;
-   uint64_t fw_mc_addr;
-   void *fw_buf_ptr = NULL;
uint64_t fw_offset = 0;
int i, err;
struct amdgpu_firmware_info *ucode = NULL;
@@ -372,37 +370,39 @@ int amdgpu_ucode_init_bo(struct amdgpu_device *adev)
return 0;
}
 
-   err = amdgpu_bo_create(adev, adev->firmware.fw_size, PAGE_SIZE, true,
-   amdgpu_sriov_vf(adev) ? AMDGPU_GEM_DOMAIN_VRAM 
: AMDGPU_GEM_DOMAIN_GTT,
-   AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-   NULL, NULL, 0, bo);
-   if (err) {
-   dev_err(adev->dev, "(%d) Firmware buffer allocate failed\n", 
err);
-   goto failed;
-   }
+   if (!amdgpu_sriov_vf(adev) || !adev->in_sriov_reset) {
+   err = amdgpu_bo_create(adev, adev->firmware.fw_size, PAGE_SIZE, 
true,
+   amdgpu_sriov_vf(adev) ? 
AMDGPU_GEM_DOMAIN_VRAM : AMDGPU_GEM_DOMAIN_GTT,
+   AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
+   NULL, NULL, 0, bo);
+   if (err) {
+   dev_err(adev->dev, "(%d) Firmware buffer allocate 
failed\n", err);
+   goto failed;
+   }
 
-   err = amdgpu_bo_reserve(*bo, false);
-   if (err) {
-   dev_err(adev->dev, "(%d) Firmware buffer reserve failed\n", 
err);
-   goto failed_reserve;
-   }
+   err = amdgpu_bo_reserve(*bo, false);
+   if (err) {
+   dev_err(adev->dev, "(%d) Firmware buffer reserve 
failed\n", err);
+   goto failed_reserve;
+   }
 
-   err = amdgpu_bo_pin(*bo, amdgpu_sriov_vf(adev) ? AMDGPU_GEM_DOMAIN_VRAM 
: AMDGPU_GEM_DOMAIN_GTT,
-   _mc_addr);
-   if (err) {
-   dev_err(adev->dev, "(%d) Firmware buffer pin failed\n", err);
-   goto failed_pin;
-   }
+   err = amdgpu_bo_pin(*bo, amdgpu_sriov_vf(adev) ? 
AMDGPU_GEM_DOMAIN_VRAM : AMDGPU_GEM_DOMAIN_GTT,
+   >firmware.fw_buf_mc);
+   if (err) {
+   dev_err(adev->dev, "(%d) Firmware buffer pin failed\n", 
err);
+   goto failed_pin;
+   }
 
-   err = amdgpu_bo_kmap(*bo, _buf_ptr);
-   if (err) {
-   dev_err(adev->dev, "(%d) Firmware buffer kmap failed\n", err);
-   goto failed_kmap;
-   }
+   err = amdgpu_bo_kmap(*bo, >firmware.fw_buf_ptr);
+   if (err) {
+   dev_err(adev->dev, "(%d) Firmware buffer kmap 
failed\n", err);
+   goto failed_kmap;
+   }
 
-   amdgpu_bo_unreserve(*bo);
+   amdgpu_bo_unreserve(*bo);
+   }
 
-   memset(fw_buf_ptr, 0, adev->firmware.fw_size);
+   memset(adev->firmware.fw_buf_ptr, 0, adev->firmware.fw_size);
 
/*
 * if SMU loaded firmware, it needn't add SMC, UVD, and VCE
@@ -421,14 +421,14 @@ int amdgpu_ucode_init_bo(struct amdgpu_device *adev)
ucode = >firmware.ucode[i];
if (ucode->fw) {
header = (const struct common_firmware_header 
*)ucode->fw->data;
-   amdgpu_ucode_init_single_fw(adev, ucode, fw_mc_addr + 
fw_offset,
-   (void *)((uint8_t 
*)fw_buf_ptr + fw_offset));
+   amdgpu_ucode_init_single_fw(adev, ucode, 
adev->firmware.fw_buf_mc + fw_offset,
+   

[PATCH 1/3] drm/amdgpu:make ctx_add_fence interruptible(v2)

2017-09-19 Thread Monk Liu
otherwise a gpu hang will make application couldn't be killed
under timedout=0 mode

v2:
Fix memoryleak job/job->s_fence issue
unlock mn
remove the ERROR msg after waiting being interrupted

Change-Id: I6051b5b3ae1188983f49325a2438c84a6c12374a
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 17 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 12 +++-
 3 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index cc9a232..6ff2959 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -736,8 +736,8 @@ struct amdgpu_ctx_mgr {
 struct amdgpu_ctx *amdgpu_ctx_get(struct amdgpu_fpriv *fpriv, uint32_t id);
 int amdgpu_ctx_put(struct amdgpu_ctx *ctx);
 
-uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, struct amdgpu_ring *ring,
- struct dma_fence *fence);
+int amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, struct amdgpu_ring *ring,
+ struct dma_fence *fence, uint64_t *seq);
 struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx *ctx,
   struct amdgpu_ring *ring, uint64_t seq);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index b59749d..9bd4834 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1043,6 +1043,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
struct amd_sched_entity *entity = >ctx->rings[ring->idx].entity;
struct amdgpu_job *job;
unsigned i;
+   uint64_t seq;
+
int r;
 
amdgpu_mn_lock(p->mn);
@@ -1071,8 +1073,19 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
job->owner = p->filp;
job->fence_ctx = entity->fence_context;
p->fence = dma_fence_get(>base.s_fence->finished);
-   cs->out.handle = amdgpu_ctx_add_fence(p->ctx, ring, p->fence);
-   job->uf_sequence = cs->out.handle;
+   r = amdgpu_ctx_add_fence(p->ctx, ring, p->fence, );
+   if (r) {
+   /* release job include the sched fence as well */
+   dma_fence_put(>base.s_fence->finished);
+   dma_fence_put(>base.s_fence->scheduled);
+   amdgpu_job_free(job);
+   amdgpu_mn_unlock(p->mn);
+   dma_fence_put(p->fence);
+   return r;
+   }
+
+   cs->out.handle = seq;
+   job->uf_sequence = seq;
amdgpu_job_free_resources(job);
 
trace_amdgpu_cs_ioctl(job);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index a11e443..551f114 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -246,8 +246,8 @@ int amdgpu_ctx_put(struct amdgpu_ctx *ctx)
return 0;
 }
 
-uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, struct amdgpu_ring *ring,
- struct dma_fence *fence)
+int amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, struct amdgpu_ring *ring,
+ struct dma_fence *fence, uint64_t* handler)
 {
struct amdgpu_ctx_ring *cring = & ctx->rings[ring->idx];
uint64_t seq = cring->sequence;
@@ -258,9 +258,9 @@ uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, 
struct amdgpu_ring *ring,
other = cring->fences[idx];
if (other) {
signed long r;
-   r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
+   r = dma_fence_wait_timeout(other, true, MAX_SCHEDULE_TIMEOUT);
if (r < 0)
-   DRM_ERROR("Error (%ld) waiting for fence!\n", r);
+   return -ERESTARTSYS;
}
 
dma_fence_get(fence);
@@ -271,8 +271,10 @@ uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, 
struct amdgpu_ring *ring,
spin_unlock(>ring_lock);
 
dma_fence_put(other);
+   if (handler)
+   *handler = seq;
 
-   return seq;
+   return 0;
 }
 
 struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx *ctx,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx