Re: [PATCH] drm/amdgpu: support gpu recovery tests on compute rings

2019-04-28 Thread Deucher, Alexander
maybe just:
amdgpu.lockup_timeout=
I don't think we really need separate timeouts for all the different video 
related engines.

Alex

From: Quan, Evan
Sent: Sunday, April 28, 2019 1:37 AM
To: Deucher, Alexander; Michel Dänzer; Koenig, Christian
Cc: Xu, Feifei; Cui, Flora; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: support gpu recovery tests on compute rings


How about amdgpu.lockup_timeout=non-compute-jobs[, gfx, sdma, decode, encode][: 
compute-jobs] ?

This will not break backward compatibility.



And I’m not sure how to map “decode” and “encode” to the uvd/vce/vcn rings.

Since there are many rings related with these IPs(uvd, uvd_enc, vce, vcn_dec, 
vcn_enc, vcn_jpeg).

Maybe we should use IP name(uvd, vce or vcn) instead of “decode/encode”?



Regards,

Evan

From: amd-gfx  On Behalf Of Deucher, 
Alexander
Sent: 2019年4月26日 22:24
To: Michel Dänzer ; Quan, Evan ; Koenig, 
Christian 
Cc: Xu, Feifei ; Cui, Flora ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: support gpu recovery tests on compute rings



How about an interface to change the timeout on a per engine (gfx, compute, 
dma, etc.) basis?

amdgpu.lockup_timeout=,]

if only one parameter is given, we change it globably.  If more are given, we 
override the global one.  Could also do a sysfs interface to change it on the 
fly.



Alex



From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of Michel Dänzer mailto:mic...@daenzer.net>>
Sent: Friday, April 26, 2019 4:35 AM
To: Quan, Evan; Koenig, Christian
Cc: Xu, Feifei; Cui, Flora; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: support gpu recovery tests on compute rings



On 2019-04-26 10:20 a.m., Quan, Evan wrote:
> My concern is there is already one module parameter "lockup_timeout".
> parm:   lockup_timeout:GPU lockup timeout in ms > 0 (default 1) 
> (int)
>
> Adding one more "timeout" seems redundant.
> And that will makes the description of "lockup_timeout"(seems working for all 
> jobs) does not match its real effect(affect only non-compute jobs).
>
> A better way is to rename "lockup_timeout" to "non-compute lockup_timeout". 
> But I do not think we can change existing module parameter. Right?

Right. Also, there are already too many amdgpu module parameters, we
should try to remove some rather than adding new ones for every little
thing that could be tweaked. :)

One possibility might be to optionally allow passing multiple values to
lockup_timeout, e.g.

 amdgpu.lockup_timeout=1,0

The first value would need to have the same meaning as now for backwards
compatibility.


--
Earthling Michel Dänzer   |  https://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH][next] drm/amd/display: fix incorrect null check on pointer

2019-04-28 Thread Colin King
From: Colin Ian King 

Currently an allocation is being made but the allocation failure
check is being performed on another pointer. Fix this by checking
the correct pointer. Also use the normal kernel idiom for null
pointer checks.

Addresses-Coverity: ("Resource leak")
Fixes: 43e3ac8389ef ("drm/amd/display: Add function to copy DC streams")
Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
index 6200df3edcd0..96e97d25d639 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
@@ -168,7 +168,7 @@ struct dc_stream_state *dc_copy_stream(const struct 
dc_stream_state *stream)
struct dc_stream_state *new_stream;
 
new_stream = kzalloc(sizeof(struct dc_stream_state), GFP_KERNEL);
-   if (stream == NULL)
+   if (!new_stream)
return NULL;
 
memcpy(new_stream, stream, sizeof(struct dc_stream_state));
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Bug Report: [PowerPlay] MCLK can't be set above 1107MHz on Vega 64

2019-04-28 Thread Yanik Yiannakis

Hello,

I experience a bug that prevents me from setting the MCLK of my Vega 64 
LC above 1107MHz.


I am using Unigine Superposition 1.1 in "Game"-mode to check the 
performance by watching the FPS.



*Behaviour with a single monitor:*

First I set the MCLK to a known stable value below 1108MHz:

/$ echo "m 3 1100 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage 
/


In Unigine Superposition the FPS increase as expected.

pp_dpm_mclk also confirms the change.

/$ watch cat 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_dpm_mclk/


   0: 167Mhz
   1: 500Mhz
   2: 800Mhz
   3: 1100Mhz *


After that I set the MCLK to a stable value above 1107MHz:

/$ echo "m 3 1200 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage 
/


In Unigine Superposition the FPS drop drastically.

pp_dpm_mclk indicates that the MCLK is stuck in state 0 (167MHz):

   0: 167Mhz *
   1: 500Mhz
   2: 800Mhz
   3: 1200Mhz


*Behaviour with multiple monitors that have different refresh rates:*

My monitors have different refresh rates. This causes the MCLK to stay 
in state 3 (945MHz stock) which is the expected behaviour as I 
understand it.



Now I try to set the MCLK to a value above 1107MHz:

/$ echo "m 3 1200 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage 
/


The FPS in Unigine Superposition remain the same as they were with 945MHz.

pp_dpm_mclk shows however that the value was set:

   0: 167Mhz
   1: 500Mhz
   2: 800Mhz
   3: 1200Mhz *


Then I set the MCLK to a value of 1107MHz or lower:

/$ echo "m 3 1100 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage 
/


The FPS in Unigine Superposition *increase*.

pp_dpm_mclk again confirms the set value:

   0: 167Mhz
   1: 500Mhz
   2: 800Mhz
   3: 1100Mhz *


Finally I increase MCLK to a known unstable value:

/$ echo "m 3 1300 950" > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:00.0/:03:00.0/pp_od_clk_voltage 
/


The FPS in Unigine Superposition remain the same. I therefore believe 
the value was not actually applied.


However pp_dpm_mclk shows that it was:

   0: 167Mhz
   1: 500Mhz
   2: 800Mhz
   3: 1300Mhz *


amdgpu_pm_info also claims that the value was set:

/$ sudo watch cat /sys/kernel/debug/dri/1/amdgpu_pm_info/

   GFX Clocks and Power:
    1300 MHz (MCLK)
    27 MHz (SCLK)
    1348 MHz (PSTATE_SCLK)
    800 MHz (PSTATE_MCLK)
    825 mV (VDDGFX)
    4.0 W (average GPU)

Again, I think the displayed MCLK is false and the memory still runs at 
1100MHz because the performance in Unigine Superposition indicates this 
and 1300MHz would cause a crash immediately.


A stable value (e.g. 1200MHz) causes the same behaviour. I just chose 
1300MHz to be sure.




Tested on these Kernels:

   Arch-Linux 5.0.9 (Arch)

   Linux 5.1-rc6 (Ubuntu)

   Linux 5.0 with amd-staging-drm-next (Ubuntu)
   (https://github.com/M-Bab/linux-kernel-amdgpu-binaries)

(Same behaviour on every kernel.)


Tested on this hardware:

   CPU: Intel i7-8700k

   Motherboard: MSI Z370 Gaming Pro Carbon

   GPU: Powercolor Vega 64 Liquid Cooled (Memory stable below 1220MHz,
   tested on Windows 10 with Wattman and Unigine Superposition)


Unigine Superposition "Game"-Mode settings:

   Preset: Custom

   Fullscreen: Disabled

   Resolution: 3840x2160 (4K UHD)

   Shaders Quality: Extreme

   Textures Quality: High

   Vsync: Off

   Depth of Field: On

   Motion Blur: On


I hope this helps.

Yanik Yiannakis

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 01/12] dma-buf: add dynamic caching of sg_table

2019-04-28 Thread Liam Mark
On Tue, 16 Apr 2019, Christian K??nig wrote:

> To allow a smooth transition from pinning buffer objects to dynamic
> invalidation we first start to cache the sg_table for an attachment
> unless the driver explicitly says to not do so.
> 
> ---
>  drivers/dma-buf/dma-buf.c | 24 
>  include/linux/dma-buf.h   | 11 +++
>  2 files changed, 35 insertions(+)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 7c858020d14b..65161a82d4d5 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -573,6 +573,20 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf 
> *dmabuf,
>   list_add(>node, >attachments);
>  
>   mutex_unlock(>lock);
> +
> + if (!dmabuf->ops->dynamic_sgt_mapping) {
> + struct sg_table *sgt;
> +
> + sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
> + if (!sgt)
> + sgt = ERR_PTR(-ENOMEM);
> + if (IS_ERR(sgt)) {
> + dma_buf_detach(dmabuf, attach);
> + return ERR_CAST(sgt);
> + }
> + attach->sgt = sgt;
> + }
> +
>   return attach;
>  
>  err_attach:
> @@ -595,6 +609,10 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct 
> dma_buf_attachment *attach)
>   if (WARN_ON(!dmabuf || !attach))
>   return;
>  
> + if (attach->sgt)
> + dmabuf->ops->unmap_dma_buf(attach, attach->sgt,
> +DMA_BIDIRECTIONAL);
> +
>   mutex_lock(>lock);
>   list_del(>node);
>   if (dmabuf->ops->detach)
> @@ -630,6 +648,9 @@ struct sg_table *dma_buf_map_attachment(struct 
> dma_buf_attachment *attach,
>   if (WARN_ON(!attach || !attach->dmabuf))
>   return ERR_PTR(-EINVAL);
>  
> + if (attach->sgt)
> + return attach->sgt;
> +

I am concerned by this change to make caching the sg_table the default 
behavior as this will result in the exporter's map_dma_buf/unmap_dma_buf 
calls are no longer being called in 
dma_buf_map_attachment/dma_buf_unmap_attachment.

This seems concerning to me as it appears to ignore the cache maintenance 
aspect of the map_dma_buf/unmap_dma_buf calls.
For example won't this potentially cause issues for clients of ION.

If we had the following
- #1 dma_buf_attach coherent_device
- #2 dma_buf attach non_coherent_device
- #3 dma_buf_map_attachment non_coherent_device
- #4 non_coherent_device writes to buffer
- #5 dma_buf_unmap_attachment non_coherent_device
- #6 dma_buf_map_attachment coherent_device
- #7 coherent_device reads buffer
- #8 dma_buf_unmap_attachment coherent_device   

There wouldn't be any CMO at step #5 anymore (specifically no invalidate) 
so now at step #7 the coherent_device could read a stale cache line.

Also, now by default dma_buf_unmap_attachment no longer removes the 
mappings from the iommu, so now by default dma_buf_unmap_attachment is not 
doing what I would expect and clients are losing the potential sandboxing 
benefits of removing the mappings.
Shouldn't this caching behavior be something that clients opt into instead 
of being the default?

Liam

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: Unmap CSA under SR-IOV in KFD path

2019-04-28 Thread Trigger Huang
In amdgpu open path, CSA will be mappened in VM, so when opening
KFD, calling mdgpu_vm_make_compute  will fail because it found this
VM is not a clean VM with some mappings, as a result, it will lead
to failed to create process VM object

The fix is try to unmap CSA, and actually CSA is not needed in
compute VF world switch

Signed-off-by: Trigger Huang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 10 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c  |  2 +-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 697b8ef..e0bc457 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -956,6 +956,16 @@ int amdgpu_amdkfd_gpuvm_acquire_process_vm(struct kgd_dev 
*kgd,
if (avm->process_info)
return -EINVAL;
 
+   /* Delete CSA mapping to make sure this VM is a clean VM  before
+*  converting VM
+*/
+   if (amdgpu_sriov_vf(adev) && drv_priv->csa_va) {
+   amdgpu_bo_reserve(adev->virt.csa_obj, true);
+   amdgpu_vm_bo_rmv(adev, drv_priv->csa_va);
+   drv_priv->csa_va = NULL;
+   amdgpu_bo_unreserve(adev->virt.csa_obj);
+   }
+
/* Convert VM into a compute VM */
ret = amdgpu_vm_make_compute(adev, avm, pasid);
if (ret)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index da7b4fe..361c2e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -1069,7 +1069,7 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
 
amdgpu_vm_bo_rmv(adev, fpriv->prt_va);
 
-   if (amdgpu_sriov_vf(adev)) {
+   if (amdgpu_sriov_vf(adev) && fpriv->csa_va) {
/* TODO: how to handle reserve failure */
BUG_ON(amdgpu_bo_reserve(adev->virt.csa_obj, true));
amdgpu_vm_bo_rmv(adev, fpriv->csa_va);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 20/27] drm/amdkfd: Fix gfx8 MEM_VIOL exception handler

2019-04-28 Thread Kuehling, Felix
From: Jay Cornwall 

When MEM_VIOL is asserted the context save handler rewinds the
program counter. This is incorrect for any source of the exception.
MEM_VIOL may be raised in normal operation by out-of-bounds access
to LDS or GDS and does not require special handling.

Remove PC adjustment when MEM_VIOL has been raised.

Signed-off-by: Jay Cornwall 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h  |  9 ++---
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm   | 13 -
 2 files changed, 2 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index 3621efbd5759..ec9a9a99f808 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -21,7 +21,7 @@
  */
 
 static const uint32_t cwsr_trap_gfx8_hex[] = {
-   0xbf820001, 0xbf82012b,
+   0xbf820001, 0xbf820121,
0xb8f4f802, 0x89748674,
0xb8f5f803, 0x8675ff75,
0x0400, 0xbf850017,
@@ -36,12 +36,7 @@ static const uint32_t cwsr_trap_gfx8_hex[] = {
0x8671ff71, 0x,
0x8f728374, 0xb972e0c2,
0xbf82, 0xb9740002,
-   0xbe801f70, 0xb8f5f803,
-   0x8675ff75, 0x0100,
-   0xbf840006, 0xbefa0080,
-   0xb97a0203, 0x8671ff71,
-   0x, 0x80f08870,
-   0x82f18071, 0xbefa0080,
+   0xbe801f70, 0xbefa0080,
0xb97a0283, 0xbef60068,
0xbef70069, 0xb8fa1c07,
0x8e7a9c7a, 0x87717a71,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
index abe1a5da29fb..a47f5b933120 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
@@ -282,19 +282,6 @@ if G8SR_DEBUG_TIMESTAMP
 s_waitcnt lgkmcnt(0) //FIXME, will cause xnack??
 end
 
-//check whether there is mem_viol
-s_getreg_b32s_save_trapsts, hwreg(HW_REG_TRAPSTS)
-s_and_b32   s_save_trapsts, s_save_trapsts, SQ_WAVE_TRAPSTS_MEM_VIOL_MASK
-s_cbranch_scc0  L_NO_PC_REWIND
-
-//if so, need rewind PC assuming GDS operation gets NACKed
-s_mov_b32   s_save_tmp, 0  
 //clear mem_viol bit
-s_setreg_b32hwreg(HW_REG_TRAPSTS, SQ_WAVE_TRAPSTS_MEM_VIOL_SHIFT, 1), 
s_save_tmp//clear mem_viol bit
-s_and_b32   s_save_pc_hi, s_save_pc_hi, 0x//pc[47:32]
-s_sub_u32   s_save_pc_lo, s_save_pc_lo, 8 //pc[31:0]-8
-s_subb_u32  s_save_pc_hi, s_save_pc_hi, 0x0   // -scc
-
-L_NO_PC_REWIND:
 s_mov_b32   s_save_tmp, 0  
 //clear saveCtx bit
 s_setreg_b32hwreg(HW_REG_TRAPSTS, SQ_WAVE_TRAPSTS_SAVECTX_SHIFT, 1), 
s_save_tmp //clear saveCtx bit
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 19/27] drm/amdkfd: Fix a circular lock dependency

2019-04-28 Thread Kuehling, Felix
Fix a circular lock dependency exposed under userptr memory pressure.
The DQM lock is the only one taken inside the MMU notifier. We need
to make sure that no reclaim is done under this lock, and that
no other locks are taken under which reclaim is possible.

Signed-off-by: Felix Kuehling 
Reviewed-by: Philip Yang 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 33 ---
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 1562590d837e..0bfdb141b6e7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -794,10 +794,14 @@ static int register_process(struct device_queue_manager 
*dqm,
retval = dqm->asic_ops.update_qpd(dqm, qpd);
 
dqm->processes_count++;
-   kfd_inc_compute_active(dqm->dev);
 
dqm_unlock(dqm);
 
+   /* Outside the DQM lock because under the DQM lock we can't do
+* reclaim or take other locks that others hold while reclaiming.
+*/
+   kfd_inc_compute_active(dqm->dev);
+
return retval;
 }
 
@@ -818,7 +822,6 @@ static int unregister_process(struct device_queue_manager 
*dqm,
list_del(>list);
kfree(cur);
dqm->processes_count--;
-   kfd_dec_compute_active(dqm->dev);
goto out;
}
}
@@ -826,6 +829,13 @@ static int unregister_process(struct device_queue_manager 
*dqm,
retval = 1;
 out:
dqm_unlock(dqm);
+
+   /* Outside the DQM lock because under the DQM lock we can't do
+* reclaim or take other locks that others hold while reclaiming.
+*/
+   if (!retval)
+   kfd_dec_compute_active(dqm->dev);
+
return retval;
 }
 
@@ -1519,6 +1529,7 @@ static int process_termination_nocpsch(struct 
device_queue_manager *dqm,
struct queue *q, *next;
struct device_process_node *cur, *next_dpn;
int retval = 0;
+   bool found = false;
 
dqm_lock(dqm);
 
@@ -1537,12 +1548,19 @@ static int process_termination_nocpsch(struct 
device_queue_manager *dqm,
list_del(>list);
kfree(cur);
dqm->processes_count--;
-   kfd_dec_compute_active(dqm->dev);
+   found = true;
break;
}
}
 
dqm_unlock(dqm);
+
+   /* Outside the DQM lock because under the DQM lock we can't do
+* reclaim or take other locks that others hold while reclaiming.
+*/
+   if (found)
+   kfd_dec_compute_active(dqm->dev);
+
return retval;
 }
 
@@ -1588,6 +1606,7 @@ static int process_termination_cpsch(struct 
device_queue_manager *dqm,
struct device_process_node *cur, *next_dpn;
enum kfd_unmap_queues_filter filter =
KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES;
+   bool found = false;
 
retval = 0;
 
@@ -1624,7 +1643,7 @@ static int process_termination_cpsch(struct 
device_queue_manager *dqm,
list_del(>list);
kfree(cur);
dqm->processes_count--;
-   kfd_dec_compute_active(dqm->dev);
+   found = true;
break;
}
}
@@ -1638,6 +1657,12 @@ static int process_termination_cpsch(struct 
device_queue_manager *dqm,
 
dqm_unlock(dqm);
 
+   /* Outside the DQM lock because under the DQM lock we can't do
+* reclaim or take other locks that others hold while reclaiming.
+*/
+   if (found)
+   kfd_dec_compute_active(dqm->dev);
+
/* Lastly, free mqd resources.
 * Do uninit_mqd() after dqm_unlock to avoid circular locking.
 */
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 13/27] drm/amdkfd: Move sdma_queue_id calculation into allocate_sdma_queue()

2019-04-28 Thread Kuehling, Felix
From: Yong Zhao 

This avoids duplicated code.

Signed-off-by: Yong Zhao 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 29 +++
 1 file changed, 11 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index e2de246d681b..38c66b8ffd31 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -883,7 +883,7 @@ static int stop_nocpsch(struct device_queue_manager *dqm)
 }
 
 static int allocate_sdma_queue(struct device_queue_manager *dqm,
-   unsigned int *sdma_id)
+   struct queue *q)
 {
int bit;
 
@@ -892,7 +892,14 @@ static int allocate_sdma_queue(struct device_queue_manager 
*dqm,
 
bit = __ffs64(dqm->sdma_bitmap);
dqm->sdma_bitmap &= ~(1ULL << bit);
-   *sdma_id = bit;
+   q->sdma_id = bit;
+
+   q->properties.sdma_engine_id = q->sdma_id % get_num_sdma_engines(dqm);
+   q->properties.sdma_queue_id = q->sdma_id / get_num_sdma_engines(dqm);
+
+   pr_debug("SDMA id is:%d\n", q->sdma_id);
+   pr_debug("SDMA engine id: %d\n", q->properties.sdma_engine_id);
+   pr_debug("SDMA queue id: %d\n", q->properties.sdma_queue_id);
 
return 0;
 }
@@ -914,21 +921,14 @@ static int create_sdma_queue_nocpsch(struct 
device_queue_manager *dqm,
 
mqd_mgr = dqm->mqd_mgrs[KFD_MQD_TYPE_SDMA];
 
-   retval = allocate_sdma_queue(dqm, >sdma_id);
+   retval = allocate_sdma_queue(dqm, q);
if (retval)
return retval;
 
-   q->properties.sdma_queue_id = q->sdma_id / get_num_sdma_engines(dqm);
-   q->properties.sdma_engine_id = q->sdma_id % get_num_sdma_engines(dqm);
-
retval = allocate_doorbell(qpd, q);
if (retval)
goto out_deallocate_sdma_queue;
 
-   pr_debug("SDMA id is:%d\n", q->sdma_id);
-   pr_debug("SDMA queue id: %d\n", q->properties.sdma_queue_id);
-   pr_debug("SDMA engine id: %d\n", q->properties.sdma_engine_id);
-
dqm->asic_ops.init_sdma_vm(dqm, q, qpd);
retval = mqd_mgr->init_mqd(mqd_mgr, >mqd, >mqd_mem_obj,
>gart_mqd_addr, >properties);
@@ -1129,16 +1129,9 @@ static int create_queue_cpsch(struct 
device_queue_manager *dqm, struct queue *q,
}
 
if (q->properties.type == KFD_QUEUE_TYPE_SDMA) {
-   retval = allocate_sdma_queue(dqm, >sdma_id);
+   retval = allocate_sdma_queue(dqm, q);
if (retval)
goto out;
-   q->properties.sdma_queue_id =
-   q->sdma_id / get_num_sdma_engines(dqm);
-   q->properties.sdma_engine_id =
-   q->sdma_id % get_num_sdma_engines(dqm);
-   pr_debug("SDMA id is:%d\n", q->sdma_id);
-   pr_debug("SDMA queue id: %d\n", q->properties.sdma_queue_id);
-   pr_debug("SDMA engine id: %d\n", q->properties.sdma_engine_id);
}
 
retval = allocate_doorbell(qpd, q);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 23/27] drm/amdkfd: Preserve ttmp[4:5] instead of ttmp[14:15]

2019-04-28 Thread Kuehling, Felix
From: Jay Cornwall 

ttmp[4:5] is initialized by the SPI with SPI_GDBG_TRAP_DATA* values.
These values are more useful to the debugger than ttmp[14:15], which
carries dispatch_scratch_base*. There are too few registers to
preserve both.

Signed-off-by: Jay Cornwall 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 466 +-
 .../drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm |  45 +-
 2 files changed, 253 insertions(+), 258 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index eed845b4e9a7..e413d4a71fa3 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -274,12 +274,12 @@ static const uint32_t cwsr_trap_gfx8_hex[] = {
 
 
 static const uint32_t cwsr_trap_gfx9_hex[] = {
-   0xbf820001, 0xbf820161,
+   0xbf820001, 0xbf82015e,
0xb8f8f802, 0x89788678,
-   0xb8f1f803, 0x866eff71,
+   0xb8fbf803, 0x866eff7b,
0x0400, 0xbf85003b,
-   0x866eff71, 0x0800,
-   0xbf850003, 0x866eff71,
+   0x866eff7b, 0x0800,
+   0xbf850003, 0x866eff7b,
0x0100, 0xbf84000c,
0x866eff78, 0x2000,
0xbf840005, 0xbf8e0010,
@@ -292,13 +292,13 @@ static const uint32_t cwsr_trap_gfx9_hex[] = {
0x8977ff77, 0xfc00,
0x87776f77, 0x896eff6e,
0x001f8000, 0xb96ef807,
-   0xb8f0f812, 0xb8f1f813,
-   0x8ef08870, 0xc0071bb8,
+   0xb8faf812, 0xb8fbf813,
+   0x8efa887a, 0xc0071bbd,
0x, 0xbf8cc07f,
-   0xc0071c38, 0x0008,
+   0xc0071ebd, 0x0008,
0xbf8cc07f, 0x86ee6e6e,
0xbf840001, 0xbe801d6e,
-   0xb8f1f803, 0x8671ff71,
+   0xb8fbf803, 0x867bff7b,
0x01ff, 0xbf850002,
0x806c846c, 0x826d806d,
0x866dff6d, 0x,
@@ -308,258 +308,256 @@ static const uint32_t cwsr_trap_gfx9_hex[] = {
0x8f6e8378, 0xb96ee0c2,
0xbf82, 0xb9780002,
0xbe801f6c, 0x866dff6d,
-   0x, 0xbef00080,
-   0xb9700283, 0xb8f02407,
-   0x8e709b70, 0x876d706d,
-   0xb8f003c7, 0x8e709a70,
-   0x876d706d, 0xb8f0f807,
-   0x8670ff70, 0x7fff,
-   0xb970f807, 0xbeee007e,
+   0x, 0xbefa0080,
+   0xb97a0283, 0xb8fa2407,
+   0x8e7a9b7a, 0x876d7a6d,
+   0xb8fa03c7, 0x8e7a9a7a,
+   0x876d7a6d, 0xb8faf807,
+   0x867aff7a, 0x7fff,
+   0xb97af807, 0xbeee007e,
0xbeef007f, 0xbefe0180,
-   0xbf94, 0x87708478,
-   0xb970f802, 0xbf8e0002,
-   0xbf88fffe, 0xb8f02a05,
+   0xbf94, 0x877a8478,
+   0xb97af802, 0xbf8e0002,
+   0xbf88fffe, 0xb8fa2a05,
+   0x807a817a, 0x8e7a8a7a,
+   0xb8fb1605, 0x807b817b,
+   0x8e7b867b, 0x807a7b7a,
+   0x807a7e7a, 0x827b807f,
+   0x867bff7b, 0x,
+   0xc04b1c3d, 0x0050,
+   0xbf8cc07f, 0xc04b1d3d,
+   0x0060, 0xbf8cc07f,
+   0xc0431e7d, 0x0074,
+   0xbf8cc07f, 0xbef4007e,
+   0x8675ff7f, 0x,
+   0x8775ff75, 0x0004,
+   0xbef60080, 0xbef700ff,
+   0x00807fac, 0x867aff7f,
+   0x0800, 0x8f7a837a,
+   0x8a77, 0x867aff7f,
+   0x7000, 0x8f7a817a,
+   0x8a77, 0xbef1007c,
+   0xbef00080, 0xb8f02a05,
0x80708170, 0x8e708a70,
-   0xb8f11605, 0x80718171,
-   0x8e718671, 0x80707170,
-   0x80707e70, 0x8271807f,
-   0x8671ff71, 0x,
-   0xc0471cb8, 0x0040,
-   0xbf8cc07f, 0xc04b1d38,
-   0x0048, 0xbf8cc07f,
-   0xc0431e78, 0x0058,
-   0xbf8cc07f, 0xc0471eb8,
-   0x005c, 0xbf8cc07f,
-   0xbef4007e, 0x8675ff7f,
-   0x, 0x8775ff75,
-   0x0004, 0xbef60080,
-   0xbef700ff, 0x00807fac,
-   0x8670ff7f, 0x0800,
-   0x8f708370, 0x8077,
-   0x8670ff7f, 0x7000,
-   0x8f708170, 0x8077,
-   0xbefb007c, 0xbefa0080,
-   0xb8fa2a05, 0x807a817a,
-   0x8e7a8a7a, 0xb8f01605,
-   0x80708170, 0x8e708670,
-   0x807a707a, 0xbef60084,
-   0xbef600ff, 0x0100,
-   0xbefe007c, 0xbefc007a,
-   0xc0611efa, 0x007c,
-   0xbf8cc07f, 0x807a847a,
-   0xbefc007e, 0xbefe007c,
-   0xbefc007a, 0xc0611b3a,
+   0xb8fa1605, 0x807a817a,
+   0x8e7a867a, 0x80707a70,
+   0xbef60084, 0xbef600ff,
+   0x0100, 0xbefe007c,
+   0xbefc0070, 0xc0611c7a,
0x007c, 0xbf8cc07f,
-   0x807a847a, 0xbefc007e,
-   0xbefe007c, 0xbefc007a,
-   0xc0611b7a, 0x007c,
-   0xbf8cc07f, 0x807a847a,
+   0x80708470, 0xbefc007e,
+   0xbefe007c, 0xbefc0070,
+   0xc0611b3a, 0x007c,
+   0xbf8cc07f, 0x80708470,
0xbefc007e, 0xbefe007c,
-   0xbefc007a, 0xc0611bba,
+   0xbefc0070, 0xc0611b7a,
0x007c, 0xbf8cc07f,
-   0x807a847a, 0xbefc007e,
-   0xbefe007c, 0xbefc007a,
-   0xc0611bfa, 0x007c,

[PATCH 26/27] drm/amdgpu: Use heavy weight for tlb invalidation on xgmi configuration

2019-04-28 Thread Kuehling, Felix
From: shaoyunl 

There is a bug found in vml2 xgmi logic:
mtype is always sent as NC on the VMC to TC interface for a page walk,
regardless of whether the request is being sent to local or remote GPU.
NC means non-coherent and will cause the VMC return data to be cached
in the TCC (versus UC – uncached will not cache the data). Since the
page table updates are being done by SDMA/HDP, then TCC will never be
updated and the GC VML2 will continue to hit on the TCC and never get
the updated page tables and result in a fault.
Heave weigh tlb invalidation does a WB/INVAL of the L1/L2 GL data
caches so TCC will not be hit on next request

Signed-off-by: shaoyunl 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 53 +--
 1 file changed, 26 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index ef3d93b995b2..7ec97e903a1b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -726,29 +726,8 @@ static uint16_t get_atc_vmid_pasid_mapping_pasid(struct 
kgd_dev *kgd,
return reg & ATC_VMID0_PASID_MAPPING__PASID_MASK;
 }
 
-static void write_vmid_invalidate_request(struct kgd_dev *kgd, uint8_t vmid)
-{
-   struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
-
-   /* Use legacy mode tlb invalidation.
-*
-* Currently on Raven the code below is broken for anything but
-* legacy mode due to a MMHUB power gating problem. A workaround
-* is for MMHUB to wait until the condition PER_VMID_INVALIDATE_REQ
-* == PER_VMID_INVALIDATE_ACK instead of simply waiting for the ack
-* bit.
-*
-* TODO 1: agree on the right set of invalidation registers for
-* KFD use. Use the last one for now. Invalidate both GC and
-* MMHUB.
-*
-* TODO 2: support range-based invalidation, requires kfg2kgd
-* interface change
-*/
-   amdgpu_gmc_flush_gpu_tlb(adev, vmid, 0);
-}
-
-static int invalidate_tlbs_with_kiq(struct amdgpu_device *adev, uint16_t pasid)
+static int invalidate_tlbs_with_kiq(struct amdgpu_device *adev, uint16_t pasid,
+   uint32_t flush_type)
 {
signed long r;
uint32_t seq;
@@ -761,7 +740,7 @@ static int invalidate_tlbs_with_kiq(struct amdgpu_device 
*adev, uint16_t pasid)
PACKET3_INVALIDATE_TLBS_DST_SEL(1) |
PACKET3_INVALIDATE_TLBS_ALL_HUB(1) |
PACKET3_INVALIDATE_TLBS_PASID(pasid) |
-   PACKET3_INVALIDATE_TLBS_FLUSH_TYPE(0)); /* legacy */
+   PACKET3_INVALIDATE_TLBS_FLUSH_TYPE(flush_type));
amdgpu_fence_emit_polling(ring, );
amdgpu_ring_commit(ring);
spin_unlock(>gfx.kiq.ring_lock);
@@ -780,12 +759,16 @@ static int invalidate_tlbs(struct kgd_dev *kgd, uint16_t 
pasid)
struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
int vmid;
struct amdgpu_ring *ring = >gfx.kiq.ring;
+   uint32_t flush_type = 0;
 
if (adev->in_gpu_reset)
return -EIO;
+   if (adev->gmc.xgmi.num_physical_nodes &&
+   adev->asic_type == CHIP_VEGA20)
+   flush_type = 2;
 
if (ring->sched.ready)
-   return invalidate_tlbs_with_kiq(adev, pasid);
+   return invalidate_tlbs_with_kiq(adev, pasid, flush_type);
 
for (vmid = 0; vmid < 16; vmid++) {
if (!amdgpu_amdkfd_is_kfd_vmid(adev, vmid))
@@ -793,7 +776,8 @@ static int invalidate_tlbs(struct kgd_dev *kgd, uint16_t 
pasid)
if (get_atc_vmid_pasid_mapping_valid(kgd, vmid)) {
if (get_atc_vmid_pasid_mapping_pasid(kgd, vmid)
== pasid) {
-   write_vmid_invalidate_request(kgd, vmid);
+   amdgpu_gmc_flush_gpu_tlb(adev, vmid,
+flush_type);
break;
}
}
@@ -811,7 +795,22 @@ static int invalidate_tlbs_vmid(struct kgd_dev *kgd, 
uint16_t vmid)
return 0;
}
 
-   write_vmid_invalidate_request(kgd, vmid);
+   /* Use legacy mode tlb invalidation.
+*
+* Currently on Raven the code below is broken for anything but
+* legacy mode due to a MMHUB power gating problem. A workaround
+* is for MMHUB to wait until the condition PER_VMID_INVALIDATE_REQ
+* == PER_VMID_INVALIDATE_ACK instead of simply waiting for the ack
+* bit.
+*
+* TODO 1: agree on the right set of invalidation registers for
+* KFD use. Use the last one for now. Invalidate both GC and
+* MMHUB.
+*
+* TODO 2: support 

[PATCH 16/27] drm/amdkfd: Introduce XGMI SDMA queue type

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

Existing QUEUE_TYPE_SDMA means PCIe optimized SDMA queues.
Introduce a new QUEUE_TYPE_SDMA_XGMI, which is optimized
for non-PCIe transfer such as XGMI.

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |   2 +
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  15 +++
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 123 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |   3 +
 .../gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c  |   2 +
 .../gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c  |   2 +
 .../gpu/drm/amd/amdkfd/kfd_packet_manager.c   |   3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |   4 +-
 .../amd/amdkfd/kfd_process_queue_manager.c|  10 +-
 include/uapi/linux/kfd_ioctl.h|   7 +-
 10 files changed, 132 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index d795e5018270..c731126ada22 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -213,6 +213,8 @@ static int set_queue_properties_from_user(struct 
queue_properties *q_properties,
q_properties->type = KFD_QUEUE_TYPE_COMPUTE;
else if (args->queue_type == KFD_IOC_QUEUE_TYPE_SDMA)
q_properties->type = KFD_QUEUE_TYPE_SDMA;
+   else if (args->queue_type == KFD_IOC_QUEUE_TYPE_SDMA_XGMI)
+   q_properties->type = KFD_QUEUE_TYPE_SDMA_XGMI;
else
return -ENOTSUPP;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 8202a5db3a35..1368b41cb92b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -54,6 +54,7 @@ static const struct kfd_device_info kaveri_device_info = {
.needs_iommu_device = true,
.needs_pci_atomics = false,
.num_sdma_engines = 2,
+   .num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
 
@@ -71,6 +72,7 @@ static const struct kfd_device_info carrizo_device_info = {
.needs_iommu_device = true,
.needs_pci_atomics = false,
.num_sdma_engines = 2,
+   .num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
 
@@ -87,6 +89,7 @@ static const struct kfd_device_info raven_device_info = {
.needs_iommu_device = true,
.needs_pci_atomics = true,
.num_sdma_engines = 1,
+   .num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
 #endif
@@ -105,6 +108,7 @@ static const struct kfd_device_info hawaii_device_info = {
.needs_iommu_device = false,
.needs_pci_atomics = false,
.num_sdma_engines = 2,
+   .num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
 
@@ -121,6 +125,7 @@ static const struct kfd_device_info tonga_device_info = {
.needs_iommu_device = false,
.needs_pci_atomics = true,
.num_sdma_engines = 2,
+   .num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
 
@@ -137,6 +142,7 @@ static const struct kfd_device_info fiji_device_info = {
.needs_iommu_device = false,
.needs_pci_atomics = true,
.num_sdma_engines = 2,
+   .num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
 
@@ -153,6 +159,7 @@ static const struct kfd_device_info fiji_vf_device_info = {
.needs_iommu_device = false,
.needs_pci_atomics = false,
.num_sdma_engines = 2,
+   .num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
 
@@ -170,6 +177,7 @@ static const struct kfd_device_info polaris10_device_info = 
{
.needs_iommu_device = false,
.needs_pci_atomics = true,
.num_sdma_engines = 2,
+   .num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
 
@@ -186,6 +194,7 @@ static const struct kfd_device_info 
polaris10_vf_device_info = {
.needs_iommu_device = false,
.needs_pci_atomics = false,
.num_sdma_engines = 2,
+   .num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
 
@@ -202,6 +211,7 @@ static const struct kfd_device_info polaris11_device_info = 
{
.needs_iommu_device = false,
.needs_pci_atomics = true,
.num_sdma_engines = 2,
+   .num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
 
@@ -218,6 +228,7 @@ static const struct kfd_device_info polaris12_device_info = 
{
.needs_iommu_device = false,
.needs_pci_atomics = true,
.num_sdma_engines = 2,
+   .num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
 
@@ -234,6 +245,7 @@ static const struct kfd_device_info vega10_device_info = {
.needs_iommu_device = false,
.needs_pci_atomics = false,
.num_sdma_engines = 2,
+   .num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine 

[PATCH 10/27] drm/amdkfd: Allocate MQD trunk for HIQ and SDMA

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

MEC FW for some new asic requires all SDMA MQDs to be in a continuous
trunk of memory right after HIQ MQD. Add a field in device queue manager
to hold the HIQ/SDMA MQD memory object and allocate MQD trunk on device
queue manager initialization.

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 32 +++
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  1 +
 2 files changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 063625c3646b..e2de246d681b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1620,6 +1620,25 @@ static int init_mqd_managers(struct device_queue_manager 
*dqm)
 
return -ENOMEM;
 }
+
+/* Allocate one hiq mqd (HWS) and all SDMA mqd in a continuous trunk*/
+static int allocate_hiq_sdma_mqd(struct device_queue_manager *dqm)
+{
+   int retval;
+   struct kfd_dev *dev = dqm->dev;
+   struct kfd_mem_obj *mem_obj = >hiq_sdma_mqd;
+   uint32_t size = dqm->mqd_mgrs[KFD_MQD_TYPE_SDMA]->mqd_size *
+   dev->device_info->num_sdma_engines *
+   dev->device_info->num_sdma_queues_per_engine +
+   dqm->mqd_mgrs[KFD_MQD_TYPE_HIQ]->mqd_size;
+
+   retval = amdgpu_amdkfd_alloc_gtt_mem(dev->kgd, size,
+   &(mem_obj->gtt_mem), &(mem_obj->gpu_addr),
+   (void *)&(mem_obj->cpu_ptr), true);
+
+   return retval;
+}
+
 struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 {
struct device_queue_manager *dqm;
@@ -1729,6 +1748,11 @@ struct device_queue_manager 
*device_queue_manager_init(struct kfd_dev *dev)
if (init_mqd_managers(dqm))
goto out_free;
 
+   if (allocate_hiq_sdma_mqd(dqm)) {
+   pr_err("Failed to allocate hiq sdma mqd trunk buffer\n");
+   goto out_free;
+   }
+
if (!dqm->ops.initialize(dqm))
return dqm;
 
@@ -1737,9 +1761,17 @@ struct device_queue_manager 
*device_queue_manager_init(struct kfd_dev *dev)
return NULL;
 }
 
+void deallocate_hiq_sdma_mqd(struct kfd_dev *dev, struct kfd_mem_obj *mqd)
+{
+   WARN(!mqd, "No hiq sdma mqd trunk to free");
+
+   amdgpu_amdkfd_free_gtt_mem(dev->kgd, mqd->gtt_mem);
+}
+
 void device_queue_manager_uninit(struct device_queue_manager *dqm)
 {
dqm->ops.uninitialize(dqm);
+   deallocate_hiq_sdma_mqd(dqm->dev, >hiq_sdma_mqd);
kfree(dqm);
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
index a5ef7a6650a5..3742fd340ec3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
@@ -197,6 +197,7 @@ struct device_queue_manager {
/* hw exception  */
boolis_hws_hang;
struct work_struct  hw_exception_work;
+   struct kfd_mem_obj  hiq_sdma_mqd;
 };
 
 void device_queue_manager_init_cik(
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 21/27] drm/amdkfd: Preserve wave state after instruction fetch MEM_VIOL

2019-04-28 Thread Kuehling, Felix
From: Jay Cornwall 

If instruction fetch fails the wave cannot be halted and returned to
the shader without raising MEM_VIOL again. Currently the wave is
terminated if this occurs, but this loses information about the cause
of the fault. The debugger would prefer the faulting wave state to be
context-saved.

Poll inside the trap handler until TRAPSTS.SAVECTX indicates context
save is ready. Exit the poll loop and complete the remainder of the
exception handler, then return to the shader. The next instruction
fetch will be from the trap handler and not the faulting PC. Context
save will then deschedule the wave and save its state.

Signed-off-by: Jay Cornwall 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 10 ++
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm | 10 --
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index ec9a9a99f808..097da0dd3b04 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -274,15 +274,17 @@ static const uint32_t cwsr_trap_gfx8_hex[] = {
 
 
 static const uint32_t cwsr_trap_gfx9_hex[] = {
-   0xbf820001, 0xbf82015d,
+   0xbf820001, 0xbf820161,
0xb8f8f802, 0x89788678,
0xb8f1f803, 0x866eff71,
-   0x0400, 0xbf850037,
+   0x0400, 0xbf85003b,
0x866eff71, 0x0800,
0xbf850003, 0x866eff71,
-   0x0100, 0xbf840008,
+   0x0100, 0xbf84000c,
0x866eff78, 0x2000,
-   0xbf840001, 0xbf81,
+   0xbf840005, 0xbf8e0010,
+   0xb8eef803, 0x866eff6e,
+   0x0400, 0xbf84fffb,
0x8778ff78, 0x2000,
0x80ec886c, 0x82ed806d,
0xb8eef807, 0x866fff6e,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
index 0bb9c577b3a2..6a010c9e55de 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
@@ -266,10 +266,16 @@ if (!EMU_RUN_HACK)
 
 L_HALT_WAVE:
 // If STATUS.HALT is set then this fault must come from SQC instruction 
fetch.
-// We cannot prevent further faults so just terminate the wavefront.
+// We cannot prevent further faults. Spin wait until context saved.
 s_and_b32   ttmp2, s_save_status, SQ_WAVE_STATUS_HALT_MASK
 s_cbranch_scc0  L_NOT_ALREADY_HALTED
-s_endpgm
+
+L_WAIT_CTX_SAVE:
+s_sleep 0x10
+s_getreg_b32ttmp2, hwreg(HW_REG_TRAPSTS)
+s_and_b32   ttmp2, ttmp2, SQ_WAVE_TRAPSTS_SAVECTX_MASK
+s_cbranch_scc0  L_WAIT_CTX_SAVE
+
 L_NOT_ALREADY_HALTED:
 s_or_b32s_save_status, s_save_status, SQ_WAVE_STATUS_HALT_MASK
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 12/27] drm/amdkfd: Allocate hiq and sdma mqd from mqd trunk

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

Instead of allocat hiq and sdma mqd from sub-allocator, allocate
them from a mqd trunk pool. This is done for all asics

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c  | 49 +++
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |  7 +++
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 20 +++-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 22 +++--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 22 +++--
 5 files changed, 80 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
index eeb2b60a36b5..9307811bc427 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
@@ -23,6 +23,55 @@
 
 #include "kfd_mqd_manager.h"
 #include "amdgpu_amdkfd.h"
+#include "kfd_device_queue_manager.h"
+
+struct kfd_mem_obj *allocate_hiq_mqd(struct kfd_dev *dev)
+{
+   struct kfd_mem_obj *mqd_mem_obj = NULL;
+
+   mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_KERNEL);
+   if (!mqd_mem_obj)
+   return NULL;
+
+   mqd_mem_obj->gtt_mem = dev->dqm->hiq_sdma_mqd.gtt_mem;
+   mqd_mem_obj->gpu_addr = dev->dqm->hiq_sdma_mqd.gpu_addr;
+   mqd_mem_obj->cpu_ptr = dev->dqm->hiq_sdma_mqd.cpu_ptr;
+
+   return mqd_mem_obj;
+}
+
+struct kfd_mem_obj *allocate_sdma_mqd(struct kfd_dev *dev,
+   struct queue_properties *q)
+{
+   struct kfd_mem_obj *mqd_mem_obj = NULL;
+   uint64_t offset;
+
+   mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_KERNEL);
+   if (!mqd_mem_obj)
+   return NULL;
+
+   offset = (q->sdma_engine_id *
+   dev->device_info->num_sdma_queues_per_engine +
+   q->sdma_queue_id) *
+   dev->dqm->mqd_mgrs[KFD_MQD_TYPE_SDMA]->mqd_size;
+
+   offset += dev->dqm->mqd_mgrs[KFD_MQD_TYPE_HIQ]->mqd_size;
+
+   mqd_mem_obj->gtt_mem = (void *)((uint64_t)dev->dqm->hiq_sdma_mqd.gtt_mem
+   + offset);
+   mqd_mem_obj->gpu_addr = dev->dqm->hiq_sdma_mqd.gpu_addr + offset;
+   mqd_mem_obj->cpu_ptr = (uint32_t *)((uint64_t)
+   dev->dqm->hiq_sdma_mqd.cpu_ptr + offset);
+
+   return mqd_mem_obj;
+}
+
+void uninit_mqd_hiq_sdma(struct mqd_manager *mm, void *mqd,
+   struct kfd_mem_obj *mqd_mem_obj)
+{
+   WARN_ON(!mqd_mem_obj->gtt_mem);
+   kfree(mqd_mem_obj);
+}
 
 void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
const uint32_t *cu_mask, uint32_t cu_mask_count,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
index 009d232fb60b..56af256a191b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
@@ -102,6 +102,13 @@ struct mqd_manager {
uint32_t mqd_size;
 };
 
+struct kfd_mem_obj *allocate_hiq_mqd(struct kfd_dev *dev);
+
+struct kfd_mem_obj *allocate_sdma_mqd(struct kfd_dev *dev,
+   struct queue_properties *q);
+void uninit_mqd_hiq_sdma(struct mqd_manager *mm, void *mqd,
+   struct kfd_mem_obj *mqd_mem_obj);
+
 void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
const uint32_t *cu_mask, uint32_t cu_mask_count,
uint32_t *se_mask);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index a00402077e34..6e8509ec29d9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -71,6 +71,9 @@ static struct kfd_mem_obj *allocate_mqd(struct kfd_dev *kfd,
 {
struct kfd_mem_obj *mqd_mem_obj;
 
+   if (q->type == KFD_QUEUE_TYPE_HIQ)
+   return allocate_hiq_mqd(kfd);
+
if (kfd_gtt_sa_allocate(kfd, sizeof(struct cik_mqd),
_mem_obj))
return NULL;
@@ -148,12 +151,10 @@ static int init_mqd_sdma(struct mqd_manager *mm, void 
**mqd,
 {
int retval;
struct cik_sdma_rlc_registers *m;
+   struct kfd_dev *dev = mm->dev;
 
-   retval = kfd_gtt_sa_allocate(mm->dev,
-   sizeof(struct cik_sdma_rlc_registers),
-   mqd_mem_obj);
-
-   if (retval != 0)
+   *mqd_mem_obj = allocate_sdma_mqd(dev, q);
+   if (!*mqd_mem_obj)
return -ENOMEM;
 
m = (struct cik_sdma_rlc_registers *) (*mqd_mem_obj)->cpu_ptr;
@@ -175,11 +176,6 @@ static void uninit_mqd(struct mqd_manager *mm, void *mqd,
kfd_gtt_sa_free(mm->dev, mqd_mem_obj);
 }
 
-static void uninit_mqd_sdma(struct mqd_manager *mm, void *mqd,
-   struct kfd_mem_obj *mqd_mem_obj)
-{
-   kfd_gtt_sa_free(mm->dev, 

[PATCH 03/27] drm/amdkfd: Differentiate b/t sdma_id and sdma_queue_id

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

sdma_queue_id is sdma queue index inside one sdma engine.
sdma_id is sdma queue index among all sdma engines. Use
those two names properly.

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 937ed1a7050d..7e5ead042dc0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -922,7 +922,7 @@ static int stop_nocpsch(struct device_queue_manager *dqm)
 }
 
 static int allocate_sdma_queue(struct device_queue_manager *dqm,
-   unsigned int *sdma_queue_id)
+   unsigned int *sdma_id)
 {
int bit;
 
@@ -931,17 +931,17 @@ static int allocate_sdma_queue(struct 
device_queue_manager *dqm,
 
bit = __ffs64(dqm->sdma_bitmap);
dqm->sdma_bitmap &= ~(1ULL << bit);
-   *sdma_queue_id = bit;
+   *sdma_id = bit;
 
return 0;
 }
 
 static void deallocate_sdma_queue(struct device_queue_manager *dqm,
-   unsigned int sdma_queue_id)
+   unsigned int sdma_id)
 {
-   if (sdma_queue_id >= get_num_sdma_queues(dqm))
+   if (sdma_id >= get_num_sdma_queues(dqm))
return;
-   dqm->sdma_bitmap |= (1ULL << sdma_queue_id);
+   dqm->sdma_bitmap |= (1ULL << sdma_id);
 }
 
 static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 01/27] drm/amdkfd: Use 64 bit sdma_bitmap

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

Maximumly support 64 sdma queues

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 10 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 1d6b15788ebf..0b1044dea765 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -891,7 +891,7 @@ static int initialize_nocpsch(struct device_queue_manager 
*dqm)
}
 
dqm->vmid_bitmap = (1 << dqm->dev->vm_info.vmid_num_kfd) - 1;
-   dqm->sdma_bitmap = (1 << get_num_sdma_queues(dqm)) - 1;
+   dqm->sdma_bitmap = (1ULL << get_num_sdma_queues(dqm)) - 1;
 
return 0;
 }
@@ -929,8 +929,8 @@ static int allocate_sdma_queue(struct device_queue_manager 
*dqm,
if (dqm->sdma_bitmap == 0)
return -ENOMEM;
 
-   bit = ffs(dqm->sdma_bitmap) - 1;
-   dqm->sdma_bitmap &= ~(1 << bit);
+   bit = __ffs64(dqm->sdma_bitmap);
+   dqm->sdma_bitmap &= ~(1ULL << bit);
*sdma_queue_id = bit;
 
return 0;
@@ -941,7 +941,7 @@ static void deallocate_sdma_queue(struct 
device_queue_manager *dqm,
 {
if (sdma_queue_id >= get_num_sdma_queues(dqm))
return;
-   dqm->sdma_bitmap |= (1 << sdma_queue_id);
+   dqm->sdma_bitmap |= (1ULL << sdma_queue_id);
 }
 
 static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
@@ -1047,7 +1047,7 @@ static int initialize_cpsch(struct device_queue_manager 
*dqm)
dqm->queue_count = dqm->processes_count = 0;
dqm->sdma_queue_count = 0;
dqm->active_runlist = false;
-   dqm->sdma_bitmap = (1 << get_num_sdma_queues(dqm)) - 1;
+   dqm->sdma_bitmap = (1ULL << get_num_sdma_queues(dqm)) - 1;
 
INIT_WORK(>hw_exception_work, kfd_process_hw_exception);
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
index 70e38a2e23b9..2770f3ece89f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
@@ -188,7 +188,7 @@ struct device_queue_manager {
unsigned inttotal_queue_count;
unsigned intnext_pipe_to_allocate;
unsigned int*allocated_queues;
-   unsigned intsdma_bitmap;
+   uint64_tsdma_bitmap;
unsigned intvmid_bitmap;
uint64_tpipelines_addr;
struct kfd_mem_obj  *pipeline_mem;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 07/27] drm/amdkfd: Introduce DIQ type mqd manager

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

With introduction of new mqd allocation scheme for HIQ,
DIQ and HIQ use different mqd allocation scheme, DIQ
can't reuse HIQ mqd manager

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c |  3 +++
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 11 +++
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 11 +++
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 11 +++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  1 +
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c|  1 -
 6 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
index f1596881f20a..58bb3ad233a1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
@@ -58,6 +58,9 @@ static bool initialize(struct kernel_queue *kq, struct 
kfd_dev *dev,
kq->nop_packet = nop.u32all;
switch (type) {
case KFD_QUEUE_TYPE_DIQ:
+   kq->mqd_mgr = dev->dqm->ops.get_mqd_manager(dev->dqm,
+   KFD_MQD_TYPE_DIQ);
+   break;
case KFD_QUEUE_TYPE_HIQ:
kq->mqd_mgr = dev->dqm->ops.get_mqd_manager(dev->dqm,
KFD_MQD_TYPE_HIQ);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index ae90a99909ef..e69bb4d3c3a9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -413,6 +413,17 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE 
type,
mqd->is_occupied = is_occupied;
 #if defined(CONFIG_DEBUG_FS)
mqd->debugfs_show_mqd = debugfs_show_mqd;
+#endif
+   break;
+   case KFD_MQD_TYPE_DIQ:
+   mqd->init_mqd = init_mqd_hiq;
+   mqd->uninit_mqd = uninit_mqd;
+   mqd->load_mqd = load_mqd;
+   mqd->update_mqd = update_mqd_hiq;
+   mqd->destroy_mqd = destroy_mqd;
+   mqd->is_occupied = is_occupied;
+#if defined(CONFIG_DEBUG_FS)
+   mqd->debugfs_show_mqd = debugfs_show_mqd;
 #endif
break;
case KFD_MQD_TYPE_SDMA:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index 8fe74b821b32..273aad4f59c8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -475,6 +475,17 @@ struct mqd_manager *mqd_manager_init_v9(enum KFD_MQD_TYPE 
type,
mqd->is_occupied = is_occupied;
 #if defined(CONFIG_DEBUG_FS)
mqd->debugfs_show_mqd = debugfs_show_mqd;
+#endif
+   break;
+   case KFD_MQD_TYPE_DIQ:
+   mqd->init_mqd = init_mqd_hiq;
+   mqd->uninit_mqd = uninit_mqd;
+   mqd->load_mqd = load_mqd;
+   mqd->update_mqd = update_mqd_hiq;
+   mqd->destroy_mqd = destroy_mqd;
+   mqd->is_occupied = is_occupied;
+#if defined(CONFIG_DEBUG_FS)
+   mqd->debugfs_show_mqd = debugfs_show_mqd;
 #endif
break;
case KFD_MQD_TYPE_SDMA:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index 6469b3456f00..67bd590a82fc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -472,6 +472,17 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE 
type,
mqd->is_occupied = is_occupied;
 #if defined(CONFIG_DEBUG_FS)
mqd->debugfs_show_mqd = debugfs_show_mqd;
+#endif
+   break;
+   case KFD_MQD_TYPE_DIQ:
+   mqd->init_mqd = init_mqd_hiq;
+   mqd->uninit_mqd = uninit_mqd;
+   mqd->load_mqd = load_mqd;
+   mqd->update_mqd = update_mqd_hiq;
+   mqd->destroy_mqd = destroy_mqd;
+   mqd->is_occupied = is_occupied;
+#if defined(CONFIG_DEBUG_FS)
+   mqd->debugfs_show_mqd = debugfs_show_mqd;
 #endif
break;
case KFD_MQD_TYPE_SDMA:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 3dd48da0e2d6..d1d60336172a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -472,6 +472,7 @@ enum KFD_MQD_TYPE {
KFD_MQD_TYPE_HIQ,   /* for hiq */
KFD_MQD_TYPE_CP,/* for cp queues and diq */
KFD_MQD_TYPE_SDMA,  /* for sdma queues */
+   KFD_MQD_TYPE_DIQ,   /* for diq */
KFD_MQD_TYPE_MAX
 };
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 

[PATCH 05/27] drm/amdkfd: Fix a potential memory leak

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

Free mqd_mem_obj it GTT buffer allocation for MQD+control stack fails.

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index 9dbba609450e..8fe74b821b32 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -76,6 +76,7 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
struct v9_mqd *m;
struct kfd_dev *kfd = mm->dev;
 
+   *mqd_mem_obj = NULL;
/* From V9,  for CWSR, the control stack is located on the next page
 * boundary after the mqd, we will use the gtt allocation function
 * instead of sub-allocation function.
@@ -93,8 +94,10 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
} else
retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct v9_mqd),
mqd_mem_obj);
-   if (retval != 0)
+   if (retval) {
+   kfree(*mqd_mem_obj);
return -ENOMEM;
+   }
 
m = (struct v9_mqd *) (*mqd_mem_obj)->cpu_ptr;
addr = (*mqd_mem_obj)->gpu_addr;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 02/27] drm/amdkfd: Add sdma allocation debug message

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

Add debug messages during SDMA queue allocation.

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 0b1044dea765..937ed1a7050d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1177,6 +1177,9 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
q->sdma_id / get_num_sdma_engines(dqm);
q->properties.sdma_engine_id =
q->sdma_id % get_num_sdma_engines(dqm);
+   pr_debug("SDMA id is:%d\n", q->sdma_id);
+   pr_debug("SDMA queue id: %d\n", q->properties.sdma_queue_id);
+   pr_debug("SDMA engine id: %d\n", q->properties.sdma_engine_id);
}
 
retval = allocate_doorbell(qpd, q);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 27/27] drm/amdgpu: Fix GTT size calculation

2019-04-28 Thread Kuehling, Felix
From: Kent Russell 

GTT size is currently limited to the minimum of VRAM size or 3/4 of
system memory. This severely limits the quanitity of system memory
that can be used by ROCm application.

Increase GTT size to the maximum of VRAM size or system memory size.

Signed-off-by: Kent Russell 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c14198737dcd..e9ecc3953673 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1740,11 +1740,10 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
struct sysinfo si;
 
si_meminfo();
-   gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
-  adev->gmc.mc_vram_size),
-  ((uint64_t)si.totalram * si.mem_unit * 3/4));
-   }
-   else
+   gtt_size = max3((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
+   adev->gmc.mc_vram_size,
+   ((uint64_t)si.totalram * si.mem_unit));
+   } else
gtt_size = (uint64_t)amdgpu_gtt_size << 20;
 
/* Initialize GTT memory pool */
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 25/27] drm/amdkfd: Add domain number into gpu_id

2019-04-28 Thread Kuehling, Felix
From: Amber Lin 

A multi-socket server can have multiple PCIe segments so BFD is not enough
to distingush each GPU. Also add domain number into account when generating
gpu_id.

Signed-off-by: Amber Lin 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 64099a8494e1..2c06d6c16eab 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -1082,8 +1082,9 @@ static uint32_t kfd_generate_gpu_id(struct kfd_dev *gpu)
local_mem_info.local_mem_size_public;
 
buf[0] = gpu->pdev->devfn;
-   buf[1] = gpu->pdev->subsystem_vendor;
-   buf[2] = gpu->pdev->subsystem_device;
+   buf[1] = gpu->pdev->subsystem_vendor |
+   (gpu->pdev->subsystem_device << 16);
+   buf[2] = pci_domain_nr(gpu->pdev->bus);
buf[3] = gpu->pdev->device;
buf[4] = gpu->pdev->bus->number;
buf[5] = lower_32_bits(local_mem_size);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 18/27] drm/amdkfd: Delete alloc_format field from map_queue struct

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

Alloc format was never really supported by MEC FW. FW always
does one per pipe allocation.

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c | 2 --
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 2 --
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h  | 7 +--
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h  | 7 +--
 4 files changed, 2 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c
index 604570bea6bd..3dd731c69b5d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c
@@ -153,8 +153,6 @@ static int pm_map_queues_v9(struct packet_manager *pm, 
uint32_t *buffer,
 
packet->header.u32All = pm_build_pm4_header(IT_MAP_QUEUES,
sizeof(struct pm4_mes_map_queues));
-   packet->bitfields2.alloc_format =
-   alloc_format__mes_map_queues__one_per_pipe_vi;
packet->bitfields2.num_queues = 1;
packet->bitfields2.queue_sel =
queue_sel__mes_map_queues__map_to_hws_determined_queue_slots_vi;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
index 3cdb19826927..2adaf40027eb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
@@ -190,8 +190,6 @@ static int pm_map_queues_vi(struct packet_manager *pm, 
uint32_t *buffer,
 
packet->header.u32All = pm_build_pm4_header(IT_MAP_QUEUES,
sizeof(struct pm4_mes_map_queues));
-   packet->bitfields2.alloc_format =
-   alloc_format__mes_map_queues__one_per_pipe_vi;
packet->bitfields2.num_queues = 1;
packet->bitfields2.queue_sel =
queue_sel__mes_map_queues__map_to_hws_determined_queue_slots_vi;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
index f2bcf5c092ea..0661339071f0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
@@ -255,11 +255,6 @@ enum mes_map_queues_queue_type_enum {
 queue_type__mes_map_queues__low_latency_static_queue_vi = 3
 };
 
-enum mes_map_queues_alloc_format_enum {
-   alloc_format__mes_map_queues__one_per_pipe_vi = 0,
-alloc_format__mes_map_queues__all_on_one_pipe_vi = 1
-};
-
 enum mes_map_queues_engine_sel_enum {
engine_sel__mes_map_queues__compute_vi = 0,
engine_sel__mes_map_queues__sdma0_vi = 2,
@@ -279,7 +274,7 @@ struct pm4_mes_map_queues {
enum mes_map_queues_queue_sel_enum queue_sel:2;
uint32_t reserved2:15;
enum mes_map_queues_queue_type_enum queue_type:3;
-   enum mes_map_queues_alloc_format_enum alloc_format:2;
+   uint32_t reserved3:2;
enum mes_map_queues_engine_sel_enum engine_sel:3;
uint32_t num_queues:3;
} bitfields2;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
index 7c8d9b357749..5466cfe1c3cc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
@@ -216,11 +216,6 @@ enum mes_map_queues_queue_type_vi_enum {
 queue_type__mes_map_queues__low_latency_static_queue_vi = 3
 };
 
-enum mes_map_queues_alloc_format_vi_enum {
-   alloc_format__mes_map_queues__one_per_pipe_vi = 0,
-alloc_format__mes_map_queues__all_on_one_pipe_vi = 1
-};
-
 enum mes_map_queues_engine_sel_vi_enum {
engine_sel__mes_map_queues__compute_vi = 0,
engine_sel__mes_map_queues__sdma0_vi = 2,
@@ -240,7 +235,7 @@ struct pm4_mes_map_queues {
enum mes_map_queues_queue_sel_vi_enum queue_sel:2;
uint32_t reserved2:15;
enum mes_map_queues_queue_type_vi_enum queue_type:3;
-   enum mes_map_queues_alloc_format_vi_enum alloc_format:2;
+   uint32_t reserved3:2;
enum mes_map_queues_engine_sel_vi_enum engine_sel:3;
uint32_t num_queues:3;
} bitfields2;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 11/27] drm/amdkfd: Move non-sdma mqd allocation out of init_mqd

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

This is preparation work to introduce more mqd allocation
scheme

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 20 ++--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 51 ---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 18 +--
 3 files changed, 64 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index eec131b801b0..a00402077e34 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -66,6 +66,19 @@ static void update_cu_mask(struct mqd_manager *mm, void *mqd,
m->compute_static_thread_mgmt_se3);
 }
 
+static struct kfd_mem_obj *allocate_mqd(struct kfd_dev *kfd,
+   struct queue_properties *q)
+{
+   struct kfd_mem_obj *mqd_mem_obj;
+
+   if (kfd_gtt_sa_allocate(kfd, sizeof(struct cik_mqd),
+   _mem_obj))
+   return NULL;
+
+   return mqd_mem_obj;
+}
+
+
 static int init_mqd(struct mqd_manager *mm, void **mqd,
struct kfd_mem_obj **mqd_mem_obj, uint64_t *gart_addr,
struct queue_properties *q)
@@ -73,11 +86,10 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
uint64_t addr;
struct cik_mqd *m;
int retval;
+   struct kfd_dev *kfd = mm->dev;
 
-   retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct cik_mqd),
-   mqd_mem_obj);
-
-   if (retval != 0)
+   *mqd_mem_obj = allocate_mqd(kfd, q);
+   if (!*mqd_mem_obj)
return -ENOMEM;
 
m = (struct cik_mqd *) (*mqd_mem_obj)->cpu_ptr;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index 15274a880ea2..8f8166189fd5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -67,38 +67,53 @@ static void update_cu_mask(struct mqd_manager *mm, void 
*mqd,
m->compute_static_thread_mgmt_se3);
 }
 
-static int init_mqd(struct mqd_manager *mm, void **mqd,
-   struct kfd_mem_obj **mqd_mem_obj, uint64_t *gart_addr,
-   struct queue_properties *q)
+static struct kfd_mem_obj *allocate_mqd(struct kfd_dev *kfd,
+   struct queue_properties *q)
 {
int retval;
-   uint64_t addr;
-   struct v9_mqd *m;
-   struct kfd_dev *kfd = mm->dev;
+   struct kfd_mem_obj *mqd_mem_obj = NULL;
 
-   *mqd_mem_obj = NULL;
/* From V9,  for CWSR, the control stack is located on the next page
 * boundary after the mqd, we will use the gtt allocation function
 * instead of sub-allocation function.
 */
if (kfd->cwsr_enabled && (q->type == KFD_QUEUE_TYPE_COMPUTE)) {
-   *mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_KERNEL);
-   if (!*mqd_mem_obj)
-   return -ENOMEM;
+   mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_NOIO);
+   if (!mqd_mem_obj)
+   return NULL;
retval = amdgpu_amdkfd_alloc_gtt_mem(kfd->kgd,
ALIGN(q->ctl_stack_size, PAGE_SIZE) +
ALIGN(sizeof(struct v9_mqd), PAGE_SIZE),
-   &((*mqd_mem_obj)->gtt_mem),
-   &((*mqd_mem_obj)->gpu_addr),
-   (void *)&((*mqd_mem_obj)->cpu_ptr), true);
-   } else
-   retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct v9_mqd),
-   mqd_mem_obj);
+   &(mqd_mem_obj->gtt_mem),
+   &(mqd_mem_obj->gpu_addr),
+   (void *)&(mqd_mem_obj->cpu_ptr), true);
+   } else {
+   retval = kfd_gtt_sa_allocate(kfd, sizeof(struct v9_mqd),
+   _mem_obj);
+   }
+
if (retval) {
-   kfree(*mqd_mem_obj);
-   return -ENOMEM;
+   kfree(mqd_mem_obj);
+   return NULL;
}
 
+   return mqd_mem_obj;
+
+}
+
+static int init_mqd(struct mqd_manager *mm, void **mqd,
+   struct kfd_mem_obj **mqd_mem_obj, uint64_t *gart_addr,
+   struct queue_properties *q)
+{
+   int retval;
+   uint64_t addr;
+   struct v9_mqd *m;
+   struct kfd_dev *kfd = mm->dev;
+
+   *mqd_mem_obj = allocate_mqd(kfd, q);
+   if (!*mqd_mem_obj)
+   return -ENOMEM;
+
m = (struct v9_mqd *) (*mqd_mem_obj)->cpu_ptr;
addr = (*mqd_mem_obj)->gpu_addr;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index ad9dc9a678f2..3296ffbde6ac 100644
--- 

[PATCH 00/27] KFD upstreaming

2019-04-28 Thread Kuehling, Felix
Assorted KFD changes that have been accumulating on amd-kfd-staging. New
features and fixes included:
* Support for VegaM
* Support for systems with multiple PCI domains
* New SDMA queue type that's optimized for XGMI links
* SDMA MQD allocation changes to support future ASICs with more SDMA queues
* Fix for compute profile switching at process termination
* Fix for a circular lock dependency in MMU notifiers
* Fix for TLB flushing bug with XGMI enabled
* Fix for artificial GTT system memory limitation
* Trap handler updates

Amber Lin (1):
  drm/amdkfd: Add domain number into gpu_id

Felix Kuehling (1):
  drm/amdkfd: Fix a circular lock dependency

Harish Kasiviswanathan (1):
  drm/amdkfd: Fix compute profile switching

Jay Cornwall (4):
  drm/amdkfd: Fix gfx8 MEM_VIOL exception handler
  drm/amdkfd: Preserve wave state after instruction fetch MEM_VIOL
  drm/amdkfd: Fix gfx9 XNACK state save/restore
  drm/amdkfd: Preserve ttmp[4:5] instead of ttmp[14:15]

Kent Russell (2):
  drm/amdkfd: Add VegaM support
  drm/amdgpu: Fix GTT size calculation

Oak Zeng (16):
  drm/amdkfd: Use 64 bit sdma_bitmap
  drm/amdkfd: Add sdma allocation debug message
  drm/amdkfd: Differentiate b/t sdma_id and sdma_queue_id
  drm/amdkfd: Shift sdma_engine_id and sdma_queue_id in mqd
  drm/amdkfd: Fix a potential memory leak
  drm/amdkfd: Introduce asic-specific mqd_manager_init function
  drm/amdkfd: Introduce DIQ type mqd manager
  drm/amdkfd: Init mqd managers in device queue manager init
  drm/amdkfd: Add mqd size in mqd manager struct
  drm/amdkfd: Allocate MQD trunk for HIQ and SDMA
  drm/amdkfd: Move non-sdma mqd allocation out of init_mqd
  drm/amdkfd: Allocate hiq and sdma mqd from mqd trunk
  drm/amdkfd: Fix sdma queue map issue
  drm/amdkfd: Introduce XGMI SDMA queue type
  drm/amdkfd: Expose sdma engine numbers to topology
  drm/amdkfd: Delete alloc_format field from map_queue struct

Yong Zhao (1):
  drm/amdkfd: Move sdma_queue_id calculation into allocate_sdma_queue()

shaoyunl (1):
  drm/amdgpu: Use heavy weight for tlb invalidation on xgmi
configuration

 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  53 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |   9 +-
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 483 +-
 .../drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm |  13 -
 .../drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm |  63 +--
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |   2 +
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c |   5 +
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  51 ++
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 354 -
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  14 +-
 .../amd/amdkfd/kfd_device_queue_manager_cik.c |   2 +
 .../amd/amdkfd/kfd_device_queue_manager_v9.c  |   1 +
 .../amd/amdkfd/kfd_device_queue_manager_vi.c  |   2 +
 drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c  |   1 +
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c |   6 +-
 .../gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c  |   4 +-
 .../gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c  |   4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c  |  70 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |   8 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |  53 +-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |  85 +--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |  53 +-
 .../gpu/drm/amd/amdkfd/kfd_packet_manager.c   |   4 +-
 .../gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h   |   7 +-
 .../gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h   |   7 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  14 +-
 .../amd/amdkfd/kfd_process_queue_manager.c|  14 +-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  13 +-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h |   2 +
 drivers/gpu/drm/amd/include/cik_structs.h |   3 +-
 drivers/gpu/drm/amd/include/v9_structs.h  |   3 +-
 drivers/gpu/drm/amd/include/vi_structs.h  |   3 +-
 include/uapi/linux/kfd_ioctl.h|   7 +-
 33 files changed, 826 insertions(+), 587 deletions(-)

-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 04/27] drm/amdkfd: Shift sdma_engine_id and sdma_queue_id in mqd

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

FW of some new ASICs requires sdma mqd size to be not more than
128 dwords. Repurpose the last 2 reserved fields of sdma mqd for
driver internal use, so the total mqd size is no bigger than 128
dwords

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/include/cik_structs.h | 3 +--
 drivers/gpu/drm/amd/include/v9_structs.h  | 3 +--
 drivers/gpu/drm/amd/include/vi_structs.h  | 3 +--
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/include/cik_structs.h 
b/drivers/gpu/drm/amd/include/cik_structs.h
index 749eab94e335..699e658c3cec 100644
--- a/drivers/gpu/drm/amd/include/cik_structs.h
+++ b/drivers/gpu/drm/amd/include/cik_structs.h
@@ -282,8 +282,7 @@ struct cik_sdma_rlc_registers {
uint32_t reserved_123;
uint32_t reserved_124;
uint32_t reserved_125;
-   uint32_t reserved_126;
-   uint32_t reserved_127;
+   /* reserved_126,127: repurposed for driver-internal use */
uint32_t sdma_engine_id;
uint32_t sdma_queue_id;
 };
diff --git a/drivers/gpu/drm/amd/include/v9_structs.h 
b/drivers/gpu/drm/amd/include/v9_structs.h
index ceaf4932258d..8b383dbe1cda 100644
--- a/drivers/gpu/drm/amd/include/v9_structs.h
+++ b/drivers/gpu/drm/amd/include/v9_structs.h
@@ -151,8 +151,7 @@ struct v9_sdma_mqd {
uint32_t reserved_123;
uint32_t reserved_124;
uint32_t reserved_125;
-   uint32_t reserved_126;
-   uint32_t reserved_127;
+   /* reserved_126,127: repurposed for driver-internal use */
uint32_t sdma_engine_id;
uint32_t sdma_queue_id;
 };
diff --git a/drivers/gpu/drm/amd/include/vi_structs.h 
b/drivers/gpu/drm/amd/include/vi_structs.h
index 717fbae1d362..c17613287cd0 100644
--- a/drivers/gpu/drm/amd/include/vi_structs.h
+++ b/drivers/gpu/drm/amd/include/vi_structs.h
@@ -151,8 +151,7 @@ struct vi_sdma_mqd {
uint32_t reserved_123;
uint32_t reserved_124;
uint32_t reserved_125;
-   uint32_t reserved_126;
-   uint32_t reserved_127;
+   /* reserved_126,127: repurposed for driver-internal use */
uint32_t sdma_engine_id;
uint32_t sdma_queue_id;
 };
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 09/27] drm/amdkfd: Add mqd size in mqd manager struct

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

Also initialize mqd size on mqd manager initialization

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h | 1 +
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 4 
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c  | 4 
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c  | 4 
 4 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
index f8261313ae7b..009d232fb60b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
@@ -99,6 +99,7 @@ struct mqd_manager {
 
struct mutexmqd_mutex;
struct kfd_dev  *dev;
+   uint32_t mqd_size;
 };
 
 void mqd_symmetrically_map_cu_mask(struct mqd_manager *mm,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index e69bb4d3c3a9..eec131b801b0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -400,6 +400,7 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE 
type,
mqd->update_mqd = update_mqd;
mqd->destroy_mqd = destroy_mqd;
mqd->is_occupied = is_occupied;
+   mqd->mqd_size = sizeof(struct cik_mqd);
 #if defined(CONFIG_DEBUG_FS)
mqd->debugfs_show_mqd = debugfs_show_mqd;
 #endif
@@ -411,6 +412,7 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE 
type,
mqd->update_mqd = update_mqd_hiq;
mqd->destroy_mqd = destroy_mqd;
mqd->is_occupied = is_occupied;
+   mqd->mqd_size = sizeof(struct cik_mqd);
 #if defined(CONFIG_DEBUG_FS)
mqd->debugfs_show_mqd = debugfs_show_mqd;
 #endif
@@ -422,6 +424,7 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE 
type,
mqd->update_mqd = update_mqd_hiq;
mqd->destroy_mqd = destroy_mqd;
mqd->is_occupied = is_occupied;
+   mqd->mqd_size = sizeof(struct cik_mqd);
 #if defined(CONFIG_DEBUG_FS)
mqd->debugfs_show_mqd = debugfs_show_mqd;
 #endif
@@ -433,6 +436,7 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE 
type,
mqd->update_mqd = update_mqd_sdma;
mqd->destroy_mqd = destroy_mqd_sdma;
mqd->is_occupied = is_occupied_sdma;
+   mqd->mqd_size = sizeof(struct cik_sdma_rlc_registers);
 #if defined(CONFIG_DEBUG_FS)
mqd->debugfs_show_mqd = debugfs_show_mqd_sdma;
 #endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index 273aad4f59c8..15274a880ea2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -462,6 +462,7 @@ struct mqd_manager *mqd_manager_init_v9(enum KFD_MQD_TYPE 
type,
mqd->destroy_mqd = destroy_mqd;
mqd->is_occupied = is_occupied;
mqd->get_wave_state = get_wave_state;
+   mqd->mqd_size = sizeof(struct v9_mqd);
 #if defined(CONFIG_DEBUG_FS)
mqd->debugfs_show_mqd = debugfs_show_mqd;
 #endif
@@ -473,6 +474,7 @@ struct mqd_manager *mqd_manager_init_v9(enum KFD_MQD_TYPE 
type,
mqd->update_mqd = update_mqd_hiq;
mqd->destroy_mqd = destroy_mqd;
mqd->is_occupied = is_occupied;
+   mqd->mqd_size = sizeof(struct v9_mqd);
 #if defined(CONFIG_DEBUG_FS)
mqd->debugfs_show_mqd = debugfs_show_mqd;
 #endif
@@ -484,6 +486,7 @@ struct mqd_manager *mqd_manager_init_v9(enum KFD_MQD_TYPE 
type,
mqd->update_mqd = update_mqd_hiq;
mqd->destroy_mqd = destroy_mqd;
mqd->is_occupied = is_occupied;
+   mqd->mqd_size = sizeof(struct v9_mqd);
 #if defined(CONFIG_DEBUG_FS)
mqd->debugfs_show_mqd = debugfs_show_mqd;
 #endif
@@ -495,6 +498,7 @@ struct mqd_manager *mqd_manager_init_v9(enum KFD_MQD_TYPE 
type,
mqd->update_mqd = update_mqd_sdma;
mqd->destroy_mqd = destroy_mqd_sdma;
mqd->is_occupied = is_occupied_sdma;
+   mqd->mqd_size = sizeof(struct v9_sdma_mqd);
 #if defined(CONFIG_DEBUG_FS)
mqd->debugfs_show_mqd = debugfs_show_mqd_sdma;
 #endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index 67bd590a82fc..ad9dc9a678f2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -459,6 +459,7 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE 
type,
mqd->destroy_mqd = destroy_mqd;
mqd->is_occupied = is_occupied;
mqd->get_wave_state = 

[PATCH 24/27] drm/amdkfd: Add VegaM support

2019-04-28 Thread Kuehling, Felix
From: Kent Russell 

Add the VegaM information to KFD

Signed-off-by: Kent Russell 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c |  5 +
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 20 +++
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c  |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c |  1 +
 .../gpu/drm/amd/amdkfd/kfd_packet_manager.c   |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  1 +
 7 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 1714900035d7..59f8ca4297db 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -134,6 +134,7 @@ static struct kfd_gpu_cache_info carrizo_cache_info[] = {
 #define polaris10_cache_info carrizo_cache_info
 #define polaris11_cache_info carrizo_cache_info
 #define polaris12_cache_info carrizo_cache_info
+#define vegam_cache_info carrizo_cache_info
 /* TODO - check & update Vega10 cache details */
 #define vega10_cache_info carrizo_cache_info
 #define raven_cache_info carrizo_cache_info
@@ -652,6 +653,10 @@ static int kfd_fill_gpu_cache_info(struct kfd_dev *kdev,
pcache_info = polaris12_cache_info;
num_of_cache_types = ARRAY_SIZE(polaris12_cache_info);
break;
+   case CHIP_VEGAM:
+   pcache_info = vegam_cache_info;
+   num_of_cache_types = ARRAY_SIZE(vegam_cache_info);
+   break;
case CHIP_VEGA10:
case CHIP_VEGA12:
case CHIP_VEGA20:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 1368b41cb92b..a53dda9071b1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -232,6 +232,23 @@ static const struct kfd_device_info polaris12_device_info 
= {
.num_sdma_queues_per_engine = 2,
 };
 
+static const struct kfd_device_info vegam_device_info = {
+   .asic_family = CHIP_VEGAM,
+   .max_pasid_bits = 16,
+   .max_no_of_hqd  = 24,
+   .doorbell_size  = 4,
+   .ih_ring_entry_size = 4 * sizeof(uint32_t),
+   .event_interrupt_class = _interrupt_class_cik,
+   .num_of_watch_points = 4,
+   .mqd_size_aligned = MQD_SIZE_ALIGNED,
+   .supports_cwsr = true,
+   .needs_iommu_device = false,
+   .needs_pci_atomics = true,
+   .num_sdma_engines = 2,
+   .num_xgmi_sdma_engines = 0,
+   .num_sdma_queues_per_engine = 2,
+};
+
 static const struct kfd_device_info vega10_device_info = {
.asic_family = CHIP_VEGA10,
.max_pasid_bits = 16,
@@ -387,6 +404,9 @@ static const struct kfd_deviceid supported_devices[] = {
{ 0x6995, _device_info }, /* Polaris12 */
{ 0x6997, _device_info }, /* Polaris12 */
{ 0x699F, _device_info }, /* Polaris12 */
+   { 0x694C, _device_info }, /* VegaM */
+   { 0x694E, _device_info }, /* VegaM */
+   { 0x694F, _device_info }, /* VegaM */
{ 0x6860, _device_info },/* Vega10 */
{ 0x6861, _device_info },/* Vega10 */
{ 0x6862, _device_info },/* Vega10 */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 0bfdb141b6e7..ece35c7a77b5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1811,6 +1811,7 @@ struct device_queue_manager 
*device_queue_manager_init(struct kfd_dev *dev)
case CHIP_POLARIS10:
case CHIP_POLARIS11:
case CHIP_POLARIS12:
+   case CHIP_VEGAM:
device_queue_manager_init_vi_tonga(>asic_ops);
break;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
index 213ea5454d11..dc7339825b5c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
@@ -398,6 +398,7 @@ int kfd_init_apertures(struct kfd_process *process)
case CHIP_POLARIS10:
case CHIP_POLARIS11:
case CHIP_POLARIS12:
+   case CHIP_VEGAM:
kfd_init_apertures_vi(pdd, id);
break;
case CHIP_VEGA10:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
index 7a737b50bed4..1cc03b3ddbb9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
@@ -315,6 +315,7 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
case CHIP_POLARIS10:
case CHIP_POLARIS11:
case CHIP_POLARIS12:
+   case CHIP_VEGAM:

[PATCH 06/27] drm/amdkfd: Introduce asic-specific mqd_manager_init function

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

Global function mqd_manager_init just calls asic-specific functions and it
is not necessary. Delete it and introduce a mqd_manager_init interface in
dqm for asic-specific mqd manager init. Call mqd_manager_init interface
directly to initialize mqd manager

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  2 ++
 .../amd/amdkfd/kfd_device_queue_manager_cik.c |  2 ++
 .../amd/amdkfd/kfd_device_queue_manager_v9.c  |  1 +
 .../amd/amdkfd/kfd_device_queue_manager_vi.c  |  2 ++
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c  | 29 ---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  2 --
 7 files changed, 8 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 7e5ead042dc0..a5a8643c04fc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -583,7 +583,7 @@ static struct mqd_manager *get_mqd_manager(
 
mqd_mgr = dqm->mqd_mgrs[type];
if (!mqd_mgr) {
-   mqd_mgr = mqd_manager_init(type, dqm->dev);
+   mqd_mgr = dqm->asic_ops.mqd_manager_init(type, dqm->dev);
if (!mqd_mgr)
pr_err("mqd manager is NULL");
dqm->mqd_mgrs[type] = mqd_mgr;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
index 2770f3ece89f..a5d83ec1c6a8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
@@ -158,6 +158,8 @@ struct device_queue_manager_asic_ops {
void(*init_sdma_vm)(struct device_queue_manager *dqm,
struct queue *q,
struct qcm_process_device *qpd);
+   struct mqd_manager *(*mqd_manager_init)(enum KFD_MQD_TYPE type,
+struct kfd_dev *dev);
 };
 
 /**
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
index aed4c21417bf..0d26506798cf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
@@ -48,6 +48,7 @@ void device_queue_manager_init_cik(
asic_ops->set_cache_memory_policy = set_cache_memory_policy_cik;
asic_ops->update_qpd = update_qpd_cik;
asic_ops->init_sdma_vm = init_sdma_vm;
+   asic_ops->mqd_manager_init = mqd_manager_init_cik;
 }
 
 void device_queue_manager_init_cik_hawaii(
@@ -56,6 +57,7 @@ void device_queue_manager_init_cik_hawaii(
asic_ops->set_cache_memory_policy = set_cache_memory_policy_cik;
asic_ops->update_qpd = update_qpd_cik_hawaii;
asic_ops->init_sdma_vm = init_sdma_vm_hawaii;
+   asic_ops->mqd_manager_init = mqd_manager_init_cik_hawaii;
 }
 
 static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c
index 417515332c35..e9fe39382371 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c
@@ -37,6 +37,7 @@ void device_queue_manager_init_v9(
 {
asic_ops->update_qpd = update_qpd_v9;
asic_ops->init_sdma_vm = init_sdma_vm_v9;
+   asic_ops->mqd_manager_init = mqd_manager_init_v9;
 }
 
 static uint32_t compute_sh_mem_bases_64bit(struct kfd_process_device *pdd)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
index c3a5dcfe877a..3a7cb2f88366 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
@@ -54,6 +54,7 @@ void device_queue_manager_init_vi(
asic_ops->set_cache_memory_policy = set_cache_memory_policy_vi;
asic_ops->update_qpd = update_qpd_vi;
asic_ops->init_sdma_vm = init_sdma_vm;
+   asic_ops->mqd_manager_init = mqd_manager_init_vi;
 }
 
 void device_queue_manager_init_vi_tonga(
@@ -62,6 +63,7 @@ void device_queue_manager_init_vi_tonga(
asic_ops->set_cache_memory_policy = set_cache_memory_policy_vi_tonga;
asic_ops->update_qpd = update_qpd_vi_tonga;
asic_ops->init_sdma_vm = init_sdma_vm_tonga;
+   asic_ops->mqd_manager_init = mqd_manager_init_vi_tonga;
 }
 
 static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
index aed9b9b82213..eeb2b60a36b5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
+++ 

[PATCH 17/27] drm/amdkfd: Expose sdma engine numbers to topology

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

Expose available numbers of both SDMA queue types in the topology.

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 7 +++
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 2 ++
 2 files changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 2cb09e088dce..e536f4b6698f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -476,6 +476,10 @@ static ssize_t node_show(struct kobject *kobj, struct 
attribute *attr,
dev->node_props.drm_render_minor);
sysfs_show_64bit_prop(buffer, "hive_id",
dev->node_props.hive_id);
+   sysfs_show_32bit_prop(buffer, "num_sdma_engines",
+   dev->node_props.num_sdma_engines);
+   sysfs_show_32bit_prop(buffer, "num_sdma_xgmi_engines",
+   dev->node_props.num_sdma_xgmi_engines);
 
if (dev->gpu) {
log_max_watch_addr =
@@ -1282,6 +1286,9 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
gpu->shared_resources.drm_render_minor;
 
dev->node_props.hive_id = gpu->hive_id;
+   dev->node_props.num_sdma_engines = gpu->device_info->num_sdma_engines;
+   dev->node_props.num_sdma_xgmi_engines =
+   gpu->device_info->num_xgmi_sdma_engines;
 
kfd_fill_mem_clk_max_info(dev);
kfd_fill_iolink_non_crat_info(dev);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
index 84710cfd23c2..949e885dfb53 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
@@ -78,6 +78,8 @@ struct kfd_node_properties {
uint32_t max_engine_clk_fcompute;
uint32_t max_engine_clk_ccompute;
int32_t  drm_render_minor;
+   uint32_t num_sdma_engines;
+   uint32_t num_sdma_xgmi_engines;
uint16_t marketing_name[KFD_TOPOLOGY_PUBLIC_NAME_SIZE];
 };
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 08/27] drm/amdkfd: Init mqd managers in device queue manager init

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

Previously mqd managers was initialized on demand. As there
are only a few type of mqd managers, the on demand initialization
doesn't save too much memory. Initialize them on device
queue initialization instead and delete the get_mqd_manager
interface. This makes codes more organized for future changes.

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 127 ++
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |   6 -
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c |   6 +-
 .../amd/amdkfd/kfd_process_queue_manager.c|   3 +-
 4 files changed, 47 insertions(+), 95 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index a5a8643c04fc..063625c3646b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -368,9 +368,7 @@ static int create_compute_queue_nocpsch(struct 
device_queue_manager *dqm,
struct mqd_manager *mqd_mgr;
int retval;
 
-   mqd_mgr = dqm->ops.get_mqd_manager(dqm, KFD_MQD_TYPE_COMPUTE);
-   if (!mqd_mgr)
-   return -ENOMEM;
+   mqd_mgr = dqm->mqd_mgrs[KFD_MQD_TYPE_COMPUTE];
 
retval = allocate_hqd(dqm, q);
if (retval)
@@ -425,10 +423,8 @@ static int destroy_queue_nocpsch_locked(struct 
device_queue_manager *dqm,
int retval;
struct mqd_manager *mqd_mgr;
 
-   mqd_mgr = dqm->ops.get_mqd_manager(dqm,
-   get_mqd_type_from_queue_type(q->properties.type));
-   if (!mqd_mgr)
-   return -ENOMEM;
+   mqd_mgr = dqm->mqd_mgrs[get_mqd_type_from_queue_type(
+   q->properties.type)];
 
if (q->properties.type == KFD_QUEUE_TYPE_COMPUTE) {
deallocate_hqd(dqm, q);
@@ -501,12 +497,8 @@ static int update_queue(struct device_queue_manager *dqm, 
struct queue *q)
retval = -ENODEV;
goto out_unlock;
}
-   mqd_mgr = dqm->ops.get_mqd_manager(dqm,
-   get_mqd_type_from_queue_type(q->properties.type));
-   if (!mqd_mgr) {
-   retval = -ENOMEM;
-   goto out_unlock;
-   }
+   mqd_mgr = dqm->mqd_mgrs[get_mqd_type_from_queue_type(
+   q->properties.type)];
/*
 * Eviction state logic: we only mark active queues as evicted
 * to avoid the overhead of restoring inactive queues later
@@ -571,27 +563,6 @@ static int update_queue(struct device_queue_manager *dqm, 
struct queue *q)
return retval;
 }
 
-static struct mqd_manager *get_mqd_manager(
-   struct device_queue_manager *dqm, enum KFD_MQD_TYPE type)
-{
-   struct mqd_manager *mqd_mgr;
-
-   if (WARN_ON(type >= KFD_MQD_TYPE_MAX))
-   return NULL;
-
-   pr_debug("mqd type %d\n", type);
-
-   mqd_mgr = dqm->mqd_mgrs[type];
-   if (!mqd_mgr) {
-   mqd_mgr = dqm->asic_ops.mqd_manager_init(type, dqm->dev);
-   if (!mqd_mgr)
-   pr_err("mqd manager is NULL");
-   dqm->mqd_mgrs[type] = mqd_mgr;
-   }
-
-   return mqd_mgr;
-}
-
 static int evict_process_queues_nocpsch(struct device_queue_manager *dqm,
struct qcm_process_device *qpd)
 {
@@ -612,13 +583,8 @@ static int evict_process_queues_nocpsch(struct 
device_queue_manager *dqm,
list_for_each_entry(q, >queues_list, list) {
if (!q->properties.is_active)
continue;
-   mqd_mgr = dqm->ops.get_mqd_manager(dqm,
-   get_mqd_type_from_queue_type(q->properties.type));
-   if (!mqd_mgr) { /* should not be here */
-   pr_err("Cannot evict queue, mqd mgr is NULL\n");
-   retval = -ENOMEM;
-   goto out;
-   }
+   mqd_mgr = dqm->mqd_mgrs[get_mqd_type_from_queue_type(
+   q->properties.type)];
q->properties.is_evicted = true;
q->properties.is_active = false;
retval = mqd_mgr->destroy_mqd(mqd_mgr, q->mqd,
@@ -717,13 +683,8 @@ static int restore_process_queues_nocpsch(struct 
device_queue_manager *dqm,
list_for_each_entry(q, >queues_list, list) {
if (!q->properties.is_evicted)
continue;
-   mqd_mgr = dqm->ops.get_mqd_manager(dqm,
-   get_mqd_type_from_queue_type(q->properties.type));
-   if (!mqd_mgr) { /* should not be here */
-   pr_err("Cannot restore queue, mqd mgr is NULL\n");
-   retval = -ENOMEM;
-   goto out;
-   }
+   mqd_mgr = dqm->mqd_mgrs[get_mqd_type_from_queue_type(
+  

[PATCH 22/27] drm/amdkfd: Fix gfx9 XNACK state save/restore

2019-04-28 Thread Kuehling, Felix
From: Jay Cornwall 

SQ_WAVE_IB_STS.RCNT grew from 4 bits to 5 in gfx9. Do not truncate
when saving in the high bits of TTMP1.

Signed-off-by: Jay Cornwall 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h   | 12 ++--
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm|  8 
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index 097da0dd3b04..eed845b4e9a7 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -310,8 +310,8 @@ static const uint32_t cwsr_trap_gfx9_hex[] = {
0xbe801f6c, 0x866dff6d,
0x, 0xbef00080,
0xb9700283, 0xb8f02407,
-   0x8e709c70, 0x876d706d,
-   0xb8f003c7, 0x8e709b70,
+   0x8e709b70, 0x876d706d,
+   0xb8f003c7, 0x8e709a70,
0x876d706d, 0xb8f0f807,
0x8670ff70, 0x7fff,
0xb970f807, 0xbeee007e,
@@ -549,11 +549,11 @@ static const uint32_t cwsr_trap_gfx9_hex[] = {
0x0048, 0xc0031e77,
0x0058, 0xc0071eb7,
0x005c, 0xbf8cc07f,
-   0x866fff6d, 0xf000,
-   0x8f6f9c6f, 0x8e6f906f,
+   0x866fff6d, 0xf800,
+   0x8f6f9b6f, 0x8e6f906f,
0xbeee0080, 0x876e6f6e,
-   0x866fff6d, 0x0800,
-   0x8f6f9b6f, 0x8e6f8f6f,
+   0x866fff6d, 0x0400,
+   0x8f6f9a6f, 0x8e6f8f6f,
0x876e6f6e, 0x866fff70,
0x0080, 0x8f6f976f,
0xb96ef807, 0x866dff6d,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
index 6a010c9e55de..e1ac34517642 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
@@ -150,10 +150,10 @@ var S_SAVE_SPI_INIT_MTYPE_SHIFT   =   28
 var S_SAVE_SPI_INIT_FIRST_WAVE_MASK=   0x0400  //bit[26]: 
FirstWaveInTG
 var S_SAVE_SPI_INIT_FIRST_WAVE_SHIFT   =   26
 
-var S_SAVE_PC_HI_RCNT_SHIFT=   28  //FIXME  check 
with Brian to ensure all fields other than PC[47:0] can be used
-var S_SAVE_PC_HI_RCNT_MASK =   0xF000  //FIXME
-var S_SAVE_PC_HI_FIRST_REPLAY_SHIFT=   27  //FIXME
-var S_SAVE_PC_HI_FIRST_REPLAY_MASK =   0x0800  //FIXME
+var S_SAVE_PC_HI_RCNT_SHIFT=   27  //FIXME  check 
with Brian to ensure all fields other than PC[47:0] can be used
+var S_SAVE_PC_HI_RCNT_MASK =   0xF800  //FIXME
+var S_SAVE_PC_HI_FIRST_REPLAY_SHIFT=   26  //FIXME
+var S_SAVE_PC_HI_FIRST_REPLAY_MASK =   0x0400  //FIXME
 
 var s_save_spi_init_lo =   exec_lo
 var s_save_spi_init_hi =   exec_hi
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 15/27] drm/amdkfd: Fix sdma queue map issue

2019-04-28 Thread Kuehling, Felix
From: Oak Zeng 

Previous codes assumes there are two sdma engines.
This is not true e.g., Raven only has 1 SDMA engine.
Fix the issue by using sdma engine number info in
device_info.

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 21 +++
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index bac1f36d38a2..d41045d3fc3a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1212,12 +1212,17 @@ int amdkfd_fence_wait_timeout(unsigned int *fence_addr,
return 0;
 }
 
-static int unmap_sdma_queues(struct device_queue_manager *dqm,
-   unsigned int sdma_engine)
+static int unmap_sdma_queues(struct device_queue_manager *dqm)
 {
-   return pm_send_unmap_queue(>packets, KFD_QUEUE_TYPE_SDMA,
-   KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0, false,
-   sdma_engine);
+   int i, retval = 0;
+
+   for (i = 0; i < dqm->dev->device_info->num_sdma_engines; i++) {
+   retval = pm_send_unmap_queue(>packets, KFD_QUEUE_TYPE_SDMA,
+   KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0, false, i);
+   if (retval)
+   return retval;
+   }
+   return retval;
 }
 
 /* dqm->lock mutex has to be locked before calling this function */
@@ -1256,10 +1261,8 @@ static int unmap_queues_cpsch(struct 
device_queue_manager *dqm,
pr_debug("Before destroying queues, sdma queue count is : %u\n",
dqm->sdma_queue_count);
 
-   if (dqm->sdma_queue_count > 0) {
-   unmap_sdma_queues(dqm, 0);
-   unmap_sdma_queues(dqm, 1);
-   }
+   if (dqm->sdma_queue_count > 0)
+   unmap_sdma_queues(dqm);
 
retval = pm_send_unmap_queue(>packets, KFD_QUEUE_TYPE_COMPUTE,
filter, filter_param, false, 0);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 14/27] drm/amdkfd: Fix compute profile switching

2019-04-28 Thread Kuehling, Felix
From: Harish Kasiviswanathan 

Fix compute profile switching on process termination.

Add a dedicated reference counter to keep track of entry/exit to/from
compute profile. This enables switching compute profiles for other
reasons than process creation or termination.

Signed-off-by: Harish Kasiviswanathan 
Signed-off-by: Eric Huang 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c  | 16 
 .../drm/amd/amdkfd/kfd_device_queue_manager.c| 11 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  7 +++
 3 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index c1e4d44d6137..8202a5db3a35 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -462,6 +462,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
kfd->pdev = pdev;
kfd->init_complete = false;
kfd->kfd2kgd = f2g;
+   atomic_set(>compute_profile, 0);
 
mutex_init(>doorbell_mutex);
memset(>doorbell_available_index, 0,
@@ -1036,6 +1037,21 @@ void kgd2kfd_set_sram_ecc_flag(struct kfd_dev *kfd)
atomic_inc(>sram_ecc_flag);
 }
 
+void kfd_inc_compute_active(struct kfd_dev *kfd)
+{
+   if (atomic_inc_return(>compute_profile) == 1)
+   amdgpu_amdkfd_set_compute_idle(kfd->kgd, false);
+}
+
+void kfd_dec_compute_active(struct kfd_dev *kfd)
+{
+   int count = atomic_dec_return(>compute_profile);
+
+   if (count == 0)
+   amdgpu_amdkfd_set_compute_idle(kfd->kgd, true);
+   WARN_ONCE(count < 0, "Compute profile ref. count error");
+}
+
 #if defined(CONFIG_DEBUG_FS)
 
 /* This function will send a package to HIQ to hang the HWS
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 38c66b8ffd31..bac1f36d38a2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -772,8 +772,8 @@ static int register_process(struct device_queue_manager 
*dqm,
 
retval = dqm->asic_ops.update_qpd(dqm, qpd);
 
-   if (dqm->processes_count++ == 0)
-   amdgpu_amdkfd_set_compute_idle(dqm->dev->kgd, false);
+   dqm->processes_count++;
+   kfd_inc_compute_active(dqm->dev);
 
dqm_unlock(dqm);
 
@@ -796,9 +796,8 @@ static int unregister_process(struct device_queue_manager 
*dqm,
if (qpd == cur->qpd) {
list_del(>list);
kfree(cur);
-   if (--dqm->processes_count == 0)
-   amdgpu_amdkfd_set_compute_idle(
-   dqm->dev->kgd, true);
+   dqm->processes_count--;
+   kfd_dec_compute_active(dqm->dev);
goto out;
}
}
@@ -1479,6 +1478,7 @@ static int process_termination_nocpsch(struct 
device_queue_manager *dqm,
list_del(>list);
kfree(cur);
dqm->processes_count--;
+   kfd_dec_compute_active(dqm->dev);
break;
}
}
@@ -1562,6 +1562,7 @@ static int process_termination_cpsch(struct 
device_queue_manager *dqm,
list_del(>list);
kfree(cur);
dqm->processes_count--;
+   kfd_dec_compute_active(dqm->dev);
break;
}
}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index d1d60336172a..87328c96b0f1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -279,6 +279,9 @@ struct kfd_dev {
 
/* SRAM ECC flag */
atomic_t sram_ecc_flag;
+
+   /* Compute Profile ref. count */
+   atomic_t compute_profile;
 };
 
 enum kfd_mempool {
@@ -977,6 +980,10 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, 
struct kfd_process *p);
 
 bool kfd_is_locked(void);
 
+/* Compute profile */
+void kfd_inc_compute_active(struct kfd_dev *dev);
+void kfd_dec_compute_active(struct kfd_dev *dev);
+
 /* Debugfs */
 #if defined(CONFIG_DEBUG_FS)
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx