Re: [PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2)

2017-11-21 Thread Christian König

Hi Frank,

thanks, the patch looks much better now.


The masks using is to programming the stack and data part for vce fw. And this 
part of code is borrowed from the non-sriov sequences.


In this case Leo can you explain this strange masks used for the 
VCE_VCPU_CACHE_OFFSET* registers?



MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET0),
-   offset & 0x7FFF);
+   offset & ~0x0f00);

...

MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET1),
-   offset & 0x7FFF);
+   (offset & ~0x0f00) | (1 << 24));


Using ~0x0f00 looks really odd here and what should the "| (1 << 
24)" part be about?


Thanks,
Christian.

Am 22.11.2017 um 06:11 schrieb Min, Frank:

Hi Christian,
Patch updated according to your suggestions.
The masks using is to programming the stack and data part for vce fw. And this 
part of code is borrowed from the non-sriov sequences.

Best Regards,
Frank

1. program vce 4.0 fw with 48 bit address 2. correct vce 4.0 fw stack and date 
offset

Change-Id: Ic1bc49c21d3a90c477d11162f9d6d9e2073fbbd3
Signed-off-by: Frank Min 
---
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 38 +++
  1 file changed, 25 insertions(+), 13 deletions(-)  mode change 100644 => 
100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
old mode 100644
new mode 100755
index 7574554..024a1be
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -243,37 +243,49 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
*adev)
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VM_CTRL), 0);
  
  		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {

-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),

adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   
(adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) &
+0xff);
} else {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
adev->vce.gpu_addr >> 8);
-   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
  
  		offset = AMDGPU_VCE_FIRMWARE_OFFSET;

size = VCE_V4_0_FW_SIZE;

Re: FW: [PATCH] drm/amd/vce: correct vce fw data and stack size config for sriov

2017-11-21 Thread Christian König

1)the coding style is correct on the patch itself but looks incorrect in mail
I use a Thunderbird plugin to show the code as it would apply to the 
file and sorry at least in this version the coding style is completely 
incorrect.


Franks V2 patch looked much better, so I think that was just an issue 
created by forwarding the patch.



2) can you point what do you mean on the sentence "programing 0 to register"

That was just me confused because the coding style looked so odd.

But I've already took a look at newer versions of the patch and the 
masks applied to the offsets still looked really strange.


Going to follow up on the newest version of the patch.

Regards,
Christian.

Am 22.11.2017 um 07:30 schrieb Liu, Monk:

Hi Christian

This patch can fix a VCE world switch hang bug, because couple registers were 
wrongly programed with the same address so hardware fight with each other
We have verified this patch,

Regarding your comments:
1)the coding style is correct on the patch itself but looks incorrect in mail
2) can you point what do you mean on the sentence "programing 0 to register"

Since Frank hasn't applied for the member of amd-gfx, so his patch cannot go to 
amd-gfx loop directly


BR Monk

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Frank 
Min
Sent: 2017年11月21日 16:34
To: amd-gfx@lists.freedesktop.org
Cc: Min, Frank 
Subject: [PATCH] drm/amd/vce: correct vce fw data and stack size config for 
sriov

Signed-off-by: Frank Min 
---
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 30 +-
  1 file changed, 17 insertions(+), 13 deletions(-)  mode change 100644 => 
100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
old mode 100644
new mode 100755
index 7574554..4a92530
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -243,37 +243,41 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
*adev)
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VM_CTRL), 0);
  
  		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {

-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),

adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   
(adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) &
+0xff);
} else {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
adev->vce.gpu_addr >> 8);
-   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
  
  		offset = AMDGPU_VCE_FIRMWARE_OFFSET;

size = VCE_V4_0_FW_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET0),
-   offset & 0x7FFF);
+   

Re: [PATCH 1/4] drm/ttm: add page order in page pool

2017-11-21 Thread Christian König

Am 22.11.2017 um 06:36 schrieb Roger He:

to indicate page order for each element in the pool

Change-Id: Ic609925ca5d2a5d4ad49d6becf505388ce3624cf
Signed-off-by: Roger He 
---
  drivers/gpu/drm/ttm/ttm_page_alloc.c | 42 ++--
  1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 72ea037..0a0c653 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -81,6 +81,7 @@ struct ttm_page_pool {
char*name;
unsigned long   nfrees;
unsigned long   nrefills;
+   unsigned intorder;
  };
  
  /**

@@ -412,6 +413,7 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
struct ttm_page_pool *pool;
int shrink_pages = sc->nr_to_scan;
unsigned long freed = 0;
+   unsigned int nr_free_pool;
  
  	if (!mutex_trylock())

return SHRINK_STOP;
@@ -421,10 +423,15 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
unsigned nr_free = shrink_pages;
if (shrink_pages == 0)
break;
+
pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
/* OK to use static buffer since global mutex is held. */
-   shrink_pages = ttm_page_pool_free(pool, nr_free, true);
-   freed += nr_free - shrink_pages;
+   nr_free_pool = (nr_free >> pool->order);
+   if (nr_free_pool == 0)
+   continue;
+
+   shrink_pages = ttm_page_pool_free(pool, nr_free_pool, true);
+   freed += ((nr_free_pool - shrink_pages) << pool->order);
}
mutex_unlock();
return freed;
@@ -436,9 +443,12 @@ ttm_pool_shrink_count(struct shrinker *shrink, struct 
shrink_control *sc)
  {
unsigned i;
unsigned long count = 0;
+   struct ttm_page_pool *pool;
  
-	for (i = 0; i < NUM_POOLS; ++i)

-   count += _manager->pools[i].npages;
+   for (i = 0; i < NUM_POOLS; ++i) {
+   pool = &_manager->pools[i];
+   count += (pool->npages << pool->order);
+   }
  
  	return count;

  }
@@ -932,7 +942,7 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
  }
  
  static void ttm_page_pool_init_locked(struct ttm_page_pool *pool, gfp_t flags,

-   char *name)
+   char *name, unsigned int order)
  {
spin_lock_init(>lock);
pool->fill_lock = false;
@@ -940,8 +950,18 @@ static void ttm_page_pool_init_locked(struct ttm_page_pool 
*pool, gfp_t flags,
pool->npages = pool->nfrees = 0;
pool->gfp_flags = flags;
pool->name = name;
+   pool->order = order;
  }
  
+/**

+ * Actually if TRANSPARENT_HUGEPAGE not enabled, we will not use
+ * wc_pool_huge and uc_pool_huge, so no matter whatever the page
+ * order are for those two pools
+ */
+#ifndef CONFIG_TRANSPARENT_HUGEPAGE
+#defineHPAGE_PMD_ORDER 9
+#endif
+


That still won't work and sorry I wasn't 100% clear in the last mail.

When CONFIG_TRANSPARENT_HUGEPAGE isn't set HPAGE_PMD_ORDER is defined as 
BUILD_BUG().


So you will still run into problems when that config option isn't set.


  int ttm_page_alloc_init(struct ttm_mem_global *glob, unsigned max_pages)
  {
int ret;


I suggest to just handle it here like this

#ifdef CONFIG_TRANSPARENT_HUGEPAGE
    unsigned order = HPAGE_PMD_ORDER;
#else
    unsigned order = 0;
#endif

Apart from that the patch looks good to me,
Christian.


@@ -952,23 +972,23 @@ int ttm_page_alloc_init(struct ttm_mem_global *glob, 
unsigned max_pages)
  
  	_manager = kzalloc(sizeof(*_manager), GFP_KERNEL);
  
-	ttm_page_pool_init_locked(&_manager->wc_pool, GFP_HIGHUSER, "wc");

+   ttm_page_pool_init_locked(&_manager->wc_pool, GFP_HIGHUSER, "wc", 0);
  
-	ttm_page_pool_init_locked(&_manager->uc_pool, GFP_HIGHUSER, "uc");

+   ttm_page_pool_init_locked(&_manager->uc_pool, GFP_HIGHUSER, "uc", 0);
  
  	ttm_page_pool_init_locked(&_manager->wc_pool_dma32,

- GFP_USER | GFP_DMA32, "wc dma");
+ GFP_USER | GFP_DMA32, "wc dma", 0);
  
  	ttm_page_pool_init_locked(&_manager->uc_pool_dma32,

- GFP_USER | GFP_DMA32, "uc dma");
+ GFP_USER | GFP_DMA32, "uc dma", 0);
  
  	ttm_page_pool_init_locked(&_manager->wc_pool_huge,

  GFP_TRANSHUGE & ~(__GFP_MOVABLE | __GFP_COMP),
- "wc huge");
+ "wc huge", HPAGE_PMD_ORDER);
  
  	ttm_page_pool_init_locked(&_manager->uc_pool_huge,

  GFP_TRANSHUGE & ~(__GFP_MOVABLE | __GFP_COMP)
- , "uc huge");
+ 

[PATCH] drm/amd/display: check plane state before validating fbc

2017-11-21 Thread S, Shirish
From: Shirish S 

While validation fbc, array_mode of the pipe is accessed without checking 
plane_state exists for it.
Causing to null pointer dereferencing followed by reboot when a crtc associated 
with external display(not
connected) is page flipped.

This patch adds a check for plane_state before using it to validate fbc.

Signed-off-by: Shirish S 
Reviewed-by: Roman Li 
---
 drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index ee3b944..a6cd63a 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1724,6 +1724,10 @@ static enum dc_status validate_fbc(struct dc *dc,
if (pipe_ctx->stream->sink->link->psr_enabled)
return DC_ERROR_UNEXPECTED;
 
+   /* Nothing to compress */
+   if (!pipe_ctx->plane_state)
+   return DC_ERROR_UNEXPECTED;
+
/* Only for non-linear tiling */
if (pipe_ctx->plane_state->tiling_info.gfx8.array_mode == 
DC_ARRAY_LINEAR_GENERAL)
return DC_ERROR_UNEXPECTED;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Kernel crash/Null pointer dereference on vblank

2017-11-21 Thread Martin Babutzka

Dear AMD Developers,At first congratulations for the DC code submission to the 4.15 kernel.Unfortunately the major regression which I reported on 29.09., 06.10.,02.11. and 05.11. still exists. But this time I got additionaldebugging information maybe this helps to fix it.Summary: I am running Xubuntu 17.10 with the amd-staging-drm-nextkernel patched to 4.14.0. The latest build which I tested is fromincludes all commits up to now (including 2017-11-17 19:51:57 (GMT)commit 85d09ce5e5039644487e9508d6359f9f4cf64427).Some vblank operations make the kernel crash and hang up the wholesystem. The error is reproducible by enabling the screen lock or thesuspend mode. The system can not return to proper state from either ofthese (after all I am not 100% sure it is the same error). Debugging is easier with screen lock. Attached you can find the kernel crash andthe dce110_vblank_set function modified by some kernel prints. It lookslike the function is called twice and does not work the second time.The whole code around dce110_vblank_set also looks interrupt-ish -could this be a race condition or timing problem? Objects being clearedfrom memory and then accessed by dce110_vblank_set?Bug reports on this issue:https://github.com/M-Bab/linux-kernel-amdgpu-binaries/issues/37https://github.com/M-Bab/linux-kernel-amdgpu-binaries/issues/29Many regards,Martin (M-bab)
 bool dce110_vblank_set(
struct irq_service *irq_service,
const struct irq_source_info *info,
bool enable)
{
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
struct dc_context *dc_ctx = irq_service->ctx;
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
struct dc *core_dc = irq_service->ctx->dc;
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
enum dc_irq_source dal_irq_src = dc_interrupt_to_irq_source(

irq_service->ctx->dc,

info->src_id,

info->ext_id);
uint8_t pipe_offset = dal_irq_src - IRQ_TYPE_VBLANK;
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);

struct timing_generator *tg =

core_dc->current_state->res_ctx.pipe_ctx[pipe_offset].stream_res.tg;
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);

if (enable) {
if (!tg->funcs->arm_vert_intr(tg, 2)) {
DC_ERROR("Failed to get VBLANK!\n");
return false;
}
}
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);

dal_irq_service_set_generic(irq_service, info, enable);
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
return true;

}


"normal" vblank during boot:
Nov 19 22:33:10 Main-PC kernel: [   17.605100] DEBUG: Passed dce110_vblank_set 
208 
Nov 19 22:33:10 Main-PC kernel: [   17.605102] DEBUG: Passed dce110_vblank_set 
210 
Nov 19 22:33:10 Main-PC kernel: [   17.605103] DEBUG: Passed dce110_vblank_set 
212 
Nov 19 22:33:10 Main-PC kernel: [   17.605104] DEBUG: Passed dce110_vblank_set 
218 
Nov 19 22:33:10 Main-PC kernel: [   17.605104] DEBUG: Passed dce110_vblank_set 
222 
Nov 19 22:33:10 Main-PC kernel: [   17.605108] DEBUG: Passed dce110_vblank_set 
230 
Nov 19 22:33:10 Main-PC kernel: [   17.605110] DEBUG: Passed dce110_vblank_set 
233 

vblank on screen lock in kernel.log/syslog:
Nov 19 22:34:10 Main-PC kernel: [   78.664890] DEBUG: Passed dce110_vblank_set 
208 
Nov 19 22:34:10 Main-PC kernel: [   78.664892] DEBUG: Passed dce110_vblank_set 
210 
Nov 19 22:34:10 Main-PC kernel: [   78.664893] DEBUG: Passed dce110_vblank_set 
212 
Nov 19 22:34:10 Main-PC kernel: [   78.664894] DEBUG: Passed dce110_vblank_set 
218 
Nov 19 22:34:10 Main-PC kernel: [   78.664894] DEBUG: Passed dce110_vblank_set 
222 
Nov 19 22:34:10 Main-PC kernel: [   78.664895] DEBUG: Passed dce110_vblank_set 
230 
Nov 19 22:34:10 Main-PC kernel: [   78.664896] DEBUG: Passed dce110_vblank_set 
233 
Nov 19 22:34:27 Main-PC kernel: [   96.113426] DEBUG: Passed dce110_vblank_set 
208 
Nov 19 22:34:27 Main-PC kernel: [   96.113433] DEBUG: Passed dce110_vblank_set 
210 
Nov 19 22:34:27 Main-PC kernel: [   96.113435] DEBUG: Passed dce110_vblank_set 
212 
Nov 19 22:34:27 Main-PC kernel: [   96.113438] DEBUG: Passed dce110_vblank_set 
218 
Nov 19 22:34:27 Main-PC kernel: [   96.113440] DEBUG: Passed dce110_vblank_set 
222 
Nov 19 22:34:27 Main-PC kernel: [   96.113448] BUG: unable to handle kernel 
NULL pointer dereference at   (null)
Nov 19 22:34:27 Main-PC kernel: [   96.113521] IP: dce110_vblank_set+0xe2/0x160 
[amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.113524] PGD 0 P4D 0 
Nov 19 22:34:27 Main-PC kernel: [   96.113531] Oops:  [#1] SMP
Nov 19 22:34:27 

Re: [PATCH 1/4] drm/ttm: add page order in page pool

2017-11-21 Thread Chunming Zhou



On 2017年11月22日 13:36, Roger He wrote:

to indicate page order for each element in the pool

Change-Id: Ic609925ca5d2a5d4ad49d6becf505388ce3624cf
Signed-off-by: Roger He 
---
  drivers/gpu/drm/ttm/ttm_page_alloc.c | 42 ++--
  1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 72ea037..0a0c653 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -81,6 +81,7 @@ struct ttm_page_pool {
char*name;
unsigned long   nfrees;
unsigned long   nrefills;
+   unsigned intorder;
  };
  
  /**

@@ -412,6 +413,7 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
struct ttm_page_pool *pool;
int shrink_pages = sc->nr_to_scan;
unsigned long freed = 0;
+   unsigned int nr_free_pool;
  
  	if (!mutex_trylock())

return SHRINK_STOP;
@@ -421,10 +423,15 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
unsigned nr_free = shrink_pages;
if (shrink_pages == 0)
break;
+
pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
/* OK to use static buffer since global mutex is held. */
-   shrink_pages = ttm_page_pool_free(pool, nr_free, true);
-   freed += nr_free - shrink_pages;
+   nr_free_pool = (nr_free >> pool->order);

How about nr_free_pool = roundup(nr_free >> pool->order, 1)?
 this way, your patch#4 is not need.

Regards,
David Zhou

+   if (nr_free_pool == 0)
+   continue;
+
+   shrink_pages = ttm_page_pool_free(pool, nr_free_pool, true);
+   freed += ((nr_free_pool - shrink_pages) << pool->order);
}
mutex_unlock();
return freed;
@@ -436,9 +443,12 @@ ttm_pool_shrink_count(struct shrinker *shrink, struct 
shrink_control *sc)
  {
unsigned i;
unsigned long count = 0;
+   struct ttm_page_pool *pool;
  
-	for (i = 0; i < NUM_POOLS; ++i)

-   count += _manager->pools[i].npages;
+   for (i = 0; i < NUM_POOLS; ++i) {
+   pool = &_manager->pools[i];
+   count += (pool->npages << pool->order);
+   }
  
  	return count;

  }
@@ -932,7 +942,7 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
  }
  
  static void ttm_page_pool_init_locked(struct ttm_page_pool *pool, gfp_t flags,

-   char *name)
+   char *name, unsigned int order)
  {
spin_lock_init(>lock);
pool->fill_lock = false;
@@ -940,8 +950,18 @@ static void ttm_page_pool_init_locked(struct ttm_page_pool 
*pool, gfp_t flags,
pool->npages = pool->nfrees = 0;
pool->gfp_flags = flags;
pool->name = name;
+   pool->order = order;
  }
  
+/**

+ * Actually if TRANSPARENT_HUGEPAGE not enabled, we will not use
+ * wc_pool_huge and uc_pool_huge, so no matter whatever the page
+ * order are for those two pools
+ */
+#ifndef CONFIG_TRANSPARENT_HUGEPAGE
+#defineHPAGE_PMD_ORDER 9
+#endif
+
  int ttm_page_alloc_init(struct ttm_mem_global *glob, unsigned max_pages)
  {
int ret;
@@ -952,23 +972,23 @@ int ttm_page_alloc_init(struct ttm_mem_global *glob, 
unsigned max_pages)
  
  	_manager = kzalloc(sizeof(*_manager), GFP_KERNEL);
  
-	ttm_page_pool_init_locked(&_manager->wc_pool, GFP_HIGHUSER, "wc");

+   ttm_page_pool_init_locked(&_manager->wc_pool, GFP_HIGHUSER, "wc", 0);
  
-	ttm_page_pool_init_locked(&_manager->uc_pool, GFP_HIGHUSER, "uc");

+   ttm_page_pool_init_locked(&_manager->uc_pool, GFP_HIGHUSER, "uc", 0);
  
  	ttm_page_pool_init_locked(&_manager->wc_pool_dma32,

- GFP_USER | GFP_DMA32, "wc dma");
+ GFP_USER | GFP_DMA32, "wc dma", 0);
  
  	ttm_page_pool_init_locked(&_manager->uc_pool_dma32,

- GFP_USER | GFP_DMA32, "uc dma");
+ GFP_USER | GFP_DMA32, "uc dma", 0);
  
  	ttm_page_pool_init_locked(&_manager->wc_pool_huge,

  GFP_TRANSHUGE & ~(__GFP_MOVABLE | __GFP_COMP),
- "wc huge");
+ "wc huge", HPAGE_PMD_ORDER);
  
  	ttm_page_pool_init_locked(&_manager->uc_pool_huge,

  GFP_TRANSHUGE & ~(__GFP_MOVABLE | __GFP_COMP)
- , "uc huge");
+ , "uc huge", HPAGE_PMD_ORDER);
  
  	_manager->options.max_size = max_pages;

_manager->options.small = SMALL_ALLOCATION;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


FW: [PATCH] drm/amd/vce: correct vce fw data and stack size config for sriov

2017-11-21 Thread Liu, Monk
Hi Christian

This patch can fix a VCE world switch hang bug, because couple registers were 
wrongly programed with the same address so hardware fight with each other
We have verified this patch, 

Regarding your comments:
1)the coding style is correct on the patch itself but looks incorrect in mail
2) can you point what do you mean on the sentence "programing 0 to register"

Since Frank hasn't applied for the member of amd-gfx, so his patch cannot go to 
amd-gfx loop directly 


BR Monk

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Frank 
Min
Sent: 2017年11月21日 16:34
To: amd-gfx@lists.freedesktop.org
Cc: Min, Frank 
Subject: [PATCH] drm/amd/vce: correct vce fw data and stack size config for 
sriov

Signed-off-by: Frank Min 
---
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)  mode change 100644 => 
100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
old mode 100644
new mode 100755
index 7574554..4a92530
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -243,37 +243,41 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
*adev)
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VM_CTRL), 0);
 
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),

adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   
(adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & 
+0xff);
} else {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
adev->vce.gpu_addr >> 8);
-   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
 
offset = AMDGPU_VCE_FIRMWARE_OFFSET;
size = VCE_V4_0_FW_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET0),
-   offset & 0x7FFF);
+   offset & ~0x0f00);
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE0), size);
 
-   offset += size;
+   offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? 
offset + 
+size : 0;
size = VCE_V4_0_STACK_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET1),
-   offset & 0x7FFF);
+   (offset & ~0x0f00) | (1 << 24));
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE1), size);
 
offset += size;
size 

[PATCH 4/4] drm/ttm: free one in huge pool even shrink request less than one element

2017-11-21 Thread Roger He
Change-Id: Id8bd4d1ecff9f3ab14355e2dbd1c59b9fe824e01
Signed-off-by: Roger He 
---
 drivers/gpu/drm/ttm/ttm_page_alloc.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 37c2f2f..f80fc5b 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -463,11 +463,13 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
/* OK to use static buffer since global mutex is held. */
nr_free_pool = (nr_free >> pool->order);
-   if (nr_free_pool == 0)
-   continue;
+   if (!nr_free_pool && pool->order)
+   nr_free_pool = 1;
 
shrink_pages = ttm_page_pool_free(pool, nr_free_pool, true);
freed += ((nr_free_pool - shrink_pages) << pool->order);
+   if (freed >= sc->nr_to_scan)
+   break;
}
mutex_unlock();
return freed;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/4] drm/ttm: add page order in page pool

2017-11-21 Thread Roger He
to indicate page order for each element in the pool

Change-Id: Ic609925ca5d2a5d4ad49d6becf505388ce3624cf
Signed-off-by: Roger He 
---
 drivers/gpu/drm/ttm/ttm_page_alloc.c | 42 ++--
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 72ea037..0a0c653 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -81,6 +81,7 @@ struct ttm_page_pool {
char*name;
unsigned long   nfrees;
unsigned long   nrefills;
+   unsigned intorder;
 };
 
 /**
@@ -412,6 +413,7 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
struct ttm_page_pool *pool;
int shrink_pages = sc->nr_to_scan;
unsigned long freed = 0;
+   unsigned int nr_free_pool;
 
if (!mutex_trylock())
return SHRINK_STOP;
@@ -421,10 +423,15 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
unsigned nr_free = shrink_pages;
if (shrink_pages == 0)
break;
+
pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
/* OK to use static buffer since global mutex is held. */
-   shrink_pages = ttm_page_pool_free(pool, nr_free, true);
-   freed += nr_free - shrink_pages;
+   nr_free_pool = (nr_free >> pool->order);
+   if (nr_free_pool == 0)
+   continue;
+
+   shrink_pages = ttm_page_pool_free(pool, nr_free_pool, true);
+   freed += ((nr_free_pool - shrink_pages) << pool->order);
}
mutex_unlock();
return freed;
@@ -436,9 +443,12 @@ ttm_pool_shrink_count(struct shrinker *shrink, struct 
shrink_control *sc)
 {
unsigned i;
unsigned long count = 0;
+   struct ttm_page_pool *pool;
 
-   for (i = 0; i < NUM_POOLS; ++i)
-   count += _manager->pools[i].npages;
+   for (i = 0; i < NUM_POOLS; ++i) {
+   pool = &_manager->pools[i];
+   count += (pool->npages << pool->order);
+   }
 
return count;
 }
@@ -932,7 +942,7 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
 }
 
 static void ttm_page_pool_init_locked(struct ttm_page_pool *pool, gfp_t flags,
-   char *name)
+   char *name, unsigned int order)
 {
spin_lock_init(>lock);
pool->fill_lock = false;
@@ -940,8 +950,18 @@ static void ttm_page_pool_init_locked(struct ttm_page_pool 
*pool, gfp_t flags,
pool->npages = pool->nfrees = 0;
pool->gfp_flags = flags;
pool->name = name;
+   pool->order = order;
 }
 
+/**
+ * Actually if TRANSPARENT_HUGEPAGE not enabled, we will not use
+ * wc_pool_huge and uc_pool_huge, so no matter whatever the page
+ * order are for those two pools
+ */
+#ifndef CONFIG_TRANSPARENT_HUGEPAGE
+#defineHPAGE_PMD_ORDER 9
+#endif
+
 int ttm_page_alloc_init(struct ttm_mem_global *glob, unsigned max_pages)
 {
int ret;
@@ -952,23 +972,23 @@ int ttm_page_alloc_init(struct ttm_mem_global *glob, 
unsigned max_pages)
 
_manager = kzalloc(sizeof(*_manager), GFP_KERNEL);
 
-   ttm_page_pool_init_locked(&_manager->wc_pool, GFP_HIGHUSER, "wc");
+   ttm_page_pool_init_locked(&_manager->wc_pool, GFP_HIGHUSER, "wc", 0);
 
-   ttm_page_pool_init_locked(&_manager->uc_pool, GFP_HIGHUSER, "uc");
+   ttm_page_pool_init_locked(&_manager->uc_pool, GFP_HIGHUSER, "uc", 0);
 
ttm_page_pool_init_locked(&_manager->wc_pool_dma32,
- GFP_USER | GFP_DMA32, "wc dma");
+ GFP_USER | GFP_DMA32, "wc dma", 0);
 
ttm_page_pool_init_locked(&_manager->uc_pool_dma32,
- GFP_USER | GFP_DMA32, "uc dma");
+ GFP_USER | GFP_DMA32, "uc dma", 0);
 
ttm_page_pool_init_locked(&_manager->wc_pool_huge,
  GFP_TRANSHUGE & ~(__GFP_MOVABLE | __GFP_COMP),
- "wc huge");
+ "wc huge", HPAGE_PMD_ORDER);
 
ttm_page_pool_init_locked(&_manager->uc_pool_huge,
  GFP_TRANSHUGE & ~(__GFP_MOVABLE | __GFP_COMP)
- , "uc huge");
+ , "uc huge", HPAGE_PMD_ORDER);
 
_manager->options.max_size = max_pages;
_manager->options.small = SMALL_ALLOCATION;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5

2017-11-21 Thread Boris Ostrovsky
On 11/21/2017 08:34 AM, Christian König wrote:
> Hi Boris,
>
> attached are two patches.
>
> The first one is a trivial fix for the infinite loop issue, it now
> correctly aborts the fixup when it can't find address space for the
> root window.
>
> The second is a workaround for your board. It simply checks if there
> is exactly one Processor Function to apply this fix on.
>
> Both are based on linus current master branch. Please test if they fix
> your issue.


Yes, they do fix it but that's because the feature is disabled.

Do you know what the actual problem was (on Xen)?

Thanks.
-boris

>
> Thanks for the help,
> Christian.
>
> Am 20.11.2017 um 17:33 schrieb Boris Ostrovsky:
>> On 11/20/2017 11:07 AM, Christian König wrote:
>>> Am 20.11.2017 um 16:51 schrieb Boris Ostrovsky:
 (and then it breaks differently as a Xen guest --- we hung on the last
 pci_read_config_dword(), I haven't looked at this at all yet)
>>> Hui? How does this fix applies to a Xen guest in the first place?
>>>
>>> Please provide the output of "lspci -nn" and explain further what is
>>> your config with Xen.
>>>
>>>
>>
>> This is dom0.
>>
>> -bash-4.1# lspci -nn
>> 00:00.0 Host bridge [0600]: ATI Technologies Inc RD890 Northbridge only
>> dual slot (2x16) PCI-e GFX Hydra part [1002:5a10] (rev 02)
>> 00:00.2 Generic system peripheral [0806]: ATI Technologies Inc Device
>> [1002:5a23]
>> 00:0d.0 PCI bridge [0604]: ATI Technologies Inc RD890 PCI to PCI bridge
>> (external gfx1 port B) [1002:5a1e]
>> 00:11.0 SATA controller [0106]: ATI Technologies Inc SB700/SB800 SATA
>> Controller [AHCI mode] [1002:4391]
>> 00:12.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>> OHCI0 Controller [1002:4397]
>> 00:12.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
>> Controller [1002:4398]
>> 00:12.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI
>> Controller [1002:4396]
>> 00:13.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>> OHCI0 Controller [1002:4397]
>> 00:13.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
>> Controller [1002:4398]
>> 00:13.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI
>> Controller [1002:4396]
>> 00:14.0 SMBus [0c05]: ATI Technologies Inc SBx00 SMBus Controller
>> [1002:4385] (rev 3d)
>> 00:14.3 ISA bridge [0601]: ATI Technologies Inc SB700/SB800 LPC host
>> controller [1002:439d]
>> 00:14.4 PCI bridge [0604]: ATI Technologies Inc SBx00 PCI to PCI Bridge
>> [1002:4384]
>> 00:14.5 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>> OHCI2 Controller [1002:4399]
>> 00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1600]
>> 00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1601]
>> 00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1602]
>> 00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1603]
>> 00:18.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1604]
>> 00:18.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1605]
>> 00:19.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1600]
>> 00:19.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1601]
>> 00:19.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1602]
>> 00:19.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1603]
>> 00:19.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1604]
>> 00:19.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1605]
>> 01:04.0 VGA compatible controller [0300]: Matrox Graphics, Inc. MGA
>> G200eW WPCM450 [102b:0532] (rev 0a)
>> 02:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
>> Network Connection [8086:10c9] (rev 01)
>> 02:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
>> Network Connection [8086:10c9] (rev 01)
>> -bash-4.1#
>>
>>
>> -boris
>
>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: used cached gca values for cik_read_register

2017-11-21 Thread Christian König

Am 21.11.2017 um 19:56 schrieb Alex Deucher:

Using the cached values has less latency for bare metal and
prevents reading back bogus values if the engine is powergated.

This was implemented for VI and SI, but somehow CIK got missed.

Signed-off-by: Alex Deucher 


Reviewed-by: Christian König 

At some point I actually wanted to unify the logic between generations. 
In other words instead of a *_read_register callback provide a list of 
uncached and cached registers and let the common code handle the details.


But yeah two hands one head problem.

Christian.


---
  drivers/gpu/drm/amd/amdgpu/cik.c | 111 +--
  1 file changed, 95 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c
index 6128080ff662..8ba056a2a5da 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik.c
@@ -1023,22 +1023,101 @@ static const struct amdgpu_allowed_register_entry 
cik_allowed_read_registers[] =
{mmPA_SC_RASTER_CONFIG_1, true},
  };
  
-static uint32_t cik_read_indexed_register(struct amdgpu_device *adev,

- u32 se_num, u32 sh_num,
- u32 reg_offset)
+
+static uint32_t cik_get_register_value(struct amdgpu_device *adev,
+  bool indexed, u32 se_num,
+  u32 sh_num, u32 reg_offset)
  {
-   uint32_t val;
+   if (indexed) {
+   uint32_t val;
+   unsigned se_idx = (se_num == 0x) ? 0 : se_num;
+   unsigned sh_idx = (sh_num == 0x) ? 0 : sh_num;
+
+   switch (reg_offset) {
+   case mmCC_RB_BACKEND_DISABLE:
+   return 
adev->gfx.config.rb_config[se_idx][sh_idx].rb_backend_disable;
+   case mmGC_USER_RB_BACKEND_DISABLE:
+   return 
adev->gfx.config.rb_config[se_idx][sh_idx].user_rb_backend_disable;
+   case mmPA_SC_RASTER_CONFIG:
+   return 
adev->gfx.config.rb_config[se_idx][sh_idx].raster_config;
+   case mmPA_SC_RASTER_CONFIG_1:
+   return 
adev->gfx.config.rb_config[se_idx][sh_idx].raster_config_1;
+   }
  
-	mutex_lock(>grbm_idx_mutex);

-   if (se_num != 0x || sh_num != 0x)
-   amdgpu_gfx_select_se_sh(adev, se_num, sh_num, 0x);
+   mutex_lock(>grbm_idx_mutex);
+   if (se_num != 0x || sh_num != 0x)
+   amdgpu_gfx_select_se_sh(adev, se_num, sh_num, 
0x);
  
-	val = RREG32(reg_offset);

+   val = RREG32(reg_offset);
  
-	if (se_num != 0x || sh_num != 0x)

-   amdgpu_gfx_select_se_sh(adev, 0x, 0x, 
0x);
-   mutex_unlock(>grbm_idx_mutex);
-   return val;
+   if (se_num != 0x || sh_num != 0x)
+   amdgpu_gfx_select_se_sh(adev, 0x, 0x, 
0x);
+   mutex_unlock(>grbm_idx_mutex);
+   return val;
+   } else {
+   unsigned idx;
+
+   switch (reg_offset) {
+   case mmGB_ADDR_CONFIG:
+   return adev->gfx.config.gb_addr_config;
+   case mmMC_ARB_RAMCFG:
+   return adev->gfx.config.mc_arb_ramcfg;
+   case mmGB_TILE_MODE0:
+   case mmGB_TILE_MODE1:
+   case mmGB_TILE_MODE2:
+   case mmGB_TILE_MODE3:
+   case mmGB_TILE_MODE4:
+   case mmGB_TILE_MODE5:
+   case mmGB_TILE_MODE6:
+   case mmGB_TILE_MODE7:
+   case mmGB_TILE_MODE8:
+   case mmGB_TILE_MODE9:
+   case mmGB_TILE_MODE10:
+   case mmGB_TILE_MODE11:
+   case mmGB_TILE_MODE12:
+   case mmGB_TILE_MODE13:
+   case mmGB_TILE_MODE14:
+   case mmGB_TILE_MODE15:
+   case mmGB_TILE_MODE16:
+   case mmGB_TILE_MODE17:
+   case mmGB_TILE_MODE18:
+   case mmGB_TILE_MODE19:
+   case mmGB_TILE_MODE20:
+   case mmGB_TILE_MODE21:
+   case mmGB_TILE_MODE22:
+   case mmGB_TILE_MODE23:
+   case mmGB_TILE_MODE24:
+   case mmGB_TILE_MODE25:
+   case mmGB_TILE_MODE26:
+   case mmGB_TILE_MODE27:
+   case mmGB_TILE_MODE28:
+   case mmGB_TILE_MODE29:
+   case mmGB_TILE_MODE30:
+   case mmGB_TILE_MODE31:
+   idx = (reg_offset - mmGB_TILE_MODE0);
+   return adev->gfx.config.tile_mode_array[idx];
+   case mmGB_MACROTILE_MODE0:
+   case mmGB_MACROTILE_MODE1:
+   case 

Re: [PATCH] drm/amd/display: Fix description of module parameter dc_log

2017-11-21 Thread Harry Wentland
On 2017-11-21 12:30 PM, Michel Dänzer wrote:
> From: Michel Dänzer 
> 
> It was incorrectly referencing the dc parameter, resulting in an empty
> description of the dc_log parameter.
> 
> Signed-off-by: Michel Dänzer 

Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index eaccd4bd12a4..31383e004947 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -216,7 +216,7 @@ module_param_named(exp_hw_support, amdgpu_exp_hw_support, 
> int, 0444);
>  MODULE_PARM_DESC(dc, "Display Core driver (1 = enable, 0 = disable, -1 = 
> auto (default))");
>  module_param_named(dc, amdgpu_dc, int, 0444);
>  
> -MODULE_PARM_DESC(dc, "Display Core Log Level (0 = minimal (default), 1 = 
> chatty");
> +MODULE_PARM_DESC(dc_log, "Display Core Log Level (0 = minimal (default), 1 = 
> chatty");
>  module_param_named(dc_log, amdgpu_dc_log, int, 0444);
>  
>  MODULE_PARM_DESC(sched_jobs, "the max number of jobs supported in the sw 
> queue (default 32)");
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: used cached gca values for cik_read_register

2017-11-21 Thread Alex Deucher
Using the cached values has less latency for bare metal and
prevents reading back bogus values if the engine is powergated.

This was implemented for VI and SI, but somehow CIK got missed.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/cik.c | 111 +--
 1 file changed, 95 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c
index 6128080ff662..8ba056a2a5da 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik.c
@@ -1023,22 +1023,101 @@ static const struct amdgpu_allowed_register_entry 
cik_allowed_read_registers[] =
{mmPA_SC_RASTER_CONFIG_1, true},
 };
 
-static uint32_t cik_read_indexed_register(struct amdgpu_device *adev,
- u32 se_num, u32 sh_num,
- u32 reg_offset)
+
+static uint32_t cik_get_register_value(struct amdgpu_device *adev,
+  bool indexed, u32 se_num,
+  u32 sh_num, u32 reg_offset)
 {
-   uint32_t val;
+   if (indexed) {
+   uint32_t val;
+   unsigned se_idx = (se_num == 0x) ? 0 : se_num;
+   unsigned sh_idx = (sh_num == 0x) ? 0 : sh_num;
+
+   switch (reg_offset) {
+   case mmCC_RB_BACKEND_DISABLE:
+   return 
adev->gfx.config.rb_config[se_idx][sh_idx].rb_backend_disable;
+   case mmGC_USER_RB_BACKEND_DISABLE:
+   return 
adev->gfx.config.rb_config[se_idx][sh_idx].user_rb_backend_disable;
+   case mmPA_SC_RASTER_CONFIG:
+   return 
adev->gfx.config.rb_config[se_idx][sh_idx].raster_config;
+   case mmPA_SC_RASTER_CONFIG_1:
+   return 
adev->gfx.config.rb_config[se_idx][sh_idx].raster_config_1;
+   }
 
-   mutex_lock(>grbm_idx_mutex);
-   if (se_num != 0x || sh_num != 0x)
-   amdgpu_gfx_select_se_sh(adev, se_num, sh_num, 0x);
+   mutex_lock(>grbm_idx_mutex);
+   if (se_num != 0x || sh_num != 0x)
+   amdgpu_gfx_select_se_sh(adev, se_num, sh_num, 
0x);
 
-   val = RREG32(reg_offset);
+   val = RREG32(reg_offset);
 
-   if (se_num != 0x || sh_num != 0x)
-   amdgpu_gfx_select_se_sh(adev, 0x, 0x, 
0x);
-   mutex_unlock(>grbm_idx_mutex);
-   return val;
+   if (se_num != 0x || sh_num != 0x)
+   amdgpu_gfx_select_se_sh(adev, 0x, 0x, 
0x);
+   mutex_unlock(>grbm_idx_mutex);
+   return val;
+   } else {
+   unsigned idx;
+
+   switch (reg_offset) {
+   case mmGB_ADDR_CONFIG:
+   return adev->gfx.config.gb_addr_config;
+   case mmMC_ARB_RAMCFG:
+   return adev->gfx.config.mc_arb_ramcfg;
+   case mmGB_TILE_MODE0:
+   case mmGB_TILE_MODE1:
+   case mmGB_TILE_MODE2:
+   case mmGB_TILE_MODE3:
+   case mmGB_TILE_MODE4:
+   case mmGB_TILE_MODE5:
+   case mmGB_TILE_MODE6:
+   case mmGB_TILE_MODE7:
+   case mmGB_TILE_MODE8:
+   case mmGB_TILE_MODE9:
+   case mmGB_TILE_MODE10:
+   case mmGB_TILE_MODE11:
+   case mmGB_TILE_MODE12:
+   case mmGB_TILE_MODE13:
+   case mmGB_TILE_MODE14:
+   case mmGB_TILE_MODE15:
+   case mmGB_TILE_MODE16:
+   case mmGB_TILE_MODE17:
+   case mmGB_TILE_MODE18:
+   case mmGB_TILE_MODE19:
+   case mmGB_TILE_MODE20:
+   case mmGB_TILE_MODE21:
+   case mmGB_TILE_MODE22:
+   case mmGB_TILE_MODE23:
+   case mmGB_TILE_MODE24:
+   case mmGB_TILE_MODE25:
+   case mmGB_TILE_MODE26:
+   case mmGB_TILE_MODE27:
+   case mmGB_TILE_MODE28:
+   case mmGB_TILE_MODE29:
+   case mmGB_TILE_MODE30:
+   case mmGB_TILE_MODE31:
+   idx = (reg_offset - mmGB_TILE_MODE0);
+   return adev->gfx.config.tile_mode_array[idx];
+   case mmGB_MACROTILE_MODE0:
+   case mmGB_MACROTILE_MODE1:
+   case mmGB_MACROTILE_MODE2:
+   case mmGB_MACROTILE_MODE3:
+   case mmGB_MACROTILE_MODE4:
+   case mmGB_MACROTILE_MODE5:
+   case mmGB_MACROTILE_MODE6:
+   case mmGB_MACROTILE_MODE7:
+   case mmGB_MACROTILE_MODE8:
+   case mmGB_MACROTILE_MODE9:
+   case mmGB_MACROTILE_MODE10:
+

Re: [PATCH] amdgpu: Downgrade DRM_ERROR to DRM_DEBUG in amdgpu_queue_mgr_map

2017-11-21 Thread Christian König

Am 21.11.2017 um 18:29 schrieb Michel Dänzer:

From: Michel Dänzer 

Prevent buggy userspace from spamming dmesg.

Signed-off-by: Michel Dänzer 


Once more subject line, apart from that Reviewed-by: Christian König 




---
  drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
index 93d86619e802..262c1267249e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
@@ -225,7 +225,7 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
  
  	/* Right now all IPs have only one instance - multiple rings. */

if (instance != 0) {
-   DRM_ERROR("invalid ip instance: %d\n", instance);
+   DRM_DEBUG("invalid ip instance: %d\n", instance);
return -EINVAL;
}
  
@@ -255,13 +255,13 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev,

ip_num_rings = adev->vcn.num_enc_rings;
break;
default:
-   DRM_ERROR("unknown ip type: %d\n", hw_ip);
+   DRM_DEBUG("unknown ip type: %d\n", hw_ip);
return -EINVAL;
}
  
  	if (ring >= ip_num_rings) {

-   DRM_ERROR("Ring index:%d exceeds maximum:%d for ip:%d\n",
-   ring, ip_num_rings, hw_ip);
+   DRM_DEBUG("Ring index:%d exceeds maximum:%d for ip:%d\n",
+ ring, ip_num_rings, hw_ip);
return -EINVAL;
}
  
@@ -292,7 +292,7 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev,

default:
*out_ring = NULL;
r = -EINVAL;
-   DRM_ERROR("unknown HW IP type: %d\n", mapper->hw_ip);
+   DRM_DEBUG("unknown HW IP type: %d\n", mapper->hw_ip);
}
  
  out_unlock:



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] amdgpu: Use unsigned HW IP/instance/ring indices in amdgpu_queue_mgr_map

2017-11-21 Thread Christian König

Am 21.11.2017 um 18:29 schrieb Michel Dänzer:

From: Michel Dänzer 

This matches the corresponding UAPI fields. Treating the ring index as
signed could result in accessing random unrelated memory if the MSB was
set.

Fixes: effd924d2f3b ("drm/amdgpu: untie user ring ids from kernel ring
   ids v6")
Cc: sta...@vger.kernel.org
Signed-off-by: Michel Dänzer 


Subject line is off, with that fixed Reviewed-by: Christian König 
.



---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h   | 2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c | 6 +++---
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 86f91789de6d..f8657c37ba9d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -722,7 +722,7 @@ int amdgpu_queue_mgr_fini(struct amdgpu_device *adev,
  struct amdgpu_queue_mgr *mgr);
  int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
 struct amdgpu_queue_mgr *mgr,
-int hw_ip, int instance, int ring,
+u32 hw_ip, u32 instance, u32 ring,
 struct amdgpu_ring **out_ring);
  
  /*

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
index 190e28cb827e..93d86619e802 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
@@ -63,7 +63,7 @@ static int amdgpu_update_cached_map(struct 
amdgpu_queue_mapper *mapper,
  
  static int amdgpu_identity_map(struct amdgpu_device *adev,

   struct amdgpu_queue_mapper *mapper,
-  int ring,
+  u32 ring,
   struct amdgpu_ring **out_ring)
  {
switch (mapper->hw_ip) {
@@ -121,7 +121,7 @@ static enum amdgpu_ring_type amdgpu_hw_ip_to_ring_type(int 
hw_ip)
  
  static int amdgpu_lru_map(struct amdgpu_device *adev,

  struct amdgpu_queue_mapper *mapper,
- int user_ring, bool lru_pipe_order,
+ u32 user_ring, bool lru_pipe_order,
  struct amdgpu_ring **out_ring)
  {
int r, i, j;
@@ -208,7 +208,7 @@ int amdgpu_queue_mgr_fini(struct amdgpu_device *adev,
   */
  int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
 struct amdgpu_queue_mgr *mgr,
-int hw_ip, int instance, int ring,
+u32 hw_ip, u32 instance, u32 ring,
 struct amdgpu_ring **out_ring)
  {
int r, ip_num_rings;



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] amdgpu: Set adev->vcn.irq.num_types for VCN

2017-11-21 Thread Christian König

Am 21.11.2017 um 18:28 schrieb Michel Dänzer:

From: Michel Dänzer 

We were setting adev->uvd.irq.num_types instead.

Fixes: 9b257116e784 ("drm/amdgpu: add vcn enc irq support")
Signed-off-by: Michel Dänzer 


Subject line is wrong :) Apart from that Reviewed-by: Christian König 
.



---
  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index 1eb4d79d6e30..0450ac5ba6b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -1175,7 +1175,7 @@ static const struct amdgpu_irq_src_funcs 
vcn_v1_0_irq_funcs = {
  
  static void vcn_v1_0_set_irq_funcs(struct amdgpu_device *adev)

  {
-   adev->uvd.irq.num_types = adev->vcn.num_enc_rings + 1;
+   adev->vcn.irq.num_types = adev->vcn.num_enc_rings + 1;
adev->vcn.irq.funcs = _v1_0_irq_funcs;
  }
  



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: move UVD/VCE and VCN structure out from union

2017-11-21 Thread Leo Liu



On 11/21/2017 12:17 PM, Michel Dänzer wrote:

On 2017-11-21 03:28 PM, Leo Liu wrote:

With the enablement of VCN Dec and Enc from user space, User space queries
kernel for the IP information, if HW has UVD/VCE, the info comes from these
IP blocks, but this could end up mis-interpret for VCN when they are in the
union, ther other way same when HW with VCN block.

Signed-off-by: Leo Liu 

Please add:

Fixes: 95d0906f8506 ("drm/amdgpu: add initial vcn support and decode
   tests")
Cc: sta...@vger.kernel.org
Reviewed-and-Tested-by: Michel Dänzer 

Thanks for the comment.
Sorry that I have already pushed it to internally branch, since my 
Polaris10 fails to light up with newer Mesa.


Leo








___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] amdgpu: Downgrade DRM_ERROR to DRM_DEBUG in amdgpu_queue_mgr_map

2017-11-21 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Michel Dänzer
> Sent: Tuesday, November 21, 2017 12:30 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH] amdgpu: Downgrade DRM_ERROR to DRM_DEBUG in
> amdgpu_queue_mgr_map
> 
> From: Michel Dänzer 
> 
> Prevent buggy userspace from spamming dmesg.
> 
> Signed-off-by: Michel Dänzer 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
> index 93d86619e802..262c1267249e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
> @@ -225,7 +225,7 @@ int amdgpu_queue_mgr_map(struct amdgpu_device
> *adev,
> 
>   /* Right now all IPs have only one instance - multiple rings. */
>   if (instance != 0) {
> - DRM_ERROR("invalid ip instance: %d\n", instance);
> + DRM_DEBUG("invalid ip instance: %d\n", instance);
>   return -EINVAL;
>   }
> 
> @@ -255,13 +255,13 @@ int amdgpu_queue_mgr_map(struct
> amdgpu_device *adev,
>   ip_num_rings = adev->vcn.num_enc_rings;
>   break;
>   default:
> - DRM_ERROR("unknown ip type: %d\n", hw_ip);
> + DRM_DEBUG("unknown ip type: %d\n", hw_ip);
>   return -EINVAL;
>   }
> 
>   if (ring >= ip_num_rings) {
> - DRM_ERROR("Ring index:%d exceeds maximum:%d for
> ip:%d\n",
> - ring, ip_num_rings, hw_ip);
> + DRM_DEBUG("Ring index:%d exceeds maximum:%d for
> ip:%d\n",
> +   ring, ip_num_rings, hw_ip);
>   return -EINVAL;
>   }
> 
> @@ -292,7 +292,7 @@ int amdgpu_queue_mgr_map(struct amdgpu_device
> *adev,
>   default:
>   *out_ring = NULL;
>   r = -EINVAL;
> - DRM_ERROR("unknown HW IP type: %d\n", mapper-
> >hw_ip);
> + DRM_DEBUG("unknown HW IP type: %d\n", mapper-
> >hw_ip);
>   }
> 
>  out_unlock:
> --
> 2.15.0
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] amdgpu: Set adev->vcn.irq.num_types for VCN

2017-11-21 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Michel Dänzer
> Sent: Tuesday, November 21, 2017 12:29 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Liu, Leo
> Subject: [PATCH] amdgpu: Set adev->vcn.irq.num_types for VCN
> 
> From: Michel Dänzer 
> 
> We were setting adev->uvd.irq.num_types instead.
> 
> Fixes: 9b257116e784 ("drm/amdgpu: add vcn enc irq support")
> Signed-off-by: Michel Dänzer 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> index 1eb4d79d6e30..0450ac5ba6b6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> @@ -1175,7 +1175,7 @@ static const struct amdgpu_irq_src_funcs
> vcn_v1_0_irq_funcs = {
> 
>  static void vcn_v1_0_set_irq_funcs(struct amdgpu_device *adev)
>  {
> - adev->uvd.irq.num_types = adev->vcn.num_enc_rings + 1;
> + adev->vcn.irq.num_types = adev->vcn.num_enc_rings + 1;
>   adev->vcn.irq.funcs = _v1_0_irq_funcs;
>  }
> 
> --
> 2.15.0
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] amdgpu: Downgrade DRM_ERROR to DRM_DEBUG in amdgpu_queue_mgr_map

2017-11-21 Thread Michel Dänzer
From: Michel Dänzer 

Prevent buggy userspace from spamming dmesg.

Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
index 93d86619e802..262c1267249e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
@@ -225,7 +225,7 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
 
/* Right now all IPs have only one instance - multiple rings. */
if (instance != 0) {
-   DRM_ERROR("invalid ip instance: %d\n", instance);
+   DRM_DEBUG("invalid ip instance: %d\n", instance);
return -EINVAL;
}
 
@@ -255,13 +255,13 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
ip_num_rings = adev->vcn.num_enc_rings;
break;
default:
-   DRM_ERROR("unknown ip type: %d\n", hw_ip);
+   DRM_DEBUG("unknown ip type: %d\n", hw_ip);
return -EINVAL;
}
 
if (ring >= ip_num_rings) {
-   DRM_ERROR("Ring index:%d exceeds maximum:%d for ip:%d\n",
-   ring, ip_num_rings, hw_ip);
+   DRM_DEBUG("Ring index:%d exceeds maximum:%d for ip:%d\n",
+ ring, ip_num_rings, hw_ip);
return -EINVAL;
}
 
@@ -292,7 +292,7 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
default:
*out_ring = NULL;
r = -EINVAL;
-   DRM_ERROR("unknown HW IP type: %d\n", mapper->hw_ip);
+   DRM_DEBUG("unknown HW IP type: %d\n", mapper->hw_ip);
}
 
 out_unlock:
-- 
2.15.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] amdgpu: Use unsigned HW IP/instance/ring indices in amdgpu_queue_mgr_map

2017-11-21 Thread Michel Dänzer
From: Michel Dänzer 

This matches the corresponding UAPI fields. Treating the ring index as
signed could result in accessing random unrelated memory if the MSB was
set.

Fixes: effd924d2f3b ("drm/amdgpu: untie user ring ids from kernel ring
  ids v6")
Cc: sta...@vger.kernel.org
Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 86f91789de6d..f8657c37ba9d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -722,7 +722,7 @@ int amdgpu_queue_mgr_fini(struct amdgpu_device *adev,
  struct amdgpu_queue_mgr *mgr);
 int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
 struct amdgpu_queue_mgr *mgr,
-int hw_ip, int instance, int ring,
+u32 hw_ip, u32 instance, u32 ring,
 struct amdgpu_ring **out_ring);
 
 /*
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
index 190e28cb827e..93d86619e802 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
@@ -63,7 +63,7 @@ static int amdgpu_update_cached_map(struct 
amdgpu_queue_mapper *mapper,
 
 static int amdgpu_identity_map(struct amdgpu_device *adev,
   struct amdgpu_queue_mapper *mapper,
-  int ring,
+  u32 ring,
   struct amdgpu_ring **out_ring)
 {
switch (mapper->hw_ip) {
@@ -121,7 +121,7 @@ static enum amdgpu_ring_type amdgpu_hw_ip_to_ring_type(int 
hw_ip)
 
 static int amdgpu_lru_map(struct amdgpu_device *adev,
  struct amdgpu_queue_mapper *mapper,
- int user_ring, bool lru_pipe_order,
+ u32 user_ring, bool lru_pipe_order,
  struct amdgpu_ring **out_ring)
 {
int r, i, j;
@@ -208,7 +208,7 @@ int amdgpu_queue_mgr_fini(struct amdgpu_device *adev,
  */
 int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
 struct amdgpu_queue_mgr *mgr,
-int hw_ip, int instance, int ring,
+u32 hw_ip, u32 instance, u32 ring,
 struct amdgpu_ring **out_ring)
 {
int r, ip_num_rings;
-- 
2.15.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] amdgpu: Set adev->vcn.irq.num_types for VCN

2017-11-21 Thread Michel Dänzer
From: Michel Dänzer 

We were setting adev->uvd.irq.num_types instead.

Fixes: 9b257116e784 ("drm/amdgpu: add vcn enc irq support")
Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index 1eb4d79d6e30..0450ac5ba6b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -1175,7 +1175,7 @@ static const struct amdgpu_irq_src_funcs 
vcn_v1_0_irq_funcs = {
 
 static void vcn_v1_0_set_irq_funcs(struct amdgpu_device *adev)
 {
-   adev->uvd.irq.num_types = adev->vcn.num_enc_rings + 1;
+   adev->vcn.irq.num_types = adev->vcn.num_enc_rings + 1;
adev->vcn.irq.funcs = _v1_0_irq_funcs;
 }
 
-- 
2.15.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: move UVD/VCE and VCN structure out from union

2017-11-21 Thread Michel Dänzer
On 2017-11-21 03:28 PM, Leo Liu wrote:
> With the enablement of VCN Dec and Enc from user space, User space queries
> kernel for the IP information, if HW has UVD/VCE, the info comes from these
> IP blocks, but this could end up mis-interpret for VCN when they are in the
> union, ther other way same when HW with VCN block.
> 
> Signed-off-by: Leo Liu 

Please add:

Fixes: 95d0906f8506 ("drm/amdgpu: add initial vcn support and decode
  tests")
Cc: sta...@vger.kernel.org
Reviewed-and-Tested-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] drm/radeon: Add dpm quirk for Jet PRO

2017-11-21 Thread Alex Deucher
Fixes stability issues.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=103370
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/radeon/si_dpm.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
index ee3e74266a13..bd4e9638b744 100644
--- a/drivers/gpu/drm/radeon/si_dpm.c
+++ b/drivers/gpu/drm/radeon/si_dpm.c
@@ -2984,6 +2984,11 @@ static void si_apply_state_adjust_rules(struct 
radeon_device *rdev,
(rdev->pdev->device == 0x6667)) {
max_sclk = 75000;
}
+   if ((rdev->pdev->revision == 0xC3) ||
+   (rdev->pdev->device == 0x6665)) {
+   max_sclk = 65000;
+   max_mclk = 8;
+   }
} else if (rdev->family == CHIP_OLAND) {
if ((rdev->pdev->revision == 0xC7) ||
(rdev->pdev->revision == 0x80) ||
-- 
2.13.6

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: move UVD/VCE and VCN structure out from union

2017-11-21 Thread Leo Liu
With the enablement of VCN Dec and Enc from user space, User space queries
kernel for the IP information, if HW has UVD/VCE, the info comes from these
IP blocks, but this could end up mis-interpret for VCN when they are in the
union, ther other way same when HW with VCN block.

Signed-off-by: Leo Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 20 
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 750336dce0e9..86f91789de6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1590,18 +1590,14 @@ struct amdgpu_device {
/* sdma */
struct amdgpu_sdma  sdma;
 
-   union {
-   struct {
-   /* uvd */
-   struct amdgpu_uvd   uvd;
-
-   /* vce */
-   struct amdgpu_vce   vce;
-   };
-
-   /* vcn */
-   struct amdgpu_vcn   vcn;
-   };
+   /* uvd */
+   struct amdgpu_uvd   uvd;
+
+   /* vce */
+   struct amdgpu_vce   vce;
+
+   /* vcn */
+   struct amdgpu_vcn   vcn;
 
/* firmwares */
struct amdgpu_firmware  firmware;
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: move UVD/VCE and VCN structure out from union

2017-11-21 Thread Christian König

Am 21.11.2017 um 17:22 schrieb Alex Deucher:

On Tue, Nov 21, 2017 at 11:13 AM, Leo Liu  wrote:

Wth the current upstream Mesa, kernel will spin error and break when
starting display manger at least on my Polaris10 card.

This is because the query info from VCN enc from Mesa, and
"adev->vcn.num_enc_rings" is not zero because of the union.

For some reason the patch doesn't seem to have come through properly
on the mailing list.


Yeah, don't see that either.



Acked-by: Alex Deucher 


Reviewed-by: Christian König 

Regards,
Christian.




[ 9066.794232] BUG: unable to handle kernel paging request at
99b641ec
[ 9066.794293] IP: amdgpu_info_ioctl+0xee2/0x1070 [amdgpu]
[ 9066.794295] PGD 44229067 P4D 44229067 PUD 0
[ 9066.794300] Oops:  [#1] SMP
[ 9066.794302] Modules linked in: fuse amdgpu(OE) mfd_core chash ttm k10temp
i2c_piix4
[ 9066.794310] CPU: 3 PID: 24999 Comm: Xorg Tainted: G   OE
4.14.0-rc3+ #4
[ 9066.794311] Hardware name: Gigabyte Technology Co., Ltd.
GA-880GMA-UD2H/GA-880GMA-UD2H, BIOS F5 09/30/2010
[ 9066.794313] task: 99b62a930040 task.stack: adf280c6
[ 9066.794339] RIP: 0010:amdgpu_info_ioctl+0xee2/0x1070 [amdgpu]
[ 9066.794340] RSP: 0018:adf280c63b70 EFLAGS: 00010217
[ 9066.794342] RAX:  RBX: adf280c63db0 RCX:
00063c59
[ 9066.794344] RDX: 99b641ec RSI:  RDI:

[ 9066.794345] RBP: adf280c63d28 R08: e200 R09:
0001
[ 9066.794346] R10:  R11:  R12:
99b61e7b
[ 9066.794348] R13: c0127730 R14: 0020 R15:
7ffd852f0410
[ 9066.794350] FS:  7f0265507a00() GS:99b63fcc()
knlGS:
[ 9066.794351] CS:  0010 DS:  ES:  CR0: 80050033
[ 9066.794352] CR2: 99b641ec CR3: 00013763d000 CR4:
06e0
[ 9066.794354] Call Trace:
[ 9066.794359]  ? kernel_text_address+0x69/0xc0
[ 9066.794362]  ? rcu_read_lock_sched_held+0x1d/0x60
[ 9066.794365]  ? module_assert_mutex_or_preempt+0x13/0x40
[ 9066.794366]  ? __module_address+0x27/0xf0
[ 9066.794391]  ? amdgpu_drm_ioctl+0x32/0x80 [amdgpu]
[ 9066.794395]  ? entry_SYSCALL_64_fastpath+0x1f/0xbe
[ 9066.794396]  ? __kernel_text_address+0xd/0x40
[ 9066.794399]  ? unwind_get_return_address+0x1a/0x30
[ 9066.794402]  ? __save_stack_trace+0x61/0xd0
[ 9066.794404]  ? entry_SYSCALL_64_fastpath+0x1f/0xbe
[ 9066.794430]  ? amdgpu_debugfs_firmware_info+0x290/0x290 [amdgpu]
[ 9066.794433]  drm_ioctl_kernel+0x64/0xb0
[ 9066.794435]  drm_ioctl+0x30a/0x3d0
[ 9066.794461]  ? amdgpu_debugfs_firmware_info+0x290/0x290 [amdgpu]
[ 9066.794464]  ? trace_hardirqs_on_caller+0x11f/0x190
[ 9066.794466]  ? trace_hardirqs_on+0xd/0x10
[ 9066.794492]  amdgpu_drm_ioctl+0x47/0x80 [amdgpu]
[ 9066.794495]  do_vfs_ioctl+0x8e/0x640
[ 9066.794497]  ? trace_hardirqs_on+0xd/0x10
[ 9066.794500]  ? security_file_ioctl+0x3e/0x60


On 11/21/2017 09:28 AM, Leo Liu wrote:

With the enablement of VCN Dec and Enc from user space, User space queries
kernel for the IP information, if HW has UVD/VCE, the info comes from these
IP blocks, but this could end up mis-interpret for VCN when they are in the
union, ther other way same when HW with VCN block.

Signed-off-by: Leo Liu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h | 20 
  1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 750336dce0e9..86f91789de6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1590,18 +1590,14 @@ struct amdgpu_device {
   /* sdma */
   struct amdgpu_sdma sdma;

- union {
- struct {
- /* uvd */
- struct amdgpu_uvd uvd;
-
- /* vce */
- struct amdgpu_vce vce;
- };
-
- /* vcn */
- struct amdgpu_vcn vcn;
- };
+ /* uvd */
+ struct amdgpu_uvd uvd;
+
+ /* vce */
+ struct amdgpu_vce vce;
+
+ /* vcn */
+ struct amdgpu_vcn vcn;

   /* firmwares */
   struct amdgpu_firmware firmware;



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation

2017-11-21 Thread Felix Kuehling

On 2017-11-21 06:44 AM, Oded Gabbay wrote:
> Thanks Felix for catching that, For some reason I remembered  EOP
> buffer should be the same size of the queue.

The EOP queue size is hard-coded to prop.eop_ring_buffer_size =
PAGE_SIZE for kernel queues in initialize in kfd_kernel_queue.c. I'm not
too familiar with the HW/FW details. But I see this comment in
kfd_mqd_manager_vi.c:

/*
 * HW does not clamp this field correctly. Maximum EOP queue size
 * is constrained by per-SE EOP done signal count, which is 8-bit.
 * Limit is 0xFF EOP entries (= 0x7F8 dwords). CP will not submit
 * more than (EOP entry count - 1) so a queue size of 0x800 dwords
 * is safe, giving a maximum field value of 0xA.
 */

With that the maximum possible EOP queue size would be two pages,
regardless of the queue size.

> Then we can remove the queue size parameter from that function ?

Not the way the code is currently organized. Currently struct
kernel_queue_ops is shared for ASIC-independent and ASIC-specific queue
ops. The ASIC-independent initialize function in kfd_kernel_queue.c
still needs this parameter.

That said, the kernel_queue stuff could be cleaned up a bit in general.
IMO the hardware-independent functions don't really need to be called
through function pointers. The ASIC-specific function pointers don't
need to be in the kernel_queue structure, they could be in kfd_dev.

Regards,
  Felix

>
> On Mon, Nov 20, 2017 at 9:22 PM, Felix Kuehling  
> wrote:
>> I think this patch is not correct. The EOP-mem is not associated with
>> the queue size. The EOP buffer is a separate buffer used by the firmware
>> to handle command completion. As I understand it, this allows more
>> concurrency, while still making it look like all commands in the queue
>> are completing in order.
>>
>> Regards,
>>   Felix
>>
>>
>> On 2017-11-19 03:19 AM, Oded Gabbay wrote:
>>> On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely  wrote:
 Signed-off-by: Jan Vesely 
 ---
  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

 diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c 
 b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
 index f1d48281e322..b3bee39661ab 100644
 --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
 +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
 @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, 
 struct kfd_dev *dev,
 enum kfd_queue_type type, unsigned int queue_size)
  {
 int retval;
 +   unsigned int size = ALIGN(queue_size, PAGE_SIZE);

 -   retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, >eop_mem);
 +   retval = kfd_gtt_sa_allocate(dev, size, >eop_mem);
 if (retval != 0)
 return false;

 kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
 kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;

 -   memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
 +   memset(kq->eop_kernel_addr, 0, size);

 return true;
  }
 --
 2.13.6

 ___
 amd-gfx mailing list
 amd-gfx@lists.freedesktop.org
 https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>> Thanks!
>>> Applied to -next tree
>>> Oded
>>> ___
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: move UVD/VCE and VCN structure out from union

2017-11-21 Thread Alex Deucher
On Tue, Nov 21, 2017 at 11:13 AM, Leo Liu  wrote:
> Wth the current upstream Mesa, kernel will spin error and break when
> starting display manger at least on my Polaris10 card.
>
> This is because the query info from VCN enc from Mesa, and
> "adev->vcn.num_enc_rings" is not zero because of the union.

For some reason the patch doesn't seem to have come through properly
on the mailing list.

Acked-by: Alex Deucher 

>
> [ 9066.794232] BUG: unable to handle kernel paging request at
> 99b641ec
> [ 9066.794293] IP: amdgpu_info_ioctl+0xee2/0x1070 [amdgpu]
> [ 9066.794295] PGD 44229067 P4D 44229067 PUD 0
> [ 9066.794300] Oops:  [#1] SMP
> [ 9066.794302] Modules linked in: fuse amdgpu(OE) mfd_core chash ttm k10temp
> i2c_piix4
> [ 9066.794310] CPU: 3 PID: 24999 Comm: Xorg Tainted: G   OE
> 4.14.0-rc3+ #4
> [ 9066.794311] Hardware name: Gigabyte Technology Co., Ltd.
> GA-880GMA-UD2H/GA-880GMA-UD2H, BIOS F5 09/30/2010
> [ 9066.794313] task: 99b62a930040 task.stack: adf280c6
> [ 9066.794339] RIP: 0010:amdgpu_info_ioctl+0xee2/0x1070 [amdgpu]
> [ 9066.794340] RSP: 0018:adf280c63b70 EFLAGS: 00010217
> [ 9066.794342] RAX:  RBX: adf280c63db0 RCX:
> 00063c59
> [ 9066.794344] RDX: 99b641ec RSI:  RDI:
> 
> [ 9066.794345] RBP: adf280c63d28 R08: e200 R09:
> 0001
> [ 9066.794346] R10:  R11:  R12:
> 99b61e7b
> [ 9066.794348] R13: c0127730 R14: 0020 R15:
> 7ffd852f0410
> [ 9066.794350] FS:  7f0265507a00() GS:99b63fcc()
> knlGS:
> [ 9066.794351] CS:  0010 DS:  ES:  CR0: 80050033
> [ 9066.794352] CR2: 99b641ec CR3: 00013763d000 CR4:
> 06e0
> [ 9066.794354] Call Trace:
> [ 9066.794359]  ? kernel_text_address+0x69/0xc0
> [ 9066.794362]  ? rcu_read_lock_sched_held+0x1d/0x60
> [ 9066.794365]  ? module_assert_mutex_or_preempt+0x13/0x40
> [ 9066.794366]  ? __module_address+0x27/0xf0
> [ 9066.794391]  ? amdgpu_drm_ioctl+0x32/0x80 [amdgpu]
> [ 9066.794395]  ? entry_SYSCALL_64_fastpath+0x1f/0xbe
> [ 9066.794396]  ? __kernel_text_address+0xd/0x40
> [ 9066.794399]  ? unwind_get_return_address+0x1a/0x30
> [ 9066.794402]  ? __save_stack_trace+0x61/0xd0
> [ 9066.794404]  ? entry_SYSCALL_64_fastpath+0x1f/0xbe
> [ 9066.794430]  ? amdgpu_debugfs_firmware_info+0x290/0x290 [amdgpu]
> [ 9066.794433]  drm_ioctl_kernel+0x64/0xb0
> [ 9066.794435]  drm_ioctl+0x30a/0x3d0
> [ 9066.794461]  ? amdgpu_debugfs_firmware_info+0x290/0x290 [amdgpu]
> [ 9066.794464]  ? trace_hardirqs_on_caller+0x11f/0x190
> [ 9066.794466]  ? trace_hardirqs_on+0xd/0x10
> [ 9066.794492]  amdgpu_drm_ioctl+0x47/0x80 [amdgpu]
> [ 9066.794495]  do_vfs_ioctl+0x8e/0x640
> [ 9066.794497]  ? trace_hardirqs_on+0xd/0x10
> [ 9066.794500]  ? security_file_ioctl+0x3e/0x60
>
>
> On 11/21/2017 09:28 AM, Leo Liu wrote:
>
> With the enablement of VCN Dec and Enc from user space, User space queries
> kernel for the IP information, if HW has UVD/VCE, the info comes from these
> IP blocks, but this could end up mis-interpret for VCN when they are in the
> union, ther other way same when HW with VCN block.
>
> Signed-off-by: Leo Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h | 20 
>  1 file changed, 8 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 750336dce0e9..86f91789de6d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1590,18 +1590,14 @@ struct amdgpu_device {
>   /* sdma */
>   struct amdgpu_sdma sdma;
>
> - union {
> - struct {
> - /* uvd */
> - struct amdgpu_uvd uvd;
> -
> - /* vce */
> - struct amdgpu_vce vce;
> - };
> -
> - /* vcn */
> - struct amdgpu_vcn vcn;
> - };
> + /* uvd */
> + struct amdgpu_uvd uvd;
> +
> + /* vce */
> + struct amdgpu_vce vce;
> +
> + /* vcn */
> + struct amdgpu_vcn vcn;
>
>   /* firmwares */
>   struct amdgpu_firmware firmware;
>
>
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[pull] amdgpu drm-next-4.15

2017-11-21 Thread Alex Deucher
Hi Dave,

A few more misc fixes for 4.15.  It doesn't look like you pulled
my request from last week.  Those fixes are in this branch as well.

The following changes since commit 451cc55dd17fa5130f05629ac8d90e32facf27f6:

  drm/amd/pp: fix dpm randomly failed on Vega10 (2017-11-15 14:03:45 -0500)

are available in the git repository at:

  git://people.freedesktop.org/~agd5f/linux drm-next-4.15

for you to fetch changes up to 446947b44fb8cabc0213ff4efd706931e36b1963:

  drm/amdgpu: fix rmmod KCQ disable failed error (2017-11-21 10:45:05 -0500)


Alex Deucher (2):
  Revert "drm/radeon: dont switch vt on suspend"
  drm/amdgpu: don't skip attributes when powerplay is enabled

Eric Huang (1):
  drm/amd/powerplay: fix unfreeze level smc message for smu7

Monk Liu (2):
  drm/amdgpu:fix memleak in takedown
  drm/amdgpu:fix memleak

Rex Zhu (1):
  drm/amd/pp: fix typecast error in powerplay.

Roger He (1):
  drm/amd/amdgpu: fix over-bound accessing in amdgpu_cs_wait_any_fence

Wang Hongcheng (1):
  drm/amdgpu: fix rmmod KCQ disable failed error

Xiangliang.Yu (1):
  drm/amdgpu: fix kernel hang when starting VNC server

 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 5 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 6 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c  | 4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c   | 3 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c| 5 -
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c   | 8 
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   | 9 +
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 8 
 drivers/gpu/drm/amd/powerplay/hwmgr/process_pptables_v1_0.c | 4 ++--
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c| 2 +-
 drivers/gpu/drm/radeon/radeon_fb.c  | 1 -
 14 files changed, 37 insertions(+), 25 deletions(-)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: move UVD/VCE and VCN structure out from union

2017-11-21 Thread Leo Liu
Wth the current upstream Mesa, kernel will spin error and break when 
starting display manger at least on my Polaris10 card.


This is because the query info from VCN enc from Mesa, and 
"adev->vcn.num_enc_rings" is not zero because of the union.


[ 9066.794232] BUG: unable to handle kernel paging request at 
99b641ec

[ 9066.794293] IP: amdgpu_info_ioctl+0xee2/0x1070 [amdgpu]
[ 9066.794295] PGD 44229067 P4D 44229067 PUD 0
[ 9066.794300] Oops:  [#1] SMP
[ 9066.794302] Modules linked in: fuse amdgpu(OE) mfd_core chash ttm 
k10temp i2c_piix4

[ 9066.794310] CPU: 3 PID: 24999 Comm: Xorg Tainted: G OE   4.14.0-rc3+ #4
[ 9066.794311] Hardware name: Gigabyte Technology Co., Ltd. 
GA-880GMA-UD2H/GA-880GMA-UD2H, BIOS F5 09/30/2010

[ 9066.794313] task: 99b62a930040 task.stack: adf280c6
[ 9066.794339] RIP: 0010:amdgpu_info_ioctl+0xee2/0x1070 [amdgpu]
[ 9066.794340] RSP: 0018:adf280c63b70 EFLAGS: 00010217
[ 9066.794342] RAX:  RBX: adf280c63db0 RCX: 
00063c59
[ 9066.794344] RDX: 99b641ec RSI:  RDI: 

[ 9066.794345] RBP: adf280c63d28 R08: e200 R09: 
0001
[ 9066.794346] R10:  R11:  R12: 
99b61e7b
[ 9066.794348] R13: c0127730 R14: 0020 R15: 
7ffd852f0410
[ 9066.794350] FS:  7f0265507a00() GS:99b63fcc() 
knlGS:

[ 9066.794351] CS:  0010 DS:  ES:  CR0: 80050033
[ 9066.794352] CR2: 99b641ec CR3: 00013763d000 CR4: 
06e0

[ 9066.794354] Call Trace:
[ 9066.794359]  ? kernel_text_address+0x69/0xc0
[ 9066.794362]  ? rcu_read_lock_sched_held+0x1d/0x60
[ 9066.794365]  ? module_assert_mutex_or_preempt+0x13/0x40
[ 9066.794366]  ? __module_address+0x27/0xf0
[ 9066.794391]  ? amdgpu_drm_ioctl+0x32/0x80 [amdgpu]
[ 9066.794395]  ? entry_SYSCALL_64_fastpath+0x1f/0xbe
[ 9066.794396]  ? __kernel_text_address+0xd/0x40
[ 9066.794399]  ? unwind_get_return_address+0x1a/0x30
[ 9066.794402]  ? __save_stack_trace+0x61/0xd0
[ 9066.794404]  ? entry_SYSCALL_64_fastpath+0x1f/0xbe
[ 9066.794430]  ? amdgpu_debugfs_firmware_info+0x290/0x290 [amdgpu]
[ 9066.794433]  drm_ioctl_kernel+0x64/0xb0
[ 9066.794435]  drm_ioctl+0x30a/0x3d0
[ 9066.794461]  ? amdgpu_debugfs_firmware_info+0x290/0x290 [amdgpu]
[ 9066.794464]  ? trace_hardirqs_on_caller+0x11f/0x190
[ 9066.794466]  ? trace_hardirqs_on+0xd/0x10
[ 9066.794492]  amdgpu_drm_ioctl+0x47/0x80 [amdgpu]
[ 9066.794495]  do_vfs_ioctl+0x8e/0x640
[ 9066.794497]  ? trace_hardirqs_on+0xd/0x10
[ 9066.794500]  ? security_file_ioctl+0x3e/0x60


On 11/21/2017 09:28 AM, Leo Liu wrote:

With the enablement of VCN Dec and Enc from user space, User space queries
kernel for the IP information, if HW has UVD/VCE, the info comes from these
IP blocks, but this could end up mis-interpret for VCN when they are in the
union, ther other way same when HW with VCN block.

Signed-off-by: Leo Liu
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h | 20 
  1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 750336dce0e9..86f91789de6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1590,18 +1590,14 @@ struct amdgpu_device {
/* sdma */
struct amdgpu_sdma  sdma;
  
-	union {

-   struct {
-   /* uvd */
-   struct amdgpu_uvd   uvd;
-
-   /* vce */
-   struct amdgpu_vce   vce;
-   };
-
-   /* vcn */
-   struct amdgpu_vcn   vcn;
-   };
+   /* uvd */
+   struct amdgpu_uvd   uvd;
+
+   /* vce */
+   struct amdgpu_vce   vce;
+
+   /* vcn */
+   struct amdgpu_vcn   vcn;
  
  	/* firmwares */

struct amdgpu_firmware  firmware;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/5] drm/amd/powerplay: Minor fixes in processpptables.c

2017-11-21 Thread Alex Deucher
On Tue, Nov 21, 2017 at 10:49 AM, Ernst Sjöstrand  wrote:
> I think my idea here was that if it's not updated later for some
> reason we shouldn't silently return success.
> But I'm guessing this can't happen with current hardware at least.

Right we shouldn't actually hit this case, but if we did, it's fine to
return success.

Alex

>
> Regards
> //Ernst
>
> 2017-11-21 16:15 GMT+01:00 Alex Deucher :
>> On Sun, Nov 19, 2017 at 12:52 PM, Ernst Sjöstrand  wrote:
>>> Reported by smatch:
>>> init_overdrive_limits() error: uninitialized symbol 'result'.
>>> get_clock_voltage_dependency_table() warn: inconsistent indenting
>>>
>>> Signed-off-by: Ernst Sjöstrand 
>>> ---
>>>  drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c | 6 +++---
>>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c 
>>> b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
>>> index afae32ee2b0d..7c5b426320f1 100644
>>> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
>>> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
>>> @@ -394,8 +394,8 @@ static int get_clock_voltage_dependency_table(struct 
>>> pp_hwmgr *hwmgr,
>>> dep_table->entries[i].clk =
>>> ((unsigned long)table->entries[i].ucClockHigh << 
>>> 16) |
>>> le16_to_cpu(table->entries[i].usClockLow);
>>> -   dep_table->entries[i].v =
>>> -   (unsigned 
>>> long)le16_to_cpu(table->entries[i].usVoltage);
>>> +   dep_table->entries[i].v =
>>> +   (unsigned 
>>> long)le16_to_cpu(table->entries[i].usVoltage);
>>> }
>>>
>>> *ptable = dep_table;
>>> @@ -1042,7 +1042,7 @@ static int init_overdrive_limits_V2_1(struct pp_hwmgr 
>>> *hwmgr,
>>>  static int init_overdrive_limits(struct pp_hwmgr *hwmgr,
>>> const ATOM_PPLIB_POWERPLAYTABLE *powerplay_table)
>>>  {
>>> -   int result;
>>> +   int result = 1;
>>
>> I think this should probably be initialized to 0.
>>
>> Alex
>>
>>> uint8_t frev, crev;
>>> uint16_t size;
>>>
>>> --
>>> 2.14.1
>>>
>>> ___
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/5] drm/amd/powerplay: Minor fixes in processpptables.c

2017-11-21 Thread Alex Deucher
On Tue, Nov 21, 2017 at 10:15 AM, Alex Deucher  wrote:
> On Sun, Nov 19, 2017 at 12:52 PM, Ernst Sjöstrand  wrote:
>> Reported by smatch:
>> init_overdrive_limits() error: uninitialized symbol 'result'.
>> get_clock_voltage_dependency_table() warn: inconsistent indenting
>>
>> Signed-off-by: Ernst Sjöstrand 
>> ---
>>  drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c 
>> b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
>> index afae32ee2b0d..7c5b426320f1 100644
>> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
>> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
>> @@ -394,8 +394,8 @@ static int get_clock_voltage_dependency_table(struct 
>> pp_hwmgr *hwmgr,
>> dep_table->entries[i].clk =
>> ((unsigned long)table->entries[i].ucClockHigh << 16) 
>> |
>> le16_to_cpu(table->entries[i].usClockLow);
>> -   dep_table->entries[i].v =
>> -   (unsigned 
>> long)le16_to_cpu(table->entries[i].usVoltage);
>> +   dep_table->entries[i].v =
>> +   (unsigned 
>> long)le16_to_cpu(table->entries[i].usVoltage);
>> }
>>
>> *ptable = dep_table;
>> @@ -1042,7 +1042,7 @@ static int init_overdrive_limits_V2_1(struct pp_hwmgr 
>> *hwmgr,
>>  static int init_overdrive_limits(struct pp_hwmgr *hwmgr,
>> const ATOM_PPLIB_POWERPLAYTABLE *powerplay_table)
>>  {
>> -   int result;
>> +   int result = 1;
>
> I think this should probably be initialized to 0.

Applied the series with that fixed up locally.

Thanks!

Alex

>
> Alex
>
>> uint8_t frev, crev;
>> uint16_t size;
>>
>> --
>> 2.14.1
>>
>> ___
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/5] drm/amd/powerplay: Minor fixes in processpptables.c

2017-11-21 Thread Alex Deucher
On Sun, Nov 19, 2017 at 12:52 PM, Ernst Sjöstrand  wrote:
> Reported by smatch:
> init_overdrive_limits() error: uninitialized symbol 'result'.
> get_clock_voltage_dependency_table() warn: inconsistent indenting
>
> Signed-off-by: Ernst Sjöstrand 
> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
> index afae32ee2b0d..7c5b426320f1 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
> @@ -394,8 +394,8 @@ static int get_clock_voltage_dependency_table(struct 
> pp_hwmgr *hwmgr,
> dep_table->entries[i].clk =
> ((unsigned long)table->entries[i].ucClockHigh << 16) |
> le16_to_cpu(table->entries[i].usClockLow);
> -   dep_table->entries[i].v =
> -   (unsigned 
> long)le16_to_cpu(table->entries[i].usVoltage);
> +   dep_table->entries[i].v =
> +   (unsigned 
> long)le16_to_cpu(table->entries[i].usVoltage);
> }
>
> *ptable = dep_table;
> @@ -1042,7 +1042,7 @@ static int init_overdrive_limits_V2_1(struct pp_hwmgr 
> *hwmgr,
>  static int init_overdrive_limits(struct pp_hwmgr *hwmgr,
> const ATOM_PPLIB_POWERPLAYTABLE *powerplay_table)
>  {
> -   int result;
> +   int result = 1;

I think this should probably be initialized to 0.

Alex

> uint8_t frev, crev;
> uint16_t size;
>
> --
> 2.14.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2)

2017-11-21 Thread Christian König

Am 21.11.2017 um 11:23 schrieb Frank Min:

1. program vce 4.0 fw with 48 bit address
2. correct vce 4.0 fw stack and date offset

Change-Id: I835f3f52f3b29f996812a3948aabede9f2d9b056
Signed-off-by: Frank Min 
---
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 97 ++-
  1 file changed, 62 insertions(+), 35 deletions(-)
  mode change 100644 => 100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
old mode 100644
new mode 100755
index 7574554..dc7b615
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -243,59 +243,86 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
*adev)
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VM_CTRL), 0);
  
  		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {

-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),

adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   
(adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & 0xff);
} else {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
adev->vce.gpu_addr >> 8);
-   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
  
  		offset = AMDGPU_VCE_FIRMWARE_OFFSET;

size = VCE_V4_0_FW_SIZE;
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET0),
-   offset & 0x7FFF);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE0), size);
-
-   offset += size;
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   mmVCE_VCPU_CACHE_OFFSET0),
+   offset & ~0x0f00);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   mmVCE_VCPU_CACHE_SIZE0), size);
+
+   offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ?
+   offset + size : 0;
size = VCE_V4_0_STACK_SIZE;
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET1),
-   offset & 0x7FFF);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE1), size);
+   

Re: [PATCH] drm/amd/vce: correct vce fw data and stack size config for sriov

2017-11-21 Thread Alex Deucher
On Tue, Nov 21, 2017 at 3:33 AM, Frank Min  wrote:

Please provide a better patch description.  What was the problem and
how did you fix it?

Alex

> Signed-off-by: Frank Min 
> ---
>  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 30 +-
>  1 file changed, 17 insertions(+), 13 deletions(-)
>  mode change 100644 => 100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> old mode 100644
> new mode 100755
> index 7574554..4a92530
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> @@ -243,37 +243,41 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
> *adev)
> MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VM_CTRL), 0);
>
> if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
> -   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
> -   
> adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
> -   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
> -   
> adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
> -   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
> +   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
> 
> adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
> +   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
> +   
> (adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & 0xff);
> } else {
> -   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
> +   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
> adev->vce.gpu_addr >> 8);
> -   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
> +   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
> +   (adev->vce.gpu_addr >> 40) & 
> 0xff);
> +   }
> +   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
> adev->vce.gpu_addr >> 8);
> -   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
> +   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
> +   (adev->vce.gpu_addr >> 40) & 
> 0xff);
> +   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
> adev->vce.gpu_addr >> 8);
> -   }
> +   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
> +   (adev->vce.gpu_addr >> 40) & 
> 0xff);
>
> offset = AMDGPU_VCE_FIRMWARE_OFFSET;
> size = VCE_V4_0_FW_SIZE;
> MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_OFFSET0),
> -   offset & 0x7FFF);
> +   offset & ~0x0f00);
> MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_SIZE0), size);
>
> -   offset += size;
> +   offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? 
> offset + size : 0;
> size = VCE_V4_0_STACK_SIZE;
> MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_OFFSET1),
> -   offset & 0x7FFF);
> +   (offset & ~0x0f00) | (1 << 24));
> MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_SIZE1), size);
>
> offset += size;
> size = VCE_V4_0_DATA_SIZE;
> MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_OFFSET2),
> -   offset & 0x7FFF);
> +   (offset & ~0x0f00) | (2 << 24));
> MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_SIZE2), size);
>
> MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 

Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5

2017-11-21 Thread Christian König

Hi Boris,

attached are two patches.

The first one is a trivial fix for the infinite loop issue, it now 
correctly aborts the fixup when it can't find address space for the root 
window.


The second is a workaround for your board. It simply checks if there is 
exactly one Processor Function to apply this fix on.


Both are based on linus current master branch. Please test if they fix 
your issue.


Thanks for the help,
Christian.

Am 20.11.2017 um 17:33 schrieb Boris Ostrovsky:

On 11/20/2017 11:07 AM, Christian König wrote:

Am 20.11.2017 um 16:51 schrieb Boris Ostrovsky:

(and then it breaks differently as a Xen guest --- we hung on the last
pci_read_config_dword(), I haven't looked at this at all yet)

Hui? How does this fix applies to a Xen guest in the first place?

Please provide the output of "lspci -nn" and explain further what is
your config with Xen.




This is dom0.

-bash-4.1# lspci -nn
00:00.0 Host bridge [0600]: ATI Technologies Inc RD890 Northbridge only
dual slot (2x16) PCI-e GFX Hydra part [1002:5a10] (rev 02)
00:00.2 Generic system peripheral [0806]: ATI Technologies Inc Device
[1002:5a23]
00:0d.0 PCI bridge [0604]: ATI Technologies Inc RD890 PCI to PCI bridge
(external gfx1 port B) [1002:5a1e]
00:11.0 SATA controller [0106]: ATI Technologies Inc SB700/SB800 SATA
Controller [AHCI mode] [1002:4391]
00:12.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
OHCI0 Controller [1002:4397]
00:12.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
Controller [1002:4398]
00:12.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI
Controller [1002:4396]
00:13.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
OHCI0 Controller [1002:4397]
00:13.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
Controller [1002:4398]
00:13.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI
Controller [1002:4396]
00:14.0 SMBus [0c05]: ATI Technologies Inc SBx00 SMBus Controller
[1002:4385] (rev 3d)
00:14.3 ISA bridge [0601]: ATI Technologies Inc SB700/SB800 LPC host
controller [1002:439d]
00:14.4 PCI bridge [0604]: ATI Technologies Inc SBx00 PCI to PCI Bridge
[1002:4384]
00:14.5 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
OHCI2 Controller [1002:4399]
00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1600]
00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1601]
00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1602]
00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1603]
00:18.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1604]
00:18.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1605]
00:19.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1600]
00:19.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1601]
00:19.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1602]
00:19.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1603]
00:19.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1604]
00:19.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1605]
01:04.0 VGA compatible controller [0300]: Matrox Graphics, Inc. MGA
G200eW WPCM450 [102b:0532] (rev 0a)
02:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
Network Connection [8086:10c9] (rev 01)
02:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
Network Connection [8086:10c9] (rev 01)
-bash-4.1#


-boris



>From 9b59f5919b31f1a869ef634481331ef325a992a7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= 
Date: Tue, 21 Nov 2017 11:20:00 +0100
Subject: [PATCH 1/2] x86/PCI: fix infinity loop in search for 64bit BAR
 placement
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Break the loop if we can't find some address space for a 64bit BAR.

Signed-off-by: Christian König 
---
 arch/x86/pci/fixup.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 1e996df687a3..5328e86f73eb 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -696,8 +696,13 @@ static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
 	res->end = 0xfdull - 1;
 
 	/* Just grab the free area behind system memory for this */
-	while ((conflict = request_resource_conflict(_resource, res)))
+	while ((conflict = request_resource_conflict(_resource, res))) {
+		if (conflict->end >= res->end) {
+			kfree(res);
+			return;
+		}
 		res->start = conflict->end + 1;
+	}
 
 	dev_info(>dev, "adding root bus resource %pR\n", res);
 
-- 
2.11.0

>From 2dc4461ba8ec1eb54a49e1e166de9a554556e572 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= 
Date: Tue, 21 Nov 2017 11:08:33 +0100
Subject: [PATCH 2/2] x86/PCI: only enable a 64bit BAR on single socket AMD
 Family 15h 

Re: [PATCH 4/4] drm/amdkfd: Add support for user-mode trap handlers

2017-11-21 Thread Oded Gabbay
Hi Felix,
I added all 4 patches to -next.
Oded

On Tue, Nov 14, 2017 at 11:41 PM, Felix Kuehling  wrote:
> A second-level user mode trap handler can be installed. The CWSR trap
> handler jumps to the secondary trap handler conditionally for any
> conditions not handled by it. This can be used e.g. for debugging or
> catching math exceptions.
>
> When CWSR is disabled, the user mode trap handler is installed as
> first level trap handler.
>
> Signed-off-by: Shaoyun.liu 
> Signed-off-by: Jay Cornwall 
> Signed-off-by: Felix Kuehling 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   | 37 
> +-
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 22 +
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  5 +++
>  include/uapi/linux/kfd_ioctl.h | 12 ++-
>  4 files changed, 74 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 2a4612d..cc61ec2 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -432,6 +432,38 @@ static int kfd_ioctl_set_memory_policy(struct file 
> *filep,
> return err;
>  }
>
> +static int kfd_ioctl_set_trap_handler(struct file *filep,
> +   struct kfd_process *p, void *data)
> +{
> +   struct kfd_ioctl_set_trap_handler_args *args = data;
> +   struct kfd_dev *dev;
> +   int err = 0;
> +   struct kfd_process_device *pdd;
> +
> +   dev = kfd_device_by_id(args->gpu_id);
> +   if (dev == NULL)
> +   return -EINVAL;
> +
> +   mutex_lock(>mutex);
> +
> +   pdd = kfd_bind_process_to_device(dev, p);
> +   if (IS_ERR(pdd)) {
> +   err = -ESRCH;
> +   goto out;
> +   }
> +
> +   if (dev->dqm->ops.set_trap_handler(dev->dqm,
> +   >qpd,
> +   args->tba_addr,
> +   args->tma_addr))
> +   err = -EINVAL;
> +
> +out:
> +   mutex_unlock(>mutex);
> +
> +   return err;
> +}
> +
>  static int kfd_ioctl_dbg_register(struct file *filep,
> struct kfd_process *p, void *data)
>  {
> @@ -980,7 +1012,10 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = 
> {
> kfd_ioctl_set_scratch_backing_va, 0),
>
> AMDKFD_IOCTL_DEF(AMDKFD_IOC_GET_TILE_CONFIG,
> -   kfd_ioctl_get_tile_config, 0)
> +   kfd_ioctl_get_tile_config, 0),
> +
> +   AMDKFD_IOCTL_DEF(AMDKFD_IOC_SET_TRAP_HANDLER,
> +   kfd_ioctl_set_trap_handler, 0),
>  };
>
>  #define AMDKFD_CORE_IOCTL_COUNTARRAY_SIZE(amdkfd_ioctls)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 5c06502..8447810 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -1116,6 +1116,26 @@ static bool set_cache_memory_policy(struct 
> device_queue_manager *dqm,
> return retval;
>  }
>
> +static int set_trap_handler(struct device_queue_manager *dqm,
> +   struct qcm_process_device *qpd,
> +   uint64_t tba_addr,
> +   uint64_t tma_addr)
> +{
> +   uint64_t *tma;
> +
> +   if (dqm->dev->cwsr_enabled) {
> +   /* Jump from CWSR trap handler to user trap */
> +   tma = (uint64_t *)(qpd->cwsr_kaddr + KFD_CWSR_TMA_OFFSET);
> +   tma[0] = tba_addr;
> +   tma[1] = tma_addr;
> +   } else {
> +   qpd->tba_addr = tba_addr;
> +   qpd->tma_addr = tma_addr;
> +   }
> +
> +   return 0;
> +}
> +
>  static int process_termination_nocpsch(struct device_queue_manager *dqm,
> struct qcm_process_device *qpd)
>  {
> @@ -1247,6 +1267,7 @@ struct device_queue_manager 
> *device_queue_manager_init(struct kfd_dev *dev)
> dqm->ops.create_kernel_queue = create_kernel_queue_cpsch;
> dqm->ops.destroy_kernel_queue = destroy_kernel_queue_cpsch;
> dqm->ops.set_cache_memory_policy = set_cache_memory_policy;
> +   dqm->ops.set_trap_handler = set_trap_handler;
> dqm->ops.process_termination = process_termination_cpsch;
> break;
> case KFD_SCHED_POLICY_NO_HWS:
> @@ -1262,6 +1283,7 @@ struct device_queue_manager 
> *device_queue_manager_init(struct kfd_dev *dev)
> dqm->ops.initialize = initialize_nocpsch;
> dqm->ops.uninitialize = uninitialize;
> dqm->ops.set_cache_memory_policy = set_cache_memory_policy;
> +   

Re: [PATCH 3/4] drm/amdkfd: Add CWSR support

2017-11-21 Thread Oded Gabbay
Hi Liu,
Thanks for the explanation.
I added the patch.

On Mon, Nov 20, 2017 at 6:30 PM, Liu, Shaoyun  wrote:
> The save/restore memory is allocated per queue in user mode and the  address 
> /size passed in the create_ioctl.   The memory you noticed that allocated  at 
> kfd_process_init_cwsr is the memory for CWSR shader code itself. This shader 
> code is executed by the shader  in the user's address space(GPU point of 
> view), they are similar as signal handler code  in CPU side .   The CWSR 
> shader code has a fix size , one page   plus another page for extra usage for 
> the CWSR ( ex . parameter for second level handler ). So we can easily 
> managed it in kernel . The save/restore data is quite big ( around 22M for 
> vega10) , so we don't want to allocate them in kernel .
> Other comments in line .
>
> Regards
> Shaoyun.liu
>
> -Original Message-
> From: Oded Gabbay [mailto:oded.gab...@gmail.com]
> Sent: Sunday, November 19, 2017 8:38 AM
> To: Kuehling, Felix
> Cc: amd-gfx list; Liu, Shaoyun; Zhao, Yong
> Subject: Re: [PATCH 3/4] drm/amdkfd: Add CWSR support
>
> On Tue, Nov 14, 2017 at 11:41 PM, Felix Kuehling  
> wrote:
>> This hardware feature allows the GPU to preempt shader execution in
>> the middle of a compute wave, save the state and restore it later to
>> resume execution.
>>
>> Memory for saving the state is allocated per queue in user mode and
>> the address and size passed to the create_queue ioctl. The size
> Is this a correct description?
> It seems to me the memory is allocated at kfd_process_init_cwsr() and the 
> address is saved internally and not passed in the create_ioctl.
> Which begs the question, why indeed it is not allocated by the user and then 
> passed through the create_ioctl function ?
>
>
>> depends on the number of waves that can be in flight simultaneously on
>> a given ASIC.
>>
>> Signed-off-by: Shaoyun.liu 
>> Signed-off-by: Yong Zhao 
>> Signed-off-by: Felix Kuehling 
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   |  7 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c| 20 -
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  6 ++
>>  drivers/gpu/drm/amd/amdkfd/kfd_module.c|  4 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c| 27 +++
>>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h  | 31 +++-
>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c   | 87 
>> +-
>>  include/uapi/linux/kfd_ioctl.h |  3 +-
>>  8 files changed, 179 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> index 505d391..2a4612d 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> @@ -117,7 +117,7 @@ static int kfd_open(struct inode *inode, struct file 
>> *filep)
>> return -EPERM;
>> }
>>
>> -   process = kfd_create_process(current);
>> +   process = kfd_create_process(filep);
>> if (IS_ERR(process))
>> return PTR_ERR(process);
>>
>> @@ -206,6 +206,7 @@ static int set_queue_properties_from_user(struct 
>> queue_properties *q_properties,
>> q_properties->ctx_save_restore_area_address =
>> args->ctx_save_restore_address;
>> q_properties->ctx_save_restore_area_size =
>> args->ctx_save_restore_size;
>> +   q_properties->ctl_stack_size = args->ctl_stack_size;
>> if (args->queue_type == KFD_IOC_QUEUE_TYPE_COMPUTE ||
>> args->queue_type == KFD_IOC_QUEUE_TYPE_COMPUTE_AQL)
>> q_properties->type = KFD_QUEUE_TYPE_COMPUTE; @@
>> -1088,6 +1089,10 @@ static int kfd_mmap(struct file *filp, struct 
>> vm_area_struct *vma)
>> KFD_MMAP_EVENTS_MASK) {
>> vma->vm_pgoff = vma->vm_pgoff ^ KFD_MMAP_EVENTS_MASK;
>> return kfd_event_mmap(process, vma);
>> +   } else if ((vma->vm_pgoff & KFD_MMAP_RESERVED_MEM_MASK) ==
>> +   KFD_MMAP_RESERVED_MEM_MASK) {
>> +   vma->vm_pgoff = vma->vm_pgoff ^ KFD_MMAP_RESERVED_MEM_MASK;
>> +   return kfd_reserved_mem_mmap(process, vma);
>> }
>>
>> return -EFAULT;
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index 621a3b5..4f05eac 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -27,6 +27,7 @@
>>  #include "kfd_priv.h"
>>  #include "kfd_device_queue_manager.h"
>>  #include "kfd_pm4_headers_vi.h"
>> +#include "cwsr_trap_handler_gfx8.asm"
>>
>>  #define MQD_SIZE_ALIGNED 768
>>
>> @@ -38,7 +39,8 @@ static const struct kfd_device_info kaveri_device_info = {
>> .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> .event_interrupt_class = 

Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation

2017-11-21 Thread Oded Gabbay
Thanks Felix for catching that, For some reason I remembered  EOP
buffer should be the same size of the queue.
Then we can remove the queue size parameter from that function ?

On Mon, Nov 20, 2017 at 9:22 PM, Felix Kuehling  wrote:
> I think this patch is not correct. The EOP-mem is not associated with
> the queue size. The EOP buffer is a separate buffer used by the firmware
> to handle command completion. As I understand it, this allows more
> concurrency, while still making it look like all commands in the queue
> are completing in order.
>
> Regards,
>   Felix
>
>
> On 2017-11-19 03:19 AM, Oded Gabbay wrote:
>> On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely  wrote:
>>> Signed-off-by: Jan Vesely 
>>> ---
>>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c 
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>> index f1d48281e322..b3bee39661ab 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>> @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, 
>>> struct kfd_dev *dev,
>>> enum kfd_queue_type type, unsigned int queue_size)
>>>  {
>>> int retval;
>>> +   unsigned int size = ALIGN(queue_size, PAGE_SIZE);
>>>
>>> -   retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, >eop_mem);
>>> +   retval = kfd_gtt_sa_allocate(dev, size, >eop_mem);
>>> if (retval != 0)
>>> return false;
>>>
>>> kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
>>> kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
>>>
>>> -   memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
>>> +   memset(kq->eop_kernel_addr, 0, size);
>>>
>>> return true;
>>>  }
>>> --
>>> 2.13.6
>>>
>>> ___
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> Thanks!
>> Applied to -next tree
>> Oded
>> ___
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV

2017-11-21 Thread Christian König

Am 21.11.2017 um 10:52 schrieb Frank Min:

1. program vce 4.0 fw with 48 bit address
2. correct vce 4.0 fw stack and date offset

Change-Id: I835f3f52f3b29f996812a3948aabede9f2d9b056
Signed-off-by: Frank Min 
---
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 30 +-
  1 file changed, 17 insertions(+), 13 deletions(-)
  mode change 100644 => 100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
old mode 100644
new mode 100755
index 7574554..4a92530
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -243,37 +243,41 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
*adev)
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VM_CTRL), 0);
  
  		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {

-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),

adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   
(adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & 0xff);
} else {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
adev->vce.gpu_addr >> 8);
-   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
  
  		offset = AMDGPU_VCE_FIRMWARE_OFFSET;

size = VCE_V4_0_FW_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET0),
-   offset & 0x7FFF);
+   offset & ~0x0f00);


That doesn't looks like the right indentation to me and the ~0x0f00 
value results in mask 0xf0ff, is that intended?


Regards,
Christian.


MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE0), size);
  
-		offset += size;

+   offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? 
offset + size : 0;
size = VCE_V4_0_STACK_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET1),
-   offset & 0x7FFF);
+   (offset & ~0x0f00) | (1 << 24));
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE1), size);
  
  		offset += size;

size = VCE_V4_0_DATA_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET2),
-   offset & 0x7FFF);
+   (offset & ~0x0f00) | (2 << 24));
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE2), size);
  
  		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_CTRL2), ~0x100, 0);




[PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2)

2017-11-21 Thread Frank Min
1. program vce 4.0 fw with 48 bit address
2. correct vce 4.0 fw stack and date offset

Change-Id: I835f3f52f3b29f996812a3948aabede9f2d9b056
Signed-off-by: Frank Min 
---
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 97 ++-
 1 file changed, 62 insertions(+), 35 deletions(-)
 mode change 100644 => 100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
old mode 100644
new mode 100755
index 7574554..dc7b615
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -243,59 +243,86 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
*adev)
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VM_CTRL), 0);
 
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),

adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   
(adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & 0xff);
} else {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
adev->vce.gpu_addr >> 8);
-   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   
mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
 
offset = AMDGPU_VCE_FIRMWARE_OFFSET;
size = VCE_V4_0_FW_SIZE;
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET0),
-   offset & 0x7FFF);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE0), size);
-
-   offset += size;
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   mmVCE_VCPU_CACHE_OFFSET0),
+   offset & ~0x0f00);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   mmVCE_VCPU_CACHE_SIZE0), size);
+
+   offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ?
+   offset + size : 0;
size = VCE_V4_0_STACK_SIZE;
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET1),
-   offset & 0x7FFF);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE1), size);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+   

Re: [PATCH 1/4] drm/ttm: add page order in page pool

2017-11-21 Thread Christian König

Am 21.11.2017 um 10:32 schrieb Roger He:

to indicate page order for each element in the pool

Change-Id: Ic609925ca5d2a5d4ad49d6becf505388ce3624cf
Signed-off-by: Roger He 
---
  drivers/gpu/drm/ttm/ttm_page_alloc.c | 33 ++---
  1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 316f831..2b83c52 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -81,6 +81,7 @@ struct ttm_page_pool {
char*name;
unsigned long   nfrees;
unsigned long   nrefills;
+   unsigned intorder;
  };
  
  /**

@@ -412,6 +413,7 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
struct ttm_page_pool *pool;
int shrink_pages = sc->nr_to_scan;
unsigned long freed = 0;
+   unsigned int nr_free_pool;
  
  	if (!mutex_trylock())

return SHRINK_STOP;
@@ -421,10 +423,15 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
unsigned nr_free = shrink_pages;
if (shrink_pages == 0)
break;
+
pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
/* OK to use static buffer since global mutex is held. */
-   shrink_pages = ttm_page_pool_free(pool, nr_free, true);
-   freed += nr_free - shrink_pages;
+   nr_free_pool = (nr_free >> pool->order);
+   if (nr_free_pool == 0)
+   continue;
+
+   shrink_pages = ttm_page_pool_free(pool, nr_free_pool, true);
+   freed += ((nr_free_pool - shrink_pages) << pool->order);
}
mutex_unlock();
return freed;
@@ -436,9 +443,12 @@ ttm_pool_shrink_count(struct shrinker *shrink, struct 
shrink_control *sc)
  {
unsigned i;
unsigned long count = 0;
+   struct ttm_page_pool *pool;
  
-	for (i = 0; i < NUM_POOLS; ++i)

-   count += _manager->pools[i].npages;
+   for (i = 0; i < NUM_POOLS; ++i) {
+   pool = &_manager->pools[i];
+   count += (pool->npages << pool->order);
+   }
  
  	return count;

  }
@@ -933,7 +943,7 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
  }
  
  static void ttm_page_pool_init_locked(struct ttm_page_pool *pool, gfp_t flags,

-   char *name)
+   char *name, unsigned int order)
  {
spin_lock_init(>lock);
pool->fill_lock = false;
@@ -941,6 +951,7 @@ static void ttm_page_pool_init_locked(struct ttm_page_pool 
*pool, gfp_t flags,
pool->npages = pool->nfrees = 0;
pool->gfp_flags = flags;
pool->name = name;
+   pool->order = order;
  }
  
  int ttm_page_alloc_init(struct ttm_mem_global *glob, unsigned max_pages)

@@ -953,23 +964,23 @@ int ttm_page_alloc_init(struct ttm_mem_global *glob, 
unsigned max_pages)
  
  	_manager = kzalloc(sizeof(*_manager), GFP_KERNEL);
  
-	ttm_page_pool_init_locked(&_manager->wc_pool, GFP_HIGHUSER, "wc");

+   ttm_page_pool_init_locked(&_manager->wc_pool, GFP_HIGHUSER, "wc", 0);
  
-	ttm_page_pool_init_locked(&_manager->uc_pool, GFP_HIGHUSER, "uc");

+   ttm_page_pool_init_locked(&_manager->uc_pool, GFP_HIGHUSER, "uc", 0);
  
  	ttm_page_pool_init_locked(&_manager->wc_pool_dma32,

- GFP_USER | GFP_DMA32, "wc dma");
+ GFP_USER | GFP_DMA32, "wc dma", 0);
  
  	ttm_page_pool_init_locked(&_manager->uc_pool_dma32,

- GFP_USER | GFP_DMA32, "uc dma");
+ GFP_USER | GFP_DMA32, "uc dma", 0);
  
  	ttm_page_pool_init_locked(&_manager->wc_pool_huge,

  GFP_TRANSHUGE & ~(__GFP_MOVABLE | __GFP_COMP),
- "wc huge");
+ "wc huge", HPAGE_PMD_ORDER);
  
  	ttm_page_pool_init_locked(&_manager->uc_pool_huge,

  GFP_TRANSHUGE & ~(__GFP_MOVABLE | __GFP_COMP)
- , "uc huge");
+ , "uc huge", HPAGE_PMD_ORDER);


HPAGE_PMD_ORDER isn't defined when huge page support isn't enabled.

That's why I avoided using this here.

Christian.

  
  	_manager->options.max_size = max_pages;

_manager->options.small = SMALL_ALLOCATION;



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/4] drm/ttm: add page order support in ttm_pages_put

2017-11-21 Thread Christian König

Am 21.11.2017 um 10:32 schrieb Roger He:

Change-Id: Ia55b206d95812c5afcfd6cec29f580758d1f50f0
Signed-off-by: Roger He 
---
  drivers/gpu/drm/ttm/ttm_page_alloc.c | 42 +---
  1 file changed, 34 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 27b2402..90546fd 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -285,13 +285,39 @@ static struct ttm_page_pool *ttm_get_pool(int flags, bool 
huge,
  }
  
  /* set memory back to wb and free the pages. */

-static void ttm_pages_put(struct page *pages[], unsigned npages)
+static void ttm_pages_put(struct page *pages[], unsigned npages,
+   unsigned int order)
  {
-   unsigned i;
-   if (set_pages_array_wb(pages, npages))
-   pr_err("Failed to set %d pages to wb!\n", npages);
-   for (i = 0; i < npages; ++i)
-   __free_page(pages[i]);
+   struct page **pages_to_free = NULL;
+   struct page **pages_array = NULL;
+   struct page *p = NULL;
+   unsigned int i, j, pages_nr = 1 << order;
+
+   if (order > 0) {
+   pages_to_free = kmalloc_array(pages_nr, sizeof(struct page *),
+   GFP_KERNEL);


We can't call kmalloc here, when this function is called by the shrinker 
we are tight on memory anyway.


That's also the reason we have this static buffer dance in 
ttm_page_pool_free() as well.


Christian.


+   if (!pages_to_free) {
+   pr_err("Failed to allocate memory for ttm pages put 
operation\n");
+   return;
+   }
+   }
+
+   for (i = 0; i < npages; ++i) {
+   if (order) {
+   p = pages[i];
+   for (j = 0; j < pages_nr; ++j)
+   pages_to_free[j] = p++;
+
+   pages_array = pages_to_free;
+   } else
+   pages_array = pages;
+
+   if (set_pages_array_wb(pages_array, pages_nr))
+   pr_err("Failed to set %d pages to wb!\n", pages_nr);
+   __free_pages(pages[i], order);
+   }
+
+   kfree(pages_to_free);
  }
  
  static void ttm_pool_update_free_locked(struct ttm_page_pool *pool,

@@ -354,7 +380,7 @@ static int ttm_page_pool_free(struct ttm_page_pool *pool, 
unsigned nr_free,
 */
spin_unlock_irqrestore(>lock, irq_flags);
  
-			ttm_pages_put(pages_to_free, freed_pages);

+   ttm_pages_put(pages_to_free, freed_pages, pool->order);
if (likely(nr_free != FREE_ALL_PAGES))
nr_free -= freed_pages;
  
@@ -389,7 +415,7 @@ static int ttm_page_pool_free(struct ttm_page_pool *pool, unsigned nr_free,

spin_unlock_irqrestore(>lock, irq_flags);
  
  	if (freed_pages)

-   ttm_pages_put(pages_to_free, freed_pages);
+   ttm_pages_put(pages_to_free, freed_pages, pool->order);
  out:
if (pages_to_free != static_buf)
kfree(pages_to_free);



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 4/4] drm/ttm: free one in huge pool even shrink request less than one element

2017-11-21 Thread Roger He
Change-Id: Id8bd4d1ecff9f3ab14355e2dbd1c59b9fe824e01
Signed-off-by: Roger He 
---
 drivers/gpu/drm/ttm/ttm_page_alloc.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 90546fd..c194a51 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -453,11 +453,13 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
/* OK to use static buffer since global mutex is held. */
nr_free_pool = (nr_free >> pool->order);
-   if (nr_free_pool == 0)
-   continue;
+   if (!nr_free_pool && pool->order)
+   nr_free_pool = 1;
 
shrink_pages = ttm_page_pool_free(pool, nr_free_pool, true);
freed += ((nr_free_pool - shrink_pages) << pool->order);
+   if (freed > sc->nr_to_scan)
+   break;
}
mutex_unlock();
return freed;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/4] drm/ttm: use NUM_PAGES_TO_ALLOC always

2017-11-21 Thread Christian König

Am 21.11.2017 um 10:32 schrieb Roger He:

Change-Id: Ide96a1ccad9bb44b0bb0d80e123c2d810ba618ed
Signed-off-by: Roger He 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/ttm/ttm_page_alloc.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 2b83c52..27b2402 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -520,8 +520,7 @@ static int ttm_alloc_new_pages(struct list_head *pages, 
gfp_t gfp_flags,
int r = 0;
unsigned i, j, cpages;
unsigned npages = 1 << order;
-   unsigned max_cpages = min(count,
-   (unsigned)(PAGE_SIZE/sizeof(struct page *)));
+   unsigned max_cpages = min(count, (unsigned)NUM_PAGES_TO_ALLOC);
  
  	/* allocate array for page caching change */

caching_array = kmalloc(max_cpages*sizeof(struct page *), GFP_KERNEL);



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 3/4] drm/ttm: add page order support in ttm_pages_put

2017-11-21 Thread Roger He
Change-Id: Ia55b206d95812c5afcfd6cec29f580758d1f50f0
Signed-off-by: Roger He 
---
 drivers/gpu/drm/ttm/ttm_page_alloc.c | 42 +---
 1 file changed, 34 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 27b2402..90546fd 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -285,13 +285,39 @@ static struct ttm_page_pool *ttm_get_pool(int flags, bool 
huge,
 }
 
 /* set memory back to wb and free the pages. */
-static void ttm_pages_put(struct page *pages[], unsigned npages)
+static void ttm_pages_put(struct page *pages[], unsigned npages,
+   unsigned int order)
 {
-   unsigned i;
-   if (set_pages_array_wb(pages, npages))
-   pr_err("Failed to set %d pages to wb!\n", npages);
-   for (i = 0; i < npages; ++i)
-   __free_page(pages[i]);
+   struct page **pages_to_free = NULL;
+   struct page **pages_array = NULL;
+   struct page *p = NULL;
+   unsigned int i, j, pages_nr = 1 << order;
+
+   if (order > 0) {
+   pages_to_free = kmalloc_array(pages_nr, sizeof(struct page *),
+   GFP_KERNEL);
+   if (!pages_to_free) {
+   pr_err("Failed to allocate memory for ttm pages put 
operation\n");
+   return;
+   }
+   }
+
+   for (i = 0; i < npages; ++i) {
+   if (order) {
+   p = pages[i];
+   for (j = 0; j < pages_nr; ++j)
+   pages_to_free[j] = p++;
+
+   pages_array = pages_to_free;
+   } else
+   pages_array = pages;
+
+   if (set_pages_array_wb(pages_array, pages_nr))
+   pr_err("Failed to set %d pages to wb!\n", pages_nr);
+   __free_pages(pages[i], order);
+   }
+
+   kfree(pages_to_free);
 }
 
 static void ttm_pool_update_free_locked(struct ttm_page_pool *pool,
@@ -354,7 +380,7 @@ static int ttm_page_pool_free(struct ttm_page_pool *pool, 
unsigned nr_free,
 */
spin_unlock_irqrestore(>lock, irq_flags);
 
-   ttm_pages_put(pages_to_free, freed_pages);
+   ttm_pages_put(pages_to_free, freed_pages, pool->order);
if (likely(nr_free != FREE_ALL_PAGES))
nr_free -= freed_pages;
 
@@ -389,7 +415,7 @@ static int ttm_page_pool_free(struct ttm_page_pool *pool, 
unsigned nr_free,
spin_unlock_irqrestore(>lock, irq_flags);
 
if (freed_pages)
-   ttm_pages_put(pages_to_free, freed_pages);
+   ttm_pages_put(pages_to_free, freed_pages, pool->order);
 out:
if (pages_to_free != static_buf)
kfree(pages_to_free);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/4] drm/ttm: add page order in page pool

2017-11-21 Thread Roger He
to indicate page order for each element in the pool

Change-Id: Ic609925ca5d2a5d4ad49d6becf505388ce3624cf
Signed-off-by: Roger He 
---
 drivers/gpu/drm/ttm/ttm_page_alloc.c | 33 ++---
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 316f831..2b83c52 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -81,6 +81,7 @@ struct ttm_page_pool {
char*name;
unsigned long   nfrees;
unsigned long   nrefills;
+   unsigned intorder;
 };
 
 /**
@@ -412,6 +413,7 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
struct ttm_page_pool *pool;
int shrink_pages = sc->nr_to_scan;
unsigned long freed = 0;
+   unsigned int nr_free_pool;
 
if (!mutex_trylock())
return SHRINK_STOP;
@@ -421,10 +423,15 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
unsigned nr_free = shrink_pages;
if (shrink_pages == 0)
break;
+
pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
/* OK to use static buffer since global mutex is held. */
-   shrink_pages = ttm_page_pool_free(pool, nr_free, true);
-   freed += nr_free - shrink_pages;
+   nr_free_pool = (nr_free >> pool->order);
+   if (nr_free_pool == 0)
+   continue;
+
+   shrink_pages = ttm_page_pool_free(pool, nr_free_pool, true);
+   freed += ((nr_free_pool - shrink_pages) << pool->order);
}
mutex_unlock();
return freed;
@@ -436,9 +443,12 @@ ttm_pool_shrink_count(struct shrinker *shrink, struct 
shrink_control *sc)
 {
unsigned i;
unsigned long count = 0;
+   struct ttm_page_pool *pool;
 
-   for (i = 0; i < NUM_POOLS; ++i)
-   count += _manager->pools[i].npages;
+   for (i = 0; i < NUM_POOLS; ++i) {
+   pool = &_manager->pools[i];
+   count += (pool->npages << pool->order);
+   }
 
return count;
 }
@@ -933,7 +943,7 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
 }
 
 static void ttm_page_pool_init_locked(struct ttm_page_pool *pool, gfp_t flags,
-   char *name)
+   char *name, unsigned int order)
 {
spin_lock_init(>lock);
pool->fill_lock = false;
@@ -941,6 +951,7 @@ static void ttm_page_pool_init_locked(struct ttm_page_pool 
*pool, gfp_t flags,
pool->npages = pool->nfrees = 0;
pool->gfp_flags = flags;
pool->name = name;
+   pool->order = order;
 }
 
 int ttm_page_alloc_init(struct ttm_mem_global *glob, unsigned max_pages)
@@ -953,23 +964,23 @@ int ttm_page_alloc_init(struct ttm_mem_global *glob, 
unsigned max_pages)
 
_manager = kzalloc(sizeof(*_manager), GFP_KERNEL);
 
-   ttm_page_pool_init_locked(&_manager->wc_pool, GFP_HIGHUSER, "wc");
+   ttm_page_pool_init_locked(&_manager->wc_pool, GFP_HIGHUSER, "wc", 0);
 
-   ttm_page_pool_init_locked(&_manager->uc_pool, GFP_HIGHUSER, "uc");
+   ttm_page_pool_init_locked(&_manager->uc_pool, GFP_HIGHUSER, "uc", 0);
 
ttm_page_pool_init_locked(&_manager->wc_pool_dma32,
- GFP_USER | GFP_DMA32, "wc dma");
+ GFP_USER | GFP_DMA32, "wc dma", 0);
 
ttm_page_pool_init_locked(&_manager->uc_pool_dma32,
- GFP_USER | GFP_DMA32, "uc dma");
+ GFP_USER | GFP_DMA32, "uc dma", 0);
 
ttm_page_pool_init_locked(&_manager->wc_pool_huge,
  GFP_TRANSHUGE & ~(__GFP_MOVABLE | __GFP_COMP),
- "wc huge");
+ "wc huge", HPAGE_PMD_ORDER);
 
ttm_page_pool_init_locked(&_manager->uc_pool_huge,
  GFP_TRANSHUGE & ~(__GFP_MOVABLE | __GFP_COMP)
- , "uc huge");
+ , "uc huge", HPAGE_PMD_ORDER);
 
_manager->options.max_size = max_pages;
_manager->options.small = SMALL_ALLOCATION;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/vce: correct vce fw data and stack size config for sriov

2017-11-21 Thread Frank Min
Signed-off-by: Frank Min 
---
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)
 mode change 100644 => 100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
old mode 100644
new mode 100755
index 7574554..4a92530
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -243,37 +243,41 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
*adev)
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VM_CTRL), 0);
 
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),

adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   
(adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & 0xff);
} else {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
adev->vce.gpu_addr >> 8);
-   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
 
offset = AMDGPU_VCE_FIRMWARE_OFFSET;
size = VCE_V4_0_FW_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET0),
-   offset & 0x7FFF);
+   offset & ~0x0f00);
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE0), size);
 
-   offset += size;
+   offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? 
offset + size : 0;
size = VCE_V4_0_STACK_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET1),
-   offset & 0x7FFF);
+   (offset & ~0x0f00) | (1 << 24));
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE1), size);
 
offset += size;
size = VCE_V4_0_DATA_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET2),
-   offset & 0x7FFF);
+   (offset & ~0x0f00) | (2 << 24));
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE2), size);
 
MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_CTRL2), ~0x100, 0);
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/vce: correct vce fw data and stack size config for sriov

2017-11-21 Thread Liu, Monk
Forward to Frank who is the author of this patch

-Original Message-
From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] 
Sent: 2017年11月21日 16:49
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/vce: correct vce fw data and stack size config for 
sriov

Am 21.11.2017 um 09:37 schrieb Liu, Monk:
> Subject: [PATCH] drm/amd/vce: correct vce fw data and stack size 
> config for sriov
>
> Signed-off-by: Frank Min 

Well first of all please fix the coding style. It looks like some elements of a 
line now start on the next line.

Apart from that just writing 0 into the registers doesn't look even remotely 
correct to me.

Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 30 +-
>   1 file changed, 17 insertions(+), 13 deletions(-)  mode change 
> 100644 => 100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> old mode 100644
> new mode 100755
> index 7574554..4a92530
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> @@ -243,37 +243,41 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
> *adev)
>   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VM_CTRL), 0);
>   
>   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
> - 
> adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
> - 
> adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> +mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
>   
> adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
> + 
> (adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & 
> +0xff);
>   } else {
> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> +mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
>   adev->vce.gpu_addr >> 8);
> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
> + (adev->vce.gpu_addr >> 40) & 
> 0xff);
> + }
> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> +mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
>   adev->vce.gpu_addr >> 8);
> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
> + (adev->vce.gpu_addr >> 40) & 
> 0xff);
> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> +mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
>   adev->vce.gpu_addr >> 8);
> - }
> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
> + (adev->vce.gpu_addr >> 40) & 
> 0xff);
>   
>   offset = AMDGPU_VCE_FIRMWARE_OFFSET;
>   size = VCE_V4_0_FW_SIZE;
>   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_OFFSET0),
> - offset & 0x7FFF);
> + offset & ~0x0f00);
>   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_SIZE0), size);
>   
> - offset += size;
> + offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? 
> offset 
> ++ size : 0;
>   size = VCE_V4_0_STACK_SIZE;
>   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_OFFSET1),
> - offset & 0x7FFF);
> + (offset & ~0x0f00) | (1 << 24));
>   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_SIZE1), size);
>   
>   offset += size;
>   size = 

Re: [PATCH] drm/amd/vce: correct vce fw data and stack size config for sriov

2017-11-21 Thread Christian König

Am 21.11.2017 um 09:37 schrieb Liu, Monk:

Subject: [PATCH] drm/amd/vce: correct vce fw data and stack size config for 
sriov

Signed-off-by: Frank Min 


Well first of all please fix the coding style. It looks like some 
elements of a line now start on the next line.


Apart from that just writing 0 into the registers doesn't look even 
remotely correct to me.


Christian.


---
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 30 +-
  1 file changed, 17 insertions(+), 13 deletions(-)  mode change 100644 => 
100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
old mode 100644
new mode 100755
index 7574554..4a92530
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -243,37 +243,41 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
*adev)
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VM_CTRL), 0);
  
  		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {

-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),

adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   
(adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) &
+0xff);
} else {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
adev->vce.gpu_addr >> 8);
-   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
  
  		offset = AMDGPU_VCE_FIRMWARE_OFFSET;

size = VCE_V4_0_FW_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET0),
-   offset & 0x7FFF);
+   offset & ~0x0f00);
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE0), size);
  
-		offset += size;

+   offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? 
offset +
+size : 0;
size = VCE_V4_0_STACK_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET1),
-   offset & 0x7FFF);
+   (offset & ~0x0f00) | (1 << 24));
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE1), size);
  
  		offset += size;

size = VCE_V4_0_DATA_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET2),
-   offset & 0x7FFF);
+   (offset & ~0x0f00) | (2 << 24));
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE2), size);
  
  		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_CTRL2), ~0x100, 0);

--
1.9.1


[PATCH] drm/amd/vce: correct vce fw data and stack size config for sriov

2017-11-21 Thread Liu, Monk
Subject: [PATCH] drm/amd/vce: correct vce fw data and stack size config for 
sriov

Signed-off-by: Frank Min 
---
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)  mode change 100644 => 
100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
old mode 100644
new mode 100755
index 7574554..4a92530
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -243,37 +243,41 @@ static int vce_v4_0_sriov_start(struct amdgpu_device 
*adev)
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VM_CTRL), 0);
 
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
-   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),

adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   
(adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & 
+0xff);
} else {
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
adev->vce.gpu_addr >> 8);
-   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
+mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
adev->vce.gpu_addr >> 8);
-   }
+   MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
+   (adev->vce.gpu_addr >> 40) & 
0xff);
 
offset = AMDGPU_VCE_FIRMWARE_OFFSET;
size = VCE_V4_0_FW_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET0),
-   offset & 0x7FFF);
+   offset & ~0x0f00);
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE0), size);
 
-   offset += size;
+   offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? 
offset + 
+size : 0;
size = VCE_V4_0_STACK_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET1),
-   offset & 0x7FFF);
+   (offset & ~0x0f00) | (1 << 24));
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE1), size);
 
offset += size;
size = VCE_V4_0_DATA_SIZE;
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_OFFSET2),
-   offset & 0x7FFF);
+   (offset & ~0x0f00) | (2 << 24));
MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_VCPU_CACHE_SIZE2), size);
 
MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, 
mmVCE_LMI_CTRL2), ~0x100, 0);
--
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu:partially revert 1cfd8e237f0318e330190ac21d63c58ae6a1f66c

2017-11-21 Thread Christian König

Am 21.11.2017 um 06:29 schrieb Monk Liu:

found RING0 test fail after S3 resume regression, which is
introduced by 1cfd8e237f0318e330190ac21d63c58ae6a1f66c

Because after suspend VRAM will be cleared, so driver must
unpin the GART table(resident in VRAM) during suspend so it
can be evicted to system ram and must correspondingly pin it
during resume so the GART table could be restored to VRAM.

Change-Id: I0c0f9dc89f38a903caf114094433fb00bd6c93fb
Signed-off-by: Monk Liu 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 79 +---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h |  2 +
  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c|  7 ++-
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c|  7 ++-
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c|  7 ++-
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c|  4 ++
  6 files changed, 94 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 707f858..1f51897 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -68,9 +68,75 @@
   */
  int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev)
  {
-   return amdgpu_bo_create_kernel(adev, adev->gart.table_size, PAGE_SIZE,
-   AMDGPU_GEM_DOMAIN_VRAM, 
>gart.robj,
-   >gart.table_addr, 
>gart.ptr);
+   int r;
+
+   if (adev->gart.robj == NULL) {
+   r = amdgpu_bo_create(adev, adev->gart.table_size,
+PAGE_SIZE, true, AMDGPU_GEM_DOMAIN_VRAM,
+AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
+AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
+NULL, NULL, 0, >gart.robj);
+   if (r) {
+   return r;
+   }
+   }
+   return 0;
+}
+
+/**
+ * amdgpu_gart_table_vram_pin - pin gart page table in vram
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Pin the GART page table in vram so it will not be moved
+ * by the memory manager (pcie r4xx, r5xx+).  These asics require the
+ * gart table to be in video memory.
+ * Returns 0 for success, error for failure.
+ */
+int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev)
+{
+   uint64_t gpu_addr;
+   int r;
+
+   r = amdgpu_bo_reserve(adev->gart.robj, false);
+   if (unlikely(r != 0))
+   return r;
+   r = amdgpu_bo_pin(adev->gart.robj,
+   AMDGPU_GEM_DOMAIN_VRAM, _addr);
+   if (r) {
+   amdgpu_bo_unreserve(adev->gart.robj);
+   return r;
+   }
+   r = amdgpu_bo_kmap(adev->gart.robj, >gart.ptr);
+   if (r)
+   amdgpu_bo_unpin(adev->gart.robj);
+   amdgpu_bo_unreserve(adev->gart.robj);
+   adev->gart.table_addr = gpu_addr;
+   return r;
+}
+
+/**
+ * amdgpu_gart_table_vram_unpin - unpin gart page table in vram
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Unpin the GART page table in vram (pcie r4xx, r5xx+).
+ * These asics require the gart table to be in video memory.
+ */
+void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev)
+{
+   int r;
+
+   if (adev->gart.robj == NULL) {
+   return;
+   }
+   r = amdgpu_bo_reserve(adev->gart.robj, true);
+   if (likely(r == 0)) {
+   amdgpu_bo_kunmap(adev->gart.robj);
+   amdgpu_bo_unpin(adev->gart.robj);
+   amdgpu_bo_unreserve(adev->gart.robj);
+   adev->gart.ptr = NULL;
+   }
  }
  
  /**

@@ -84,9 +150,10 @@ int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev)
   */
  void amdgpu_gart_table_vram_free(struct amdgpu_device *adev)
  {
-   amdgpu_bo_free_kernel(>gart.robj,
-   >gart.table_addr,
-   >gart.ptr);
+   if (adev->gart.robj == NULL) {
+   return;
+   }
+   amdgpu_bo_unref(>gart.robj);
  }
  
  /*

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
index f15e319..fe2e76b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
@@ -58,6 +58,8 @@ struct amdgpu_gart {
  
  int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev);

  void amdgpu_gart_table_vram_free(struct amdgpu_device *adev);
+int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
+void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
  int amdgpu_gart_init(struct amdgpu_device *adev);
  void amdgpu_gart_fini(struct amdgpu_device *adev);
  int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 9c672ec..aea3d1b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ 

Re: [PATCH] drm/radeon: fix possible memory leak in radeon_bo_create

2017-11-21 Thread Christian König

Am 21.11.2017 um 05:36 schrieb Alex Deucher:

if ttm_bo_init fails, don't leak the bo object.


NAK, see ttm_bo_init_reserved once more.

When the function fails it calls the destroy callback which releases the 
memory.


Christian.



Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/radeon/radeon_object.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
b/drivers/gpu/drm/radeon/radeon_object.c
index 093594976126..53c5bb6c25e4 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -262,6 +262,8 @@ int radeon_bo_create(struct radeon_device *rdev,
acc_size, sg, resv, _ttm_bo_destroy);
up_read(>pm.mclk_lock);
if (unlikely(r != 0)) {
+   drm_gem_object_release(>gem_base);
+   kfree(bo);
return r;
}
*bo_ptr = bo;



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx