[pull] ttm, amdgpu drm-next-5.1

2019-01-31 Thread Alex Deucher
Hi Dave, Daniel,

More stuff for 5.1.  Mostly bug fixes.
ttm:
- Replace ref/unref naming with get/put

amdgpu:
- Revert DC clang fix, causes a segfault with some compiler versions
- SR-IOV fix
- PCIE fix for vega20
- Misc DC fixes

The following changes since commit 10117450735c7a7c0858095fb46a860e7037cb9a:

  drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to 
undefined SW FP routines (2019-01-25 16:15:37 -0500)

are available in the Git repository at:

  git://people.freedesktop.org/~agd5f/linux drm-next-5.1

for you to fetch changes up to 47dd8048a1bf5b2fb96e5abe99b4f1dcd208ea4d:

  drm/amdgpu: Show XGMI node and hive message per device only once (2019-01-29 
15:16:18 -0500)


Alex Deucher (1):
  Revert "drm/amd/display: add -msse2 to prevent Clang from emitting 
libcalls to undefined SW FP routines"

Eric Huang (1):
  drm/amd/powerplay: add override pcie parameters for Vega20

Fatemeh Darbehani (1):
  drm/amd/display: Add Vline1 interrupt source to InterruptManager

Martin Tsai (1):
  drm/amd/display: Poll pending down rep before clear payload allocation 
table

Nicholas Kazlauskas (3):
  drm/amd/display: Enable vblank interrupt during CRC capture
  drm/amd/display: Re-enable CRC capture following modeset
  drm/amd/display: Don't leak memory when updating streams

Thomas Zimmermann (7):
  drm/ast: Replace ttm_bo_unref with ttm_bo_put
  drm/nouveau: Replace ttm_bo_reference with ttm_bo_get
  drm/nouveau: Replace ttm_bo_unref with ttm_bo_put
  drm/vmwgfx: Replace ttm_bo_reference with ttm_bo_get
  drm/vmwgfx: Replace ttm_bo_unref with ttm_bo_put
  drm/mgag200: Replace ttm_bo_unref with ttm_bo_put
  drm/ttm: Remove ttm_bo_reference and ttm_bo_unref

shaoyunl (1):
  drm/amdgpu: Show XGMI node and hive message per device only once

wentalou (2):
  drm/amdgpu: csa_vaddr should not larger than AMDGPU_GMC_HOLE_START
  drm/amdgpu: sriov restrict max_pfn below AMDGPU_GMC_HOLE

 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   |  7 +--
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  6 +-
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 29 -
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c  | 48 ---
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c  |  7 +++
 drivers/gpu/drm/amd/display/dc/calcs/Makefile  |  2 +-
 drivers/gpu/drm/amd/display/dc/core/dc.c   | 10 +--
 drivers/gpu/drm/amd/display/dc/core/dc_link.c  |  5 ++
 drivers/gpu/drm/amd/display/dc/dc_stream.h | 14 -
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c  | 72 +-
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h  | 12 +++-
 drivers/gpu/drm/amd/display/dc/dm_helpers.h|  7 +++
 drivers/gpu/drm/amd/display/dc/dml/Makefile|  2 +-
 .../drm/amd/display/dc/inc/hw/timing_generator.h   | 13 +++-
 drivers/gpu/drm/amd/display/dc/irq_types.h |  8 +++
 drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c | 46 ++
 drivers/gpu/drm/ast/ast_main.c |  6 +-
 drivers/gpu/drm/mgag200/mgag200_main.c |  8 +--
 drivers/gpu/drm/nouveau/nouveau_bo.h   | 12 ++--
 drivers/gpu/drm/nouveau/nouveau_gem.c  |  3 +-
 drivers/gpu/drm/ttm/ttm_bo.c   |  9 ---
 drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 11 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c | 11 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.h|  9 +--
 drivers/gpu/drm/vmwgfx/vmwgfx_mob.c| 21 ---
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c   |  9 ++-
 drivers/gpu/drm/vmwgfx/vmwgfx_validation.c |  6 +-
 include/drm/ttm/ttm_bo_api.h   | 28 -
 28 files changed, 239 insertions(+), 182 deletions(-)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: amd-staging-drm-next: Oops - BUG: unable to handle kernel NULL pointer dereference, bisected.

2019-01-31 Thread Przemek Socha
Dnia czwartek, 31 stycznia 2019 17:56:32 CET piszesz:

> In my experience only the last chunk of the patch is necessary.  Can you 
> try this without:
> 
> 
>  >> + vm->bulk_moveable = false;
> 
> 
> Too?
> 
> Thanks,
> Tom

Sure.

I have applied only the last chunk of the patch on top of today's amd-staging-
drm-next pull:

> >> @@ -2772,6 +2773,9 @@  void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
> >> 
> >>struct amdgpu_vm_bo_base **base;
> >>
> >>
> >>
> >>if (bo) {
> >> 
> >> +  if (bo->tbo.resv == vm->root.base.bo->tbo.resv)
> >> +  vm->bulk_moveable = false;
> >> +
> >> 
> >>for (base = _va->base.bo->vm_bo; *base;
> >>
> >> base = &(*base)->next) {
> >>
> >>if (*base != _va->base)

and it seems to be working as expected also. 

Thanks,
Przemek.

signature.asc
Description: This is a digitally signed message part.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: amd-staging-drm-next: Oops - BUG: unable to handle kernel NULL pointer dereference, bisected.

2019-01-31 Thread Przemek Socha
Dnia środa, 30 stycznia 2019 13:42:33 CET piszesz:
> Does the attached patch fix the issue?
> 
> Christian.

I have tested this one also - "drm/amdgpu: partial revert cleanup setting 
bulk_movable v2"

>We still need to set bulk_movable to false when new BOs are added or removed.
>
>v2: also set it to false on removal
>
>Signed-off-by: Christian König 
>---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 
> 1 file changed, 4 insertions(+)
>
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/
>amdgpu/amdgpu_vm.c
>index 79f9dde70bc0..822546a149fa 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>@@ -332,6 +332,7 @@  static void amdgpu_vm_bo_base_init(struct 
>amdgpu_vm_bo_base *base,
>   if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
>   return;
> 
>+  vm->bulk_moveable = false;
>   if (bo->tbo.type == ttm_bo_type_kernel)
>   amdgpu_vm_bo_relocated(base);
>   else
>@@ -2772,6 +2773,9 @@  void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
>   struct amdgpu_vm_bo_base **base;
> 
>   if (bo) {
>+  if (bo->tbo.resv == vm->root.base.bo->tbo.resv)
>+  vm->bulk_moveable = false;
>+
>   for (base = _va->base.bo->vm_bo; *base;
>base = &(*base)->next) {
>   if (*base != _va->base)

and so far I have no lockup and Oops, so I think this one is ok.

Thank you very much,
Przemek.

signature.asc
Description: This is a digitally signed message part.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[pull] radeon, amdgpu drm-fixes-5.0

2019-01-31 Thread Alex Deucher
Hi Dave, Daniel,

Sorry this is a bit late.  I had Internet issues yesterday.
A few fixes for 5.0:
- Fix radeon crash on SI with VM passthrough
- Fencing fix for shared buffers
- Fix power hwmon reporting on APUs
- Powerplay fix for APUs

The following changes since commit f0e7ce1eef5854584dfb59b3824a67edee37580f:

  Merge tag 'drm-msm-fixes-2019-01-24' of 
git://people.freedesktop.org/~robclark/linux into drm-fixes (2019-01-25 
07:45:00 +1000)

are available in the Git repository at:

  git://people.freedesktop.org/~agd5f/linux drm-fixes-5.0

for you to fetch changes up to 6e11ea9de9576a644045ffdc2067c09bc2012eda:

  drm/amdgpu: Transfer fences to dmabuf importer (2019-01-30 12:52:44 -0500)


Alex Deucher (2):
  drm/amdgpu: Add missing power attribute to APU check
  drm/radeon: check if device is root before getting pci speed caps

Chris Wilson (1):
  drm/amdgpu: Transfer fences to dmabuf importer

Gustavo A. R. Silva (1):
  drm/amd/powerplay: Fix missing break in switch

 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c|  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 59 ---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c |  1 +
 drivers/gpu/drm/radeon/ci_dpm.c   |  5 +-
 drivers/gpu/drm/radeon/si_dpm.c   |  5 +-
 5 files changed, 60 insertions(+), 13 deletions(-)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells mask on SOC15

2019-01-31 Thread Zhao, Yong
Reserved doorbells for SDMA IH and VCN were not properly masked out
when allocating doorbells for CP user queues. This patch fixed that.

Change-Id: I670adfc3fd7725d2ed0bd9665cb7f69f8b9023c2
Signed-off-by: Yong Zhao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 17 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h  |  4 +++
 drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c  |  3 +++
 drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c  |  3 +++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  9 +++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 25 +--
 .../gpu/drm/amd/include/kgd_kfd_interface.h   | 19 +-
 7 files changed, 56 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index e957e42c539a..13710f34191a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -196,11 +196,20 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
gpu_resources.sdma_doorbell[1][i+1] =
adev->doorbell_index.sdma_engine[1] + 0x200 + 
(i >> 1);
}
-   /* Doorbells 0x0e0-0ff and 0x2e0-2ff are reserved for
-* SDMA, IH and VCN. So don't use them for the CP.
+
+   /* Because of the setting in registers like
+* SDMA0_DOORBELL_RANGE etc., BIF statically uses the
+* lower 12 bits of doorbell address for routing, in
+* order to route the CP queue doorbells to CP engine,
+* the doorbells allocated to CP queues have to be
+* outside the range set for SDMA, VCN, and IH blocks
+* Prior to SOC15, all queues use queue ID to
+* determine doorbells.
 */
-   gpu_resources.reserved_doorbell_mask = 0x1e0;
-   gpu_resources.reserved_doorbell_val  = 0x0e0;
+   gpu_resources.reserved_doorbells_start =
+   adev->doorbell_index.sdma_engine[0];
+   gpu_resources.reserved_doorbells_end =
+   adev->doorbell_index.last_non_cp;
 
kgd2kfd_device_init(adev->kfd.dev, _resources);
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
index 1cfec06f81d4..4de431f7f380 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
@@ -71,6 +71,7 @@ struct amdgpu_doorbell_index {
uint32_t vce_ring6_7;
} uvd_vce;
};
+   uint32_t last_non_cp;
uint32_t max_assignment;
/* Per engine SDMA doorbell size in dword */
uint32_t sdma_doorbell_range;
@@ -143,6 +144,7 @@ typedef enum _AMDGPU_VEGA20_DOORBELL_ASSIGNMENT
AMDGPU_VEGA20_DOORBELL64_VCE_RING2_3 = 0x18D,
AMDGPU_VEGA20_DOORBELL64_VCE_RING4_5 = 0x18E,
AMDGPU_VEGA20_DOORBELL64_VCE_RING6_7 = 0x18F,
+   AMDGPU_VEGA20_DOORBELL64_LAST_NON_CP = 
AMDGPU_VEGA20_DOORBELL64_VCE_RING6_7,
AMDGPU_VEGA20_DOORBELL_MAX_ASSIGNMENT= 0x18F,
AMDGPU_VEGA20_DOORBELL_INVALID   = 0x
 } AMDGPU_VEGA20_DOORBELL_ASSIGNMENT;
@@ -222,6 +224,8 @@ typedef enum _AMDGPU_DOORBELL64_ASSIGNMENT
AMDGPU_DOORBELL64_VCE_RING4_5 = 0xFE,
AMDGPU_DOORBELL64_VCE_RING6_7 = 0xFF,
 
+   AMDGPU_DOORBELL64_LAST_NON_CP = 
AMDGPU_DOORBELL64_VCE_RING6_7,
+
AMDGPU_DOORBELL64_MAX_ASSIGNMENT  = 0xFF,
AMDGPU_DOORBELL64_INVALID = 0x
 } AMDGPU_DOORBELL64_ASSIGNMENT;
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c 
b/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
index 4b5d60ea3e78..fa0433199215 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
@@ -81,6 +81,9 @@ void vega10_doorbell_index_init(struct amdgpu_device *adev)
adev->doorbell_index.uvd_vce.vce_ring2_3 = 
AMDGPU_DOORBELL64_VCE_RING2_3;
adev->doorbell_index.uvd_vce.vce_ring4_5 = 
AMDGPU_DOORBELL64_VCE_RING4_5;
adev->doorbell_index.uvd_vce.vce_ring6_7 = 
AMDGPU_DOORBELL64_VCE_RING6_7;
+
+   adev->doorbell_index.last_non_cp = AMDGPU_DOORBELL64_LAST_NON_CP;
+
/* In unit of dword doorbell */
adev->doorbell_index.max_assignment = AMDGPU_DOORBELL64_MAX_ASSIGNMENT 
<< 1;
adev->doorbell_index.sdma_doorbell_range = 4;
diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c 
b/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
index 53716c593b2b..b1052caaff5e 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
@@ -85,6 +85,9 @@ void vega20_doorbell_index_init(struct amdgpu_device *adev)

[PATCH 3/3] drm/amdkfd: Optimize out sdma doorbell array in kgd2kfd_shared_resources

2019-01-31 Thread Zhao, Yong
We can directly calculate the sdma doorbell index in the process doorbell
pages through the doorbell_index structure in amdgpu_device, so no need
to cache them in kgd2kfd_shared_resources any more, resulting in more
portable code.

Change-Id: Ic657799856ed0256f36b01e502ef0cab263b1f49
Signed-off-by: Yong Zhao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 55 ++-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 18 --
 .../gpu/drm/amd/include/kgd_kfd_interface.h   |  2 +-
 3 files changed, 31 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 13710f34191a..f050adc3f5da 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -131,7 +131,7 @@ static void amdgpu_doorbell_get_kfd_info(struct 
amdgpu_device *adev,
 
 void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 {
-   int i, n;
+   int i;
int last_valid_bit;
 
if (adev->kfd.dev) {
@@ -142,7 +142,9 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
.gpuvm_size = min(adev->vm_manager.max_pfn
  << AMDGPU_GPU_PAGE_SHIFT,
  AMDGPU_GMC_HOLE_START),
-   .drm_render_minor = adev->ddev->render->index
+   .drm_render_minor = adev->ddev->render->index,
+   .sdma_doorbell_idx = adev->doorbell_index.sdma_engine,
+
};
 
/* this is going to have a few of the MSBs set that we need to
@@ -172,45 +174,22 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
_resources.doorbell_aperture_size,
_resources.doorbell_start_offset);
 
-   if (adev->asic_type < CHIP_VEGA10) {
-   kgd2kfd_device_init(adev->kfd.dev, _resources);
-   return;
-   }
-
-   n = (adev->asic_type < CHIP_VEGA20) ? 2 : 8;
-
-   for (i = 0; i < n; i += 2) {
-   /* On SOC15 the BIF is involved in routing
-* doorbells using the low 12 bits of the
-* address. Communicate the assignments to
-* KFD. KFD uses two doorbell pages per
-* process in case of 64-bit doorbells so we
-* can use each doorbell assignment twice.
+   if (adev->asic_type >= CHIP_VEGA10) {
+   /* Because of the setting in registers like
+* SDMA0_DOORBELL_RANGE etc., BIF statically uses the
+* lower 12 bits of doorbell address for routing, in
+* order to route the CP queue doorbells to CP engine,
+* the doorbells allocated to CP queues have to be
+* outside the range set for SDMA, VCN, and IH blocks
+* Prior to SOC15, all queues use queue ID to
+* determine doorbells.
 */
-   gpu_resources.sdma_doorbell[0][i] =
-   adev->doorbell_index.sdma_engine[0] + (i >> 1);
-   gpu_resources.sdma_doorbell[0][i+1] =
-   adev->doorbell_index.sdma_engine[0] + 0x200 + 
(i >> 1);
-   gpu_resources.sdma_doorbell[1][i] =
-   adev->doorbell_index.sdma_engine[1] + (i >> 1);
-   gpu_resources.sdma_doorbell[1][i+1] =
-   adev->doorbell_index.sdma_engine[1] + 0x200 + 
(i >> 1);
+   gpu_resources.reserved_doorbells_start =
+   adev->doorbell_index.sdma_engine[0];
+   gpu_resources.reserved_doorbells_end =
+   adev->doorbell_index.last_non_cp;
}
 
-   /* Because of the setting in registers like
-* SDMA0_DOORBELL_RANGE etc., BIF statically uses the
-* lower 12 bits of doorbell address for routing, in
-* order to route the CP queue doorbells to CP engine,
-* the doorbells allocated to CP queues have to be
-* outside the range set for SDMA, VCN, and IH blocks
-* Prior to SOC15, all queues use queue ID to
-* determine doorbells.
-*/
-   gpu_resources.reserved_doorbells_start =
-   adev->doorbell_index.sdma_engine[0];
-   gpu_resources.reserved_doorbells_end =
-   adev->doorbell_index.last_non_cp;
-
kgd2kfd_device_init(adev->kfd.dev, _resources);
}
 }
diff --git 

[PATCH 1/3] drm/amdkfd: Move a constant definition around

2019-01-31 Thread Zhao, Yong
The similar definitions should be consecutive.

Change-Id: I936cf076363e641c60e0704d8405ae9493718e18
Signed-off-by: Yong Zhao 
---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 12b66330fc6d..e5ebcca7f031 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -97,17 +97,18 @@
 #define KFD_CWSR_TBA_TMA_SIZE (PAGE_SIZE * 2)
 #define KFD_CWSR_TMA_OFFSET PAGE_SIZE
 
+#define KFD_MAX_NUM_OF_QUEUES_PER_DEVICE   \
+   (KFD_MAX_NUM_OF_PROCESSES * \
+   KFD_MAX_NUM_OF_QUEUES_PER_PROCESS)
+
+#define KFD_KERNEL_QUEUE_SIZE 2048
+
 /*
  * Kernel module parameter to specify maximum number of supported queues per
  * device
  */
 extern int max_num_of_queues_per_device;
 
-#define KFD_MAX_NUM_OF_QUEUES_PER_DEVICE   \
-   (KFD_MAX_NUM_OF_PROCESSES * \
-   KFD_MAX_NUM_OF_QUEUES_PER_PROCESS)
-
-#define KFD_KERNEL_QUEUE_SIZE 2048
 
 /* Kernel module parameter to specify the scheduling policy */
 extern int sched_policy;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: cleanup amdgpu_ih_process a bit more

2019-01-31 Thread Kuehling, Felix
On 2019-01-24 7:52 a.m., Christian König wrote:
> Remove the callback and call the dispatcher directly.
>
> Signed-off-by: Christian König 

This patch is Reviewed-by: Felix Kuehling 


> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c  |  6 ++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h  |  4 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 48 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h |  2 +-
>   4 files changed, 21 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> index d0a5db777b6d..1c50be3ab8a9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> @@ -140,9 +140,7 @@ void amdgpu_ih_ring_fini(struct amdgpu_device *adev, 
> struct amdgpu_ih_ring *ih)
>* Interrupt hander (VI), walk the IH ring.
>* Returns irq process return code.
>*/
> -int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih,
> -   void (*callback)(struct amdgpu_device *adev,
> -struct amdgpu_ih_ring *ih))
> +int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih)
>   {
>   u32 wptr;
>   
> @@ -162,7 +160,7 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct 
> amdgpu_ih_ring *ih,
>   rmb();
>   
>   while (ih->rptr != wptr) {
> - callback(adev, ih);
> + amdgpu_irq_dispatch(adev, ih);
>   ih->rptr &= ih->ptr_mask;
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> index 1ccb1831382a..113a1ba13d4a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> @@ -69,8 +69,6 @@ struct amdgpu_ih_funcs {
>   int amdgpu_ih_ring_init(struct amdgpu_device *adev, struct amdgpu_ih_ring 
> *ih,
>   unsigned ring_size, bool use_bus_addr);
>   void amdgpu_ih_ring_fini(struct amdgpu_device *adev, struct amdgpu_ih_ring 
> *ih);
> -int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih,
> -   void (*callback)(struct amdgpu_device *adev,
> -struct amdgpu_ih_ring *ih));
> +int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih);
>   
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> index 8bfb3dab46f7..af4c3b1af322 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> @@ -130,29 +130,6 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev)
>   spin_unlock_irqrestore(>irq.lock, irqflags);
>   }
>   
> -/**
> - * amdgpu_irq_callback - callback from the IH ring
> - *
> - * @adev: amdgpu device pointer
> - * @ih: amdgpu ih ring
> - *
> - * Callback from IH ring processing to handle the entry at the current 
> position
> - * and advance the read pointer.
> - */
> -static void amdgpu_irq_callback(struct amdgpu_device *adev,
> - struct amdgpu_ih_ring *ih)
> -{
> - u32 ring_index = ih->rptr >> 2;
> - struct amdgpu_iv_entry entry;
> -
> - entry.iv_entry = (const uint32_t *)>ring[ring_index];
> - amdgpu_ih_decode_iv(adev, );
> -
> - trace_amdgpu_iv(ih - >irq.ih, );
> -
> - amdgpu_irq_dispatch(adev, );
> -}
> -
>   /**
>* amdgpu_irq_handler - IRQ handler
>*
> @@ -170,7 +147,7 @@ irqreturn_t amdgpu_irq_handler(int irq, void *arg)
>   struct amdgpu_device *adev = dev->dev_private;
>   irqreturn_t ret;
>   
> - ret = amdgpu_ih_process(adev, >irq.ih, amdgpu_irq_callback);
> + ret = amdgpu_ih_process(adev, >irq.ih);
>   if (ret == IRQ_HANDLED)
>   pm_runtime_mark_last_busy(dev->dev);
>   return ret;
> @@ -188,7 +165,7 @@ static void amdgpu_irq_handle_ih1(struct work_struct 
> *work)
>   struct amdgpu_device *adev = container_of(work, struct amdgpu_device,
> irq.ih1_work);
>   
> - amdgpu_ih_process(adev, >irq.ih1, amdgpu_irq_callback);
> + amdgpu_ih_process(adev, >irq.ih1);
>   }
>   
>   /**
> @@ -203,7 +180,7 @@ static void amdgpu_irq_handle_ih2(struct work_struct 
> *work)
>   struct amdgpu_device *adev = container_of(work, struct amdgpu_device,
> irq.ih2_work);
>   
> - amdgpu_ih_process(adev, >irq.ih2, amdgpu_irq_callback);
> + amdgpu_ih_process(adev, >irq.ih2);
>   }
>   
>   /**
> @@ -394,14 +371,23 @@ int amdgpu_irq_add_id(struct amdgpu_device *adev,
>* Dispatches IRQ to IP blocks.
>*/
>   void amdgpu_irq_dispatch(struct amdgpu_device *adev,
> -  struct amdgpu_iv_entry *entry)
> +  struct amdgpu_ih_ring *ih)
>   {
> - unsigned client_id = entry->client_id;
> - unsigned src_id = entry->src_id;
> + u32 ring_index = ih->rptr >> 2;
> + struct 

Re: [PATCH 2/2] drm/amdgpu: use ring buffer for fault handling on GMC 9

2019-01-31 Thread Kuehling, Felix
I don't see how this solves anything. You'll still need to remove 
entries from this fault ring buffer. Otherwise you'll ignore faults for 
addresses that have faulted recently, although it may be a new fault.

Also, this reintroduces interrupt storms and repeated processing of the 
same fault in cases where we have more than 64 distinct addresses 
faulting at once.

I think the solution should be different. We do need a way to ignore 
duplicate faults for the same address. The problem you ran into was that 
after clearing an entry, you get additional interrupts for the same 
address. This could be handled by keeping track of the age of IH ring 
entries. After we update the page table and flush TLBs, there may still 
be stale entries in the IH ring. But no new interrupts should be 
generated (until that PTE is invalidated again). So we can check the 
current IH wptr. When IH processing (rptr) has caught up to that value, 
all the stale retry interrupts have been flushed, and we can remove the 
entry from the hash table (or ring buffer). Any subsequent interrupts 
for the same address are new and should be handled again.

Regards,
   Felix

On 2019-01-24 7:52 a.m., Christian König wrote:
> Further testing showed that the idea with the chash doesn't work as expected.
> Especially we can't predict when we can remove the entries from the hash 
> again.
>
> So replace the chash with a simple ring buffer for now to filter out the
> already handled faults. As long as we see less than 64 distinct addresses we
> still report them only once to the following handling.
>
> Signed-off-by: Christian König 
> ---
>   drivers/gpu/drm/Kconfig   |   2 -
>   drivers/gpu/drm/Makefile  |   1 -
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h   |   8 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 105 
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|  14 -
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  40 +-
>   drivers/gpu/drm/amd/include/linux/chash.h | 366 -
>   drivers/gpu/drm/amd/lib/Kconfig   |  28 -
>   drivers/gpu/drm/amd/lib/Makefile  |  32 --
>   drivers/gpu/drm/amd/lib/chash.c   | 638 --
>   10 files changed, 16 insertions(+), 1218 deletions(-)
>   delete mode 100644 drivers/gpu/drm/amd/include/linux/chash.h
>   delete mode 100644 drivers/gpu/drm/amd/lib/Kconfig
>   delete mode 100644 drivers/gpu/drm/amd/lib/Makefile
>   delete mode 100644 drivers/gpu/drm/amd/lib/chash.c
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 4385f00e1d05..98b033e675db 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -229,8 +229,6 @@ config DRM_AMDGPU
>   
>   source "drivers/gpu/drm/amd/amdgpu/Kconfig"
>   
> -source "drivers/gpu/drm/amd/lib/Kconfig"
> -
>   source "drivers/gpu/drm/nouveau/Kconfig"
>   
>   source "drivers/gpu/drm/i915/Kconfig"
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index 7f3be3506057..ac6213575691 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -56,7 +56,6 @@ obj-$(CONFIG_DRM_TTM)   += ttm/
>   obj-$(CONFIG_DRM_SCHED) += scheduler/
>   obj-$(CONFIG_DRM_TDFX)  += tdfx/
>   obj-$(CONFIG_DRM_R128)  += r128/
> -obj-y+= amd/lib/
>   obj-$(CONFIG_HSA_AMD) += amd/amdkfd/
>   obj-$(CONFIG_DRM_RADEON)+= radeon/
>   obj-$(CONFIG_DRM_AMDGPU)+= amd/amdgpu/
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> index 81e6070d255b..83c415eb40f4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> @@ -43,6 +43,11 @@
>*/
>   #define AMDGPU_GMC_HOLE_MASK0xULL
>   
> +/*
> + * Number of entries for the log of recent VM faults
> + */
> +#define AMDGPU_GMC_NUM_VM_FAULTS 64
> +
>   struct firmware;
>   
>   /*
> @@ -147,6 +152,9 @@ struct amdgpu_gmc {
>   struct kfd_vm_fault_info *vm_fault_info;
>   atomic_tvm_fault_info_updated;
>   
> + uint64_tvm_faults[AMDGPU_GMC_NUM_VM_FAULTS];
> + int vm_fault_idx;
> +
>   const struct amdgpu_gmc_funcs   *gmc_funcs;
>   
>   struct amdgpu_xgmi xgmi;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 0bc6f553dc08..3041d72c5478 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2972,22 +2972,6 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, 
> uint32_t min_vm_size,
>adev->vm_manager.fragment_size);
>   }
>   
> -static struct amdgpu_retryfault_hashtable *init_fault_hash(void)
> -{
> - struct amdgpu_retryfault_hashtable *fault_hash;
> -
> - fault_hash = kmalloc(sizeof(*fault_hash), GFP_KERNEL);
> - if (!fault_hash)
> - return fault_hash;
> -
> - INIT_CHASH_TABLE(fault_hash->hash,
> -

Re: [PATCH] drm/amdkfd: Fix if preprocessor statement above kfd_fill_iolink_info_for_cpu

2019-01-31 Thread Kuehling, Felix
Thank you, Nathan. I applied your patch to amd-staging-drm-next.

Sorry for the late response. I'm catching up with my email backlog after 
a vacation.

Regards,
   Felix

On 2019-01-21 6:52 p.m., Nathan Chancellor wrote:
> Clang warns:
>
> drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:866:5: warning:
> 'CONFIG_X86_64' is not defined, evaluates to 0 [-Wundef]
> #if CONFIG_X86_64
>  ^
> 1 warning generated.
>
> Fixes: d1c234e2cd10 ("drm/amdkfd: Allow building KFD on ARM64 (v2)")
> Signed-off-by: Nathan Chancellor 
> ---
>
> Resending as I forgot to add the lists...
>
>   drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> index 5d85ff341385..2e7c44955f43 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> @@ -863,7 +863,7 @@ static int kfd_fill_mem_info_for_cpu(int numa_node_id, 
> int *avail_size,
>   return 0;
>   }
>   
> -#if CONFIG_X86_64
> +#ifdef CONFIG_X86_64
>   static int kfd_fill_iolink_info_for_cpu(int numa_node_id, int *avail_size,
>   uint32_t *num_entries,
>   struct crat_subtype_iolink *sub_type_hdr)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/display: Attach VRR properties for eDP connectors

2019-01-31 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx  On Behalf Of
> Nicholas Kazlauskas
> Sent: Thursday, January 31, 2019 1:58 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Li, Sun peng (Leo) ; Wentland, Harry
> ; Kazlauskas, Nicholas
> 
> Subject: [PATCH] drm/amd/display: Attach VRR properties for eDP
> connectors
> 
> [Why]
> eDP was missing in the checks for supported VRR connectors.
> 
> [How]
> Attach the properties for eDP connectors too.
> 
> Cc: Leo Li 
> Cc: Harry Wentland 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202449
> Signed-off-by: Nicholas Kazlauskas 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index cdda68aba70e..4c7c34cae882 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -4249,7 +4249,8 @@ void amdgpu_dm_connector_init_helper(struct
> amdgpu_display_manager *dm,
>   }
> 
>   if (connector_type == DRM_MODE_CONNECTOR_HDMIA ||
> - connector_type == DRM_MODE_CONNECTOR_DisplayPort) {
> + connector_type == DRM_MODE_CONNECTOR_DisplayPort ||
> + connector_type == DRM_MODE_CONNECTOR_eDP) {
>   drm_connector_attach_vrr_capable_property(
>   >base);
>   drm_object_attach_property(>base.base,
> --
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: Attach VRR properties for eDP connectors

2019-01-31 Thread Nicholas Kazlauskas
[Why]
eDP was missing in the checks for supported VRR connectors.

[How]
Attach the properties for eDP connectors too.

Cc: Leo Li 
Cc: Harry Wentland 
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202449
Signed-off-by: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index cdda68aba70e..4c7c34cae882 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4249,7 +4249,8 @@ void amdgpu_dm_connector_init_helper(struct 
amdgpu_display_manager *dm,
}
 
if (connector_type == DRM_MODE_CONNECTOR_HDMIA ||
-   connector_type == DRM_MODE_CONNECTOR_DisplayPort) {
+   connector_type == DRM_MODE_CONNECTOR_DisplayPort ||
+   connector_type == DRM_MODE_CONNECTOR_eDP) {
drm_connector_attach_vrr_capable_property(
>base);
drm_object_attach_property(>base.base,
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Bug#921004: downgrade to firmware-amd-graphics_20180825-1_all.deb

2019-01-31 Thread Michel Dänzer
On 2019-01-31 6:01 p.m., Deucher, Alexander wrote:
> Is there a big report?

Yeah, 921...@bugs.debian.org => https://bugs.debian.org/921004 . Doesn't
have any more information currently though, we'd have to get in touch
with the reporter.


> I haven't heard of any other issues and these updates have been upstream
> in linux-firmware for over a month now.

It entered Debian sid about two weeks ago, so might have only entered
Debian testing this week.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: Bug#921004: downgrade to firmware-amd-graphics_20180825-1_all.deb

2019-01-31 Thread Deucher, Alexander
Is there a big report?  I haven't heard of any other issues and these updates 
have been upstream in linux-firmware for over a month now.

Alex

> -Original Message-
> From: amd-gfx  On Behalf Of
> Michel Dänzer
> Sent: Thursday, January 31, 2019 11:03 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: Fwd: Bug#921004: downgrade to firmware-amd-graphics_20180825-
> 1_all.deb
> 
> 
> Bad news I'm afraid, looks like the latest firmware (based on linux-firmware
> commit bc656509a3cfb60fcdfc905d7e23c18873e4e7b9 from
> 2019-01-14) broke some RX 580 cards.
> 
> 
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: amd-staging-drm-next: Oops - BUG: unable to handle kernel NULL pointer dereference, bisected.

2019-01-31 Thread StDenis, Tom
On 2019-01-31 4:23 a.m., Przemek Socha wrote:
> Dnia środa, 30 stycznia 2019 13:42:33 CET piszesz:
>> Does the attached patch fix the issue?
>>
>> Christian.
> 
> I have tested this one also - "drm/amdgpu: partial revert cleanup setting
> bulk_movable v2"
> 
>> We still need to set bulk_movable to false when new BOs are added or removed.
>>
>> v2: also set it to false on removal
>>
>> Signed-off-by: Christian König 
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/
>> amdgpu/amdgpu_vm.c
>> index 79f9dde70bc0..822546a149fa 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -332,6 +332,7 @@  static void amdgpu_vm_bo_base_init(struct
>> amdgpu_vm_bo_base *base,
>>  if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
>>  return;
>>
>> +vm->bulk_moveable = false;
>>  if (bo->tbo.type == ttm_bo_type_kernel)
>>  amdgpu_vm_bo_relocated(base);
>>  else
>> @@ -2772,6 +2773,9 @@  void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
>>  struct amdgpu_vm_bo_base **base;
>>
>>  if (bo) {
>> +if (bo->tbo.resv == vm->root.base.bo->tbo.resv)
>> +vm->bulk_moveable = false;
>> +
>>  for (base = _va->base.bo->vm_bo; *base;
>>   base = &(*base)->next) {
>>  if (*base != _va->base)
> 
> and so far I have no lockup and Oops, so I think this one is ok.

In my experience only the last chunk of the patch is necessary.  Can you 
try this without:

 >> +   vm->bulk_moveable = false;

Too?

Thanks,
Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Fwd: Bug#921004: downgrade to firmware-amd-graphics_20180825-1_all.deb

2019-01-31 Thread Michel Dänzer

Bad news I'm afraid, looks like the latest firmware (based on
linux-firmware commit bc656509a3cfb60fcdfc905d7e23c18873e4e7b9 from
2019-01-14) broke some RX 580 cards.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
--- Begin Message ---

Package: xserver-xorg-video-amdgpu

Version: 18.1.0-1

Severity: critical

Debian: Stretch amd64

Regression: Yes

Graphic card: AMD Radeon Rx 580


Good Morning,

since the latest update of xserver-xorg-video-amdgpu and 
firmware-amd-graphics, most GL applications do not display anything and 
display this error message in the console:


amdgpu: The CS has been cancelled because the context is lost.

This makes Debian unusable for developing and running GL programs.

--
Jean-Dominique Frattini

--- End Message ---
--- Begin Message ---

Good Morning,

downgrading to firmware-amd-graphics_20180825-1_all.deb seems to fix the 
issue.


Please let me know if I should report this bug directly to 
firmware-amd-graphics.


--- End Message ---
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: use spin_lock_irqsave to protect vm_manager.pasid_idr

2019-01-31 Thread Christian König

Am 31.01.19 um 15:25 schrieb Yang, Philip:

amdgpu_vm_get_task_info is called from interrupt handler and sched timeout
workqueue, so it is needed to use irq version spin_lock to avoid deadlock.

Change-Id: Ifedd4b97535bf0b5d3936edd2d9688957020efd4


Good catch, but your signed-of-by line is missing.

With that fixed the patch is Reviewed-by: Christian König 



Regards,
Christian.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 8f394a20a9eb..bfeb9007e100 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3386,14 +3386,15 @@ void amdgpu_vm_get_task_info(struct amdgpu_device 
*adev, unsigned int pasid,
 struct amdgpu_task_info *task_info)
  {
struct amdgpu_vm *vm;
+   unsigned long flags;
  
-	spin_lock(>vm_manager.pasid_lock);

+   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
  
  	vm = idr_find(>vm_manager.pasid_idr, pasid);

if (vm)
*task_info = vm->task_info;
  
-	spin_unlock(>vm_manager.pasid_lock);

+   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
  }
  
  /**


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: use spin_lock_irqsave to protect vm_manager.pasid_idr

2019-01-31 Thread Yang, Philip
amdgpu_vm_get_task_info is called from interrupt handler and sched timeout
workqueue, so it is needed to use irq version spin_lock to avoid deadlock.

Change-Id: Ifedd4b97535bf0b5d3936edd2d9688957020efd4
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 8f394a20a9eb..bfeb9007e100 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3386,14 +3386,15 @@ void amdgpu_vm_get_task_info(struct amdgpu_device 
*adev, unsigned int pasid,
 struct amdgpu_task_info *task_info)
 {
struct amdgpu_vm *vm;
+   unsigned long flags;
 
-   spin_lock(>vm_manager.pasid_lock);
+   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
 
vm = idr_find(>vm_manager.pasid_idr, pasid);
if (vm)
*task_info = vm->task_info;
 
-   spin_unlock(>vm_manager.pasid_lock);
+   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
 }
 
 /**
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Yet another RX Vega hang with another kernel panic signature. WARNING: inconsistent lock state

2019-01-31 Thread Yang, Philip
I found same issue while debugging, I will submit patch to fix this shortly.

Philip

On 2019-01-30 10:35 p.m., Mikhail Gavrilov wrote:
> Hi folks.
> Yet another kernel panic happens while GPU again is hang:
> 
> [ 1469.906798] 
> [ 1469.906799] WARNING: inconsistent lock state
> [ 1469.906801] 5.0.0-0.rc4.git2.2.fc30.x86_64 #1 Tainted: G C
> [ 1469.906802] 
> [ 1469.906804] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
> [ 1469.906806] kworker/12:3/681 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [ 1469.906807] d591b82b
> (&(>vm_manager.pasid_lock)->rlock){?...}, at:
> amdgpu_vm_get_task_info+0x23/0x80 [amdgpu]
> [ 1469.906851] {IN-HARDIRQ-W} state was registered at:
> [ 1469.906855]   _raw_spin_lock+0x31/0x80
> [ 1469.906893]   amdgpu_vm_get_task_info+0x23/0x80 [amdgpu]
> [ 1469.906936]   gmc_v9_0_process_interrupt+0x198/0x2b0 [amdgpu]
> [ 1469.906978]   amdgpu_irq_dispatch+0x90/0x1f0 [amdgpu]
> [ 1469.907018]   amdgpu_irq_callback+0x4a/0x70 [amdgpu]
> [ 1469.907061]   amdgpu_ih_process+0x89/0x100 [amdgpu]
> [ 1469.907103]   amdgpu_irq_handler+0x22/0x50 [amdgpu]
> [ 1469.907106]   __handle_irq_event_percpu+0x3f/0x290
> [ 1469.907108]   handle_irq_event_percpu+0x31/0x80
> [ 1469.907109]   handle_irq_event+0x34/0x51
> [ 1469.907111]   handle_edge_irq+0x7c/0x1a0
> [ 1469.907114]   handle_irq+0xbf/0x100
> [ 1469.907116]   do_IRQ+0x61/0x120
> [ 1469.907118]   ret_from_intr+0x0/0x22
> [ 1469.907121]   cpuidle_enter_state+0xbf/0x470
> [ 1469.907123]   do_idle+0x1ec/0x280
> [ 1469.907125]   cpu_startup_entry+0x19/0x20
> [ 1469.907127]   start_secondary+0x1b3/0x200
> [ 1469.907129]   secondary_startup_64+0xa4/0xb0
> [ 1469.907131] irq event stamp: 5546749
> [ 1469.907133] hardirqs last  enabled at (5546749):
> [] ktime_get+0xfa/0x130
> [ 1469.907135] hardirqs last disabled at (5546748):
> [] ktime_get+0x2b/0x130
> [ 1469.907137] softirqs last  enabled at (5498318):
> [] __do_softirq+0x35f/0x46a
> [ 1469.907140] softirqs last disabled at (5497393):
> [] irq_exit+0x119/0x120
> [ 1469.907141]
> other info that might help us debug this:
> [ 1469.907142]  Possible unsafe locking scenario:
> 
> [ 1469.907143]CPU0
> [ 1469.907144]
> [ 1469.907144]   lock(&(>vm_manager.pasid_lock)->rlock);
> [ 1469.907146]   
> [ 1469.907147] lock(&(>vm_manager.pasid_lock)->rlock);
> [ 1469.907148]
>  *** DEADLOCK ***
> 
> [ 1469.907150] 2 locks held by kworker/12:3/681:
> [ 1469.907152]  #0: 953235a7 ((wq_completion)"events"){+.+.},
> at: process_one_work+0x1e9/0x5d0
> [ 1469.907157]  #1: 71a3d218
> ((work_completion)(&(>work_tdr)->work)){+.+.}, at:
> process_one_work+0x1e9/0x5d0
> [ 1469.907160]
> stack backtrace:
> [ 1469.907163] CPU: 12 PID: 681 Comm: kworker/12:3 Tainted: G
> C5.0.0-0.rc4.git2.2.fc30.x86_64 #1
> [ 1469.907165] Hardware name: System manufacturer System Product
> Name/ROG STRIX X470-I GAMING, BIOS 1103 11/16/2018
> [ 1469.907169] Workqueue: events drm_sched_job_timedout [gpu_sched]
> [ 1469.907171] Call Trace:
> [ 1469.907176]  dump_stack+0x85/0xc0
> [ 1469.907180]  print_usage_bug.cold+0x1ae/0x1e8
> [ 1469.907183]  ? print_shortest_lock_dependencies+0x40/0x40
> [ 1469.907185]  mark_lock+0x50a/0x600
> [ 1469.907186]  ? print_shortest_lock_dependencies+0x40/0x40
> [ 1469.907189]  __lock_acquire+0x544/0x1660
> [ 1469.907191]  ? mark_held_locks+0x57/0x80
> [ 1469.907193]  ? trace_hardirqs_on_thunk+0x1a/0x1c
> [ 1469.907195]  ? lockdep_hardirqs_on+0xed/0x180
> [ 1469.907197]  ? trace_hardirqs_on_thunk+0x1a/0x1c
> [ 1469.907200]  ? retint_kernel+0x10/0x10
> [ 1469.907202]  lock_acquire+0xa2/0x1b0
> [ 1469.907242]  ? amdgpu_vm_get_task_info+0x23/0x80 [amdgpu]
> [ 1469.907245]  _raw_spin_lock+0x31/0x80
> [ 1469.907283]  ? amdgpu_vm_get_task_info+0x23/0x80 [amdgpu]
> [ 1469.907323]  amdgpu_vm_get_task_info+0x23/0x80 [amdgpu]
> [ 1469.907324] [ cut here ]
> 
> 
> My kernel commit is: 62967898789d
> 
> 
> 
> --
> Best Regards,
> Mike Gavrilov.
> 
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: partial revert cleanup setting bulk_movable v2

2019-01-31 Thread Zhou, David(ChunMing)
If Tom tests it OK as well, feel free add my RB to submit it ASAP.

-David

> -Original Message-
> From: amd-gfx  On Behalf Of
> Christian K?nig
> Sent: Thursday, January 31, 2019 3:57 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: partial revert cleanup setting bulk_movable
> v2
> 
> We still need to set bulk_movable to false when new BOs are added or
> removed.
> 
> v2: also set it to false on removal
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 79f9dde70bc0..822546a149fa 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -332,6 +332,7 @@ static void amdgpu_vm_bo_base_init(struct
> amdgpu_vm_bo_base *base,
>   if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
>   return;
> 
> + vm->bulk_moveable = false;
>   if (bo->tbo.type == ttm_bo_type_kernel)
>   amdgpu_vm_bo_relocated(base);
>   else
> @@ -2772,6 +2773,9 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device
> *adev,
>   struct amdgpu_vm_bo_base **base;
> 
>   if (bo) {
> + if (bo->tbo.resv == vm->root.base.bo->tbo.resv)
> + vm->bulk_moveable = false;
> +
>   for (base = _va->base.bo->vm_bo; *base;
>base = &(*base)->next) {
>   if (*base != _va->base)
> --
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm: prefix header search paths with $(srctree)/

2019-01-31 Thread Masahiro Yamada
Currently, the Kbuild core manipulates header search paths in a crazy
way [1].

To fix this mess, I want all Makefiles to add explicit $(srctree)/ to
the search paths in the srctree. Some Makefiles are already written in
that way, but not all. The goal of this work is to make the notation
consistent, and finally get rid of the gross hacks.

Having whitespaces after -I does not matter since commit 48f6e3cf5bc6
("kbuild: do not drop -I without parameter").

[1]: https://patchwork.kernel.org/patch/9632347/

Signed-off-by: Masahiro Yamada 
---

I put all gpu/drm changes into a single patch because
they are trivial conversion.

Please let me know if I need to split this into per-driver patches.


 drivers/gpu/drm/amd/amdgpu/Makefile | 2 +-
 drivers/gpu/drm/amd/lib/Makefile| 2 +-
 drivers/gpu/drm/i915/gvt/Makefile   | 2 +-
 drivers/gpu/drm/msm/Makefile| 6 +++---
 drivers/gpu/drm/nouveau/Kbuild  | 8 
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index f76bcb9..b21ebb0 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -23,7 +23,7 @@
 # Makefile for the drm device driver.  This driver provides support for the
 # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
 
-FULL_AMD_PATH=$(src)/..
+FULL_AMD_PATH=$(srctree)/$(src)/..
 DISPLAY_FOLDER_NAME=display
 FULL_AMD_DISPLAY_PATH = $(FULL_AMD_PATH)/$(DISPLAY_FOLDER_NAME)
 
diff --git a/drivers/gpu/drm/amd/lib/Makefile b/drivers/gpu/drm/amd/lib/Makefile
index 6902430..d534992 100644
--- a/drivers/gpu/drm/amd/lib/Makefile
+++ b/drivers/gpu/drm/amd/lib/Makefile
@@ -27,6 +27,6 @@
 # driver components or later moved to kernel/lib for sharing with
 # other drivers.
 
-ccflags-y := -I$(src)/../include
+ccflags-y := -I $(srctree)/$(src)/../include
 
 obj-$(CONFIG_CHASH) += chash.o
diff --git a/drivers/gpu/drm/i915/gvt/Makefile 
b/drivers/gpu/drm/i915/gvt/Makefile
index b016dc7..a4a5a96 100644
--- a/drivers/gpu/drm/i915/gvt/Makefile
+++ b/drivers/gpu/drm/i915/gvt/Makefile
@@ -5,6 +5,6 @@ GVT_SOURCE := gvt.o aperture_gm.o handlers.o vgpu.o 
trace_points.o firmware.o \
execlist.o scheduler.o sched_policy.o mmio_context.o cmd_parser.o 
debugfs.o \
fb_decoder.o dmabuf.o page_track.o
 
-ccflags-y  += -I$(src) -I$(src)/$(GVT_DIR)
+ccflags-y  += -I $(srctree)/$(src) -I 
$(srctree)/$(src)/$(GVT_DIR)/
 i915-y += $(addprefix $(GVT_DIR)/, 
$(GVT_SOURCE))
 obj-$(CONFIG_DRM_I915_GVT_KVMGT)   += $(GVT_DIR)/kvmgt.o
diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 56a70c7..b7b1ebd 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
-ccflags-y := -Idrivers/gpu/drm/msm
-ccflags-y += -Idrivers/gpu/drm/msm/disp/dpu1
-ccflags-$(CONFIG_DRM_MSM_DSI) += -Idrivers/gpu/drm/msm/dsi
+ccflags-y := -I $(srctree)/$(src)
+ccflags-y += -I $(srctree)/$(src)/disp/dpu1
+ccflags-$(CONFIG_DRM_MSM_DSI) += -I $(srctree)/$(src)/dsi
 
 msm-y := \
adreno/adreno_device.o \
diff --git a/drivers/gpu/drm/nouveau/Kbuild b/drivers/gpu/drm/nouveau/Kbuild
index b17843d..b4bc88ad 100644
--- a/drivers/gpu/drm/nouveau/Kbuild
+++ b/drivers/gpu/drm/nouveau/Kbuild
@@ -1,7 +1,7 @@
-ccflags-y += -I$(src)/include
-ccflags-y += -I$(src)/include/nvkm
-ccflags-y += -I$(src)/nvkm
-ccflags-y += -I$(src)
+ccflags-y += -I $(srctree)/$(src)/include
+ccflags-y += -I $(srctree)/$(src)/include/nvkm
+ccflags-y += -I $(srctree)/$(src)/nvkm
+ccflags-y += -I $(srctree)/$(src)
 
 # NVKM - HW resource manager
 #- code also used by various userspace tools/tests
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx