Re: GPU hang trying to run OpenCL kernels on x86_64

2018-05-04 Thread Luís Mendes
Hi Slava,

The two x86_64 systems I tried are:
- System One
  Tyan S7025 with dual Xeon X5675 and 48GB registered ECC memory, with
a NVIDIA GTX 1050Ti 4GB(also used for display) and an AMD RX 550 4GB
  Running standard Ubuntu 16.04.4 with kernels
linux-image-4.13.0-38-generic and linux-image-4.4.0-122-generic,
mesa-17.2.8-0ubuntu0, libdrm-2.4.83-1
  and amdgpu-pro 17.50/amdgpu-pro 18.10
  lsb_release -a
  Description: Ubuntu 16.04.4 LTS

  BIOS configuration:
  ACPI enabled v3.0
  ACPI APIC support Enabled
  ACPI SRAT table Enabled
  SR-IOV Enabled
  Intel VT-d Disabled
  PCI MMIO 64 Bits support Disabled


- System Two
  Tyan S7002 with dual Xeon X5670 and 12GB registered ECC memory, with
an AMD RX 480
  Running Ubuntu 18.04 with kernels vanilla 4.16.7 and
linux-image-4.15.0-20-generic, mesa-18.0.0~rc5-1ubuntu1,
libdrm-2.4.91-2
  and mesa-opencl-icd, libclc-0.2.0+git20180312-1

  BIOS configuration:
  ACPI enabled v2.0
  ACPI APIC support Enabled
  ACPI SRAT table Enabled
  SR-IOV Enabled
  Intel VT-d Disabled
  PCI MMIO 64 Bits support Disabled

  amdgpu-pro-install --headless --opencl=legacy



When I try to run the attached openCL code (which computes a
cross-correlation between two square matrices directly by cross
correlation function definition), the GPU hangs, but there are also
other kernels where this also happens.

As soon as I try to run the kernel the system hangs at the first
kernel computation on all the two systems, and after a couple of
seconds dmesg shows:
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, last
signaled seq=2, last emitted seq=3
[drm] IP block:gmc_v8_0 is hung!
[drm] IP block:tonga_ih is hung!
[drm] IP block:gfx_v8_0 is hung!
[drm] IP block:sdma_v3_0 is hung!
[drm] IP block:uvd_v6_0 is hung!
[drm] IP block:vce_v3_0 is hung!
[drm] GPU recovery disabled.

- On another system with armhf 32 bits, 1GB ram, 512GB SSD, AMD RX 480
or AMD RX 550
  with Ubuntu 17.10, vanilla kernel 4.16.7, mesa-18.0.2,
libdrm-2.4.92-git, libclc-git at commit
3d994f2ff2cbb4531223fe2657144cb19f0c5328 (15/Nov/2017)

  The kernels work properly on the same AMD cards.

On Fri, May 4, 2018 at 7:18 PM, Abramov, Slava  wrote:
> Luis,
>
>
> Can you please provide more details on your system environment and steps on
> configuring the software and reproducing the issue?
>
>
>
> Slava A
>
> 
> From: amd-gfx  on behalf of Luís
> Mendes 
> Sent: Friday, May 4, 2018 12:27:47 PM
> To: amd-gfx list; Koenig, Christian; Michel Dänzer
> Subject: GPU hang trying to run OpenCL kernels on x86_64
>
> Hi,
>
> I am a collaborator with Syncleus/aparapi project on github and I've
> been testing OpenCL on AMD and NVIDIA cards.
>
> Currently I have a set of kernels that hang the GPU (AMD RX 460 and
> AMD RX 550) across all compute units on x86_64 running vanilla kernel
> 4.16.7 on Ubuntu 18.04, also on Ubuntu 16.04.4 with AMDGPU PRO 17.50
> and 18.10 show the same problems, in fact, AMDGPU-PRO 18.10 is even
> worse.
>
> However the same set of kernels run happily on armhf with vanilla
> Linux 4.16.7 and mesa 18.0 (mesa-opencl-icd and libclc for amdgcn),
> Ubuntu 17.10, on an AMD RX460 and an AMD RX 550.
>
> Luís Mendes
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
May 04, 2018 10:37:41 PM com.aparapi.internal.kernel.KernelRunner 
executeInternalInner
INFO: typedef struct This_s{
   __global int *tilesGeometry;
   __global int *inputGeometry;
   __global int *threadOutputStart;
   __global int *outputGeometry;
   __global int *threadOffsetI;
   __global int *threadOffsetJ;
   __global float *matrixInF;
   __global float *matrixInG;
   __global float *matrixOut;
   int passid;
}This;
int get_pass_id(This *this){
   return this->passid;
}
short pt_ist_ceris_vipivist_jobs_CrossCorrelationKernel__signX(This *this, 
short x){
   short value = (short)((x + x) + 1);
   return((short)(value / abs(value)));
}
short pt_ist_ceris_vipivist_jobs_CrossCorrelationKernel__relocateX(This *this, 
short x, short dimX){
   short result = 
(short)(((pt_ist_ceris_vipivist_jobs_CrossCorrelationKernel__signX(this, x) + 
1) * (x + 1)) / 2);
   result = 
(short)(((pt_ist_ceris_vipivist_jobs_CrossCorrelationKernel__signX(this, 
(short)(dimX - result)) + 1) * result) / 2);
   return(result);
}
__kernel void run(
   __global int *tilesGeometry, 
   __global int *inputGeometry, 
   __global int *threadOutputStart, 
   __global int *outputGeometry, 
   __global int *threadOffsetI, 
   __global int *threadOffsetJ, 
   __global float *matrixInF, 
   __global float *matrixInG, 
   __global float *matrixOut, 
   int passid
){
   This thisStruct;
   This* this=
   this->tilesGeometry = tilesGeometry;
   this->inputGeometry = inputGeometry;
   this->threadOutputStart = threadOutputStart;
   this->outputGeometry = 

Re: GPU hang trying to run OpenCL kernels on x86_64

2018-05-04 Thread Abramov, Slava
Luis,


Can you please provide more details on your system environment and steps on 
configuring the software and reproducing the issue?



Slava A


From: amd-gfx  on behalf of Luís Mendes 

Sent: Friday, May 4, 2018 12:27:47 PM
To: amd-gfx list; Koenig, Christian; Michel Dänzer
Subject: GPU hang trying to run OpenCL kernels on x86_64

Hi,

I am a collaborator with Syncleus/aparapi project on github and I've
been testing OpenCL on AMD and NVIDIA cards.

Currently I have a set of kernels that hang the GPU (AMD RX 460 and
AMD RX 550) across all compute units on x86_64 running vanilla kernel
4.16.7 on Ubuntu 18.04, also on Ubuntu 16.04.4 with AMDGPU PRO 17.50
and 18.10 show the same problems, in fact, AMDGPU-PRO 18.10 is even
worse.

However the same set of kernels run happily on armhf with vanilla
Linux 4.16.7 and mesa 18.0 (mesa-opencl-icd and libclc for amdgcn),
Ubuntu 17.10, on an AMD RX460 and an AMD RX 550.

Luís Mendes
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


GPU hang trying to run OpenCL kernels on x86_64

2018-05-04 Thread Luís Mendes
Hi,

I am a collaborator with Syncleus/aparapi project on github and I've
been testing OpenCL on AMD and NVIDIA cards.

Currently I have a set of kernels that hang the GPU (AMD RX 460 and
AMD RX 550) across all compute units on x86_64 running vanilla kernel
4.16.7 on Ubuntu 18.04, also on Ubuntu 16.04.4 with AMDGPU PRO 17.50
and 18.10 show the same problems, in fact, AMDGPU-PRO 18.10 is even
worse.

However the same set of kernels run happily on armhf with vanilla
Linux 4.16.7 and mesa 18.0 (mesa-opencl-icd and libclc for amdgcn),
Ubuntu 17.10, on an AMD RX460 and an AMD RX 550.

Luís Mendes
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: amdgpu 0000:01:00.0: IH ring buffer overflow (0x00000010, 0x00000000, 0x00000020)

2018-05-04 Thread Christian König

Am 27.04.2018 um 14:02 schrieb Paul Menzel:

Dear Linux AMD folks,


I get the overrun message below.


Sorry for the delayed response, I was on vacation and then on sick leave :(




$ more /proc/version
Linux version 4.14.30.mx64.211 (r...@holidayincambodia.molgen.mpg.de) 
(gcc version 7.3.0 (GCC)) #1 SMP Tue Mar 27 12:40:07 CEST 2018


That one is ancient, please try an up to date kernel as well.

[90612.637194] amdgpu :01:00.0: IH ring buffer overflow 
(0x0010, 0x, 0x0020)


Should this be reported and fixed? How can this be debugged? 
(`drm.debug=0xe`?). We do not know how to reproduce it yet.


Yes, that should certainly be fixed. The issue is that your hardware is 
producing so many interrupts that the CPU can't catch up any more 
processing them.


That either sounds like a hardware problem or something is blocking the 
CPU for quite a long time with interrupts disabled.


Regards,
Christian.




Kind regards,

Paul



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: invalidate parent bo when shadow bo was invalidated

2018-05-04 Thread Christian König

Am 04.05.2018 um 08:44 schrieb Chunming Zhou:

Shadow BO is located on GTT and its parent (PT and PD) BO could located on VRAM.
In some case, the BO on GTT could be evicted but the parent did not. This may
cause the shadow BO not be put in the evict list and could not be invalidate
correctly.
v2: suggested by Christian

Change-Id: Iad10d9a3031fa2b243879b9e58ee4d8c527eb433
Signed-off-by: Chunming Zhou 
Reported-by: Shaoyun Liu 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 71dcdefce255..8e71d3984016 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2252,6 +2252,10 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,
  {
struct amdgpu_vm_bo_base *bo_base;
  
+	/* shadow bo doesn't have bo base, its validation needs its parent */

+   if (bo->parent && bo->parent->shadow == bo)
+   bo = bo->parent;
+
list_for_each_entry(bo_base, >va, bo_list) {
struct amdgpu_vm *vm = bo_base->vm;
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: remove need of modeset flag for overlay planes

2018-05-04 Thread Andrey Grodzovsky



On 05/03/2018 02:11 PM, Harry Wentland wrote:

On 2018-05-03 04:00 AM, S, Shirish wrote:


On 5/2/2018 7:21 PM, Harry Wentland wrote:

On 2018-04-27 06:27 AM, Shirish S wrote:

This patch is in continuation to the
"843e3c7 drm/amd/display: defer modeset check in dm_update_planes_state"
where we started to eliminate the dependency on
DRM_MODE_ATOMIC_ALLOW_MODESET to be set by the user space,
which as such is not mandatory.

After deferring, this patch eliminates the dependency on the flag
for overlay planes.


Apologies for the late response. I had to think about this patch for a long 
time since I'm not quite comfortable with it.


This has to be done in stages as its a pretty complex and requires thorough
testing before we free primary planes as well from dependency on modeset
flag.

Signed-off-by: Shirish S 
---
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +---
   1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 1a63c04..87b661d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4174,7 +4174,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
   }
   spin_unlock_irqrestore(>dev->event_lock, flags);
   -    if (!pflip_needed) {
+    if (!pflip_needed || plane->type == DRM_PLANE_TYPE_OVERLAY) {

Does this mean that whenever we have an overlay plane we won't do 
amdgpu_dm_do_flip but commit_planes_to_stream instead? Is this really the 
behavior we want?

commit_planes_to_stream was intended to program a new surface on a modeset 
whereas amdgpu_dm_do_flip was intended for pageflips.

Need of "modeset" flag to program new surface is what we want to fix in this 
patch for underlay plane and in next stages, fix manifestations caused by this approach 
as and when seen.
Since the user space doesn't send modeset flag for new surface, hence to 
program it, this patch checks the plane type to construct planes_count before 
calling commit_planes_to_stream().


Looking at the allow_modeset flag was never quite right and we anticipated 
having to rework this when having to deal with things like multiple planes. 
What really has to happen is that we determine the surface_update_type in 
atomic_check and then use that in atomic_commit to either program the surface 
only (UPDATE_TYPE_FAST) without having to lock all pipes or to lock all pipes 
(see lock_and_validation_needed in amdgpu_dm_atomic_check) if we need to 
reprogram mode (UPDATE_TYPE_FULL). I don't remember exactly what 
UPDATE_TYPE_MED is used for.

I don't feel comfortable taking a shortcut for DRM_PLANE_TYPE_OVERLAY without 
first having a plan and patches for how to deal with the above-mentioned.

Bhawan and Andrey had a look at this before but it was never quite ready. The 
work was non-trivial and potentially impacts lots of configurations and 
scenarios if we don't get it right. If you're curious you can look at this 
change (apologies to everyone else for posting AMD-internal link): 
http://git.amd.com:8080/#/c/103931/11


  If we use commit_planes_to_stream we end up losing things like the 
immediate_flip flag, as well as the wait for the right moment to program the 
flip that amdgpu_dm_do_flip does.

 From the code, amdgpu_dm_do_flip does what you mentioned only for primary 
plane and hence either way its not set for underlay.

The code wasn't designed with underlay in mind, so it will need work.

Harry


I support Harry's comments, we definitely need to strive to remove 
dependency on page_fleep needed flag, AFAIK we are the only ATOMIC KMS 
driver which makes a distinction between page fleep and other
surface updates, but it's better to sit and create a general plan of how 
to address it for all type of planes instead of patching for overlay only.


Andrey




Regards,
Shirish S

   Even more importantly we won't wait for fences 
(reservation_object_wait_timeout_rcu).

Harry


   WARN_ON(!dm_new_plane_state->dc_state);
     plane_states_constructed[planes_count] = 
dm_new_plane_state->dc_state;
@@ -4884,7 +4884,8 @@ static int dm_update_planes_state(struct dc *dc,
     /* Remove any changed/removed planes */
   if (!enable) {
-    if (pflip_needed)
+    if (pflip_needed &&
+    plane && plane->type != DRM_PLANE_TYPE_OVERLAY)
   continue;
     if (!old_plane_crtc)
@@ -4931,7 +4932,8 @@ static int dm_update_planes_state(struct dc *dc,
   if (!dm_new_crtc_state->stream)
   continue;
   -    if (pflip_needed)
+    if (pflip_needed &&
+    plane && plane->type != DRM_PLANE_TYPE_OVERLAY)
   continue;
     WARN_ON(dm_new_plane_state->dc_state);



___
amd-gfx 

Re: [Linaro-mm-sig] [PATCH 4/8] dma-buf: add peer2peer flag

2018-05-04 Thread Lucas Stach
Am Mittwoch, den 25.04.2018, 13:44 -0400 schrieb Alex Deucher:
> On Wed, Apr 25, 2018 at 2:41 AM, Christoph Hellwig  > wrote:
> > On Wed, Apr 25, 2018 at 02:24:36AM -0400, Alex Deucher wrote:
> > > > It has a non-coherent transaction mode (which the chipset can opt to
> > > > not implement and still flush), to make sure the AGP horror show
> > > > doesn't happen again and GPU folks are happy with PCIe. That's at
> > > > least my understanding from digging around in amd the last time we had
> > > > coherency issues between intel and amd gpus. GPUs have some bits
> > > > somewhere (in the pagetables, or in the buffer object description
> > > > table created by userspace) to control that stuff.
> > > 
> > > Right.  We have a bit in the GPU page table entries that determines
> > > whether we snoop the CPU's cache or not.
> > 
> > I can see how that works with the GPU on the same SOC or SOC set as the
> > CPU.  But how is that going to work for a GPU that is a plain old PCIe
> > card?  The cache snooping in that case is happening in the PCIe root
> > complex.
> 
> I'm not a pci expert, but as far as I know, the device sends either a
> snooped or non-snooped transaction on the bus.  I think the
> transaction descriptor supports a no snoop attribute.  Our GPUs have
> supported this feature for probably 20 years if not more, going back
> to PCI.  Using non-snooped transactions have lower latency and faster
> throughput compared to snooped transactions.

PCI-X (as in the thing which as all the rage before PCIe) added a
attribute phase to each transaction where the requestor can enable
relaxed ordering and/or no-snoop on a transaction basis. As those are
strictly performance optimizations the root-complex isn't required to
honor those attributes, but implementations that care about performance
 usually will.

Regards,
Lucas
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: invalidate parent bo when shadow bo was invalidated

2018-05-04 Thread Zhang, Jerry (Junwei)

On 05/04/2018 02:44 PM, Chunming Zhou wrote:

Shadow BO is located on GTT and its parent (PT and PD) BO could located on VRAM.
In some case, the BO on GTT could be evicted but the parent did not. This may
cause the shadow BO not be put in the evict list and could not be invalidate
correctly.
v2: suggested by Christian

Change-Id: Iad10d9a3031fa2b243879b9e58ee4d8c527eb433
Signed-off-by: Chunming Zhou 
Reported-by: Shaoyun Liu 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 71dcdefce255..8e71d3984016 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2252,6 +2252,10 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,
  {
struct amdgpu_vm_bo_base *bo_base;

+   /* shadow bo doesn't have bo base, its validation needs its parent */
+   if (bo->parent && bo->parent->shadow == bo)
+   bo = bo->parent;
+
list_for_each_entry(bo_base, >va, bo_list) {
struct amdgpu_vm *vm = bo_base->vm;



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: invalidate parent bo when shadow bo was invalidated

2018-05-04 Thread Chunming Zhou
Shadow BO is located on GTT and its parent (PT and PD) BO could located on VRAM.
In some case, the BO on GTT could be evicted but the parent did not. This may
cause the shadow BO not be put in the evict list and could not be invalidate
correctly.
v2: suggested by Christian

Change-Id: Iad10d9a3031fa2b243879b9e58ee4d8c527eb433
Signed-off-by: Chunming Zhou 
Reported-by: Shaoyun Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 71dcdefce255..8e71d3984016 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2252,6 +2252,10 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,
 {
struct amdgpu_vm_bo_base *bo_base;
 
+   /* shadow bo doesn't have bo base, its validation needs its parent */
+   if (bo->parent && bo->parent->shadow == bo)
+   bo = bo->parent;
+
list_for_each_entry(bo_base, >va, bo_list) {
struct amdgpu_vm *vm = bo_base->vm;
 
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: link shadow bo as well

2018-05-04 Thread zhoucm1



On 2018年05月03日 18:14, Christian König wrote:

Am 24.04.2018 um 09:35 schrieb Chunming Zhou:
Shadow BO is located on GTT and its parent (PT and PD) BO could 
located on VRAM.
In some case, the BO on GTT could be evicted but the parent did not. 
This may
cause the shadow BO not be put in the evict list and could not be 
invalidate

correctly.

Change-Id: Iad10d9a3031fa2b243879b9e58ee4d8c527eb433
Signed-off-by: Chunming Zhou 
Reported-by: Shaoyun Liu 


Way to much memory usage and to complicated.

What we need to do is just to handle the real BO as evicted when the 
shadow BO is evicted.


Something like the following at the start of amdgpu_vm_bo_invalidate() 
should be sufficient:


if (bo->parent && bo->parent->shadow == bo)
    bo = bo->parent;

clever, will send a patch soon.

Regards,
David Zhou


Regards,
Christian.



---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  5 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  1 +
  4 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c

index e1756b68a17b..9c9f6fc5c994 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -462,11 +462,6 @@ static int amdgpu_cs_validate(void *param, 
struct amdgpu_bo *bo)

  do {
  r = amdgpu_cs_bo_validate(p, bo);
  } while (r == -ENOMEM && amdgpu_cs_try_evict(p, bo));
-    if (r)
-    return r;
-
-    if (bo->shadow)
-    r = amdgpu_cs_bo_validate(p, bo->shadow);
    return r;
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h

index 540e03fa159f..8078da36aec7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -58,6 +58,7 @@ struct amdgpu_bo_va_mapping {
  /* User space allocated BO in a VM */
  struct amdgpu_bo_va {
  struct amdgpu_vm_bo_base    base;
+    struct amdgpu_vm_bo_base    shadow_base;
    /* protected by bo being reserved */
  unsigned    ref_count;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index c75b96433ee7..026147cc5104 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -450,8 +450,13 @@ static int amdgpu_vm_alloc_levels(struct 
amdgpu_device *adev,

  pt->parent = amdgpu_bo_ref(parent->base.bo);
    amdgpu_vm_bo_base_init(>base, vm, pt);
+    amdgpu_vm_bo_base_init(>shadow_base, vm,
+   pt->shadow);
  spin_lock(>status_lock);
  list_move(>base.vm_status, >relocated);
+    if (pt->shadow)
+    list_move(>shadow_base.vm_status,
+  >relocated);
  spin_unlock(>status_lock);
  }
  @@ -1875,6 +1880,7 @@ struct amdgpu_bo_va *amdgpu_vm_bo_add(struct 
amdgpu_device *adev,

  return NULL;
  }
  amdgpu_vm_bo_base_init(_va->base, vm, bo);
+    amdgpu_vm_bo_base_init(_va->shadow_base, vm, bo ? bo->shadow 
: NULL);

    bo_va->ref_count = 1;
  INIT_LIST_HEAD(_va->valids);
@@ -2220,9 +2226,11 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
  struct amdgpu_vm *vm = bo_va->base.vm;
    list_del(_va->base.bo_list);
+    list_del(_va->shadow_base.bo_list);
    spin_lock(>status_lock);
  list_del(_va->base.vm_status);
+    list_del(_va->shadow_base.vm_status);
  spin_unlock(>status_lock);
    list_for_each_entry_safe(mapping, next, _va->valids, list) {
@@ -2456,6 +2464,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,

  goto error_unreserve;
    amdgpu_vm_bo_base_init(>root.base, vm, root);
+    amdgpu_vm_bo_base_init(>root.shadow_base, vm, root->shadow);
  amdgpu_bo_unreserve(vm->root.base.bo);
    if (pasid) {
@@ -2575,6 +2584,8 @@ static void amdgpu_vm_free_levels(struct 
amdgpu_device *adev,

  if (parent->base.bo) {
  list_del(>base.bo_list);
  list_del(>base.vm_status);
+    list_del(>shadow_base.bo_list);
+    list_del(>shadow_base.vm_status);
  amdgpu_bo_unref(>base.bo->shadow);
  amdgpu_bo_unref(>base.bo);
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h

index 30f080364c97..f4ae6c6b28b8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -153,6 +153,7 @@ struct amdgpu_vm_bo_base {
    struct amdgpu_vm_pt {
  struct amdgpu_vm_bo_base    base;
+    struct amdgpu_vm_bo_base    shadow_base;
  bool    huge;
    /* array of page tables, one for each directory entry */




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org