Re: [PATCH 6/8] drm/amdgpu: allocate VM PDs/PTs on demand

2019-02-04 Thread Kuehling, Felix
On 2019-02-04 3:17 p.m., Kuehling, Felix wrote:
> This may cause regressions in KFD if PT BO allocation can trigger
> eviction fences. The previous call to amdgpu_vm_alloc_pts was in a
> context where we had temporarily removed the eviction fence. PT BO
> allocation in amdgpu_vm_bo_update is not protected like that.
>
> I vaguely remember looking into this last time you were working on our
> eviction fence code and thinking that most of the cases where we remove
> the eviction fence were no longer needed if fence synchronization
> happens through the amdgpu_sync-object API (rather than waiting on a
> reservation object directly). I think it's time I go and finally clean
> that up.

I'm looking at this now. It's not working as I was hoping.


>
> Do you know whether PT BO allocation would wait on the page-directory
> reservation object without the sync-object API anywhere?

It doesn't even matter. Buffer moves (anything calling amdgpu_sync_resv 
with AMDGPU_FENCE_OWNER_UNDEFINED) are supposed to trigger eviction 
fences. So page table BO allocation triggers eviction fences on the PD 
reservation. I don't know how to avoid this other than by removing the 
eviction fence while allocating PT BOs. With your changes this will be 
potentially every time we map or unmap memory.

Any better ideas?

Regards,
   Felix


>
> Regards,
>     Felix
>
> On 2019-02-04 7:42 a.m., Christian König wrote:
>> Let's start to allocate VM PDs/PTs on demand instead of pre-allocating
>> them during mapping.
>>
>> Signed-off-by: Christian König 
>> ---
>>.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   6 -
>>drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c   |   9 --
>>drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  10 --
>>drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 136 +-
>>drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|   3 -
>>5 files changed, 38 insertions(+), 126 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index d7b10d79f1de..2176c92f9377 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -492,12 +492,6 @@ static int add_bo_to_vm(struct amdgpu_device *adev, 
>> struct kgd_mem *mem,
>>  vm->process_info->eviction_fence,
>>  NULL, NULL);
>>
>> -ret = amdgpu_vm_alloc_pts(adev, vm, va, amdgpu_bo_size(bo));
>> -if (ret) {
>> -pr_err("Failed to allocate pts, err=%d\n", ret);
>> -goto err_alloc_pts;
>> -}
>> -
>>  ret = vm_validate_pt_pd_bos(vm);
>>  if (ret) {
>>  pr_err("validate_pt_pd_bos() failed\n");
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>> index 7e22be7ca68a..54dd02a898b9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>> @@ -92,15 +92,6 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, 
>> struct amdgpu_vm *vm,
>>  return -ENOMEM;
>>  }
>>
>> -r = amdgpu_vm_alloc_pts(adev, (*bo_va)->base.vm, csa_addr,
>> -size);
>> -if (r) {
>> -DRM_ERROR("failed to allocate pts for static CSA, err=%d\n", r);
>> -amdgpu_vm_bo_rmv(adev, *bo_va);
>> -ttm_eu_backoff_reservation(, );
>> -return r;
>> -}
>> -
>>  r = amdgpu_vm_bo_map(adev, *bo_va, csa_addr, 0, size,
>>   AMDGPU_PTE_READABLE | AMDGPU_PTE_WRITEABLE |
>>   AMDGPU_PTE_EXECUTABLE);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> index d21dd2f369da..e141e3b13112 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> @@ -627,11 +627,6 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void 
>> *data,
>>
>>  switch (args->operation) {
>>  case AMDGPU_VA_OP_MAP:
>> -r = amdgpu_vm_alloc_pts(adev, bo_va->base.vm, args->va_address,
>> -args->map_size);
>> -if (r)
>> -goto error_backoff;
>> -
>>  va_flags = amdgpu_gmc_get_pte_flags(adev, args->flags);
>>  r = amdgpu_vm_bo_map(adev, bo_va, args->va_address,
>>   args->offset_in_bo, args->map_size,
>> @@ -647,11 +642,6 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void 
>> *data,
>>  args->map_size);
>>  break;
>>  case AMDGPU_VA_OP_REPLACE:
>> -r = amdgpu_vm_alloc_pts(adev, bo_va->base.vm, args->va_address,
>> -args->map_size);
>> -if (r)
>> -goto error_backoff;
>> -
>>  va_flags = amdgpu_gmc_get_pte_flags(adev, 

Re: After upgrade mesa to 19.0.0~rc1 all vulkan based application stop working ["vulkan-cube" received SIGSEGV in radv_pipeline_init_blend_state at ../src/amd/vulkan/radv_pipeline.c:699]

2019-02-04 Thread Mikhail Gavrilov
On Mon, 4 Feb 2019 at 19:59,  wrote:
>
> Sorry to pop in the bug report.
>
> I run git linux(amd-staging-drm-next), drm, llvm (erk!), mesa-gl, mesa-vulkan,
> xserver, xf86-video-amdgpu from yesterday, that on AMD tahiti xt.
>
> I have run dota 2 vulkan, artifact (vulkan), cs:go (still opengl) without
> issues (I did play all of them yesterday night). I have a very old 
> vulkanloader
> though.
>
> Could you try dota2 vulkan and cs:go (free to play games) if you have room on
> your gaming storage and good download speed?

I tried dota2 vulkan and see same crash:
Thread 1 "dota2" received signal SIGSEGV, Segmentation fault.
radv_pipeline_init_blend_state (pipeline=0x557a64905800,
extra=0x7ffc2c6ad750, pCreateInfo
=0x7ffc2c6ad7c0) at ../src/amd/vulkan/radv_pipeline.c:699
699 VkBlendOp eqRGB = att->colorBlendOp;
Argument list to give program being debugged when it is started is
"+engine_experimental_drop_frame_ticks 1".

All opengl application working as expected.

I propose continue discussion in more appropriate place:
https://bugs.freedesktop.org/show_bug.cgi?id=109543

--
Best Regards,
Mike Gavrilov.
GNU gdb (GDB) Fedora 8.2.50.20190120-13.fc30
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/mikhail/.local/share/Steam/steamapps/common/dota 2 
beta/game/bin/linuxsteamrt64/dota2...
Missing separate debuginfo for 
/home/mikhail/.local/share/Steam/steamapps/common/dota 2 
beta/game/bin/linuxsteamrt64/dota2
Try: dnf --enablerepo='*debug*' install 
/usr/lib/debug/.build-id/3c/1a1d14631823da3e1735c2955ee5e282905ead.debug
(No debugging symbols found in 
/home/mikhail/.local/share/Steam/steamapps/common/dota 2 
beta/game/bin/linuxsteamrt64/dota2)
Function "main" not defined.
Make breakpoint pending on future shared library load? (y or [n]) [answered N; 
input not from terminal]
LD_PRELOAD = 
:/home/mikhail/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so:/home/mikhail/.local/share/Steam/ubuntu12_64/gameoverlayrenderer.so
ERROR: ld.so: object 
'/home/mikhail/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from 
LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
pid 5733 != 5732, skipping destruction (fork without exec?)
pid 5734 != 5732, skipping destruction (fork without exec?)
pid 5735 != 5732, skipping destruction (fork without exec?)
ERROR: ld.so: object 
'/home/mikhail/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from 
LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
>>> Adding process 5732 for game ID 570
pid 5736 != 5732, skipping destruction (fork without exec?)
>>> Adding process 5737 for game ID 570
pid 5738 != 5732, skipping destruction (fork without exec?)
warning: Loadable section ".note.gnu.property" outside of ELF segments
ERROR: ld.so: object 
'/home/mikhail/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from 
LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
saving roaming config store to 'sharedconfig.vdf'
roaming config store 2 saved successfully
>>> Adding process 5739 for game ID 570
Missing separate debuginfo for 
/home/mikhail/.local/share/Steam/ubuntu12_64/gameoverlayrenderer.so
Try: dnf --enablerepo='*debug*' install 
/usr/lib/debug/.build-id/32/32b4ac074a0e92c6ee859e3ff4f0d2a141381c.debug
Missing separate debuginfo for 
/home/mikhail/.local/share/Steam/steamapps/common/dota 2 
beta/game/bin/linuxsteamrt64/libtcmalloc_minimal.so.0
Try: dnf --enablerepo='*debug*' install 
/usr/lib/debug/.build-id/ca/916378ae7ccebea645b2fbc5512c9ed37571ab.debug
warning: Loadable section ".note.gnu.property" outside of ELF segments
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
Missing separate debuginfo for 

Re: [PATCH] drm/amd/display: Don't re-enable CRC when CONFIG_DEBUG_FS isn't defined

2019-02-04 Thread Wentland, Harry
On 2019-02-04 9:32 a.m., Nicholas Kazlauskas wrote:
> [Why]
> When CONFIG_DEBUG_FS isn't defined then amdgpu_dm_crtc_set_crc_source
> is NULL. This causes a compilation error since it's being called
> unconditionally.
> 
> [How]
> Guard the call based on CONFIG_DEBUG_FS - CRC capture isn't supported
> without this.
> 
> Cc: Leo Li 
> Cc: Harry Wentland 
> Fixes: 43a6a02eb355 ("drm/amd/display: Re-enable CRC capture following 
> modeset")
> Signed-off-by: Nicholas Kazlauskas 

Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index a2453900e15a..cacb8fe5a1ad 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -5284,9 +5284,11 @@ static void amdgpu_dm_atomic_commit_tail(struct 
> drm_atomic_state *state)
>  
>   manage_dm_interrupts(adev, acrtc, true);
>  
> +#ifdef CONFIG_DEBUG_FS
>   /* The stream has changed so CRC capture needs to re-enabled. */
>   if (dm_new_crtc_state->crc_enabled)
>   amdgpu_dm_crtc_set_crc_source(crtc, "auto");
> +#endif
>   }
>  
>   /* update planes when needed per crtc*/
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/8] drm/amdgpu: cleanup VM dw estimation a bit

2019-02-04 Thread Kuehling, Felix
On 2019-02-04 7:42 a.m., Christian König wrote:
> No functional change.
>
> Signed-off-by: Christian König 

Minor indentation issue inline. With that fixed, this patch is 
Reviewed-by: Felix Kuehling 


> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 16 
>   1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 44950f3b8801..69b0bee0661e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1804,13 +1804,12 @@ static int amdgpu_vm_bo_update_mapping(struct 
> amdgpu_device *adev,
>   /*
>* reserve space for two commands every (1 << BLOCK_SIZE)
>*  entries or 2k dwords (whatever is smaller)
> - *
> - * The second command is for the shadow pagetables.
>*/
> + ncmds = ((nptes >> min(adev->vm_manager.block_size, 11u)) + 1);
> +
> + /* The second command is for the shadow pagetables. */

Using space instead of TABs.


>   if (vm->root.base.bo->shadow)
> - ncmds = ((nptes >> min(adev->vm_manager.block_size, 11u)) + 1) 
> * 2;
> - else
> - ncmds = ((nptes >> min(adev->vm_manager.block_size, 11u)) + 1);
> + ncmds *= 2;
>   
>   /* padding, etc. */
>   ndw = 64;
> @@ -1829,10 +1828,11 @@ static int amdgpu_vm_bo_update_mapping(struct 
> amdgpu_device *adev,
>   ndw += ncmds * 10;
>   
>   /* extra commands for begin/end fragments */
> + ncmds = 2 * adev->vm_manager.fragment_size;
>   if (vm->root.base.bo->shadow)
> - ndw += 2 * 10 * adev->vm_manager.fragment_size * 2;
> - else
> - ndw += 2 * 10 * adev->vm_manager.fragment_size;
> + ncmds *= 2;
> +
> + ndw += 10 * ncmds;
>   
>   params.func = amdgpu_vm_do_set_ptes;
>   }
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/8] drm/amdgpu: fix waiting for BO moves with CPU based PD/PT updates

2019-02-04 Thread Kuehling, Felix
On 2019-02-04 7:42 a.m., Christian König wrote:
> Otherwise we open up the possibility to use uninitialized memory.
>
> Signed-off-by: Christian König 

This patch is Reviewed-by: Felix Kuehling 


> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 ++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index c70621db3bb7..44950f3b8801 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1779,13 +1779,18 @@ static int amdgpu_vm_bo_update_mapping(struct 
> amdgpu_device *adev,
>   if (pages_addr)
>   params.src = ~0;
>   
> - /* Wait for PT BOs to be free. PTs share the same resv. object
> + /* Wait for PT BOs to be idle. PTs share the same resv. object
>* as the root PD BO
>*/
>   r = amdgpu_vm_wait_pd(adev, vm, owner);
>   if (unlikely(r))
>   return r;
>   
> + /* Wait for any BO move to be completed */
> + r = dma_fence_wait(exclusive, true);
> + if (unlikely(r))
> + return r;
> +
>   params.func = amdgpu_vm_cpu_set_ptes;
>   params.pages_addr = pages_addr;
>   return amdgpu_vm_update_ptes(, start, last + 1,
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/powerplay: add override pcie parameters for Vega20 (v2)

2019-02-04 Thread Alex Deucher
On Mon, Feb 4, 2019 at 3:05 PM Kasiviswanathan, Harish
 wrote:
>
> >
> > For e.g. In my machine the hierarchy looks as shown below. The current code 
> > only looks at the top 2 and decides incorrectly as Gen 4.
> >
> > amdgpu :07:00.0: lnksta:1103 lnkcap2:180001e lnkcap:400d04   -- 
> > PCIE_SPEED_16_0GT - Gen 4
> > pcieport :06:00.0: lnksta:7103 lnkcap2:180001e lnkcap:700d04   -- 
> > PCIE_SPEED_16_0GT - Gen 4
> > pcieport :05:00.0: lnksta:1103 lnkcap2:180001e lnkcap:10473904 -- 
> > "PCIE_SPEED_8_0GT - Gen 3" // lnkcap2:_xx1e indicates Gen 4
> > pcieport :04:10.0: lnksta:103 lnkcap2:10e lnkcap:10416103 -- 
> > PCIE_SPEED_8_0GT - Gen 3
> > pcieport :03:00.0: lnksta:103 lnkcap2:10e lnkcap:416103 -- 
> > PCIE_SPEED_8_0GT - Gen 3
> > pcieport :00:03.0: lnksta:7103 lnkcap2:e lnkcap:77a3103 -- 
> > PCIE_SPEED_8_0GT - Gen 3
>
> First, there is minor error in my previous email. Both bridges :06:00.0 & 
> :05:00.0 are reporting Gen 4.
>
> > This seems wrong.  Is :06:00.0 a gen 4 bridge that is part of the
> > board design or something?
>
> Yes both bridges :06:00.0 & :05:00.0 are part of AMD Vega20 board. If 
> unplug the card, it doesn't show up.
>
> >  Maybe we need to make sure we go up the first bridge level that is not 
> > part of board configuration?
>
> This might be what we want. But how? Can we ignore the bridges and stop at 
> the first switch?

Maybe, or maybe we need special handling for boards with bridges on
board.  Maybe we can adjust based on sku?

>
>
> > That said, if the bridge supports gen4, it should work even if down 
> > upstream link has less bandwidth.
>
> Working is not the issue here. SMU (or rather PPTable) has to be updated with 
> highest PCI Gen supported by the system.

Well, my understanding in general is that the SMU just controls the
speed of the link between the GPU and the port above it.  However, in
the case of these boards, maybe the SMU controls the speed of the link
from the "board" and the port above it which also encompasses the
bridges on the board.

Going back you last idea, I think maybe it makes sense to walk the
pcie topology to the root to get the pcie speeds and widths.  Other
than peer to peer transfers, it doesn't make sense to run the
clock/lanes higher than the slowest link in the hierarchy.

Alex

>
> Best Regards,
> Harish
>
>
>
>
>
> From: Alex Deucher 
> Sent: Monday, February 4, 2019 1:53 PM
> To: Kasiviswanathan, Harish
> Cc: Li, Colin; Xi, Sheldon; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amd/powerplay: add override pcie parameters for 
> Vega20 (v2)
>
>
> On Mon, Feb 4, 2019 at 12:55 PM Kasiviswanathan, Harish
>  wrote:
> >
> > Hi Alex,
> >
> > I already have that patch in my code. The function 
> > amdgpu_device_get_pcie_info() updates platform caps 
> > (CAIL_PCIE_LINK_SPEED_SUPPORT_GEN*) based on pcie_get_speed_cap() of the 
> > bus AMD GPU is connected. I think to get the system caps going up just one 
> > level  in PCI hierarchy is not sufficient. It should be check all the way 
> > up to the root. Like the function pcie_bandwidth_available(). See  
> > https://elixir.bootlin.com/linux/latest/source/drivers/pci/pci.c#L5503
>
>
> Linux source code: drivers/pci/pci.c (v4.20.3) - Bootlin
> elixir.bootlin.com
> Elixir Cross Referencer
>
> Nevermind that patch. It only affected gen1 and gen2 platforms, so
> irrelevant for this discussion.  I'm not a pcie expert, but I think
> looking at the next bridge up should be sufficient.  We've been doing
> this for years already for smu7 (polaris, fiji, etc.) and smu9
> (vega10, 12) devices and it's worked correctly.
> pcie_bandwidth_available() walks the entire tree because it wants to
> know the slowest link in the hierarchy which would impact performance
> even if downstream links have higher bandwidth.
>
> >
> > For e.g. In my machine the hierarchy looks as shown below. The current code 
> > only looks at the top 2 and decides incorrectly as Gen 4.
> >
> > amdgpu :07:00.0: lnksta:1103 lnkcap2:180001e lnkcap:400d04   -- 
> > PCIE_SPEED_16_0GT - Gen 4
> > pcieport :06:00.0: lnksta:7103 lnkcap2:180001e lnkcap:700d04   -- 
> > PCIE_SPEED_16_0GT - Gen 4
> > pcieport :05:00.0: lnksta:1103 lnkcap2:180001e lnkcap:10473904 -- 
> > PCIE_SPEED_8_0GT - Gen 3
> > pcieport :04:10.0: lnksta:103 lnkcap2:10e lnkcap:10416103 -- 
> > PCIE_SPEED_8_0GT - Gen 3
> > pcieport :03:00.0: lnksta:103 lnkcap2:10e lnkcap:416103 -- 
> > PCIE_SPEED_8_0GT - Gen 3
> > pcieport :00:03.0: lnksta:7103 lnkcap2:e lnkcap:77a3103 -- 
> > PCIE_SPEED_8_0GT - Gen 3
>
> This seems wrong.  Is :06:00.0 a gen 4 bridge that is part of the
> board design or something?  Maybe we need to make sure we go up the
> first bridge level that is not part of board configuration?  That
> said, if the bridge supports gen4, it should work even if down
> upstream link has less bandwidth.
>
> Alex
>
> >
> >
> > Best Regards,
> > Harish
> >
> >
> >
> >
> >
> > From: Alex Deucher 

Re: [PATCH 6/8] drm/amdgpu: allocate VM PDs/PTs on demand

2019-02-04 Thread Kuehling, Felix
This may cause regressions in KFD if PT BO allocation can trigger 
eviction fences. The previous call to amdgpu_vm_alloc_pts was in a 
context where we had temporarily removed the eviction fence. PT BO 
allocation in amdgpu_vm_bo_update is not protected like that.

I vaguely remember looking into this last time you were working on our 
eviction fence code and thinking that most of the cases where we remove 
the eviction fence were no longer needed if fence synchronization 
happens through the amdgpu_sync-object API (rather than waiting on a 
reservation object directly). I think it's time I go and finally clean 
that up.

Do you know whether PT BO allocation would wait on the page-directory 
reservation object without the sync-object API anywhere?

Regards,
   Felix

On 2019-02-04 7:42 a.m., Christian König wrote:
> Let's start to allocate VM PDs/PTs on demand instead of pre-allocating
> them during mapping.
>
> Signed-off-by: Christian König 
> ---
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   6 -
>   drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c   |   9 --
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  10 --
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 136 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|   3 -
>   5 files changed, 38 insertions(+), 126 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index d7b10d79f1de..2176c92f9377 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -492,12 +492,6 @@ static int add_bo_to_vm(struct amdgpu_device *adev, 
> struct kgd_mem *mem,
>   vm->process_info->eviction_fence,
>   NULL, NULL);
>   
> - ret = amdgpu_vm_alloc_pts(adev, vm, va, amdgpu_bo_size(bo));
> - if (ret) {
> - pr_err("Failed to allocate pts, err=%d\n", ret);
> - goto err_alloc_pts;
> - }
> -
>   ret = vm_validate_pt_pd_bos(vm);
>   if (ret) {
>   pr_err("validate_pt_pd_bos() failed\n");
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> index 7e22be7ca68a..54dd02a898b9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> @@ -92,15 +92,6 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, 
> struct amdgpu_vm *vm,
>   return -ENOMEM;
>   }
>   
> - r = amdgpu_vm_alloc_pts(adev, (*bo_va)->base.vm, csa_addr,
> - size);
> - if (r) {
> - DRM_ERROR("failed to allocate pts for static CSA, err=%d\n", r);
> - amdgpu_vm_bo_rmv(adev, *bo_va);
> - ttm_eu_backoff_reservation(, );
> - return r;
> - }
> -
>   r = amdgpu_vm_bo_map(adev, *bo_va, csa_addr, 0, size,
>AMDGPU_PTE_READABLE | AMDGPU_PTE_WRITEABLE |
>AMDGPU_PTE_EXECUTABLE);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index d21dd2f369da..e141e3b13112 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -627,11 +627,6 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void 
> *data,
>   
>   switch (args->operation) {
>   case AMDGPU_VA_OP_MAP:
> - r = amdgpu_vm_alloc_pts(adev, bo_va->base.vm, args->va_address,
> - args->map_size);
> - if (r)
> - goto error_backoff;
> -
>   va_flags = amdgpu_gmc_get_pte_flags(adev, args->flags);
>   r = amdgpu_vm_bo_map(adev, bo_va, args->va_address,
>args->offset_in_bo, args->map_size,
> @@ -647,11 +642,6 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void 
> *data,
>   args->map_size);
>   break;
>   case AMDGPU_VA_OP_REPLACE:
> - r = amdgpu_vm_alloc_pts(adev, bo_va->base.vm, args->va_address,
> - args->map_size);
> - if (r)
> - goto error_backoff;
> -
>   va_flags = amdgpu_gmc_get_pte_flags(adev, args->flags);
>   r = amdgpu_vm_bo_replace_map(adev, bo_va, args->va_address,
>args->offset_in_bo, args->map_size,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index f7d410a5d848..0b8a1aa56c4a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -504,47 +504,6 @@ static void amdgpu_vm_pt_next(struct amdgpu_device *adev,
>   }
>   }
>   
> -/**
> - * amdgpu_vm_pt_first_leaf - get first leaf PD/PT
> - *
> - * @adev: amdgpu_device pointer
> - * @vm: amdgpu_vm 

Re: [PATCH] drm/amd/powerplay: add override pcie parameters for Vega20 (v2)

2019-02-04 Thread Kasiviswanathan, Harish
>
> For e.g. In my machine the hierarchy looks as shown below. The current code 
> only looks at the top 2 and decides incorrectly as Gen 4.
>
> amdgpu :07:00.0: lnksta:1103 lnkcap2:180001e lnkcap:400d04   -- 
> PCIE_SPEED_16_0GT - Gen 4
> pcieport :06:00.0: lnksta:7103 lnkcap2:180001e lnkcap:700d04   -- 
> PCIE_SPEED_16_0GT - Gen 4
> pcieport :05:00.0: lnksta:1103 lnkcap2:180001e lnkcap:10473904 -- 
> "PCIE_SPEED_8_0GT - Gen 3" // lnkcap2:_xx1e indicates Gen 4
> pcieport :04:10.0: lnksta:103 lnkcap2:10e lnkcap:10416103 -- 
> PCIE_SPEED_8_0GT - Gen 3
> pcieport :03:00.0: lnksta:103 lnkcap2:10e lnkcap:416103 -- 
> PCIE_SPEED_8_0GT - Gen 3
> pcieport :00:03.0: lnksta:7103 lnkcap2:e lnkcap:77a3103 -- 
> PCIE_SPEED_8_0GT - Gen 3

First, there is minor error in my previous email. Both bridges :06:00.0 & 
:05:00.0 are reporting Gen 4.

> This seems wrong.  Is :06:00.0 a gen 4 bridge that is part of the
> board design or something?

Yes both bridges :06:00.0 & :05:00.0 are part of AMD Vega20 board. If 
unplug the card, it doesn't show up.

>  Maybe we need to make sure we go up the first bridge level that is not part 
> of board configuration?

This might be what we want. But how? Can we ignore the bridges and stop at the 
first switch?


> That said, if the bridge supports gen4, it should work even if down upstream 
> link has less bandwidth.

Working is not the issue here. SMU (or rather PPTable) has to be updated with 
highest PCI Gen supported by the system.

Best Regards,
Harish





From: Alex Deucher 
Sent: Monday, February 4, 2019 1:53 PM
To: Kasiviswanathan, Harish
Cc: Li, Colin; Xi, Sheldon; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/powerplay: add override pcie parameters for Vega20 
(v2)
  

On Mon, Feb 4, 2019 at 12:55 PM Kasiviswanathan, Harish
 wrote:
>
> Hi Alex,
>
> I already have that patch in my code. The function 
> amdgpu_device_get_pcie_info() updates platform caps 
> (CAIL_PCIE_LINK_SPEED_SUPPORT_GEN*) based on pcie_get_speed_cap() of the bus 
> AMD GPU is connected. I think to get the system caps going up just one level  
> in PCI hierarchy is not sufficient. It should be check all the way up to the 
> root. Like the function pcie_bandwidth_available(). See  
> https://elixir.bootlin.com/linux/latest/source/drivers/pci/pci.c#L5503


Linux source code: drivers/pci/pci.c (v4.20.3) - Bootlin
elixir.bootlin.com
Elixir Cross Referencer

Nevermind that patch. It only affected gen1 and gen2 platforms, so
irrelevant for this discussion.  I'm not a pcie expert, but I think
looking at the next bridge up should be sufficient.  We've been doing
this for years already for smu7 (polaris, fiji, etc.) and smu9
(vega10, 12) devices and it's worked correctly.
pcie_bandwidth_available() walks the entire tree because it wants to
know the slowest link in the hierarchy which would impact performance
even if downstream links have higher bandwidth.

>
> For e.g. In my machine the hierarchy looks as shown below. The current code 
> only looks at the top 2 and decides incorrectly as Gen 4.
>
> amdgpu :07:00.0: lnksta:1103 lnkcap2:180001e lnkcap:400d04   -- 
> PCIE_SPEED_16_0GT - Gen 4
> pcieport :06:00.0: lnksta:7103 lnkcap2:180001e lnkcap:700d04   -- 
> PCIE_SPEED_16_0GT - Gen 4
> pcieport :05:00.0: lnksta:1103 lnkcap2:180001e lnkcap:10473904 -- 
> PCIE_SPEED_8_0GT - Gen 3
> pcieport :04:10.0: lnksta:103 lnkcap2:10e lnkcap:10416103 -- 
> PCIE_SPEED_8_0GT - Gen 3
> pcieport :03:00.0: lnksta:103 lnkcap2:10e lnkcap:416103 -- 
> PCIE_SPEED_8_0GT - Gen 3
> pcieport :00:03.0: lnksta:7103 lnkcap2:e lnkcap:77a3103 -- 
> PCIE_SPEED_8_0GT - Gen 3

This seems wrong.  Is :06:00.0 a gen 4 bridge that is part of the
board design or something?  Maybe we need to make sure we go up the
first bridge level that is not part of board configuration?  That
said, if the bridge supports gen4, it should work even if down
upstream link has less bandwidth.

Alex

>
>
> Best Regards,
> Harish
>
>
>
>
>
> From: Alex Deucher 
> Sent: Friday, February 1, 2019 5:53 PM
> To: Kasiviswanathan, Harish
> Cc: Li, Colin; Xi, Sheldon; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amd/powerplay: add override pcie parameters for 
> Vega20 (v2)
>
>
>
>
>
>
>
>
>
> Right.  we have the max gen and width supported by the card and the platform 
> in already in the driver thanks to amdgpu_device_get_pcie_info() so we just 
> need to use the platform gen caps to limit the max pcie gen in dpm1.  You 
> don't want to use the current   status because that might have been changed 
> by sw at some point.  The asic caps are defined by 
> CAIL_ASIC_PCIE_LINK_SPEED_SUPPORT_GEN* while the platform caps are defined by 
> CAIL_PCIE_LINK_SPEED_SUPPORT_GEN*.  I think the original code was correct.  
> What you   may have been running into was a bug in pcie_get_speed_cap() that 
> recently got fixed:
>
>
> commit f1f90e254e46e0a14220e4090041f68256fbe297
> Author: Mikulas 

Re: [pull] amdgpu, amdkfd, ttm, sched drm-next-5.1

2019-02-04 Thread Dave Airlie
On Tue, 5 Feb 2019 at 04:35, Alex Deucher  wrote:
>
> On Sun, Feb 3, 2019 at 11:57 PM Dave Airlie  wrote:
> >
> > On Mon, 4 Feb 2019 at 13:27, Dave Airlie  wrote:
> > >
> > > On Sat, 26 Jan 2019 at 09:15, Alex Deucher  wrote:
> > > >
> > > > Hi Dave, Daniel,
> > > >
> > > > New stuff for 5.1.
> > > > amdgpu:
> > > > - DC bandwidth formula updates
> > > > - Support for DCC on scanout surfaces
> > > > - Support for multiple IH rings on soc15 asics
> > > > - Fix xgmi locking
> > > > - Add sysfs interface to get pcie usage stats
> > > > - Simplify DC i2c/aux code
> > > > - Initial support for BACO on vega10/20
> > > > - New runtime SMU feature debug interface
> > > > - Expand existing sysfs power interfaces to new clock domains
> > > > - Handle kexec properly
> > > > - Simplify IH programming
> > > > - Rework doorbell handling across asics
> > > > - Drop old CI DPM implementation
> > > > - DC page flipping fixes
> > > > - Misc SR-IOV fixes
> > > >
> > >
> > > Hi Alex,
> > >
> > > I pulled this, but it introduced a warning, so I dropped it for now
> > > and the fix is on the list, please add it and resend,
> > >
> >
> > Just realised the warning was already there, so I pulled this again,
> > please send another next with the warning fix when you can.
>
> I was planning to send it via -fixes since the original patch was in
> 5.0, but I can include it in both -fixes and -next if you'd prefer.

Oh wow, I'm not sure how I missed it, please send via -fixes and I'll backmerge.

Dave.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/powerplay: add override pcie parameters for Vega20 (v2)

2019-02-04 Thread Alex Deucher
On Mon, Feb 4, 2019 at 12:55 PM Kasiviswanathan, Harish
 wrote:
>
> Hi Alex,
>
> I already have that patch in my code. The function 
> amdgpu_device_get_pcie_info() updates platform caps 
> (CAIL_PCIE_LINK_SPEED_SUPPORT_GEN*) based on pcie_get_speed_cap() of the bus 
> AMD GPU is connected. I think to get the system caps going up just one level 
> in PCI hierarchy is not sufficient. It should be check all the way up to the 
> root. Like the function pcie_bandwidth_available(). See 
> https://elixir.bootlin.com/linux/latest/source/drivers/pci/pci.c#L5503

Nevermind that patch. It only affected gen1 and gen2 platforms, so
irrelevant for this discussion.  I'm not a pcie expert, but I think
looking at the next bridge up should be sufficient.  We've been doing
this for years already for smu7 (polaris, fiji, etc.) and smu9
(vega10, 12) devices and it's worked correctly.
pcie_bandwidth_available() walks the entire tree because it wants to
know the slowest link in the hierarchy which would impact performance
even if downstream links have higher bandwidth.

>
> For e.g. In my machine the hierarchy looks as shown below. The current code 
> only looks at the top 2 and decides incorrectly as Gen 4.
>
> amdgpu :07:00.0: lnksta:1103 lnkcap2:180001e lnkcap:400d04   -- 
> PCIE_SPEED_16_0GT - Gen 4
> pcieport :06:00.0: lnksta:7103 lnkcap2:180001e lnkcap:700d04   -- 
> PCIE_SPEED_16_0GT - Gen 4
> pcieport :05:00.0: lnksta:1103 lnkcap2:180001e lnkcap:10473904 -- 
> PCIE_SPEED_8_0GT - Gen 3
> pcieport :04:10.0: lnksta:103 lnkcap2:10e lnkcap:10416103 -- 
> PCIE_SPEED_8_0GT - Gen 3
> pcieport :03:00.0: lnksta:103 lnkcap2:10e lnkcap:416103 -- 
> PCIE_SPEED_8_0GT - Gen 3
> pcieport :00:03.0: lnksta:7103 lnkcap2:e lnkcap:77a3103 -- 
> PCIE_SPEED_8_0GT - Gen 3

This seems wrong.  Is :06:00.0 a gen 4 bridge that is part of the
board design or something?  Maybe we need to make sure we go up the
first bridge level that is not part of board configuration?  That
said, if the bridge supports gen4, it should work even if down
upstream link has less bandwidth.

Alex

>
>
> Best Regards,
> Harish
>
>
>
>
>
> From: Alex Deucher 
> Sent: Friday, February 1, 2019 5:53 PM
> To: Kasiviswanathan, Harish
> Cc: Li, Colin; Xi, Sheldon; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amd/powerplay: add override pcie parameters for 
> Vega20 (v2)
>
>
>
>
>
>
>
>
>
> Right.  we have the max gen and width supported by the card and the platform 
> in already in the driver thanks to amdgpu_device_get_pcie_info() so we just 
> need to use the platform gen caps to limit the max pcie gen in dpm1.  You 
> don't want to use the current  status because that might have been changed by 
> sw at some point.  The asic caps are defined by 
> CAIL_ASIC_PCIE_LINK_SPEED_SUPPORT_GEN* while the platform caps are defined by 
> CAIL_PCIE_LINK_SPEED_SUPPORT_GEN*.  I think the original code was correct.  
> What you  may have been running into was a bug in pcie_get_speed_cap() that 
> recently got fixed:
>
>
> commit f1f90e254e46e0a14220e4090041f68256fbe297
> Author: Mikulas Patocka 
> Date:   Mon Nov 26 10:37:13 2018 -0600
>
> PCI: Fix incorrect value returned from pcie_get_speed_cap()
>
> The macros PCI_EXP_LNKCAP_SLS_*GB are values, not bit masks.  We must mask
> the register and compare it against them.
>
> This fixes errors like this:
>
>   amdgpu: [powerplay] failed to send message 261 ret is 0
>
> when a PCIe-v3 card is plugged into a PCIe-v1 slot, because the slot is
> being incorrectly reported as PCIe-v3 capable.
>
> 6cf57be0f78e, which appeared in v4.17, added pcie_get_speed_cap() with the
> incorrect test of PCI_EXP_LNKCAP_SLS as a bitmask.  5d9a63304032, which
> appeared in v4.19, changed amdgpu to use pcie_get_speed_cap(), so the
> amdgpu bug reports below are regressions in v4.19.
>
> Fixes: 6cf57be0f78e ("PCI: Add pcie_get_speed_cap() to find max supported 
> link speed")
> Fixes: 5d9a63304032 ("drm/amdgpu: use pcie functions for link width and 
> speed")
> Link: https://bugs.freedesktop.org/show_bug.cgi?id=108704
> Link: https://bugs.freedesktop.org/show_bug.cgi?id=108778
> Signed-off-by: Mikulas Patocka 
> [bhelgaas: update comment, remove use of PCI_EXP_LNKCAP_SLS_8_0GB and
> PCI_EXP_LNKCAP_SLS_16_0GB since those should be covered by 
> PCI_EXP_LNKCAP2,
> remove test of PCI_EXP_LNKCAP for zero, since that register is required]
> Signed-off-by: Bjorn Helgaas 
> Acked-by: Alex Deucher 
> Cc: sta...@vger.kernel.org  # v4.17+
>
>
>
>
> On Fri, Feb 1, 2019 at 5:43 PM Kasiviswanathan, Harish 
>  wrote:
>
>
> Let me explain the problem. I also adding Colin & Sheldon from the SMU team.
>
> Currently, PPTable entries (inside VBIOS) reflects ASIC capabilities and not 
> the system capabilities. So if we put Gen4 capable Vega20 in a system that 
> supports  only Gen3, the PPTable has Gen4 for DPM 1 (higher DPM state). Later 
> on when SMU 

Re: [pull] amdgpu, amdkfd, ttm, sched drm-next-5.1

2019-02-04 Thread Alex Deucher
On Sun, Feb 3, 2019 at 11:57 PM Dave Airlie  wrote:
>
> On Mon, 4 Feb 2019 at 13:27, Dave Airlie  wrote:
> >
> > On Sat, 26 Jan 2019 at 09:15, Alex Deucher  wrote:
> > >
> > > Hi Dave, Daniel,
> > >
> > > New stuff for 5.1.
> > > amdgpu:
> > > - DC bandwidth formula updates
> > > - Support for DCC on scanout surfaces
> > > - Support for multiple IH rings on soc15 asics
> > > - Fix xgmi locking
> > > - Add sysfs interface to get pcie usage stats
> > > - Simplify DC i2c/aux code
> > > - Initial support for BACO on vega10/20
> > > - New runtime SMU feature debug interface
> > > - Expand existing sysfs power interfaces to new clock domains
> > > - Handle kexec properly
> > > - Simplify IH programming
> > > - Rework doorbell handling across asics
> > > - Drop old CI DPM implementation
> > > - DC page flipping fixes
> > > - Misc SR-IOV fixes
> > >
> >
> > Hi Alex,
> >
> > I pulled this, but it introduced a warning, so I dropped it for now
> > and the fix is on the list, please add it and resend,
> >
>
> Just realised the warning was already there, so I pulled this again,
> please send another next with the warning fix when you can.

I was planning to send it via -fixes since the original patch was in
5.0, but I can include it in both -fixes and -next if you'd prefer.

Alex
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/3] drm/amdgpu: use HMM mirror callback to replace mmu notifier v7

2019-02-04 Thread Yang, Philip
Replace our MMU notifier with hmm_mirror_ops.sync_cpu_device_pagetables
callback. Enable CONFIG_HMM and CONFIG_HMM_MIRROR as a dependency in
DRM_AMDGPU_USERPTR Kconfig.

It supports both KFD userptr and gfx userptr paths.

Change-Id: Ie62c3c5e3c5b8521ab3b438d1eff2aa2a003835e
Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdgpu/Kconfig |   6 +-
 drivers/gpu/drm/amd/amdgpu/Makefile|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 154 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h |   2 +-
 4 files changed, 70 insertions(+), 94 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig 
b/drivers/gpu/drm/amd/amdgpu/Kconfig
index 9221e5489069..960a63355705 100644
--- a/drivers/gpu/drm/amd/amdgpu/Kconfig
+++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
@@ -26,10 +26,10 @@ config DRM_AMDGPU_CIK
 config DRM_AMDGPU_USERPTR
bool "Always enable userptr write support"
depends on DRM_AMDGPU
-   select MMU_NOTIFIER
+   select HMM_MIRROR
help
- This option selects CONFIG_MMU_NOTIFIER if it isn't already
- selected to enabled full userptr support.
+ This option selects CONFIG_HMM and CONFIG_HMM_MIRROR if it
+ isn't already selected to enabled full userptr support.
 
 config DRM_AMDGPU_GART_DEBUGFS
bool "Allow GART access through debugfs"
diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 466da5954a68..851001ced5e8 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -172,7 +172,7 @@ endif
 amdgpu-$(CONFIG_COMPAT) += amdgpu_ioc32.o
 amdgpu-$(CONFIG_VGA_SWITCHEROO) += amdgpu_atpx_handler.o
 amdgpu-$(CONFIG_ACPI) += amdgpu_acpi.o
-amdgpu-$(CONFIG_MMU_NOTIFIER) += amdgpu_mn.o
+amdgpu-$(CONFIG_HMM_MIRROR) += amdgpu_mn.o
 
 include $(FULL_AMD_PATH)/powerplay/Makefile
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 3e6823fdd939..e356867d2308 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -45,7 +45,7 @@
 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -58,7 +58,6 @@
  *
  * @adev: amdgpu device pointer
  * @mm: process address space
- * @mn: MMU notifier structure
  * @type: type of MMU notifier
  * @work: destruction work item
  * @node: hash table node to find structure by adev and mn
@@ -66,6 +65,7 @@
  * @objects: interval tree containing amdgpu_mn_nodes
  * @read_lock: mutex for recursive locking of @lock
  * @recursion: depth of recursion
+ * @mirror: HMM mirror function support
  *
  * Data for each amdgpu device and process address space.
  */
@@ -73,7 +73,6 @@ struct amdgpu_mn {
/* constant after initialisation */
struct amdgpu_device*adev;
struct mm_struct*mm;
-   struct mmu_notifier mn;
enum amdgpu_mn_type type;
 
/* only used on destruction */
@@ -85,8 +84,9 @@ struct amdgpu_mn {
/* objects protected by lock */
struct rw_semaphore lock;
struct rb_root_cached   objects;
-   struct mutexread_lock;
-   atomic_trecursion;
+
+   /* HMM mirror */
+   struct hmm_mirror   mirror;
 };
 
 /**
@@ -103,7 +103,7 @@ struct amdgpu_mn_node {
 };
 
 /**
- * amdgpu_mn_destroy - destroy the MMU notifier
+ * amdgpu_mn_destroy - destroy the HMM mirror
  *
  * @work: previously sheduled work item
  *
@@ -129,28 +129,26 @@ static void amdgpu_mn_destroy(struct work_struct *work)
}
up_write(>lock);
mutex_unlock(>mn_lock);
-   mmu_notifier_unregister_no_release(>mn, amn->mm);
+
+   hmm_mirror_unregister(>mirror);
kfree(amn);
 }
 
 /**
- * amdgpu_mn_release - callback to notify about mm destruction
+ * amdgpu_hmm_mirror_release - callback to notify about mm destruction
  *
- * @mn: our notifier
- * @mm: the mm this callback is about
+ * @mirror: the HMM mirror (mm) this callback is about
  *
- * Shedule a work item to lazy destroy our notifier.
+ * Shedule a work item to lazy destroy HMM mirror.
  */
-static void amdgpu_mn_release(struct mmu_notifier *mn,
- struct mm_struct *mm)
+static void amdgpu_hmm_mirror_release(struct hmm_mirror *mirror)
 {
-   struct amdgpu_mn *amn = container_of(mn, struct amdgpu_mn, mn);
+   struct amdgpu_mn *amn = container_of(mirror, struct amdgpu_mn, mirror);
 
INIT_WORK(>work, amdgpu_mn_destroy);
schedule_work(>work);
 }
 
-
 /**
  * amdgpu_mn_lock - take the write side lock for this notifier
  *
@@ -181,14 +179,10 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn)
 static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable)
 {
if (blockable)
-   mutex_lock(>read_lock);
-   else if (!mutex_trylock(>read_lock))
+   down_read(>lock);
+   else if (!down_read_trylock(>lock))
return -EAGAIN;
 
-   if 

[PATCH 0/3] Use HMM to replace get_user_pages

2019-02-04 Thread Yang, Philip
Hi Christian,

This patch is rebased to lastest HMM. Please review the GEM and CS part changes
in patch 3/3.

Thanks,

Philip Yang (3):
  drm/amdgpu: use HMM mirror callback to replace mmu notifier v7
  drm/amdkfd: avoid HMM change cause circular lock dependency v2
  drm/amdgpu: replace get_user_pages with HMM address mirror helpers v6

 drivers/gpu/drm/amd/amdgpu/Kconfig|   6 +-
 drivers/gpu/drm/amd/amdgpu/Makefile   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   1 -
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  95 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 158 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  14 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c| 179 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h|   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 173 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   3 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  32 ++--
 12 files changed, 285 insertions(+), 386 deletions(-)

-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 3/3] drm/amdgpu: replace get_user_pages with HMM address mirror helpers v6

2019-02-04 Thread Yang, Philip
Use HMM helper function hmm_vma_fault() to get physical pages backing
userptr and start CPU page table update track of those pages. Then use
hmm_vma_range_done() to check if those pages are updated before
amdgpu_cs_submit for gfx or before user queues are resumed for kfd.

If userptr pages are updated, for gfx, amdgpu_cs_ioctl will restart
from scratch, for kfd, restore worker is rescheduled to retry.

HMM simplify the CPU page table concurrent update check, so remove
guptasklock, mmu_invalidations, last_set_pages fields from
amdgpu_ttm_tt struct.

HMM does not pin the page (increase page ref count), so remove related
operations like release_pages(), put_page(), mark_page_dirty().

Change-Id: I2e8c0c6f0d8c21e5596a32d7fc91762778bc9e67
Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   1 -
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  95 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 158 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  14 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c|  25 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h|   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 173 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   3 +-
 9 files changed, 198 insertions(+), 277 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 0b31a1859023..0e1711a75b68 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -61,7 +61,6 @@ struct kgd_mem {
 
atomic_t invalid;
struct amdkfd_process_info *process_info;
-   struct page **user_pages;
 
struct amdgpu_sync sync;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index d7b10d79f1de..ae2d838d31ea 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -582,28 +582,12 @@ static int init_user_pages(struct kgd_mem *mem, struct 
mm_struct *mm,
goto out;
}
 
-   /* If no restore worker is running concurrently, user_pages
-* should not be allocated
-*/
-   WARN(mem->user_pages, "Leaking user_pages array");
-
-   mem->user_pages = kvmalloc_array(bo->tbo.ttm->num_pages,
-  sizeof(struct page *),
-  GFP_KERNEL | __GFP_ZERO);
-   if (!mem->user_pages) {
-   pr_err("%s: Failed to allocate pages array\n", __func__);
-   ret = -ENOMEM;
-   goto unregister_out;
-   }
-
-   ret = amdgpu_ttm_tt_get_user_pages(bo->tbo.ttm, mem->user_pages);
+   ret = amdgpu_ttm_tt_get_user_pages(bo->tbo.ttm, bo->tbo.ttm->pages);
if (ret) {
pr_err("%s: Failed to get user pages: %d\n", __func__, ret);
-   goto free_out;
+   goto unregister_out;
}
 
-   amdgpu_ttm_tt_set_user_pages(bo->tbo.ttm, mem->user_pages);
-
ret = amdgpu_bo_reserve(bo, true);
if (ret) {
pr_err("%s: Failed to reserve BO\n", __func__);
@@ -616,11 +600,7 @@ static int init_user_pages(struct kgd_mem *mem, struct 
mm_struct *mm,
amdgpu_bo_unreserve(bo);
 
 release_out:
-   if (ret)
-   release_pages(mem->user_pages, bo->tbo.ttm->num_pages);
-free_out:
-   kvfree(mem->user_pages);
-   mem->user_pages = NULL;
+   amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm);
 unregister_out:
if (ret)
amdgpu_mn_unregister(bo);
@@ -679,7 +659,6 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
ctx->kfd_bo.priority = 0;
ctx->kfd_bo.tv.bo = >tbo;
ctx->kfd_bo.tv.num_shared = 1;
-   ctx->kfd_bo.user_pages = NULL;
list_add(>kfd_bo.tv.head, >list);
 
amdgpu_vm_get_pd_bo(vm, >list, >vm_pd[0]);
@@ -743,7 +722,6 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
ctx->kfd_bo.priority = 0;
ctx->kfd_bo.tv.bo = >tbo;
ctx->kfd_bo.tv.num_shared = 1;
-   ctx->kfd_bo.user_pages = NULL;
list_add(>kfd_bo.tv.head, >list);
 
i = 0;
@@ -1371,15 +1349,6 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
list_del(_list_entry->head);
mutex_unlock(_info->lock);
 
-   /* Free user pages if necessary */
-   if (mem->user_pages) {
-   pr_debug("%s: Freeing user_pages array\n", __func__);
-   if (mem->user_pages[0])
-   release_pages(mem->user_pages,
-   mem->bo->tbo.ttm->num_pages);
-   kvfree(mem->user_pages);
-   }
-
ret = reserve_bo_and_cond_vms(mem, NULL, BO_VM_ALL, );
if (unlikely(ret))
return ret;
@@ -1855,25 +1824,11 @@ static int update_invalid_user_pages(struct 

[PATCH 2/3] drm/amdkfd: avoid HMM change cause circular lock dependency v2

2019-02-04 Thread Yang, Philip
There is circular lock between gfx and kfd path with HMM change:
lock(dqm) -> bo::reserve -> amdgpu_mn_lock

To avoid this, move init/unint_mqd() out of lock(dqm), to remove nested
locking between mmap_sem and bo::reserve. The locking order
is: bo::reserve -> amdgpu_mn_lock(p->mn)

Change-Id: I2ec09a47571f6b4c8eaef93f22c0a600f5f70153
Signed-off-by: Philip Yang 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 32 ++-
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 8372556b52eb..efe0d3c0215b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1156,21 +1156,17 @@ static int create_queue_cpsch(struct 
device_queue_manager *dqm, struct queue *q,
int retval;
struct mqd_manager *mqd_mgr;
 
-   retval = 0;
-
-   dqm_lock(dqm);
-
if (dqm->total_queue_count >= max_num_of_queues_per_device) {
pr_warn("Can't create new usermode queue because %d queues were 
already created\n",
dqm->total_queue_count);
retval = -EPERM;
-   goto out_unlock;
+   goto out;
}
 
if (q->properties.type == KFD_QUEUE_TYPE_SDMA) {
retval = allocate_sdma_queue(dqm, >sdma_id);
if (retval)
-   goto out_unlock;
+   goto out;
q->properties.sdma_queue_id =
q->sdma_id / get_num_sdma_engines(dqm);
q->properties.sdma_engine_id =
@@ -1181,6 +1177,9 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
if (retval)
goto out_deallocate_sdma_queue;
 
+   /* Do init_mqd before dqm_lock(dqm) to avoid circular locking order:
+* lock(dqm) -> bo::reserve
+*/
mqd_mgr = dqm->ops.get_mqd_manager(dqm,
get_mqd_type_from_queue_type(q->properties.type));
 
@@ -1188,6 +1187,7 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
retval = -ENOMEM;
goto out_deallocate_doorbell;
}
+
/*
 * Eviction state logic: we only mark active queues as evicted
 * to avoid the overhead of restoring inactive queues later
@@ -1196,9 +1196,7 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
q->properties.is_evicted = (q->properties.queue_size > 0 &&
q->properties.queue_percent > 0 &&
q->properties.queue_address != 0);
-
dqm->asic_ops.init_sdma_vm(dqm, q, qpd);
-
q->properties.tba_addr = qpd->tba_addr;
q->properties.tma_addr = qpd->tma_addr;
retval = mqd_mgr->init_mqd(mqd_mgr, >mqd, >mqd_mem_obj,
@@ -1206,6 +1204,8 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
if (retval)
goto out_deallocate_doorbell;
 
+   dqm_lock(dqm);
+
list_add(>list, >queues_list);
qpd->queue_count++;
if (q->properties.is_active) {
@@ -1233,9 +1233,7 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
 out_deallocate_sdma_queue:
if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
deallocate_sdma_queue(dqm, q->sdma_id);
-out_unlock:
-   dqm_unlock(dqm);
-
+out:
return retval;
 }
 
@@ -1398,8 +1396,6 @@ static int destroy_queue_cpsch(struct 
device_queue_manager *dqm,
qpd->reset_wavefronts = true;
}
 
-   mqd_mgr->uninit_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj);
-
/*
 * Unconditionally decrement this counter, regardless of the queue's
 * type
@@ -1410,6 +1406,9 @@ static int destroy_queue_cpsch(struct 
device_queue_manager *dqm,
 
dqm_unlock(dqm);
 
+   /* Do uninit_mqd after dqm_unlock(dqm) to avoid circular locking */
+   mqd_mgr->uninit_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj);
+
return retval;
 
 failed:
@@ -1631,7 +1630,11 @@ static int process_termination_cpsch(struct 
device_queue_manager *dqm,
qpd->reset_wavefronts = false;
}
 
-   /* lastly, free mqd resources */
+   dqm_unlock(dqm);
+
+   /* Lastly, free mqd resources.
+* Do uninit_mqd() after dqm_unlock to avoid circular locking.
+*/
list_for_each_entry_safe(q, next, >queues_list, list) {
mqd_mgr = dqm->ops.get_mqd_manager(dqm,
get_mqd_type_from_queue_type(q->properties.type));
@@ -1645,7 +1648,6 @@ static int process_termination_cpsch(struct 
device_queue_manager *dqm,
}
 
 out:
-   dqm_unlock(dqm);
return retval;
 }
 
-- 
2.17.1


Re: [PATCH 1/3] drm/amdgpu: use HMM mirror callback to replace mmu notifier v6

2019-02-04 Thread Christian König

Am 04.02.19 um 18:17 schrieb Yang, Philip:


On 2019-02-04 10:18 a.m., Christian König wrote:

Am 04.02.19 um 16:06 schrieb Yang, Philip:

Replace our MMU notifier with hmm_mirror_ops.sync_cpu_device_pagetables
callback. Enable CONFIG_HMM and CONFIG_HMM_MIRROR as a dependency in
DRM_AMDGPU_USERPTR Kconfig.

It supports both KFD userptr and gfx userptr paths.

The depdent HMM patchset from Jérôme Glisse are all merged into 4.20.0
kernel now.

Change-Id: Ie62c3c5e3c5b8521ab3b438d1eff2aa2a003835e
Signed-off-by: Philip Yang 
---
   drivers/gpu/drm/amd/amdgpu/Kconfig |   6 +-
   drivers/gpu/drm/amd/amdgpu/Makefile    |   2 +-
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 139 +++--
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h |   2 +-
   4 files changed, 67 insertions(+), 82 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig
b/drivers/gpu/drm/amd/amdgpu/Kconfig
index 9221e5489069..960a63355705 100644
--- a/drivers/gpu/drm/amd/amdgpu/Kconfig
+++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
@@ -26,10 +26,10 @@ config DRM_AMDGPU_CIK
   config DRM_AMDGPU_USERPTR
   bool "Always enable userptr write support"
   depends on DRM_AMDGPU
-    select MMU_NOTIFIER
+    select HMM_MIRROR
   help
-  This option selects CONFIG_MMU_NOTIFIER if it isn't already
-  selected to enabled full userptr support.
+  This option selects CONFIG_HMM and CONFIG_HMM_MIRROR if it
+  isn't already selected to enabled full userptr support.
   config DRM_AMDGPU_GART_DEBUGFS
   bool "Allow GART access through debugfs"
diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 466da5954a68..851001ced5e8 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -172,7 +172,7 @@ endif
   amdgpu-$(CONFIG_COMPAT) += amdgpu_ioc32.o
   amdgpu-$(CONFIG_VGA_SWITCHEROO) += amdgpu_atpx_handler.o
   amdgpu-$(CONFIG_ACPI) += amdgpu_acpi.o
-amdgpu-$(CONFIG_MMU_NOTIFIER) += amdgpu_mn.o
+amdgpu-$(CONFIG_HMM_MIRROR) += amdgpu_mn.o
   include $(FULL_AMD_PATH)/powerplay/Makefile
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 3e6823fdd939..5d518d2bb9be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -45,7 +45,7 @@
   #include 
   #include 
-#include 
+#include 
   #include 
   #include 
   #include 
@@ -58,7 +58,6 @@
    *
    * @adev: amdgpu device pointer
    * @mm: process address space
- * @mn: MMU notifier structure
    * @type: type of MMU notifier
    * @work: destruction work item
    * @node: hash table node to find structure by adev and mn
@@ -66,6 +65,7 @@
    * @objects: interval tree containing amdgpu_mn_nodes
    * @read_lock: mutex for recursive locking of @lock
    * @recursion: depth of recursion
+ * @mirror: HMM mirror function support
    *
    * Data for each amdgpu device and process address space.
    */
@@ -73,7 +73,6 @@ struct amdgpu_mn {
   /* constant after initialisation */
   struct amdgpu_device    *adev;
   struct mm_struct    *mm;
-    struct mmu_notifier    mn;
   enum amdgpu_mn_type    type;
   /* only used on destruction */
@@ -87,6 +86,9 @@ struct amdgpu_mn {
   struct rb_root_cached    objects;
   struct mutex    read_lock;
   atomic_t    recursion;

With HMM we don't need this any more. Please remove it and simplify
amdgpu_mn_read_lock() and amdgpu_mn_read_unlock().


Thanks, this makes sense because HMM uses hmm->mirror_sem to serialize
invalidate range.

amn->read_lock is also not needed anymore, this was used to protect
atomic operations for atomic_inc_return(amn->recursion) and
down_read(amn->lock).

Just one amn->lock is needed to sync amdgpu_cs_submit and userptr
update. I will submit another patch.


Sounds good. I will take a closer look at the other patches tomorrow.

Thanks,
Christian.



Philip


Apart from that looks good to me,
Christian.


+
+    /* HMM mirror */
+    struct hmm_mirror    mirror;
   };
   /**
@@ -103,7 +105,7 @@ struct amdgpu_mn_node {
   };
   /**
- * amdgpu_mn_destroy - destroy the MMU notifier
+ * amdgpu_mn_destroy - destroy the HMM mirror
    *
    * @work: previously sheduled work item
    *
@@ -129,28 +131,26 @@ static void amdgpu_mn_destroy(struct work_struct
*work)
   }
   up_write(>lock);
   mutex_unlock(>mn_lock);
-    mmu_notifier_unregister_no_release(>mn, amn->mm);
+
+    hmm_mirror_unregister(>mirror);
   kfree(amn);
   }
   /**
- * amdgpu_mn_release - callback to notify about mm destruction
+ * amdgpu_hmm_mirror_release - callback to notify about mm destruction
    *
- * @mn: our notifier
- * @mm: the mm this callback is about
+ * @mirror: the HMM mirror (mm) this callback is about
    *
- * Shedule a work item to lazy destroy our notifier.
+ * Shedule a work item to lazy destroy HMM mirror.
    */
-static void amdgpu_mn_release(struct mmu_notifier *mn,
-  struct mm_struct *mm)
+static 

Re: [PATCH] drm/amd/powerplay: add override pcie parameters for Vega20 (v2)

2019-02-04 Thread Kasiviswanathan, Harish
Hi Alex,

I already have that patch in my code. The function 
amdgpu_device_get_pcie_info() updates platform caps 
(CAIL_PCIE_LINK_SPEED_SUPPORT_GEN*) based on pcie_get_speed_cap() of the bus 
AMD GPU is connected. I think to get the system caps going up just one level in 
PCI hierarchy is not sufficient. It should be check all the way up to the root. 
Like the function pcie_bandwidth_available(). See 
https://elixir.bootlin.com/linux/latest/source/drivers/pci/pci.c#L5503

For e.g. In my machine the hierarchy looks as shown below. The current code 
only looks at the top 2 and decides incorrectly as Gen 4.

amdgpu :07:00.0: lnksta:1103 lnkcap2:180001e lnkcap:400d04   -- 
PCIE_SPEED_16_0GT - Gen 4
pcieport :06:00.0: lnksta:7103 lnkcap2:180001e lnkcap:700d04   -- 
PCIE_SPEED_16_0GT - Gen 4
pcieport :05:00.0: lnksta:1103 lnkcap2:180001e lnkcap:10473904 -- 
PCIE_SPEED_8_0GT - Gen 3
pcieport :04:10.0: lnksta:103 lnkcap2:10e lnkcap:10416103 -- 
PCIE_SPEED_8_0GT - Gen 3
pcieport :03:00.0: lnksta:103 lnkcap2:10e lnkcap:416103 -- PCIE_SPEED_8_0GT 
- Gen 3
pcieport :00:03.0: lnksta:7103 lnkcap2:e lnkcap:77a3103 -- PCIE_SPEED_8_0GT 
- Gen 3


Best Regards,
Harish





From: Alex Deucher 
Sent: Friday, February 1, 2019 5:53 PM
To: Kasiviswanathan, Harish
Cc: Li, Colin; Xi, Sheldon; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/powerplay: add override pcie parameters for Vega20 
(v2)
  








Right.  we have the max gen and width supported by the card and the platform in 
already in the driver thanks to amdgpu_device_get_pcie_info() so we just need 
to use the platform gen caps to limit the max pcie gen in dpm1.  You don't want 
to use the current  status because that might have been changed by sw at some 
point.  The asic caps are defined by CAIL_ASIC_PCIE_LINK_SPEED_SUPPORT_GEN* 
while the platform caps are defined by CAIL_PCIE_LINK_SPEED_SUPPORT_GEN*.  I 
think the original code was correct.  What you  may have been running into was 
a bug in pcie_get_speed_cap() that recently got fixed:


commit f1f90e254e46e0a14220e4090041f68256fbe297
Author: Mikulas Patocka 
Date:   Mon Nov 26 10:37:13 2018 -0600

    PCI: Fix incorrect value returned from pcie_get_speed_cap()
    
    The macros PCI_EXP_LNKCAP_SLS_*GB are values, not bit masks.  We must mask
    the register and compare it against them.
    
    This fixes errors like this:
    
  amdgpu: [powerplay] failed to send message 261 ret is 0
    
    when a PCIe-v3 card is plugged into a PCIe-v1 slot, because the slot is
    being incorrectly reported as PCIe-v3 capable.
    
    6cf57be0f78e, which appeared in v4.17, added pcie_get_speed_cap() with the
    incorrect test of PCI_EXP_LNKCAP_SLS as a bitmask.  5d9a63304032, which
    appeared in v4.19, changed amdgpu to use pcie_get_speed_cap(), so the
    amdgpu bug reports below are regressions in v4.19.
    
    Fixes: 6cf57be0f78e ("PCI: Add pcie_get_speed_cap() to find max supported 
link speed")
    Fixes: 5d9a63304032 ("drm/amdgpu: use pcie functions for link width and 
speed")
    Link: https://bugs.freedesktop.org/show_bug.cgi?id=108704
    Link: https://bugs.freedesktop.org/show_bug.cgi?id=108778
    Signed-off-by: Mikulas Patocka 
    [bhelgaas: update comment, remove use of PCI_EXP_LNKCAP_SLS_8_0GB and
    PCI_EXP_LNKCAP_SLS_16_0GB since those should be covered by PCI_EXP_LNKCAP2,
    remove test of PCI_EXP_LNKCAP for zero, since that register is required]
    Signed-off-by: Bjorn Helgaas 
    Acked-by: Alex Deucher 
    Cc: sta...@vger.kernel.org  # v4.17+




On Fri, Feb 1, 2019 at 5:43 PM Kasiviswanathan, Harish 
 wrote:
 

Let me explain the problem. I also adding Colin & Sheldon from the SMU team.

Currently, PPTable entries (inside VBIOS) reflects ASIC capabilities and not 
the system capabilities. So if we put Gen4 capable Vega20 in a system that 
supports  only Gen3, the PPTable has Gen4 for DPM 1 (higher DPM state). Later 
on when SMU wants to switch to DPM1 it fails since Gen4 is not supported. SMU 
however, doesn't fall back to Gen3 but instead to Gen0. This is causing a 
performance issue.

As I understand, this override is required only for the higher DPM state.



Best Regards,
Harish



From: Alex Deucher 
Sent: Friday, February 1, 2019 5:34 PM
To: Kasiviswanathan, Harish
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/powerplay: add override pcie parameters for Vega20 
(v2)
  


Do we actually need the current status?  We have the caps of the platform and 
the device.  For the lowest dpm state, we generally want the lowest common gen 
and then the highest common gen for the highest dpm state.  If we use the 
current status, we will  mostly likely end up with the highest gen in the 
lowest dpm state because generally the highest gen is negotiated at power up.


Alex



On Fri, Feb 1, 2019 at 5:28 PM Kasiviswanathan, Harish 
  wrote:
 

Thanks Alex. I saw that function. It gets caps. The function requires link 
status 

Re: [Intel-gfx] [PATCH 2/6] drm/drv: Prepare to remove drm_dev_unplug()

2019-02-04 Thread Noralf Trønnes


Den 04.02.2019 16.41, skrev Daniel Vetter:
> On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
>> The only thing now that makes drm_dev_unplug() special is that it sets
>> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
>> can remove drm_dev_unplug().
>>
>> Signed-off-by: Noralf Trønnes 
>> ---

[...]

>>  drivers/gpu/drm/drm_drv.c | 27 +++
>>  include/drm/drm_drv.h | 10 --
>>  2 files changed, 19 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>> index 05bbc2b622fc..e0941200edc6 100644
>> --- a/drivers/gpu/drm/drm_drv.c
>> +++ b/drivers/gpu/drm/drm_drv.c
>> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
>>   */
>>  void drm_dev_unplug(struct drm_device *dev)
>>  {
>> -/*
>> - * After synchronizing any critical read section is guaranteed to see
>> - * the new value of ->unplugged, and any critical section which might
>> - * still have seen the old value of ->unplugged is guaranteed to have
>> - * finished.
>> - */
>> -dev->unplugged = true;
>> -synchronize_srcu(_unplug_srcu);
>> -
>>  drm_dev_unregister(dev);
>>  drm_dev_put(dev);
>>  }
>> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
>>   * drm_dev_register() but does not deallocate the device. The caller must 
>> call
>>   * drm_dev_put() to drop their final reference.
>>   *
>> - * A special form of unregistering for hotpluggable devices is 
>> drm_dev_unplug(),
>> - * which can be called while there are still open users of @dev.
>> + * This function can be called while there are still open users of @dev as 
>> long
>> + * as the driver protects its device resources using drm_dev_enter() and
>> + * drm_dev_exit().
>>   *
>>   * This should be called first in the device teardown code to make sure
>> - * userspace can't access the device instance any more.
>> + * userspace can't access the device instance any more. Drivers that support
>> + * device unplug will probably want to call drm_atomic_helper_shutdown() 
>> first
> 
> Read once more with a bit more coffee, spotted this:
> 
> s/first/afterwards/ - shutting down the hw before we've taken it away from
> userspace is kinda the wrong way round. It should be the inverse of driver
> load, which is 1) allocate structures 2) prep hw 3) register driver with
> the world (simplified ofc).
> 

The problem is that drm_dev_unregister() sets the device as unplugged
and if drm_atomic_helper_shutdown() is called afterwards it's not
allowed to touch hardware.

I know it's the wrong order, but the only way to do it in the right
order is to have a separate function that sets unplugged:

drm_dev_unregister();
drm_atomic_helper_shutdown();
drm_dev_set_unplugged();

Noralf.

>> + * in order to disable the hardware on regular driver module unload.
>>   */
>>  void drm_dev_unregister(struct drm_device *dev)
>>  {
>> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
>>  if (drm_core_check_feature(dev, DRIVER_LEGACY))
>>  drm_lastclose(dev);
>>  
>> +/*
>> + * After synchronizing any critical read section is guaranteed to see
>> + * the new value of ->unplugged, and any critical section which might
>> + * still have seen the old value of ->unplugged is guaranteed to have
>> + * finished.
>> + */
>> +dev->unplugged = true;
>> +synchronize_srcu(_unplug_srcu);
>> +
>>  dev->registered = false;
>>  
>>  drm_client_dev_unregister(dev);
>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
>> index ca46a45a9cce..c50696c82a42 100644
>> --- a/include/drm/drm_drv.h
>> +++ b/include/drm/drm_drv.h
>> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
>>   * drm_dev_is_unplugged - is a DRM device unplugged
>>   * @dev: DRM device
>>   *
>> - * This function can be called to check whether a hotpluggable is unplugged.
>> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
>> - * unplugged, these two functions guarantee that any store before calling
>> - * drm_dev_unplug() is visible to callers of this function after it 
>> completes
>> + * This function can be called to check whether @dev is unregistered. This 
>> can
>> + * be used to detect that the underlying parent device is gone.
> 
> I think it'd be good to keep the first part, and just update the reference
> to drm_dev_unregister. So:
> 
>  * This function can be called to check whether a hotpluggable is unplugged.
>  * Unplugging itself is singalled through drm_dev_unregister(). If a device is
>  * unplugged, these two functions guarantee that any store before calling
>  * drm_dev_unregister() is visible to callers of this function after it
>  * completes.
> 
> I think your version shrugs a few important details under the rug. With
> those nits addressed:
> 
> Reviewed-by: Daniel Vetter 
> 
> Cheers, Daniel
> 
>>   *
>> - * WARNING: This function 

RE: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells mask on SOC15

2019-02-04 Thread Zeng, Oak
>then we should probably use 1024 for mmCP_MEC_DOORBELL_RANGE_UPPER.
This is also what I thought.

Thanks,
Oak

From: Zhao, Yong
Sent: Monday, February 4, 2019 12:28 PM
To: Zeng, Oak ; Kuehling, Felix ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells 
mask on SOC15

The decision we need to make here is which value we should program 
mmCP_MEC_DOORBELL_RANGE_UPPER. It is not very consistent right now throughout 
our code. If we let CP to use all the doorbells left over by SDMA, IH and VCN, 
then we should probably use 1024 for mmCP_MEC_DOORBELL_RANGE_UPPER. If we 
continue to use adev->doorbell_index.userqueue_end * 2, as is currently in the 
code, we should limit the CP doorbells in the range of 0 to 
adev->doorbell_index.userqueue_end * 2 when allocating CP user queue doorbells.

It seems to me that either way would work.

Regards,
Yong

From: Zeng, Oak
Sent: Monday, February 4, 2019 12:15 PM
To: Kuehling, Felix; Zhao, Yong; 
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells 
mask on SOC15

Correct. MEC_DOORBELL_RANGE should cover both user and kernel CP 
(KIQ/HIQ/DIQ/amdgpu kernel MEC rings)queues. USERQUEUE_START/END should be 
renamed to USER_CP_QUEUE_START/END.

Thanks,
Oak

-Original Message-
From: Kuehling, Felix
Sent: Monday, February 4, 2019 11:35 AM
To: Zeng, Oak mailto:oak.z...@amd.com>>; Zhao, Yong 
mailto:yong.z...@amd.com>>; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells 
mask on SOC15

I don't see anything about userqueue start/end in Yong's patch. That said, I 
think you're mixing up two different things. MEC_DOORBELL_RANGE is not the same 
as userqueues? We have MEC kernel queues as well that would have to fall in the 
same range. And we have SDMA user queues that don't belong in the MEC doorbell 
range. Therefore I think the name USERQUEUE_START/END is misleading.

Regards,
   Felix

On 2019-02-02 10:01 a.m., Zeng, Oak wrote:
>
> Hi Yong,
>
> mmCP_MEC_DOORBELL_RANGE_LOWER/UPPER are used at CP to double check a
> doorbell ringing routed to CP. It is a must to program. We have to
> keep userqueue_start/end. Sorry for the confusion.
>
> Regards,
>
> Oak
>
> *From:* amd-gfx 
> mailto:amd-gfx-boun...@lists.freedesktop.org>>
>  *On Behalf Of
> *Zhao, Yong
> *Sent:* Friday, February 1, 2019 7:04 PM
> *To:* Kuehling, Felix mailto:felix.kuehl...@amd.com>>;
> amd-gfx@lists.freedesktop.org; Zeng, 
> Oak mailto:oak.z...@amd.com>>
> *Subject:* Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user
> queue doorbells mask on SOC15
>
> Oak,
>
> It is a good idea to remove the userqueue_start/end. However, when
> doing so, I came to the following code in gfx_v9_0_kiq_init_register() :
>
> /* enable the doorbell if requested */
>
> if (ring->use_doorbell) {
>
> WREG32_SOC15(GC, 0, mmCP_MEC_DOORBELL_RANGE_LOWER,
>
> (adev->doorbell_index.kiq * 2) << 2);
>
> WREG32_SOC15(GC, 0, mmCP_MEC_DOORBELL_RANGE_UPPER,
>
> (adev->doorbell_index.userqueue_end * 2) << 2);
>
> }
>
> I remember that you reached the conclusion in one email thread that
> mmCP_MEC_DOORBELL_RANGE_UPPER/LOWER did not have effect at all on
> gfx9. Have we got any firm conclusions from HW engineer in the end? If
> those two registers are still valid, we should enforce the range for
> CP queues.
>
> Regards,
>
> Yong
>
> --
> --
>
> *From:*Kuehling, Felix
> *Sent:* Friday, February 1, 2019 3:03 PM
> *To:* Zhao, Yong; 
> amd-gfx@lists.freedesktop.org
> 
> *Subject:* Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user
> queue doorbells mask on SOC15
>
> On 2019-01-31 5:27 p.m., Zhao, Yong wrote:
> > Reserved doorbells for SDMA IH and VCN were not properly masked out
> > when allocating doorbells for CP user queues. This patch fixed that.
> >
> > Change-Id: I670adfc3fd7725d2ed0bd9665cb7f69f8b9023c2
> > Signed-off-by: Yong Zhao mailto:yong.z...@amd.com%20%0b>> > >
>
> One mostly cosmetic comment inline. With that fixed, the series is
> Reviewed-by: Felix Kuehling mailto:felix.kuehl...@amd.com%20%0b>> >
>
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 17 ++---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h  |  4 +++
> >   drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c  |  3 +++
> >   drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c  |  3 +++
> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  9 +++
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 25
> >+--
> >   .../gpu/drm/amd/include/kgd_kfd_interface.h   | 19 

Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells mask on SOC15

2019-02-04 Thread Zhao, Yong
The decision we need to make here is which value we should program 
mmCP_MEC_DOORBELL_RANGE_UPPER. It is not very consistent right now throughout 
our code. If we let CP to use all the doorbells left over by SDMA, IH and VCN, 
then we should probably use 1024 for mmCP_MEC_DOORBELL_RANGE_UPPER. If we 
continue to use adev->doorbell_index.userqueue_end * 2, as is currently in the 
code, we should limit the CP doorbells in the range of 0 to 
adev->doorbell_index.userqueue_end * 2 when allocating CP user queue doorbells.

It seems to me that either way would work.

Regards,
Yong

From: Zeng, Oak
Sent: Monday, February 4, 2019 12:15 PM
To: Kuehling, Felix; Zhao, Yong; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells 
mask on SOC15

Correct. MEC_DOORBELL_RANGE should cover both user and kernel CP 
(KIQ/HIQ/DIQ/amdgpu kernel MEC rings)queues. USERQUEUE_START/END should be 
renamed to USER_CP_QUEUE_START/END.

Thanks,
Oak

-Original Message-
From: Kuehling, Felix
Sent: Monday, February 4, 2019 11:35 AM
To: Zeng, Oak ; Zhao, Yong ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells 
mask on SOC15

I don't see anything about userqueue start/end in Yong's patch. That said, I 
think you're mixing up two different things. MEC_DOORBELL_RANGE is not the same 
as userqueues? We have MEC kernel queues as well that would have to fall in the 
same range. And we have SDMA user queues that don't belong in the MEC doorbell 
range. Therefore I think the name USERQUEUE_START/END is misleading.

Regards,
   Felix

On 2019-02-02 10:01 a.m., Zeng, Oak wrote:
>
> Hi Yong,
>
> mmCP_MEC_DOORBELL_RANGE_LOWER/UPPER are used at CP to double check a
> doorbell ringing routed to CP. It is a must to program. We have to
> keep userqueue_start/end. Sorry for the confusion.
>
> Regards,
>
> Oak
>
> *From:* amd-gfx  *On Behalf Of
> *Zhao, Yong
> *Sent:* Friday, February 1, 2019 7:04 PM
> *To:* Kuehling, Felix ;
> amd-gfx@lists.freedesktop.org; Zeng, Oak 
> *Subject:* Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user
> queue doorbells mask on SOC15
>
> Oak,
>
> It is a good idea to remove the userqueue_start/end. However, when
> doing so, I came to the following code in gfx_v9_0_kiq_init_register() :
>
> /* enable the doorbell if requested */
>
> if (ring->use_doorbell) {
>
> WREG32_SOC15(GC, 0, mmCP_MEC_DOORBELL_RANGE_LOWER,
>
> (adev->doorbell_index.kiq * 2) << 2);
>
> WREG32_SOC15(GC, 0, mmCP_MEC_DOORBELL_RANGE_UPPER,
>
> (adev->doorbell_index.userqueue_end * 2) << 2);
>
> }
>
> I remember that you reached the conclusion in one email thread that
> mmCP_MEC_DOORBELL_RANGE_UPPER/LOWER did not have effect at all on
> gfx9. Have we got any firm conclusions from HW engineer in the end? If
> those two registers are still valid, we should enforce the range for
> CP queues.
>
> Regards,
>
> Yong
>
> --
> --
>
> *From:*Kuehling, Felix
> *Sent:* Friday, February 1, 2019 3:03 PM
> *To:* Zhao, Yong; amd-gfx@lists.freedesktop.org
> 
> *Subject:* Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user
> queue doorbells mask on SOC15
>
> On 2019-01-31 5:27 p.m., Zhao, Yong wrote:
> > Reserved doorbells for SDMA IH and VCN were not properly masked out
> > when allocating doorbells for CP user queues. This patch fixed that.
> >
> > Change-Id: I670adfc3fd7725d2ed0bd9665cb7f69f8b9023c2
> > Signed-off-by: Yong Zhao  > >
>
> One mostly cosmetic comment inline. With that fixed, the series is
> Reviewed-by: Felix Kuehling  >
>
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 17 ++---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h  |  4 +++
> >   drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c  |  3 +++
> >   drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c  |  3 +++
> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  9 +++
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 25
> >+--
> >   .../gpu/drm/amd/include/kgd_kfd_interface.h   | 19 +-
> >   7 files changed, 56 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > index e957e42c539a..13710f34191a 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > @@ -196,11 +196,20 @@ void amdgpu_amdkfd_device_init(struct
> amdgpu_device *adev)
> > gpu_resources.sdma_doorbell[1][i+1] =
> > adev->doorbell_index.sdma_engine[1] + 0x200 + (i >> 1);
> >}
> > - /* Doorbells 0x0e0-0ff and 0x2e0-2ff are reserved for
> > -  * SDMA, IH and VCN. So don't use them for the CP.
> > +
> > + /* Because of the 

Re: [PATCH 1/3] drm/amdgpu: use HMM mirror callback to replace mmu notifier v6

2019-02-04 Thread Yang, Philip


On 2019-02-04 10:18 a.m., Christian König wrote:
> Am 04.02.19 um 16:06 schrieb Yang, Philip:
>> Replace our MMU notifier with hmm_mirror_ops.sync_cpu_device_pagetables
>> callback. Enable CONFIG_HMM and CONFIG_HMM_MIRROR as a dependency in
>> DRM_AMDGPU_USERPTR Kconfig.
>>
>> It supports both KFD userptr and gfx userptr paths.
>>
>> The depdent HMM patchset from Jérôme Glisse are all merged into 4.20.0
>> kernel now.
>>
>> Change-Id: Ie62c3c5e3c5b8521ab3b438d1eff2aa2a003835e
>> Signed-off-by: Philip Yang 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/Kconfig |   6 +-
>>   drivers/gpu/drm/amd/amdgpu/Makefile    |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 139 +++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h |   2 +-
>>   4 files changed, 67 insertions(+), 82 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig 
>> b/drivers/gpu/drm/amd/amdgpu/Kconfig
>> index 9221e5489069..960a63355705 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/Kconfig
>> +++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
>> @@ -26,10 +26,10 @@ config DRM_AMDGPU_CIK
>>   config DRM_AMDGPU_USERPTR
>>   bool "Always enable userptr write support"
>>   depends on DRM_AMDGPU
>> -    select MMU_NOTIFIER
>> +    select HMM_MIRROR
>>   help
>> -  This option selects CONFIG_MMU_NOTIFIER if it isn't already
>> -  selected to enabled full userptr support.
>> +  This option selects CONFIG_HMM and CONFIG_HMM_MIRROR if it
>> +  isn't already selected to enabled full userptr support.
>>   config DRM_AMDGPU_GART_DEBUGFS
>>   bool "Allow GART access through debugfs"
>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
>> b/drivers/gpu/drm/amd/amdgpu/Makefile
>> index 466da5954a68..851001ced5e8 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
>> @@ -172,7 +172,7 @@ endif
>>   amdgpu-$(CONFIG_COMPAT) += amdgpu_ioc32.o
>>   amdgpu-$(CONFIG_VGA_SWITCHEROO) += amdgpu_atpx_handler.o
>>   amdgpu-$(CONFIG_ACPI) += amdgpu_acpi.o
>> -amdgpu-$(CONFIG_MMU_NOTIFIER) += amdgpu_mn.o
>> +amdgpu-$(CONFIG_HMM_MIRROR) += amdgpu_mn.o
>>   include $(FULL_AMD_PATH)/powerplay/Makefile
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> index 3e6823fdd939..5d518d2bb9be 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> @@ -45,7 +45,7 @@
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>   #include 
>>   #include 
>>   #include 
>> @@ -58,7 +58,6 @@
>>    *
>>    * @adev: amdgpu device pointer
>>    * @mm: process address space
>> - * @mn: MMU notifier structure
>>    * @type: type of MMU notifier
>>    * @work: destruction work item
>>    * @node: hash table node to find structure by adev and mn
>> @@ -66,6 +65,7 @@
>>    * @objects: interval tree containing amdgpu_mn_nodes
>>    * @read_lock: mutex for recursive locking of @lock
>>    * @recursion: depth of recursion
>> + * @mirror: HMM mirror function support
>>    *
>>    * Data for each amdgpu device and process address space.
>>    */
>> @@ -73,7 +73,6 @@ struct amdgpu_mn {
>>   /* constant after initialisation */
>>   struct amdgpu_device    *adev;
>>   struct mm_struct    *mm;
>> -    struct mmu_notifier    mn;
>>   enum amdgpu_mn_type    type;
>>   /* only used on destruction */
>> @@ -87,6 +86,9 @@ struct amdgpu_mn {
>>   struct rb_root_cached    objects;
>>   struct mutex    read_lock;
> 
>>   atomic_t    recursion;
> 
> With HMM we don't need this any more. Please remove it and simplify 
> amdgpu_mn_read_lock() and amdgpu_mn_read_unlock().
> 
Thanks, this makes sense because HMM uses hmm->mirror_sem to serialize 
invalidate range.

amn->read_lock is also not needed anymore, this was used to protect 
atomic operations for atomic_inc_return(amn->recursion) and 
down_read(amn->lock).

Just one amn->lock is needed to sync amdgpu_cs_submit and userptr 
update. I will submit another patch.

Philip

> Apart from that looks good to me,
> Christian.
> 
>> +
>> +    /* HMM mirror */
>> +    struct hmm_mirror    mirror;
>>   };
>>   /**
>> @@ -103,7 +105,7 @@ struct amdgpu_mn_node {
>>   };
>>   /**
>> - * amdgpu_mn_destroy - destroy the MMU notifier
>> + * amdgpu_mn_destroy - destroy the HMM mirror
>>    *
>>    * @work: previously sheduled work item
>>    *
>> @@ -129,28 +131,26 @@ static void amdgpu_mn_destroy(struct work_struct 
>> *work)
>>   }
>>   up_write(>lock);
>>   mutex_unlock(>mn_lock);
>> -    mmu_notifier_unregister_no_release(>mn, amn->mm);
>> +
>> +    hmm_mirror_unregister(>mirror);
>>   kfree(amn);
>>   }
>>   /**
>> - * amdgpu_mn_release - callback to notify about mm destruction
>> + * amdgpu_hmm_mirror_release - callback to notify about mm destruction
>>    *
>> - * @mn: our notifier
>> - * @mm: the mm this callback is about
>> + * @mirror: the HMM mirror (mm) this callback is about
>>    *
>> - * Shedule a work item to lazy 

RE: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells mask on SOC15

2019-02-04 Thread Zeng, Oak
Correct. MEC_DOORBELL_RANGE should cover both user and kernel CP 
(KIQ/HIQ/DIQ/amdgpu kernel MEC rings)queues. USERQUEUE_START/END should be 
renamed to USER_CP_QUEUE_START/END.

Thanks,
Oak

-Original Message-
From: Kuehling, Felix 
Sent: Monday, February 4, 2019 11:35 AM
To: Zeng, Oak ; Zhao, Yong ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells 
mask on SOC15

I don't see anything about userqueue start/end in Yong's patch. That said, I 
think you're mixing up two different things. MEC_DOORBELL_RANGE is not the same 
as userqueues? We have MEC kernel queues as well that would have to fall in the 
same range. And we have SDMA user queues that don't belong in the MEC doorbell 
range. Therefore I think the name USERQUEUE_START/END is misleading.

Regards,
   Felix

On 2019-02-02 10:01 a.m., Zeng, Oak wrote:
>
> Hi Yong,
>
> mmCP_MEC_DOORBELL_RANGE_LOWER/UPPER are used at CP to double check a 
> doorbell ringing routed to CP. It is a must to program. We have to 
> keep userqueue_start/end. Sorry for the confusion.
>
> Regards,
>
> Oak
>
> *From:* amd-gfx  *On Behalf Of 
> *Zhao, Yong
> *Sent:* Friday, February 1, 2019 7:04 PM
> *To:* Kuehling, Felix ; 
> amd-gfx@lists.freedesktop.org; Zeng, Oak 
> *Subject:* Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user 
> queue doorbells mask on SOC15
>
> Oak,
>
> It is a good idea to remove the userqueue_start/end. However, when 
> doing so, I came to the following code in gfx_v9_0_kiq_init_register() :
>
>         /* enable the doorbell if requested */
>
>         if (ring->use_doorbell) {
>
> WREG32_SOC15(GC, 0, mmCP_MEC_DOORBELL_RANGE_LOWER,
>
>                     (adev->doorbell_index.kiq * 2) << 2);
>
> WREG32_SOC15(GC, 0, mmCP_MEC_DOORBELL_RANGE_UPPER,
>
> (adev->doorbell_index.userqueue_end * 2) << 2);
>
>         }
>
> I remember that you reached the conclusion in one email thread that 
> mmCP_MEC_DOORBELL_RANGE_UPPER/LOWER did not have effect at all on 
> gfx9. Have we got any firm conclusions from HW engineer in the end? If 
> those two registers are still valid, we should enforce the range for 
> CP queues.
>
> Regards,
>
> Yong
>
> --
> --
>
> *From:*Kuehling, Felix
> *Sent:* Friday, February 1, 2019 3:03 PM
> *To:* Zhao, Yong; amd-gfx@lists.freedesktop.org 
> 
> *Subject:* Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user 
> queue doorbells mask on SOC15
>
> On 2019-01-31 5:27 p.m., Zhao, Yong wrote:
> > Reserved doorbells for SDMA IH and VCN were not properly masked out 
> > when allocating doorbells for CP user queues. This patch fixed that.
> >
> > Change-Id: I670adfc3fd7725d2ed0bd9665cb7f69f8b9023c2
> > Signed-off-by: Yong Zhao  > >
>
> One mostly cosmetic comment inline. With that fixed, the series is
> Reviewed-by: Felix Kuehling  >
>
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    | 17 ++---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h  |  4 +++
> >   drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c  |  3 +++
> >   drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c  |  3 +++
> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  9 +++
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 25 
> >+--
> >   .../gpu/drm/amd/include/kgd_kfd_interface.h   | 19 +-
> >   7 files changed, 56 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > index e957e42c539a..13710f34191a 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > @@ -196,11 +196,20 @@ void amdgpu_amdkfd_device_init(struct
> amdgpu_device *adev)
> > gpu_resources.sdma_doorbell[1][i+1] =
> > adev->doorbell_index.sdma_engine[1] + 0x200 + (i >> 1);
> >    }
> > - /* Doorbells 0x0e0-0ff and 0x2e0-2ff are reserved for
> > -  * SDMA, IH and VCN. So don't use them for the CP.
> > +
> > + /* Because of the setting in registers like
> > +  * SDMA0_DOORBELL_RANGE etc., BIF statically uses the
> > +  * lower 12 bits of doorbell address for routing, in
> > +  * order to route the CP queue doorbells to CP engine,
> > +  * the doorbells allocated to CP queues have to be
> > +  * outside the range set for SDMA, VCN, and IH blocks
> > +  * Prior to SOC15, all queues use queue ID to
> > +  * determine doorbells.
> > */
> > - gpu_resources.reserved_doorbell_mask = 0x1e0;
> > - gpu_resources.reserved_doorbell_val  = 0x0e0;
> > + gpu_resources.reserved_doorbells_start =
> > + adev->doorbell_index.sdma_engine[0];
> > + gpu_resources.reserved_doorbells_end =
> > + 

Re: [PATCH] drm/amdgpu: add a workaround for GDS ordered append hangs with compute queues

2019-02-04 Thread Alex Deucher
On Mon, Feb 4, 2019 at 11:09 AM Koenig, Christian
 wrote:
>
> Am 04.02.19 um 16:55 schrieb Marek Olšák:
>
> On Mon, Feb 4, 2019 at 7:42 AM Christian König 
>  wrote:
>>
>> At least from coding style, backward compatibility etc.. this looks sane to 
>> me, so feel free to add an Acked-by.
>>
>> But I absolutely can't judge if that is correct from the hardware point of 
>> view or not.
>
>
> Our GDS docs say that writing the register resets the wave counter.
>
>>
>>
>> And I think that somebody else looking at this is mandatory for it to be 
>> committed.
>
>
> There won't be anybody else. Nobody here really understands GDS, nobody here 
> really cares about GDS, and hw guys weren't helpful with the hangs. If I 
> didn't discover this by luck, GDS OA would be unusable on Linux.
>
>
> Great, as long as Alex is ok with this feel free to go ahead.

Acked-by: Alex Deucher 

>
> Christian.
>
>
> Marek
>
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user queue doorbells mask on SOC15

2019-02-04 Thread Kuehling, Felix
I don't see anything about userqueue start/end in Yong's patch. That 
said, I think you're mixing up two different things. MEC_DOORBELL_RANGE 
is not the same as userqueues? We have MEC kernel queues as well that 
would have to fall in the same range. And we have SDMA user queues that 
don't belong in the MEC doorbell range. Therefore I think the name 
USERQUEUE_START/END is misleading.

Regards,
   Felix

On 2019-02-02 10:01 a.m., Zeng, Oak wrote:
>
> Hi Yong,
>
> mmCP_MEC_DOORBELL_RANGE_LOWER/UPPER are used at CP to double check a 
> doorbell ringing routed to CP. It is a must to program. We have to 
> keep userqueue_start/end. Sorry for the confusion.
>
> Regards,
>
> Oak
>
> *From:* amd-gfx  *On Behalf Of 
> *Zhao, Yong
> *Sent:* Friday, February 1, 2019 7:04 PM
> *To:* Kuehling, Felix ; 
> amd-gfx@lists.freedesktop.org; Zeng, Oak 
> *Subject:* Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user 
> queue doorbells mask on SOC15
>
> Oak,
>
> It is a good idea to remove the userqueue_start/end. However, when 
> doing so, I came to the following code in gfx_v9_0_kiq_init_register() :
>
>         /* enable the doorbell if requested */
>
>         if (ring->use_doorbell) {
>
> WREG32_SOC15(GC, 0, mmCP_MEC_DOORBELL_RANGE_LOWER,
>
>                     (adev->doorbell_index.kiq * 2) << 2);
>
> WREG32_SOC15(GC, 0, mmCP_MEC_DOORBELL_RANGE_UPPER,
>
> (adev->doorbell_index.userqueue_end * 2) << 2);
>
>         }
>
> I remember that you reached the conclusion in one email thread that 
> mmCP_MEC_DOORBELL_RANGE_UPPER/LOWER did not have effect at all on 
> gfx9. Have we got any firm conclusions from HW engineer in the end? If 
> those two registers are still valid, we should enforce the range for 
> CP queues.
>
> Regards,
>
> Yong
>
> 
>
> *From:*Kuehling, Felix
> *Sent:* Friday, February 1, 2019 3:03 PM
> *To:* Zhao, Yong; amd-gfx@lists.freedesktop.org 
> 
> *Subject:* Re: [PATCH 2/3] drm/amdkfd: Fix bugs regarding CP user 
> queue doorbells mask on SOC15
>
> On 2019-01-31 5:27 p.m., Zhao, Yong wrote:
> > Reserved doorbells for SDMA IH and VCN were not properly masked out
> > when allocating doorbells for CP user queues. This patch fixed that.
> >
> > Change-Id: I670adfc3fd7725d2ed0bd9665cb7f69f8b9023c2
> > Signed-off-by: Yong Zhao mailto:yong.z...@amd.com>>
>
> One mostly cosmetic comment inline. With that fixed, the series is
> Reviewed-by: Felix Kuehling  >
>
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    | 17 ++---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h  |  4 +++
> >   drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c  |  3 +++
> >   drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c  |  3 +++
> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  9 +++
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 25 +--
> >   .../gpu/drm/amd/include/kgd_kfd_interface.h   | 19 +-
> >   7 files changed, 56 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > index e957e42c539a..13710f34191a 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > @@ -196,11 +196,20 @@ void amdgpu_amdkfd_device_init(struct 
> amdgpu_device *adev)
> > gpu_resources.sdma_doorbell[1][i+1] =
> > adev->doorbell_index.sdma_engine[1] + 0x200 + (i >> 1);
> >    }
> > - /* Doorbells 0x0e0-0ff and 0x2e0-2ff are reserved for
> > -  * SDMA, IH and VCN. So don't use them for the CP.
> > +
> > + /* Because of the setting in registers like
> > +  * SDMA0_DOORBELL_RANGE etc., BIF statically uses the
> > +  * lower 12 bits of doorbell address for routing, in
> > +  * order to route the CP queue doorbells to CP engine,
> > +  * the doorbells allocated to CP queues have to be
> > +  * outside the range set for SDMA, VCN, and IH blocks
> > +  * Prior to SOC15, all queues use queue ID to
> > +  * determine doorbells.
> > */
> > - gpu_resources.reserved_doorbell_mask = 0x1e0;
> > - gpu_resources.reserved_doorbell_val  = 0x0e0;
> > + gpu_resources.reserved_doorbells_start =
> > + adev->doorbell_index.sdma_engine[0];
> > + gpu_resources.reserved_doorbells_end =
> > + adev->doorbell_index.last_non_cp;
> >
> >    kgd2kfd_device_init(adev->kfd.dev, _resources);
> >    }
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
> > index 1cfec06f81d4..4de431f7f380 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
> > @@ -71,6 +71,7 @@ struct amdgpu_doorbell_index {
> >  

Re: [PATCH 01/35] drm/amd/display: Use udelay when waiting between aux retries

2019-02-04 Thread sylvain . bertrand
On Mon, Feb 04, 2019 at 03:43:36PM +, Wentland, Harry wrote:
> DRM actually bumped this to 32 due to an issue with a Dell 4k display.

As I feared there is a retry counter higher in the code. My bad.

> It depends. I wouldn't call one or the other more correct. I seem to remember
> that the DP spec is quite vague on these retries but I could be mistaken.
> Since our driver hasn't show any problems with the DRM code and I believe
> others (such as i915) also pass DP compliance without issues I wouldn't
> proactively change this.

I reacted to this matter since my DP monitor had troubles getting out of DP off
state (iiyama) with logs which did seem to point towards a failure of getting
the EDID. It happens rarely, and I could not figure out a context to reproduce
for sure. Could be gone for good with the latest fixes.

I guess fixes for popular and defective hardware (for instance that dell 4k)
from a compliance stand point have to go in the code unfortunately.

regards,

-- 
Sylvain
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: add a workaround for GDS ordered append hangs with compute queues

2019-02-04 Thread Koenig, Christian
Am 04.02.19 um 16:55 schrieb Marek Olšák:
On Mon, Feb 4, 2019 at 7:42 AM Christian König 
mailto:ckoenig.leichtzumer...@gmail.com>> 
wrote:
At least from coding style, backward compatibility etc.. this looks sane to me, 
so feel free to add an Acked-by.

But I absolutely can't judge if that is correct from the hardware point of view 
or not.

Our GDS docs say that writing the register resets the wave counter.


And I think that somebody else looking at this is mandatory for it to be 
committed.

There won't be anybody else. Nobody here really understands GDS, nobody here 
really cares about GDS, and hw guys weren't helpful with the hangs. If I didn't 
discover this by luck, GDS OA would be unusable on Linux.

Great, as long as Alex is ok with this feel free to go ahead.

Christian.


Marek

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: add a workaround for GDS ordered append hangs with compute queues

2019-02-04 Thread Marek Olšák
FYI, when I push this, I'll bump the DRM version.

Marek

On Mon, Feb 4, 2019 at 10:55 AM Marek Olšák  wrote:

> On Mon, Feb 4, 2019 at 7:42 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> At least from coding style, backward compatibility etc.. this looks sane
>> to me, so feel free to add an Acked-by.
>>
>> But I absolutely can't judge if that is correct from the hardware point
>> of view or not.
>>
>
> Our GDS docs say that writing the register resets the wave counter.
>
>
>>
>> And I think that somebody else looking at this is mandatory for it to be
>> committed.
>>
>
> There won't be anybody else. Nobody here really understands GDS, nobody
> here really cares about GDS, and hw guys weren't helpful with the hangs. If
> I didn't discover this by luck, GDS OA would be unusable on Linux.
>
> Marek
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/display: Use context parameters to enable FBC

2019-02-04 Thread Li, Roman
Reviewed-by: Roman Li 

-Original Message-
From: amd-gfx  On Behalf Of S, Shirish
Sent: Monday, February 4, 2019 4:20 AM
To: Li, Sun peng (Leo) ; Wu, Hersen ; 
Wentland, Harry 
Cc: amd-gfx@lists.freedesktop.org; S, Shirish 
Subject: [PATCH] drm/amd/display: Use context parameters to enable FBC

[What]
FBC fails to get enabled when switched between LINEAR(console/VT) and 
non-LINEAR(GUI) based rendering due to default value of tiling info stored in 
the current_state which is used for deciding whether or not to turn FBC on or 
off.

[How]
Use context structure's tiling information which is coherant with the screen 
updates.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index db0ef41..fd7cd5b 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -2535,7 +2535,7 @@ static void dce110_apply_ctx_for_surface(
}
 
if (dc->fbc_compressor)
-   enable_fbc(dc, dc->current_state);
+   enable_fbc(dc, context);
 }
 
 static void dce110_power_down_fe(struct dc *dc, struct pipe_ctx *pipe_ctx)
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: add a workaround for GDS ordered append hangs with compute queues

2019-02-04 Thread Marek Olšák
On Mon, Feb 4, 2019 at 7:42 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> At least from coding style, backward compatibility etc.. this looks sane
> to me, so feel free to add an Acked-by.
>
> But I absolutely can't judge if that is correct from the hardware point of
> view or not.
>

Our GDS docs say that writing the register resets the wave counter.


>
> And I think that somebody else looking at this is mandatory for it to be
> committed.
>

There won't be anybody else. Nobody here really understands GDS, nobody
here really cares about GDS, and hw guys weren't helpful with the hangs. If
I didn't discover this by luck, GDS OA would be unusable on Linux.

Marek
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: Use context parameters to enable FBC

2019-02-04 Thread Wentland, Harry
On 2019-02-04 4:20 a.m., S, Shirish wrote:
> [What]
> FBC fails to get enabled when switched between LINEAR(console/VT)
> and non-LINEAR(GUI) based rendering due to default value of
> tiling info stored in the current_state which is used for deciding
> whether or not to turn FBC on or off.
> 
> [How]
> Use context structure's tiling information which is coherant with
> the screen updates.
> 
> Signed-off-by: Shirish S 
> ---
>  drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
> b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> index db0ef41..fd7cd5b 100644
> --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> @@ -2535,7 +2535,7 @@ static void dce110_apply_ctx_for_surface(
>   }
>  
>   if (dc->fbc_compressor)
> - enable_fbc(dc, dc->current_state);
> + enable_fbc(dc, context);

Looks good, but I'd want Roma's RB.

Acked-by: Harry Wentland 

Harry

>  }
>  
>  static void dce110_power_down_fe(struct dc *dc, struct pipe_ctx *pipe_ctx)
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Fw: [PATCH 15/35] drm/amd/display: pass vline_config parameter by reference.

2019-02-04 Thread Lakha, Bhawanpreet




From: amd-gfx  on behalf of Bhawanpreet 
Lakha 
Sent: February 4, 2019 10:45 AM
To: amd-gfx@lists.freedesktop.org
Cc: Sun, Yongqiang
Subject: [PATCH 15/35] drm/amd/display: pass vline_config parameter by 
reference.

From: Yongqiang Sun 

Change-Id: Icfe018c7579ad2b3ef65195f578b8e44422d53f3
Signed-off-by: Yongqiang Sun 
Reviewed-by: Aric Cyr 
Acked-by: Bhawanpreet Lakha 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 4 ++--
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c| 8 
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h| 2 +-
 drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h | 2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index e0ac009f00ab..d8579b207300 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1560,11 +1560,11 @@ static void commit_planes_do_stream_update(struct dc 
*dc,

 if (stream_update->vline0_config && 
pipe_ctx->stream_res.tg->funcs->program_vline_interrupt)
 
pipe_ctx->stream_res.tg->funcs->program_vline_interrupt(
-   pipe_ctx->stream_res.tg, VLINE0, 
stream->vline0_config);
+   pipe_ctx->stream_res.tg, VLINE0, 
>vline0_config);

 if (stream_update->vline1_config && 
pipe_ctx->stream_res.tg->funcs->program_vline_interrupt)
 
pipe_ctx->stream_res.tg->funcs->program_vline_interrupt(
-   pipe_ctx->stream_res.tg, VLINE1, 
stream->vline1_config);
+   pipe_ctx->stream_res.tg, VLINE1, 
>vline1_config);

 if ((stream_update->hdr_static_metadata && 
!stream->use_dynamic_meta) ||
 stream_update->vrr_infopacket ||
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c
index 1d4f9b48ed7d..cefa322df8a6 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c
@@ -95,19 +95,19 @@ static void optc1_disable_stereo(struct timing_generator 
*optc)
 void optc1_program_vline_interrupt(
 struct timing_generator *optc,
 enum vline_select vline,
-   struct vline_config vline_config)
+   const struct vline_config *vline_config)
 {
 struct optc *optc1 = DCN10TG_FROM_TG(optc);

 switch (vline) {
 case VLINE0:
 REG_SET_2(OTG_VERTICAL_INTERRUPT0_POSITION, 0,
-   OTG_VERTICAL_INTERRUPT0_LINE_START, 
vline_config.start_line,
-   OTG_VERTICAL_INTERRUPT0_LINE_END, 
vline_config.end_line);
+   OTG_VERTICAL_INTERRUPT0_LINE_START, 
vline_config->start_line,
+   OTG_VERTICAL_INTERRUPT0_LINE_END, 
vline_config->end_line);
 break;
 case VLINE1:
 REG_SET(OTG_VERTICAL_INTERRUPT1_POSITION, 0,
-   OTG_VERTICAL_INTERRUPT1_LINE_START, 
vline_config.start_line);
+   OTG_VERTICAL_INTERRUPT1_LINE_START, 
vline_config->start_line);
 break;
 default:
 break;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h
index 8eb71c0160a7..b34c8a240598 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h
@@ -485,7 +485,7 @@ void optc1_program_timing(

 void optc1_program_vline_interrupt(struct timing_generator *optc,
 enum vline_select vline,
-   struct vline_config vline_config);
+   const struct vline_config *vline_config);

 void optc1_program_global_sync(
 struct timing_generator *optc);
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h 
b/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h
index df64cf73ceb9..d22a406c19c0 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h
@@ -151,7 +151,7 @@ struct timing_generator_funcs {
 bool use_vbios);
 void (*program_vline_interrupt)(struct timing_generator *optc,
 enum vline_select vline,
-   struct vline_config vline_config);
+   const struct vline_config *vline_config);
 bool (*enable_crtc)(struct timing_generator *tg);
 bool (*disable_crtc)(struct timing_generator *tg);
 bool (*is_counter_moving)(struct timing_generator 

[PATCH 15/35] drm/amd/display: pass vline_config parameter by reference.

2019-02-04 Thread Bhawanpreet Lakha
From: Yongqiang Sun 

Change-Id: Icfe018c7579ad2b3ef65195f578b8e44422d53f3
Signed-off-by: Yongqiang Sun 
Reviewed-by: Aric Cyr 
Acked-by: Bhawanpreet Lakha 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 4 ++--
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c| 8 
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h| 2 +-
 drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h | 2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index e0ac009f00ab..d8579b207300 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1560,11 +1560,11 @@ static void commit_planes_do_stream_update(struct dc 
*dc,
 
if (stream_update->vline0_config && 
pipe_ctx->stream_res.tg->funcs->program_vline_interrupt)

pipe_ctx->stream_res.tg->funcs->program_vline_interrupt(
-   pipe_ctx->stream_res.tg, VLINE0, 
stream->vline0_config);
+   pipe_ctx->stream_res.tg, VLINE0, 
>vline0_config);
 
if (stream_update->vline1_config && 
pipe_ctx->stream_res.tg->funcs->program_vline_interrupt)

pipe_ctx->stream_res.tg->funcs->program_vline_interrupt(
-   pipe_ctx->stream_res.tg, VLINE1, 
stream->vline1_config);
+   pipe_ctx->stream_res.tg, VLINE1, 
>vline1_config);
 
if ((stream_update->hdr_static_metadata && 
!stream->use_dynamic_meta) ||
stream_update->vrr_infopacket ||
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c
index 1d4f9b48ed7d..cefa322df8a6 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c
@@ -95,19 +95,19 @@ static void optc1_disable_stereo(struct timing_generator 
*optc)
 void optc1_program_vline_interrupt(
struct timing_generator *optc,
enum vline_select vline,
-   struct vline_config vline_config)
+   const struct vline_config *vline_config)
 {
struct optc *optc1 = DCN10TG_FROM_TG(optc);
 
switch (vline) {
case VLINE0:
REG_SET_2(OTG_VERTICAL_INTERRUPT0_POSITION, 0,
-   OTG_VERTICAL_INTERRUPT0_LINE_START, 
vline_config.start_line,
-   OTG_VERTICAL_INTERRUPT0_LINE_END, 
vline_config.end_line);
+   OTG_VERTICAL_INTERRUPT0_LINE_START, 
vline_config->start_line,
+   OTG_VERTICAL_INTERRUPT0_LINE_END, 
vline_config->end_line);
break;
case VLINE1:
REG_SET(OTG_VERTICAL_INTERRUPT1_POSITION, 0,
-   OTG_VERTICAL_INTERRUPT1_LINE_START, 
vline_config.start_line);
+   OTG_VERTICAL_INTERRUPT1_LINE_START, 
vline_config->start_line);
break;
default:
break;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h
index 8eb71c0160a7..b34c8a240598 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h
@@ -485,7 +485,7 @@ void optc1_program_timing(
 
 void optc1_program_vline_interrupt(struct timing_generator *optc,
enum vline_select vline,
-   struct vline_config vline_config);
+   const struct vline_config *vline_config);
 
 void optc1_program_global_sync(
struct timing_generator *optc);
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h 
b/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h
index df64cf73ceb9..d22a406c19c0 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h
@@ -151,7 +151,7 @@ struct timing_generator_funcs {
bool use_vbios);
void (*program_vline_interrupt)(struct timing_generator *optc,
enum vline_select vline,
-   struct vline_config vline_config);
+   const struct vline_config *vline_config);
bool (*enable_crtc)(struct timing_generator *tg);
bool (*disable_crtc)(struct timing_generator *tg);
bool (*is_counter_moving)(struct timing_generator *tg);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3] drm/amdgpu: Add AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES

2019-02-04 Thread Grodzovsky, Andrey
Thanks, I will update, add your RB and push.

Andrey

On 2/3/19 10:51 AM, Koenig, Christian wrote:

@@ -1090,6 +1091,14 @@ static int amdgpu_cs_process_fence_dep(struct 
amdgpu_cs_parser *p,

fence = amdgpu_ctx_get_fence(ctx, entity,
 deps[i].handle);
+
+   if (chunk->chunk_id == AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES) {
+   struct drm_sched_fence *s_fence = 
to_drm_sched_fence(fence);
+
+   dma_fence_put(fence);
+   fence = dma_fence_get(_fence->scheduled);


You need to change the order here or otherwise the dma_fence_put could
destroy the fence we want.

Apart from that looks really good to me,
Christian.


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 01/35] drm/amd/display: Use udelay when waiting between aux retries

2019-02-04 Thread Wentland, Harry
On 2019-02-01 5:55 p.m., sylvain.bertr...@gmail.com wrote:
> On Fri, Feb 01, 2019 at 09:20:56PM +, Wentland, Harry wrote:
>> DRM's AUX code uses usleep_range in drm_dp_dpcd_access.
> 
> My bad, forgot about the usleep_range switch. That said AUX_RETRY_INTERVAL is
> 500 us, with a usleep_range top bound of 600 us.
> 
> Then, it would mean DC DP timeout retries would happen after ~1ms, and drm
> ~600us. Additionaly, it seems that the number of retries are 3 in drm code and
> 7 in DC code. 
> 

DRM actually bumped this to 32 due to an issue with a Dell 4k display.

> I may be wrong, but it seems DC code is much more "insisting" on auxchannel
> (edid retrieval) and much more forgiving on monitor ability to timeout in
> time (~1ms).
> 
> If I did read the code the right way, it may be more reasonable to have 
> similar
> behavior in drm code than in DC code, right?
> 

It depends. I wouldn't call one or the other more correct. I seem to remember 
that the DP spec is quite vague on these retries but I could be mistaken. Since 
our driver hasn't show any problems with the DRM code and I believe others 
(such as i915) also pass DP compliance without issues I wouldn't proactively 
change this.

Harry
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Intel-gfx] [PATCH 2/6] drm/drv: Prepare to remove drm_dev_unplug()

2019-02-04 Thread Daniel Vetter
On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
> The only thing now that makes drm_dev_unplug() special is that it sets
> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
> can remove drm_dev_unplug().
> 
> Signed-off-by: Noralf Trønnes 
> ---
> 
> Maybe s/unplugged/unregistered/ ?
> 
> I looked at drm_device->registered, but using that would mean that
> drm_dev_is_unplugged() would return before drm_device is registered.
> And given that its current purpose is to prevent race against connector
> registration, I stayed away from it.

Yeah I think we need to keep the registered state separate from unplugged.
Iirc this exact scenario is what we discussed when you revamped the
unplug infrastructure.

> 
> Noralf.
> 
> 
>  drivers/gpu/drm/drm_drv.c | 27 +++
>  include/drm/drm_drv.h | 10 --
>  2 files changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 05bbc2b622fc..e0941200edc6 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
>   */
>  void drm_dev_unplug(struct drm_device *dev)
>  {
> - /*
> -  * After synchronizing any critical read section is guaranteed to see
> -  * the new value of ->unplugged, and any critical section which might
> -  * still have seen the old value of ->unplugged is guaranteed to have
> -  * finished.
> -  */
> - dev->unplugged = true;
> - synchronize_srcu(_unplug_srcu);
> -
>   drm_dev_unregister(dev);
>   drm_dev_put(dev);
>  }
> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
>   * drm_dev_register() but does not deallocate the device. The caller must 
> call
>   * drm_dev_put() to drop their final reference.
>   *
> - * A special form of unregistering for hotpluggable devices is 
> drm_dev_unplug(),
> - * which can be called while there are still open users of @dev.
> + * This function can be called while there are still open users of @dev as 
> long
> + * as the driver protects its device resources using drm_dev_enter() and
> + * drm_dev_exit().
>   *
>   * This should be called first in the device teardown code to make sure
> - * userspace can't access the device instance any more.
> + * userspace can't access the device instance any more. Drivers that support
> + * device unplug will probably want to call drm_atomic_helper_shutdown() 
> first

Read once more with a bit more coffee, spotted this:

s/first/afterwards/ - shutting down the hw before we've taken it away from
userspace is kinda the wrong way round. It should be the inverse of driver
load, which is 1) allocate structures 2) prep hw 3) register driver with
the world (simplified ofc).

> + * in order to disable the hardware on regular driver module unload.
>   */
>  void drm_dev_unregister(struct drm_device *dev)
>  {
> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
>   if (drm_core_check_feature(dev, DRIVER_LEGACY))
>   drm_lastclose(dev);
>  
> + /*
> +  * After synchronizing any critical read section is guaranteed to see
> +  * the new value of ->unplugged, and any critical section which might
> +  * still have seen the old value of ->unplugged is guaranteed to have
> +  * finished.
> +  */
> + dev->unplugged = true;
> + synchronize_srcu(_unplug_srcu);
> +
>   dev->registered = false;
>  
>   drm_client_dev_unregister(dev);
> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> index ca46a45a9cce..c50696c82a42 100644
> --- a/include/drm/drm_drv.h
> +++ b/include/drm/drm_drv.h
> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
>   * drm_dev_is_unplugged - is a DRM device unplugged
>   * @dev: DRM device
>   *
> - * This function can be called to check whether a hotpluggable is unplugged.
> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
> - * unplugged, these two functions guarantee that any store before calling
> - * drm_dev_unplug() is visible to callers of this function after it completes
> + * This function can be called to check whether @dev is unregistered. This 
> can
> + * be used to detect that the underlying parent device is gone.

I think it'd be good to keep the first part, and just update the reference
to drm_dev_unregister. So:

 * This function can be called to check whether a hotpluggable is unplugged.
 * Unplugging itself is singalled through drm_dev_unregister(). If a device is
 * unplugged, these two functions guarantee that any store before calling
 * drm_dev_unregister() is visible to callers of this function after it
 * completes.

I think your version shrugs a few important details under the rug. With
those nits addressed:

Reviewed-by: Daniel Vetter 

Cheers, Daniel

>   *
> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
> - * recommended that drivers instead use the 

Re: [PATCH] drm/amdgu/vce_v3: start vce block before ring test

2019-02-04 Thread Liu, Leo

On 2/4/19 7:49 AM, Koenig, Christian wrote:
> Am 04.02.19 um 13:44 schrieb S, Shirish:
>> vce ring test fails during resume since mmVCE_RB_RPTR*
>> is not intitalized/updated.
>>
>> Hence start vce block before ring test.
> Mhm, I wonder why this ever worked. But yeah, same problem seems to
> exits for VCE 2 as well.
>
> Leo any comment on this?

UVD and VCE start function were at hw_init originally from the bring up 
on all the HW. And later the DPM developer moved them to 
set_powergating_state() for some reason.

@Shirish, are you sure the vce_v3_0_start() is not there?

Just simply adding it back to hw_init, might break the DPM logic, so 
please make sure.


Thanks,

Leo


>
> Thanks,
> Christian.
>
>> Signed-off-by: Shirish S 
>> ---
>> * vce_v4_0.c's hw_init sequence already has this change.
>>
>>drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 4 
>>1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
>> index 6ec65cf1..d809c10 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
>> @@ -469,6 +469,10 @@ static int vce_v3_0_hw_init(void *handle)
>>  int r, i;
>>  struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>> +r = vce_v3_0_start(adev);
>> +if (r)
>> +return r;
>> +
>>  vce_v3_0_override_vce_clock_gating(adev, true);
>>
>>  amdgpu_asic_set_vce_clocks(adev, 1, 1);
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] drm/amdgpu: use HMM mirror callback to replace mmu notifier v6

2019-02-04 Thread Christian König

Am 04.02.19 um 16:06 schrieb Yang, Philip:

Replace our MMU notifier with hmm_mirror_ops.sync_cpu_device_pagetables
callback. Enable CONFIG_HMM and CONFIG_HMM_MIRROR as a dependency in
DRM_AMDGPU_USERPTR Kconfig.

It supports both KFD userptr and gfx userptr paths.

The depdent HMM patchset from Jérôme Glisse are all merged into 4.20.0
kernel now.

Change-Id: Ie62c3c5e3c5b8521ab3b438d1eff2aa2a003835e
Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdgpu/Kconfig |   6 +-
  drivers/gpu/drm/amd/amdgpu/Makefile|   2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 139 +++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h |   2 +-
  4 files changed, 67 insertions(+), 82 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig 
b/drivers/gpu/drm/amd/amdgpu/Kconfig
index 9221e5489069..960a63355705 100644
--- a/drivers/gpu/drm/amd/amdgpu/Kconfig
+++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
@@ -26,10 +26,10 @@ config DRM_AMDGPU_CIK
  config DRM_AMDGPU_USERPTR
bool "Always enable userptr write support"
depends on DRM_AMDGPU
-   select MMU_NOTIFIER
+   select HMM_MIRROR
help
- This option selects CONFIG_MMU_NOTIFIER if it isn't already
- selected to enabled full userptr support.
+ This option selects CONFIG_HMM and CONFIG_HMM_MIRROR if it
+ isn't already selected to enabled full userptr support.
  
  config DRM_AMDGPU_GART_DEBUGFS

bool "Allow GART access through debugfs"
diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 466da5954a68..851001ced5e8 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -172,7 +172,7 @@ endif
  amdgpu-$(CONFIG_COMPAT) += amdgpu_ioc32.o
  amdgpu-$(CONFIG_VGA_SWITCHEROO) += amdgpu_atpx_handler.o
  amdgpu-$(CONFIG_ACPI) += amdgpu_acpi.o
-amdgpu-$(CONFIG_MMU_NOTIFIER) += amdgpu_mn.o
+amdgpu-$(CONFIG_HMM_MIRROR) += amdgpu_mn.o
  
  include $(FULL_AMD_PATH)/powerplay/Makefile
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c

index 3e6823fdd939..5d518d2bb9be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -45,7 +45,7 @@
  
  #include 

  #include 
-#include 
+#include 
  #include 
  #include 
  #include 
@@ -58,7 +58,6 @@
   *
   * @adev: amdgpu device pointer
   * @mm: process address space
- * @mn: MMU notifier structure
   * @type: type of MMU notifier
   * @work: destruction work item
   * @node: hash table node to find structure by adev and mn
@@ -66,6 +65,7 @@
   * @objects: interval tree containing amdgpu_mn_nodes
   * @read_lock: mutex for recursive locking of @lock
   * @recursion: depth of recursion
+ * @mirror: HMM mirror function support
   *
   * Data for each amdgpu device and process address space.
   */
@@ -73,7 +73,6 @@ struct amdgpu_mn {
/* constant after initialisation */
struct amdgpu_device*adev;
struct mm_struct*mm;
-   struct mmu_notifier mn;
enum amdgpu_mn_type type;
  
  	/* only used on destruction */

@@ -87,6 +86,9 @@ struct amdgpu_mn {
struct rb_root_cached   objects;
struct mutexread_lock;



atomic_trecursion;


With HMM we don't need this any more. Please remove it and simplify 
amdgpu_mn_read_lock() and amdgpu_mn_read_unlock().


Apart from that looks good to me,
Christian.


+
+   /* HMM mirror */
+   struct hmm_mirror   mirror;
  };
  
  /**

@@ -103,7 +105,7 @@ struct amdgpu_mn_node {
  };
  
  /**

- * amdgpu_mn_destroy - destroy the MMU notifier
+ * amdgpu_mn_destroy - destroy the HMM mirror
   *
   * @work: previously sheduled work item
   *
@@ -129,28 +131,26 @@ static void amdgpu_mn_destroy(struct work_struct *work)
}
up_write(>lock);
mutex_unlock(>mn_lock);
-   mmu_notifier_unregister_no_release(>mn, amn->mm);
+
+   hmm_mirror_unregister(>mirror);
kfree(amn);
  }
  
  /**

- * amdgpu_mn_release - callback to notify about mm destruction
+ * amdgpu_hmm_mirror_release - callback to notify about mm destruction
   *
- * @mn: our notifier
- * @mm: the mm this callback is about
+ * @mirror: the HMM mirror (mm) this callback is about
   *
- * Shedule a work item to lazy destroy our notifier.
+ * Shedule a work item to lazy destroy HMM mirror.
   */
-static void amdgpu_mn_release(struct mmu_notifier *mn,
- struct mm_struct *mm)
+static void amdgpu_hmm_mirror_release(struct hmm_mirror *mirror)
  {
-   struct amdgpu_mn *amn = container_of(mn, struct amdgpu_mn, mn);
+   struct amdgpu_mn *amn = container_of(mirror, struct amdgpu_mn, mirror);
  
  	INIT_WORK(>work, amdgpu_mn_destroy);

schedule_work(>work);
  }
  
-

  /**
   * amdgpu_mn_lock - take the write side lock for this notifier
   *
@@ -237,141 +237,126 @@ static void amdgpu_mn_invalidate_node(struct 
amdgpu_mn_node *node,
  /**
 

[PATCH 2/3] drm/amdkfd: avoid HMM change cause circular lock dependency v2

2019-02-04 Thread Yang, Philip
There is circular lock between gfx and kfd path with HMM change:
lock(dqm) -> bo::reserve -> amdgpu_mn_lock

To avoid this, move init/unint_mqd() out of lock(dqm), to remove nested
locking between mmap_sem and bo::reserve. The locking order
is: bo::reserve -> amdgpu_mn_lock(p->mn)

Change-Id: I2ec09a47571f6b4c8eaef93f22c0a600f5f70153
Signed-off-by: Philip Yang 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 32 ++-
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 8372556b52eb..efe0d3c0215b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1156,21 +1156,17 @@ static int create_queue_cpsch(struct 
device_queue_manager *dqm, struct queue *q,
int retval;
struct mqd_manager *mqd_mgr;
 
-   retval = 0;
-
-   dqm_lock(dqm);
-
if (dqm->total_queue_count >= max_num_of_queues_per_device) {
pr_warn("Can't create new usermode queue because %d queues were 
already created\n",
dqm->total_queue_count);
retval = -EPERM;
-   goto out_unlock;
+   goto out;
}
 
if (q->properties.type == KFD_QUEUE_TYPE_SDMA) {
retval = allocate_sdma_queue(dqm, >sdma_id);
if (retval)
-   goto out_unlock;
+   goto out;
q->properties.sdma_queue_id =
q->sdma_id / get_num_sdma_engines(dqm);
q->properties.sdma_engine_id =
@@ -1181,6 +1177,9 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
if (retval)
goto out_deallocate_sdma_queue;
 
+   /* Do init_mqd before dqm_lock(dqm) to avoid circular locking order:
+* lock(dqm) -> bo::reserve
+*/
mqd_mgr = dqm->ops.get_mqd_manager(dqm,
get_mqd_type_from_queue_type(q->properties.type));
 
@@ -1188,6 +1187,7 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
retval = -ENOMEM;
goto out_deallocate_doorbell;
}
+
/*
 * Eviction state logic: we only mark active queues as evicted
 * to avoid the overhead of restoring inactive queues later
@@ -1196,9 +1196,7 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
q->properties.is_evicted = (q->properties.queue_size > 0 &&
q->properties.queue_percent > 0 &&
q->properties.queue_address != 0);
-
dqm->asic_ops.init_sdma_vm(dqm, q, qpd);
-
q->properties.tba_addr = qpd->tba_addr;
q->properties.tma_addr = qpd->tma_addr;
retval = mqd_mgr->init_mqd(mqd_mgr, >mqd, >mqd_mem_obj,
@@ -1206,6 +1204,8 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
if (retval)
goto out_deallocate_doorbell;
 
+   dqm_lock(dqm);
+
list_add(>list, >queues_list);
qpd->queue_count++;
if (q->properties.is_active) {
@@ -1233,9 +1233,7 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
 out_deallocate_sdma_queue:
if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
deallocate_sdma_queue(dqm, q->sdma_id);
-out_unlock:
-   dqm_unlock(dqm);
-
+out:
return retval;
 }
 
@@ -1398,8 +1396,6 @@ static int destroy_queue_cpsch(struct 
device_queue_manager *dqm,
qpd->reset_wavefronts = true;
}
 
-   mqd_mgr->uninit_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj);
-
/*
 * Unconditionally decrement this counter, regardless of the queue's
 * type
@@ -1410,6 +1406,9 @@ static int destroy_queue_cpsch(struct 
device_queue_manager *dqm,
 
dqm_unlock(dqm);
 
+   /* Do uninit_mqd after dqm_unlock(dqm) to avoid circular locking */
+   mqd_mgr->uninit_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj);
+
return retval;
 
 failed:
@@ -1631,7 +1630,11 @@ static int process_termination_cpsch(struct 
device_queue_manager *dqm,
qpd->reset_wavefronts = false;
}
 
-   /* lastly, free mqd resources */
+   dqm_unlock(dqm);
+
+   /* Lastly, free mqd resources.
+* Do uninit_mqd() after dqm_unlock to avoid circular locking.
+*/
list_for_each_entry_safe(q, next, >queues_list, list) {
mqd_mgr = dqm->ops.get_mqd_manager(dqm,
get_mqd_type_from_queue_type(q->properties.type));
@@ -1645,7 +1648,6 @@ static int process_termination_cpsch(struct 
device_queue_manager *dqm,
}
 
 out:
-   dqm_unlock(dqm);
return retval;
 }
 
-- 
2.17.1


[PATCH 3/3] drm/amdgpu: replace get_user_pages with HMM address mirror helpers v6

2019-02-04 Thread Yang, Philip
Use HMM helper function hmm_vma_fault() to get physical pages backing
userptr and start CPU page table update track of those pages. Then use
hmm_vma_range_done() to check if those pages are updated before
amdgpu_cs_submit for gfx or before user queues are resumed for kfd.

If userptr pages are updated, for gfx, amdgpu_cs_ioctl will restart
from scratch, for kfd, restore worker is rescheduled to retry.

HMM simplify the CPU page table concurrent update check, so remove
guptasklock, mmu_invalidations, last_set_pages fields from
amdgpu_ttm_tt struct.

HMM does not pin the page (increase page ref count), so remove related
operations like release_pages(), put_page(), mark_page_dirty().

Change-Id: I2e8c0c6f0d8c21e5596a32d7fc91762778bc9e67
Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   1 -
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  95 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 158 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  14 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c|  25 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h|   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 173 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   3 +-
 9 files changed, 198 insertions(+), 277 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 0b31a1859023..0e1711a75b68 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -61,7 +61,6 @@ struct kgd_mem {
 
atomic_t invalid;
struct amdkfd_process_info *process_info;
-   struct page **user_pages;
 
struct amdgpu_sync sync;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index d7b10d79f1de..ae2d838d31ea 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -582,28 +582,12 @@ static int init_user_pages(struct kgd_mem *mem, struct 
mm_struct *mm,
goto out;
}
 
-   /* If no restore worker is running concurrently, user_pages
-* should not be allocated
-*/
-   WARN(mem->user_pages, "Leaking user_pages array");
-
-   mem->user_pages = kvmalloc_array(bo->tbo.ttm->num_pages,
-  sizeof(struct page *),
-  GFP_KERNEL | __GFP_ZERO);
-   if (!mem->user_pages) {
-   pr_err("%s: Failed to allocate pages array\n", __func__);
-   ret = -ENOMEM;
-   goto unregister_out;
-   }
-
-   ret = amdgpu_ttm_tt_get_user_pages(bo->tbo.ttm, mem->user_pages);
+   ret = amdgpu_ttm_tt_get_user_pages(bo->tbo.ttm, bo->tbo.ttm->pages);
if (ret) {
pr_err("%s: Failed to get user pages: %d\n", __func__, ret);
-   goto free_out;
+   goto unregister_out;
}
 
-   amdgpu_ttm_tt_set_user_pages(bo->tbo.ttm, mem->user_pages);
-
ret = amdgpu_bo_reserve(bo, true);
if (ret) {
pr_err("%s: Failed to reserve BO\n", __func__);
@@ -616,11 +600,7 @@ static int init_user_pages(struct kgd_mem *mem, struct 
mm_struct *mm,
amdgpu_bo_unreserve(bo);
 
 release_out:
-   if (ret)
-   release_pages(mem->user_pages, bo->tbo.ttm->num_pages);
-free_out:
-   kvfree(mem->user_pages);
-   mem->user_pages = NULL;
+   amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm);
 unregister_out:
if (ret)
amdgpu_mn_unregister(bo);
@@ -679,7 +659,6 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
ctx->kfd_bo.priority = 0;
ctx->kfd_bo.tv.bo = >tbo;
ctx->kfd_bo.tv.num_shared = 1;
-   ctx->kfd_bo.user_pages = NULL;
list_add(>kfd_bo.tv.head, >list);
 
amdgpu_vm_get_pd_bo(vm, >list, >vm_pd[0]);
@@ -743,7 +722,6 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
ctx->kfd_bo.priority = 0;
ctx->kfd_bo.tv.bo = >tbo;
ctx->kfd_bo.tv.num_shared = 1;
-   ctx->kfd_bo.user_pages = NULL;
list_add(>kfd_bo.tv.head, >list);
 
i = 0;
@@ -1371,15 +1349,6 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
list_del(_list_entry->head);
mutex_unlock(_info->lock);
 
-   /* Free user pages if necessary */
-   if (mem->user_pages) {
-   pr_debug("%s: Freeing user_pages array\n", __func__);
-   if (mem->user_pages[0])
-   release_pages(mem->user_pages,
-   mem->bo->tbo.ttm->num_pages);
-   kvfree(mem->user_pages);
-   }
-
ret = reserve_bo_and_cond_vms(mem, NULL, BO_VM_ALL, );
if (unlikely(ret))
return ret;
@@ -1855,25 +1824,11 @@ static int update_invalid_user_pages(struct 

[PATCH 1/3] drm/amdgpu: use HMM mirror callback to replace mmu notifier v6

2019-02-04 Thread Yang, Philip
Replace our MMU notifier with hmm_mirror_ops.sync_cpu_device_pagetables
callback. Enable CONFIG_HMM and CONFIG_HMM_MIRROR as a dependency in
DRM_AMDGPU_USERPTR Kconfig.

It supports both KFD userptr and gfx userptr paths.

The depdent HMM patchset from Jérôme Glisse are all merged into 4.20.0
kernel now.

Change-Id: Ie62c3c5e3c5b8521ab3b438d1eff2aa2a003835e
Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdgpu/Kconfig |   6 +-
 drivers/gpu/drm/amd/amdgpu/Makefile|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 139 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h |   2 +-
 4 files changed, 67 insertions(+), 82 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig 
b/drivers/gpu/drm/amd/amdgpu/Kconfig
index 9221e5489069..960a63355705 100644
--- a/drivers/gpu/drm/amd/amdgpu/Kconfig
+++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
@@ -26,10 +26,10 @@ config DRM_AMDGPU_CIK
 config DRM_AMDGPU_USERPTR
bool "Always enable userptr write support"
depends on DRM_AMDGPU
-   select MMU_NOTIFIER
+   select HMM_MIRROR
help
- This option selects CONFIG_MMU_NOTIFIER if it isn't already
- selected to enabled full userptr support.
+ This option selects CONFIG_HMM and CONFIG_HMM_MIRROR if it
+ isn't already selected to enabled full userptr support.
 
 config DRM_AMDGPU_GART_DEBUGFS
bool "Allow GART access through debugfs"
diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 466da5954a68..851001ced5e8 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -172,7 +172,7 @@ endif
 amdgpu-$(CONFIG_COMPAT) += amdgpu_ioc32.o
 amdgpu-$(CONFIG_VGA_SWITCHEROO) += amdgpu_atpx_handler.o
 amdgpu-$(CONFIG_ACPI) += amdgpu_acpi.o
-amdgpu-$(CONFIG_MMU_NOTIFIER) += amdgpu_mn.o
+amdgpu-$(CONFIG_HMM_MIRROR) += amdgpu_mn.o
 
 include $(FULL_AMD_PATH)/powerplay/Makefile
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 3e6823fdd939..5d518d2bb9be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -45,7 +45,7 @@
 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -58,7 +58,6 @@
  *
  * @adev: amdgpu device pointer
  * @mm: process address space
- * @mn: MMU notifier structure
  * @type: type of MMU notifier
  * @work: destruction work item
  * @node: hash table node to find structure by adev and mn
@@ -66,6 +65,7 @@
  * @objects: interval tree containing amdgpu_mn_nodes
  * @read_lock: mutex for recursive locking of @lock
  * @recursion: depth of recursion
+ * @mirror: HMM mirror function support
  *
  * Data for each amdgpu device and process address space.
  */
@@ -73,7 +73,6 @@ struct amdgpu_mn {
/* constant after initialisation */
struct amdgpu_device*adev;
struct mm_struct*mm;
-   struct mmu_notifier mn;
enum amdgpu_mn_type type;
 
/* only used on destruction */
@@ -87,6 +86,9 @@ struct amdgpu_mn {
struct rb_root_cached   objects;
struct mutexread_lock;
atomic_trecursion;
+
+   /* HMM mirror */
+   struct hmm_mirror   mirror;
 };
 
 /**
@@ -103,7 +105,7 @@ struct amdgpu_mn_node {
 };
 
 /**
- * amdgpu_mn_destroy - destroy the MMU notifier
+ * amdgpu_mn_destroy - destroy the HMM mirror
  *
  * @work: previously sheduled work item
  *
@@ -129,28 +131,26 @@ static void amdgpu_mn_destroy(struct work_struct *work)
}
up_write(>lock);
mutex_unlock(>mn_lock);
-   mmu_notifier_unregister_no_release(>mn, amn->mm);
+
+   hmm_mirror_unregister(>mirror);
kfree(amn);
 }
 
 /**
- * amdgpu_mn_release - callback to notify about mm destruction
+ * amdgpu_hmm_mirror_release - callback to notify about mm destruction
  *
- * @mn: our notifier
- * @mm: the mm this callback is about
+ * @mirror: the HMM mirror (mm) this callback is about
  *
- * Shedule a work item to lazy destroy our notifier.
+ * Shedule a work item to lazy destroy HMM mirror.
  */
-static void amdgpu_mn_release(struct mmu_notifier *mn,
- struct mm_struct *mm)
+static void amdgpu_hmm_mirror_release(struct hmm_mirror *mirror)
 {
-   struct amdgpu_mn *amn = container_of(mn, struct amdgpu_mn, mn);
+   struct amdgpu_mn *amn = container_of(mirror, struct amdgpu_mn, mirror);
 
INIT_WORK(>work, amdgpu_mn_destroy);
schedule_work(>work);
 }
 
-
 /**
  * amdgpu_mn_lock - take the write side lock for this notifier
  *
@@ -237,141 +237,126 @@ static void amdgpu_mn_invalidate_node(struct 
amdgpu_mn_node *node,
 /**
  * amdgpu_mn_invalidate_range_start_gfx - callback to notify about mm change
  *
- * @mn: our notifier
- * @range: mmu notifier context
+ * @mirror: the hmm_mirror (mm) is about to update
+ * @update: the update start, end address
  *
  * Block for operations on BOs to 

[PATCH 0/3] Use HMM to replace get_user_pages

2019-02-04 Thread Yang, Philip
Hi Christian,

This patch is rebased to lastest HMM. Please review the GEM and CS part changes
in patch 3/3.

Regards,

Philip Yang (3):
  drm/amdgpu: use HMM mirror callback to replace mmu notifier v6
  drm/amdkfd: avoid HMM change cause circular lock dependency v2
  drm/amdgpu: replace get_user_pages with HMM address mirror helpers v6

 drivers/gpu/drm/amd/amdgpu/Kconfig|   6 +-
 drivers/gpu/drm/amd/amdgpu/Makefile   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   1 -
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  95 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 158 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  14 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c| 164 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h|   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 173 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   3 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  32 ++--
 12 files changed, 282 insertions(+), 374 deletions(-)

-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: After upgrade mesa to 19.0.0~rc1 all vulkan based application stop working ["vulkan-cube" received SIGSEGV in radv_pipeline_init_blend_state at ../src/amd/vulkan/radv_pipeline.c:699]

2019-02-04 Thread sylvain . bertrand
> > List of tested application:
> > - Rise of the Tomb Raider

Sorry to pop in the bug report.

I run git linux(amd-staging-drm-next), drm, llvm (erk!), mesa-gl, mesa-vulkan,
xserver, xf86-video-amdgpu from yesterday, that on AMD tahiti xt.

I have run dota 2 vulkan, artifact (vulkan), cs:go (still opengl) without
issues (I did play all of them yesterday night). I have a very old vulkanloader
though.

Could you try dota2 vulkan and cs:go (free to play games) if you have room on
your gaming storage and good download speed?

I have Rise of the Tomb Raider, I'll download it and give it a try tonight since
I have room on my gaming ssd (I did clear it months ago).

-- 
Sylvain
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: Don't re-enable CRC when CONFIG_DEBUG_FS isn't defined

2019-02-04 Thread Nicholas Kazlauskas
[Why]
When CONFIG_DEBUG_FS isn't defined then amdgpu_dm_crtc_set_crc_source
is NULL. This causes a compilation error since it's being called
unconditionally.

[How]
Guard the call based on CONFIG_DEBUG_FS - CRC capture isn't supported
without this.

Cc: Leo Li 
Cc: Harry Wentland 
Fixes: 43a6a02eb355 ("drm/amd/display: Re-enable CRC capture following modeset")
Signed-off-by: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a2453900e15a..cacb8fe5a1ad 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5284,9 +5284,11 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
 
manage_dm_interrupts(adev, acrtc, true);
 
+#ifdef CONFIG_DEBUG_FS
/* The stream has changed so CRC capture needs to re-enabled. */
if (dm_new_crtc_state->crc_enabled)
amdgpu_dm_crtc_set_crc_source(crtc, "auto");
+#endif
}
 
/* update planes when needed per crtc*/
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 6/6] drm/drv: Remove drm_dev_unplug()

2019-02-04 Thread Oleksandr Andrushchenko
On 2/3/19 5:42 PM, Noralf Trønnes wrote:
> There are no users left.
>
> Signed-off-by: Noralf Trønnes 
Reviewed-by: Oleksandr Andrushchenko 
> ---
>   drivers/gpu/drm/drm_drv.c | 17 -
>   include/drm/drm_drv.h |  1 -
>   2 files changed, 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index e0941200edc6..87210d4a9e53 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -354,23 +354,6 @@ void drm_dev_exit(int idx)
>   }
>   EXPORT_SYMBOL(drm_dev_exit);
>   
> -/**
> - * drm_dev_unplug - unplug a DRM device
> - * @dev: DRM device
> - *
> - * This unplugs a hotpluggable DRM device, which makes it inaccessible to
> - * userspace operations. Entry-points can use drm_dev_enter() and
> - * drm_dev_exit() to protect device resources in a race free manner. This
> - * essentially unregisters the device like drm_dev_unregister(), but can be
> - * called while there are still open users of @dev.
> - */
> -void drm_dev_unplug(struct drm_device *dev)
> -{
> - drm_dev_unregister(dev);
> - drm_dev_put(dev);
> -}
> -EXPORT_SYMBOL(drm_dev_unplug);
> -
>   /*
>* DRM internal mount
>* We want to be able to allocate our own "struct address_space" to control
> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> index c50696c82a42..b8765a6fc092 100644
> --- a/include/drm/drm_drv.h
> +++ b/include/drm/drm_drv.h
> @@ -730,7 +730,6 @@ void drm_dev_put(struct drm_device *dev);
>   void drm_put_dev(struct drm_device *dev);
>   bool drm_dev_enter(struct drm_device *dev, int *idx);
>   void drm_dev_exit(int idx);
> -void drm_dev_unplug(struct drm_device *dev);
>   
>   /**
>* drm_dev_is_unplugged - is a DRM device unplugged
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 5/6] drm/xen: Use drm_dev_unregister()

2019-02-04 Thread Oleksandr Andrushchenko
On 2/3/19 5:41 PM, Noralf Trønnes wrote:
> drm_dev_unplug() has been stripped down and is going away. Open code its
> 2 remaining function calls.
>
> Also remove the drm_dev_is_unplugged() check since this can't be true
> before drm_dev_unregister() is called which happens after the check.
>
> Cc: Oleksandr Andrushchenko 
> Signed-off-by: Noralf Trønnes 
> ---
>   drivers/gpu/drm/xen/xen_drm_front.c | 7 ++-
>   1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/xen/xen_drm_front.c 
> b/drivers/gpu/drm/xen/xen_drm_front.c
> index 3e78a832d7f9..5c5eb24c6342 100644
> --- a/drivers/gpu/drm/xen/xen_drm_front.c
> +++ b/drivers/gpu/drm/xen/xen_drm_front.c
> @@ -576,12 +576,9 @@ static void xen_drm_drv_fini(struct xen_drm_front_info 
> *front_info)
>   if (!dev)
>   return;
>   
> - /* Nothing to do if device is already unplugged */
> - if (drm_dev_is_unplugged(dev))
> - return;
xen_drm_drv_fini is called when the backend changes its state [1],
so I just use the check above to prevent possible race conditions here,
e.g. do not allow to run unregister code if it is already in progress
So, I think we should keep this and probably just add a comment why it is
here
> -
>   drm_kms_helper_poll_fini(dev);
> - drm_dev_unplug(dev);
> + drm_dev_unregister(dev);
> + drm_dev_put(dev);
>   
>   front_info->drm_info = NULL;
>   
[1] https://elixir.bootlin.com/linux/v5.0-rc5/ident/displback_disconnect
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/6] drm/drv: Prepare to remove drm_dev_unplug()

2019-02-04 Thread Oleksandr Andrushchenko
On 2/3/19 5:41 PM, Noralf Trønnes wrote:
> The only thing now that makes drm_dev_unplug() special is that it sets
> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
> can remove drm_dev_unplug().
>
> Signed-off-by: Noralf Trønnes 
Reviewed-by: Oleksandr Andrushchenko 
> ---
>
> Maybe s/unplugged/unregistered/ ?
>
> I looked at drm_device->registered, but using that would mean that
> drm_dev_is_unplugged() would return before drm_device is registered.
> And given that its current purpose is to prevent race against connector
> registration, I stayed away from it.
>
> Noralf.
>
>
>   drivers/gpu/drm/drm_drv.c | 27 +++
>   include/drm/drm_drv.h | 10 --
>   2 files changed, 19 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 05bbc2b622fc..e0941200edc6 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
>*/
>   void drm_dev_unplug(struct drm_device *dev)
>   {
> - /*
> -  * After synchronizing any critical read section is guaranteed to see
> -  * the new value of ->unplugged, and any critical section which might
> -  * still have seen the old value of ->unplugged is guaranteed to have
> -  * finished.
> -  */
> - dev->unplugged = true;
> - synchronize_srcu(_unplug_srcu);
> -
>   drm_dev_unregister(dev);
>   drm_dev_put(dev);
>   }
> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
>* drm_dev_register() but does not deallocate the device. The caller must 
> call
>* drm_dev_put() to drop their final reference.
>*
> - * A special form of unregistering for hotpluggable devices is 
> drm_dev_unplug(),
> - * which can be called while there are still open users of @dev.
> + * This function can be called while there are still open users of @dev as 
> long
> + * as the driver protects its device resources using drm_dev_enter() and
> + * drm_dev_exit().
>*
>* This should be called first in the device teardown code to make sure
> - * userspace can't access the device instance any more.
> + * userspace can't access the device instance any more. Drivers that support
> + * device unplug will probably want to call drm_atomic_helper_shutdown() 
> first
> + * in order to disable the hardware on regular driver module unload.
>*/
>   void drm_dev_unregister(struct drm_device *dev)
>   {
> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
>   if (drm_core_check_feature(dev, DRIVER_LEGACY))
>   drm_lastclose(dev);
>   
> + /*
> +  * After synchronizing any critical read section is guaranteed to see
> +  * the new value of ->unplugged, and any critical section which might
> +  * still have seen the old value of ->unplugged is guaranteed to have
> +  * finished.
> +  */
> + dev->unplugged = true;
> + synchronize_srcu(_unplug_srcu);
> +
>   dev->registered = false;
>   
>   drm_client_dev_unregister(dev);
> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> index ca46a45a9cce..c50696c82a42 100644
> --- a/include/drm/drm_drv.h
> +++ b/include/drm/drm_drv.h
> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
>* drm_dev_is_unplugged - is a DRM device unplugged
>* @dev: DRM device
>*
> - * This function can be called to check whether a hotpluggable is unplugged.
> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
> - * unplugged, these two functions guarantee that any store before calling
> - * drm_dev_unplug() is visible to callers of this function after it completes
> + * This function can be called to check whether @dev is unregistered. This 
> can
> + * be used to detect that the underlying parent device is gone.
>*
> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
> - * recommended that drivers instead use the underlying drm_dev_enter() and
> + * WARNING: This function fundamentally races against drm_dev_unregister(). 
> It
> + * is recommended that drivers instead use the underlying drm_dev_enter() and
>* drm_dev_exit() function pairs.
>*/
>   static inline bool drm_dev_is_unplugged(struct drm_device *dev)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/6] drm: Fix drm_release() and device unplug

2019-02-04 Thread Oleksandr Andrushchenko
On 2/3/19 5:41 PM, Noralf Trønnes wrote:
> If userspace has open fd(s) when drm_dev_unplug() is run, it will result
> in drm_dev_unregister() being called twice. First in drm_dev_unplug() and
> then later in drm_release() through the call to drm_put_dev().
>
> Since userspace already holds a ref on drm_device through the drm_minor,
> it's not necessary to add extra ref counting based on no open file
> handles. Instead just drm_dev_put() unconditionally in drm_dev_unplug().
>
> We now has this:
s/has/have ?
> - Userpace holds a ref on drm_device as long as there's open fd(s)
> - The driver holds a ref on drm_device as long as it's bound to the
>struct device
>
> When both sides are done with drm_device, it is released.
>
> Signed-off-by: Noralf Trønnes 
Reviewed-by: Oleksandr Andrushchenko 
> ---
>   drivers/gpu/drm/drm_drv.c  | 6 +-
>   drivers/gpu/drm/drm_file.c | 6 ++
>   2 files changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 381581b01d48..05bbc2b622fc 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -376,11 +376,7 @@ void drm_dev_unplug(struct drm_device *dev)
>   synchronize_srcu(_unplug_srcu);
>   
>   drm_dev_unregister(dev);
> -
> - mutex_lock(_global_mutex);
> - if (dev->open_count == 0)
> - drm_dev_put(dev);
> - mutex_unlock(_global_mutex);
> + drm_dev_put(dev);
>   }
>   EXPORT_SYMBOL(drm_dev_unplug);
>   
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index 46f48f245eb5..3f20f598cd7c 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -479,11 +479,9 @@ int drm_release(struct inode *inode, struct file *filp)
>   
>   drm_file_free(file_priv);
>   
> - if (!--dev->open_count) {
> + if (!--dev->open_count)
>   drm_lastclose(dev);
> - if (drm_dev_is_unplugged(dev))
> - drm_put_dev(dev);
> - }
> +
>   mutex_unlock(_global_mutex);
>   
>   drm_minor_release(minor);
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/display: Use context parameters to enable FBC

2019-02-04 Thread Wu, Hersen
+ Roman

-Original Message-
From: S, Shirish  
Sent: Monday, February 4, 2019 4:20 AM
To: Li, Sun peng (Leo) ; Wu, Hersen ; 
Wentland, Harry 
Cc: amd-gfx@lists.freedesktop.org; S, Shirish 
Subject: [PATCH] drm/amd/display: Use context parameters to enable FBC

[What]
FBC fails to get enabled when switched between LINEAR(console/VT) and 
non-LINEAR(GUI) based rendering due to default value of tiling info stored in 
the current_state which is used for deciding whether or not to turn FBC on or 
off.

[How]
Use context structure's tiling information which is coherant with the screen 
updates.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index db0ef41..fd7cd5b 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -2535,7 +2535,7 @@ static void dce110_apply_ctx_for_surface(
}
 
if (dc->fbc_compressor)
-   enable_fbc(dc, dc->current_state);
+   enable_fbc(dc, context);
 }
 
 static void dce110_power_down_fe(struct dc *dc, struct pipe_ctx *pipe_ctx)
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 5/6] drm/xen: Use drm_dev_unregister()

2019-02-04 Thread Noralf Trønnes


Den 04.02.2019 11.42, skrev Oleksandr Andrushchenko:
> On 2/3/19 5:41 PM, Noralf Trønnes wrote:
>> drm_dev_unplug() has been stripped down and is going away. Open code its
>> 2 remaining function calls.
>>
>> Also remove the drm_dev_is_unplugged() check since this can't be true
>> before drm_dev_unregister() is called which happens after the check.
>>
>> Cc: Oleksandr Andrushchenko 
>> Signed-off-by: Noralf Trønnes 
>> ---
>>   drivers/gpu/drm/xen/xen_drm_front.c | 7 ++-
>>   1 file changed, 2 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xen/xen_drm_front.c 
>> b/drivers/gpu/drm/xen/xen_drm_front.c
>> index 3e78a832d7f9..5c5eb24c6342 100644
>> --- a/drivers/gpu/drm/xen/xen_drm_front.c
>> +++ b/drivers/gpu/drm/xen/xen_drm_front.c
>> @@ -576,12 +576,9 @@ static void xen_drm_drv_fini(struct xen_drm_front_info 
>> *front_info)
>>  if (!dev)
>>  return;
>>   
>> -/* Nothing to do if device is already unplugged */
>> -if (drm_dev_is_unplugged(dev))
>> -return;
> xen_drm_drv_fini is called when the backend changes its state [1],
> so I just use the check above to prevent possible race conditions here,
> e.g. do not allow to run unregister code if it is already in progress
> So, I think we should keep this and probably just add a comment why it is
> here

Ok, it's just me not reading the code closely enough. I'll put it back
in the next version.

Noralf.

>> -
>>  drm_kms_helper_poll_fini(dev);
>> -drm_dev_unplug(dev);
>> +drm_dev_unregister(dev);
>> +drm_dev_put(dev);
>>   
>>  front_info->drm_info = NULL;
>>   
> [1] https://elixir.bootlin.com/linux/v5.0-rc5/ident/displback_disconnect
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 3/8] drm/amdgpu: clear PDs/PTs only after initializing them

2019-02-04 Thread Christian König
Clear the VM PDs/PTs only after initializing all the structures.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 69b0bee0661e..a52444f4bd38 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -945,10 +945,6 @@ int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
if (r)
return r;
 
-   r = amdgpu_vm_clear_bo(adev, vm, pt, cursor.level, ats);
-   if (r)
-   goto error_free_pt;
-
if (vm->use_cpu_for_update) {
r = amdgpu_bo_kmap(pt, NULL);
if (r)
@@ -961,6 +957,10 @@ int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
pt->parent = amdgpu_bo_ref(cursor.parent->base.bo);
 
amdgpu_vm_bo_base_init(>base, vm, pt);
+
+   r = amdgpu_vm_clear_bo(adev, vm, pt, cursor.level, ats);
+   if (r)
+   goto error_free_pt;
}
 
return 0;
@@ -3053,13 +3053,14 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
if (r)
goto error_unreserve;
 
+   amdgpu_vm_bo_base_init(>root.base, vm, root);
+
r = amdgpu_vm_clear_bo(adev, vm, root,
   adev->vm_manager.root_level,
   vm->pte_support_ats);
if (r)
goto error_unreserve;
 
-   amdgpu_vm_bo_base_init(>root.base, vm, root);
amdgpu_bo_unreserve(vm->root.base.bo);
 
if (pasid) {
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/8] drm/amdgpu: cleanup VM dw estimation a bit

2019-02-04 Thread Christian König
No functional change.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 44950f3b8801..69b0bee0661e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1804,13 +1804,12 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
/*
 * reserve space for two commands every (1 << BLOCK_SIZE)
 *  entries or 2k dwords (whatever is smaller)
- *
- * The second command is for the shadow pagetables.
 */
+   ncmds = ((nptes >> min(adev->vm_manager.block_size, 11u)) + 1);
+
+ /* The second command is for the shadow pagetables. */
if (vm->root.base.bo->shadow)
-   ncmds = ((nptes >> min(adev->vm_manager.block_size, 11u)) + 1) 
* 2;
-   else
-   ncmds = ((nptes >> min(adev->vm_manager.block_size, 11u)) + 1);
+   ncmds *= 2;
 
/* padding, etc. */
ndw = 64;
@@ -1829,10 +1828,11 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
ndw += ncmds * 10;
 
/* extra commands for begin/end fragments */
+   ncmds = 2 * adev->vm_manager.fragment_size;
if (vm->root.base.bo->shadow)
-   ndw += 2 * 10 * adev->vm_manager.fragment_size * 2;
-   else
-   ndw += 2 * 10 * adev->vm_manager.fragment_size;
+   ncmds *= 2;
+
+   ndw += 10 * ncmds;
 
params.func = amdgpu_vm_do_set_ptes;
}
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/8] drm/amdgpu: fix waiting for BO moves with CPU based PD/PT updates

2019-02-04 Thread Christian König
Otherwise we open up the possibility to use uninitialized memory.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index c70621db3bb7..44950f3b8801 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1779,13 +1779,18 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
if (pages_addr)
params.src = ~0;
 
-   /* Wait for PT BOs to be free. PTs share the same resv. object
+   /* Wait for PT BOs to be idle. PTs share the same resv. object
 * as the root PD BO
 */
r = amdgpu_vm_wait_pd(adev, vm, owner);
if (unlikely(r))
return r;
 
+   /* Wait for any BO move to be completed */
+   r = dma_fence_wait(exclusive, true);
+   if (unlikely(r))
+   return r;
+
params.func = amdgpu_vm_cpu_set_ptes;
params.pages_addr = pages_addr;
return amdgpu_vm_update_ptes(, start, last + 1,
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 4/8] drm/amdgpu: rework shadow handling during PD clear

2019-02-04 Thread Christian König
This way we only deal with the real BO in here.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 64 --
 1 file changed, 39 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index a52444f4bd38..283a9e9878be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -788,38 +788,58 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
 
r = ttm_bo_validate(>tbo, >placement, );
if (r)
-   goto error;
+   return r;
 
r = amdgpu_ttm_alloc_gart(>tbo);
if (r)
return r;
 
+   if (bo->shadow) {
+   r = ttm_bo_validate(>shadow->tbo, >shadow->placement,
+   );
+   if (r)
+   return r;
+
+   r = amdgpu_ttm_alloc_gart(>shadow->tbo);
+   if (r)
+   return r;
+
+   }
+
r = amdgpu_job_alloc_with_ib(adev, 64, );
if (r)
-   goto error;
+   return r;
 
-   addr = amdgpu_bo_gpu_offset(bo);
-   if (ats_entries) {
-   uint64_t ats_value;
+   while (1) {
+   addr = amdgpu_bo_gpu_offset(bo);
+   if (ats_entries) {
+   uint64_t ats_value;
 
-   ats_value = AMDGPU_PTE_DEFAULT_ATC;
-   if (level != AMDGPU_VM_PTB)
-   ats_value |= AMDGPU_PDE_PTE;
+   ats_value = AMDGPU_PTE_DEFAULT_ATC;
+   if (level != AMDGPU_VM_PTB)
+   ats_value |= AMDGPU_PDE_PTE;
 
-   amdgpu_vm_set_pte_pde(adev, >ibs[0], addr, 0,
- ats_entries, 0, ats_value);
-   addr += ats_entries * 8;
-   }
+   amdgpu_vm_set_pte_pde(adev, >ibs[0], addr, 0,
+ ats_entries, 0, ats_value);
+   addr += ats_entries * 8;
+   }
 
-   if (entries) {
-   uint64_t value = 0;
+   if (entries) {
+   uint64_t value = 0;
 
-   /* Workaround for fault priority problem on GMC9 */
-   if (level == AMDGPU_VM_PTB && adev->asic_type >= CHIP_VEGA10)
-   value = AMDGPU_PTE_EXECUTABLE;
+   /* Workaround for fault priority problem on GMC9 */
+   if (level == AMDGPU_VM_PTB &&
+   adev->asic_type >= CHIP_VEGA10)
+   value = AMDGPU_PTE_EXECUTABLE;
+
+   amdgpu_vm_set_pte_pde(adev, >ibs[0], addr, 0,
+ entries, 0, value);
+   }
 
-   amdgpu_vm_set_pte_pde(adev, >ibs[0], addr, 0,
- entries, 0, value);
+   if (bo->shadow)
+   bo = bo->shadow;
+   else
+   break;
}
 
amdgpu_ring_pad_ib(ring, >ibs[0]);
@@ -838,16 +858,10 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
amdgpu_bo_fence(bo, fence, true);
dma_fence_put(fence);
 
-   if (bo->shadow)
-   return amdgpu_vm_clear_bo(adev, vm, bo->shadow,
- level, pte_support_ats);
-
return 0;
 
 error_free:
amdgpu_job_free(job);
-
-error:
return r;
 }
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 7/8] drm/amdgpu: free PDs/PTs on demand

2019-02-04 Thread Christian König
When something is unmapped we now free the affected PDs/PTs again.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 71 +++---
 1 file changed, 53 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 0b8a1aa56c4a..a3dd2b397299 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -515,12 +515,31 @@ static void amdgpu_vm_pt_next(struct amdgpu_device *adev,
  */
 static void amdgpu_vm_pt_first_dfs(struct amdgpu_device *adev,
   struct amdgpu_vm *vm,
+  struct amdgpu_vm_pt_cursor *start,
   struct amdgpu_vm_pt_cursor *cursor)
 {
-   amdgpu_vm_pt_start(adev, vm, 0, cursor);
+   if (start)
+   *cursor = *start;
+   else
+   amdgpu_vm_pt_start(adev, vm, 0, cursor);
while (amdgpu_vm_pt_descendant(adev, cursor));
 }
 
+/**
+ * amdgpu_vm_pt_continue_dfs - check if the deep first search should continue
+ *
+ * @start: starting point for the search
+ * @entry: current entry
+ *
+ * Returns:
+ * True when the search should continue, false otherwise.
+ */
+static bool amdgpu_vm_pt_continue_dfs(struct amdgpu_vm_pt_cursor *start,
+ struct amdgpu_vm_pt *entry)
+{
+   return entry && (!start || entry != start->entry);
+}
+
 /**
  * amdgpu_vm_pt_next_dfs - get the next node for a deep first search
  *
@@ -546,11 +565,11 @@ static void amdgpu_vm_pt_next_dfs(struct amdgpu_device 
*adev,
 /**
  * for_each_amdgpu_vm_pt_dfs_safe - safe deep first search of all PDs/PTs
  */
-#define for_each_amdgpu_vm_pt_dfs_safe(adev, vm, cursor, entry)
\
-   for (amdgpu_vm_pt_first_dfs((adev), (vm), &(cursor)),   
\
+#define for_each_amdgpu_vm_pt_dfs_safe(adev, vm, start, cursor, entry) 
\
+   for (amdgpu_vm_pt_first_dfs((adev), (vm), (start), &(cursor)),  
\
 (entry) = (cursor).entry, amdgpu_vm_pt_next_dfs((adev), 
&(cursor));\
-(entry); (entry) = (cursor).entry, 
\
-amdgpu_vm_pt_next_dfs((adev), &(cursor)))
+amdgpu_vm_pt_continue_dfs((start), (entry));   
\
+(entry) = (cursor).entry, amdgpu_vm_pt_next_dfs((adev), &(cursor)))
 
 /**
  * amdgpu_vm_get_pd_bo - add the VM PD to a validation list
@@ -931,32 +950,46 @@ static int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
return r;
 }
 
+/**
+ * amdgpu_vm_free_table - fre one PD/PT
+ *
+ * @entry: PDE to free
+ */
+static void amdgpu_vm_free_table(struct amdgpu_vm_pt *entry)
+{
+   if (entry->base.bo) {
+   entry->base.bo->vm_bo = NULL;
+   list_del(>base.vm_status);
+   amdgpu_bo_unref(>base.bo->shadow);
+   amdgpu_bo_unref(>base.bo);
+   }
+   kvfree(entry->entries);
+   entry->entries = NULL;
+}
+
 /**
  * amdgpu_vm_free_pts - free PD/PT levels
  *
  * @adev: amdgpu device structure
  * @vm: amdgpu vm structure
+ * @start: optional cursor where to start freeing PDs/PTs
  *
  * Free the page directory or page table level and all sub levels.
  */
 static void amdgpu_vm_free_pts(struct amdgpu_device *adev,
-  struct amdgpu_vm *vm)
+  struct amdgpu_vm *vm,
+  struct amdgpu_vm_pt_cursor *start)
 {
struct amdgpu_vm_pt_cursor cursor;
struct amdgpu_vm_pt *entry;
 
-   for_each_amdgpu_vm_pt_dfs_safe(adev, vm, cursor, entry) {
+   vm->bulk_moveable = false;
 
-   if (entry->base.bo) {
-   entry->base.bo->vm_bo = NULL;
-   list_del(>base.vm_status);
-   amdgpu_bo_unref(>base.bo->shadow);
-   amdgpu_bo_unref(>base.bo);
-   }
-   kvfree(entry->entries);
-   }
+   for_each_amdgpu_vm_pt_dfs_safe(adev, vm, start, cursor, entry)
+   amdgpu_vm_free_table(entry);
 
-   BUG_ON(vm->root.base.bo);
+   if (start)
+   amdgpu_vm_free_table(start->entry);
 }
 
 /**
@@ -1377,7 +1410,7 @@ static void amdgpu_vm_invalidate_pds(struct amdgpu_device 
*adev,
struct amdgpu_vm_pt_cursor cursor;
struct amdgpu_vm_pt *entry;
 
-   for_each_amdgpu_vm_pt_dfs_safe(adev, vm, cursor, entry)
+   for_each_amdgpu_vm_pt_dfs_safe(adev, vm, NULL, cursor, entry)
if (entry->base.bo && !entry->base.moved)
amdgpu_vm_bo_relocated(>base);
 }
@@ -1684,6 +1717,7 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_pte_update_params *params,
/* Mark all child entries as huge */
while (cursor.pfn < frag_start) {
cursor.entry->huge = true;
+

Re: [PATCH 0/6] drm/drv: Remove drm_dev_unplug()

2019-02-04 Thread Koenig, Christian
Adding Andrey who looked into cleaning this up a while ago as well.

Christian.

Am 03.02.19 um 16:41 schrieb Noralf Trønnes:
> This series removes drm_dev_unplug() and moves the unplugged state
> setting to drm_dev_unregister(). All drivers will now have access to the
> unplugged state if they so desire.
>
> The drm_device ref handling wrt to the last fd closed after unregister
> have been simplified, which also fixed a double drm_dev_unregister()
> situation.
>
> Noralf.
>
> Noralf Trønnes (6):
>drm: Fix drm_release() and device unplug
>drm/drv: Prepare to remove drm_dev_unplug()
>drm/amd: Use drm_dev_unregister()
>drm/udl: Use drm_dev_unregister()
>drm/xen: Use drm_dev_unregister()
>drm/drv: Remove drm_dev_unplug()
>
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  3 +-
>   drivers/gpu/drm/drm_drv.c   | 48 -
>   drivers/gpu/drm/drm_file.c  |  6 ++--
>   drivers/gpu/drm/udl/udl_drv.c   |  3 +-
>   drivers/gpu/drm/xen/xen_drm_front.c |  7 ++--
>   include/drm/drm_drv.h   | 11 +++---
>   6 files changed, 27 insertions(+), 51 deletions(-)
>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgu/vce_v3: start vce block before ring test

2019-02-04 Thread Koenig, Christian
Am 04.02.19 um 13:44 schrieb S, Shirish:
> vce ring test fails during resume since mmVCE_RB_RPTR*
> is not intitalized/updated.
>
> Hence start vce block before ring test.

Mhm, I wonder why this ever worked. But yeah, same problem seems to 
exits for VCE 2 as well.

Leo any comment on this?

Thanks,
Christian.

>
> Signed-off-by: Shirish S 
> ---
> * vce_v4_0.c's hw_init sequence already has this change.
>
>   drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 4 
>   1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
> b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> index 6ec65cf1..d809c10 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> @@ -469,6 +469,10 @@ static int vce_v3_0_hw_init(void *handle)
>   int r, i;
>   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   
> + r = vce_v3_0_start(adev);
> + if (r)
> + return r;
> +
>   vce_v3_0_override_vce_clock_gating(adev, true);
>   
>   amdgpu_asic_set_vce_clocks(adev, 1, 1);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgu/vce_v3: start vce block before ring test

2019-02-04 Thread S, Shirish
vce ring test fails during resume since mmVCE_RB_RPTR*
is not intitalized/updated.

Hence start vce block before ring test.

Signed-off-by: Shirish S 
---
* vce_v4_0.c's hw_init sequence already has this change.

 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 6ec65cf1..d809c10 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -469,6 +469,10 @@ static int vce_v3_0_hw_init(void *handle)
int r, i;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   r = vce_v3_0_start(adev);
+   if (r)
+   return r;
+
vce_v3_0_override_vce_clock_gating(adev, true);
 
amdgpu_asic_set_vce_clocks(adev, 1, 1);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 6/8] drm/amdgpu: allocate VM PDs/PTs on demand

2019-02-04 Thread Christian König
Let's start to allocate VM PDs/PTs on demand instead of pre-allocating
them during mapping.

Signed-off-by: Christian König 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   6 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c   |   9 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  10 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 136 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|   3 -
 5 files changed, 38 insertions(+), 126 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index d7b10d79f1de..2176c92f9377 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -492,12 +492,6 @@ static int add_bo_to_vm(struct amdgpu_device *adev, struct 
kgd_mem *mem,
vm->process_info->eviction_fence,
NULL, NULL);
 
-   ret = amdgpu_vm_alloc_pts(adev, vm, va, amdgpu_bo_size(bo));
-   if (ret) {
-   pr_err("Failed to allocate pts, err=%d\n", ret);
-   goto err_alloc_pts;
-   }
-
ret = vm_validate_pt_pd_bos(vm);
if (ret) {
pr_err("validate_pt_pd_bos() failed\n");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 7e22be7ca68a..54dd02a898b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -92,15 +92,6 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
return -ENOMEM;
}
 
-   r = amdgpu_vm_alloc_pts(adev, (*bo_va)->base.vm, csa_addr,
-   size);
-   if (r) {
-   DRM_ERROR("failed to allocate pts for static CSA, err=%d\n", r);
-   amdgpu_vm_bo_rmv(adev, *bo_va);
-   ttm_eu_backoff_reservation(, );
-   return r;
-   }
-
r = amdgpu_vm_bo_map(adev, *bo_va, csa_addr, 0, size,
 AMDGPU_PTE_READABLE | AMDGPU_PTE_WRITEABLE |
 AMDGPU_PTE_EXECUTABLE);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index d21dd2f369da..e141e3b13112 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -627,11 +627,6 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 
switch (args->operation) {
case AMDGPU_VA_OP_MAP:
-   r = amdgpu_vm_alloc_pts(adev, bo_va->base.vm, args->va_address,
-   args->map_size);
-   if (r)
-   goto error_backoff;
-
va_flags = amdgpu_gmc_get_pte_flags(adev, args->flags);
r = amdgpu_vm_bo_map(adev, bo_va, args->va_address,
 args->offset_in_bo, args->map_size,
@@ -647,11 +642,6 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
args->map_size);
break;
case AMDGPU_VA_OP_REPLACE:
-   r = amdgpu_vm_alloc_pts(adev, bo_va->base.vm, args->va_address,
-   args->map_size);
-   if (r)
-   goto error_backoff;
-
va_flags = amdgpu_gmc_get_pte_flags(adev, args->flags);
r = amdgpu_vm_bo_replace_map(adev, bo_va, args->va_address,
 args->offset_in_bo, args->map_size,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index f7d410a5d848..0b8a1aa56c4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -504,47 +504,6 @@ static void amdgpu_vm_pt_next(struct amdgpu_device *adev,
}
 }
 
-/**
- * amdgpu_vm_pt_first_leaf - get first leaf PD/PT
- *
- * @adev: amdgpu_device pointer
- * @vm: amdgpu_vm structure
- * @start: start addr of the walk
- * @cursor: state to initialize
- *
- * Start a walk and go directly to the leaf node.
- */
-static void amdgpu_vm_pt_first_leaf(struct amdgpu_device *adev,
-   struct amdgpu_vm *vm, uint64_t start,
-   struct amdgpu_vm_pt_cursor *cursor)
-{
-   amdgpu_vm_pt_start(adev, vm, start, cursor);
-   while (amdgpu_vm_pt_descendant(adev, cursor));
-}
-
-/**
- * amdgpu_vm_pt_next_leaf - get next leaf PD/PT
- *
- * @adev: amdgpu_device pointer
- * @cursor: current state
- *
- * Walk the PD/PT tree to the next leaf node.
- */
-static void amdgpu_vm_pt_next_leaf(struct amdgpu_device *adev,
-  struct amdgpu_vm_pt_cursor *cursor)
-{
-   amdgpu_vm_pt_next(adev, cursor);
-   if (cursor->pfn != ~0ll)
-   while (amdgpu_vm_pt_descendant(adev, cursor));
-}
-
-/**
- * for_each_amdgpu_vm_pt_leaf 

[PATCH 8/8] drm/amdgpu: drop the huge page flag

2019-02-04 Thread Christian König
Not needed any more since we now free PDs/PTs on demand.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 14 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  1 -
 2 files changed, 1 insertion(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index a3dd2b397299..8f4fd8a7fb96 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1383,10 +1383,6 @@ static void amdgpu_vm_update_pde(struct 
amdgpu_pte_update_params *params,
uint64_t pde, pt, flags;
unsigned level;
 
-   /* Don't update huge pages here */
-   if (entry->huge)
-   return;
-
for (level = 0, pbo = bo->parent; pbo; ++level)
pbo = pbo->parent;
 
@@ -1649,13 +1645,6 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_pte_update_params *params,
continue;
}
 
-   /* If it isn't already handled it can't be a huge page */
-   if (cursor.entry->huge) {
-   /* Add the entry to the relocated list to update it. */
-   cursor.entry->huge = false;
-   amdgpu_vm_bo_relocated(>base);
-   }
-
shift = amdgpu_vm_level_shift(adev, cursor.level);
parent_shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
if (adev->asic_type < CHIP_VEGA10) {
@@ -1714,9 +1703,8 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_pte_update_params *params,
} while (frag_start < entry_end);
 
if (amdgpu_vm_pt_descendant(adev, )) {
-   /* Mark all child entries as huge */
+   /* Free all child entries */
while (cursor.pfn < frag_start) {
-   cursor.entry->huge = true;
amdgpu_vm_free_pts(adev, params->vm, );
amdgpu_vm_pt_next(adev, );
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 116605c038d2..3c6537ef659c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -140,7 +140,6 @@ struct amdgpu_vm_bo_base {
 
 struct amdgpu_vm_pt {
struct amdgpu_vm_bo_basebase;
-   boolhuge;
 
/* array of page tables, one for each directory entry */
struct amdgpu_vm_pt *entries;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: add a workaround for GDS ordered append hangs with compute queues

2019-02-04 Thread Christian König
At least from coding style, backward compatibility etc.. this looks sane 
to me, so feel free to add an Acked-by.


But I absolutely can't judge if that is correct from the hardware point 
of view or not.


And I think that somebody else looking at this is mandatory for it to be 
committed.


Christian.

Am 28.01.19 um 18:25 schrieb Marek Olšák:

Ping

On Tue, Jan 22, 2019 at 3:05 PM Marek Olšák > wrote:


From: Marek Olšák mailto:marek.ol...@amd.com>>

I'm not increasing the DRM version because GDS isn't totally
without bugs yet.

v2: update emit_ib_size

Signed-off-by: Marek Olšák mailto:marek.ol...@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h |  2 ++
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c   | 19 +++-
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   | 21 +++--
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 40
+++--
 include/uapi/drm/amdgpu_drm.h           |  5 
 5 files changed, 82 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h
index ecbcefe49a98..f89f5734d985 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h
@@ -30,20 +30,22 @@ struct amdgpu_bo;
 struct amdgpu_gds_asic_info {
        uint32_t        total_size;
        uint32_t        gfx_partition_size;
        uint32_t        cs_partition_size;
 };

 struct amdgpu_gds {
        struct amdgpu_gds_asic_info     mem;
        struct amdgpu_gds_asic_info     gws;
        struct amdgpu_gds_asic_info     oa;
+       uint32_t gds_compute_max_wave_id;
+
        /* At present, GDS, GWS and OA resources for gfx (graphics)
         * is always pre-allocated and available for graphics
operation.
         * Such resource is shared between all gfx clients.
         * TODO: move this operation to user space
         * */
        struct amdgpu_bo*               gds_gfx_bo;
        struct amdgpu_bo*               gws_gfx_bo;
        struct amdgpu_bo*               oa_gfx_bo;
 };

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index 7984292f9282..a59e0fdf5a97 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -2257,20 +2257,36 @@ static void
gfx_v7_0_ring_emit_ib_gfx(struct amdgpu_ring *ring,
 }

 static void gfx_v7_0_ring_emit_ib_compute(struct amdgpu_ring *ring,
                                          struct amdgpu_job *job,
                                          struct amdgpu_ib *ib,
                                          uint32_t flags)
 {
        unsigned vmid = AMDGPU_JOB_GET_VMID(job);
        u32 control = INDIRECT_BUFFER_VALID | ib->length_dw |
(vmid << 24);

+       /* Currently, there is a high possibility to get wave ID
mismatch
+        * between ME and GDS, leading to a hw deadlock, because
ME generates
+        * different wave IDs than the GDS expects. This situation
happens
+        * randomly when at least 5 compute pipes use GDS ordered
append.
+        * The wave IDs generated by ME are also wrong after
suspend/resume.
+        * Those are probably bugs somewhere else in the kernel
driver.
+        *
+        * Writing GDS_COMPUTE_MAX_WAVE_ID resets wave ID counters
in ME and
+        * GDS to 0 for this ring (me/pipe).
+        */
+       if (ib->flags & AMDGPU_IB_FLAG_RESET_GDS_MAX_WAVE_ID) {
+               amdgpu_ring_write(ring,
PACKET3(PACKET3_SET_CONFIG_REG, 1));
+               amdgpu_ring_write(ring, mmGDS_COMPUTE_MAX_WAVE_ID
- PACKET3_SET_CONFIG_REG_START);
+               amdgpu_ring_write(ring,
ring->adev->gds.gds_compute_max_wave_id);
+       }
+
        amdgpu_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
        amdgpu_ring_write(ring,
 #ifdef __BIG_ENDIAN
                                          (2 << 0) |
 #endif
                                          (ib->gpu_addr &
0xFFFC));
        amdgpu_ring_write(ring, upper_32_bits(ib->gpu_addr) & 0x);
        amdgpu_ring_write(ring, control);
 }

@@ -4993,21 +5009,21 @@ static const struct amdgpu_ring_funcs
gfx_v7_0_ring_funcs_compute = {
        .get_rptr = gfx_v7_0_ring_get_rptr,
        .get_wptr = gfx_v7_0_ring_get_wptr_compute,
        .set_wptr = gfx_v7_0_ring_set_wptr_compute,
        .emit_frame_size =
                20 + /* gfx_v7_0_ring_emit_gds_switch */
                7 + /* gfx_v7_0_ring_emit_hdp_flush */
                5 + /* hdp invalidate */
                7 + /* gfx_v7_0_ring_emit_pipeline_sync */
                CIK_FLUSH_GPU_TLB_NUM_WREG * 5 + 7 + /*

[PATCH 5/8] drm/amdgpu: let amdgpu_vm_clear_bo figure out ats status

2019-02-04 Thread Christian König
Instead of providing it from outside figure out the ats status in the
function itself from the data structures.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 52 ++
 1 file changed, 29 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 283a9e9878be..f7d410a5d848 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -747,8 +747,6 @@ bool amdgpu_vm_ready(struct amdgpu_vm *vm)
  * @adev: amdgpu_device pointer
  * @vm: VM to clear BO from
  * @bo: BO to clear
- * @level: level this BO is at
- * @pte_support_ats: indicate ATS support from PTE
  *
  * Root PD needs to be reserved when calling this.
  *
@@ -756,10 +754,11 @@ bool amdgpu_vm_ready(struct amdgpu_vm *vm)
  * 0 on success, errno otherwise.
  */
 static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
- struct amdgpu_vm *vm, struct amdgpu_bo *bo,
- unsigned level, bool pte_support_ats)
+ struct amdgpu_vm *vm,
+ struct amdgpu_bo *bo)
 {
struct ttm_operation_ctx ctx = { true, false };
+   unsigned level = adev->vm_manager.root_level;
struct dma_fence *fence = NULL;
unsigned entries, ats_entries;
struct amdgpu_ring *ring;
@@ -768,17 +767,32 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
int r;
 
entries = amdgpu_bo_size(bo) / 8;
+   if (vm->pte_support_ats) {
+   ats_entries = amdgpu_vm_level_shift(adev, level);
+   ats_entries += AMDGPU_GPU_PAGE_SHIFT;
+   ats_entries = AMDGPU_GMC_HOLE_START >> ats_entries;
 
-   if (pte_support_ats) {
-   if (level == adev->vm_manager.root_level) {
-   ats_entries = amdgpu_vm_level_shift(adev, level);
-   ats_entries += AMDGPU_GPU_PAGE_SHIFT;
-   ats_entries = AMDGPU_GMC_HOLE_START >> ats_entries;
+   if (!bo->parent) {
ats_entries = min(ats_entries, entries);
entries -= ats_entries;
} else {
-   ats_entries = entries;
-   entries = 0;
+   struct amdgpu_bo *ancestor = bo;
+   struct amdgpu_vm_pt *pt;
+
+   ++level;
+   while (ancestor->parent->parent) {
+   ancestor = ancestor->parent;
+   ++level;
+   }
+
+   pt = container_of(ancestor->vm_bo, struct amdgpu_vm_pt,
+ base);
+   if ((pt - vm->root.entries) >= ats_entries) {
+   ats_entries = 0;
+   } else {
+   ats_entries = entries;
+   entries = 0;
+   }
}
} else {
ats_entries = 0;
@@ -911,7 +925,6 @@ int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
 {
struct amdgpu_vm_pt_cursor cursor;
struct amdgpu_bo *pt;
-   bool ats = false;
uint64_t eaddr;
int r;
 
@@ -921,9 +934,6 @@ int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
 
eaddr = saddr + size - 1;
 
-   if (vm->pte_support_ats)
-   ats = saddr < AMDGPU_GMC_HOLE_START;
-
saddr /= AMDGPU_GPU_PAGE_SIZE;
eaddr /= AMDGPU_GPU_PAGE_SIZE;
 
@@ -972,7 +982,7 @@ int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
 
amdgpu_vm_bo_base_init(>base, vm, pt);
 
-   r = amdgpu_vm_clear_bo(adev, vm, pt, cursor.level, ats);
+   r = amdgpu_vm_clear_bo(adev, vm, pt);
if (r)
goto error_free_pt;
}
@@ -3069,9 +3079,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
 
amdgpu_vm_bo_base_init(>root.base, vm, root);
 
-   r = amdgpu_vm_clear_bo(adev, vm, root,
-  adev->vm_manager.root_level,
-  vm->pte_support_ats);
+   r = amdgpu_vm_clear_bo(adev, vm, root);
if (r)
goto error_unreserve;
 
@@ -3166,9 +3174,8 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, 
struct amdgpu_vm *vm, uns
 * changing any other state, in case it fails.
 */
if (pte_support_ats != vm->pte_support_ats) {
-   r = amdgpu_vm_clear_bo(adev, vm, vm->root.base.bo,
-  adev->vm_manager.root_level,
-  pte_support_ats);
+   vm->pte_support_ats = pte_support_ats;
+   r = amdgpu_vm_clear_bo(adev, vm, vm->root.base.bo);
if (r)
goto free_idr;
}
@@ 

Re: [PATCH] Revert "Revert "drm/amd/powerplay: support Vega10 SOCclk and DCEFclk dpm level settings""

2019-02-04 Thread Michel Dänzer
On 2019-02-03 4:17 a.m., Kenneth Feng wrote:
> This reverts commit ea37fc706e4cde83b39ad2104eec0241e752b8ea.
> Since we have another patch to fix the below problem,
> we need to revert the 'revert'
> https://bugs.freedesktop.org/show_bug.cgi?id=109462
> Acked by Alex Deucher
> 
> Signed-off-by: Kenneth Feng 

Revert "Revert "..."" is kind of an ugly shortlog. :) It's nicer to
re-use the original commit's log, possibly amended with an explanation
why the change is applied again.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: After upgrade mesa to 19.0.0~rc1 all vulkan based application stop working ["vulkan-cube" received SIGSEGV in radv_pipeline_init_blend_state at ../src/amd/vulkan/radv_pipeline.c:699]

2019-02-04 Thread Michel Dänzer
On 2019-02-03 7:15 p.m., Mikhail Gavrilov wrote:
> Hi folks.
> Today I tried update mesa to 19.0.0~rc1 version, but after update all
> vulkan based application stop working.
> 
> List of tested application:
> - Rise of the Tomb Raider
> - F1 2017
> - Any game launched via steam play DXVK
> - vulcan-cube (backtrace is included to this message)
> 
> Last working mesa version:
> 18.3.2-1

Please report RADV issues at
https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa=Drivers/Vulkan/radeon
.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/6] drm/drv: Remove drm_dev_unplug()

2019-02-04 Thread Daniel Vetter
On Sun, Feb 03, 2019 at 04:41:54PM +0100, Noralf Trønnes wrote:
> This series removes drm_dev_unplug() and moves the unplugged state
> setting to drm_dev_unregister(). All drivers will now have access to the
> unplugged state if they so desire.
> 
> The drm_device ref handling wrt to the last fd closed after unregister
> have been simplified, which also fixed a double drm_dev_unregister()
> situation.

Nice. On the series:

Reviewed-by: Daniel Vetter 

But definitely wait 1-2 weeks for some more driver acks I'd say before
applying this all.
-Daniel

> 
> Noralf.
> 
> Noralf Trønnes (6):
>   drm: Fix drm_release() and device unplug
>   drm/drv: Prepare to remove drm_dev_unplug()
>   drm/amd: Use drm_dev_unregister()
>   drm/udl: Use drm_dev_unregister()
>   drm/xen: Use drm_dev_unregister()
>   drm/drv: Remove drm_dev_unplug()
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  3 +-
>  drivers/gpu/drm/drm_drv.c   | 48 -
>  drivers/gpu/drm/drm_file.c  |  6 ++--
>  drivers/gpu/drm/udl/udl_drv.c   |  3 +-
>  drivers/gpu/drm/xen/xen_drm_front.c |  7 ++--
>  include/drm/drm_drv.h   | 11 +++---
>  6 files changed, 27 insertions(+), 51 deletions(-)
> 
> -- 
> 2.20.1
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: Use context parameters to enable FBC

2019-02-04 Thread S, Shirish
[What]
FBC fails to get enabled when switched between LINEAR(console/VT)
and non-LINEAR(GUI) based rendering due to default value of
tiling info stored in the current_state which is used for deciding
whether or not to turn FBC on or off.

[How]
Use context structure's tiling information which is coherant with
the screen updates.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index db0ef41..fd7cd5b 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -2535,7 +2535,7 @@ static void dce110_apply_ctx_for_surface(
}
 
if (dc->fbc_compressor)
-   enable_fbc(dc, dc->current_state);
+   enable_fbc(dc, context);
 }
 
 static void dce110_power_down_fe(struct dc *dc, struct pipe_ctx *pipe_ctx)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


After upgrade mesa to 19.0.0~rc1 all vulkan based application stop working ["vulkan-cube" received SIGSEGV in radv_pipeline_init_blend_state at ../src/amd/vulkan/radv_pipeline.c:699]

2019-02-04 Thread Mikhail Gavrilov
Hi folks.
Today I tried update mesa to 19.0.0~rc1 version, but after update all
vulkan based application stop working.

List of tested application:
- Rise of the Tomb Raider
- F1 2017
- Any game launched via steam play DXVK
- vulcan-cube (backtrace is included to this message)

Last working mesa version:
18.3.2-1

--
Best Regards,
Mike Gavrilov.
$ gdb vulkan-cube
GNU gdb (GDB) Fedora 8.2.50.20190120-13.fc30
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from vulkan-cube...
Reading symbols from 
/usr/lib/debug/usr/bin/vulkan-cube-1.1.82.0-1.fc29.x86_64.debug...
(gdb) r
Starting program: /usr/bin/vulkan-cube 
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
[New Thread 0x72e3c700 (LWP 11328)]

Thread 1 "vulkan-cube" received signal SIGSEGV, Segmentation fault.
radv_pipeline_init_blend_state (pipeline=0x558e7880, extra=0x7fffbd80, 
pCreateInfo=0x7fffbdf0) at ../src/amd/vulkan/radv_pipeline.c:699
699 VkBlendOp eqRGB = att->colorBlendOp;
(gdb) thread apply all bt full

Thread 2 (Thread 0x72e3c700 (LWP 11328)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x5565e108) at 
../sysdeps/unix/sysv/linux/futex-internal.h:88
__ret = -512
oldtype = 0
err = 
oldtype = 
err = 
__ret = 
resultvar = 
__arg4 = 
__arg3 = 
__arg2 = 
__arg1 = 
_a4 = 
_a3 = 
_a2 = 
_a1 = 
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x5565e0b8, 
cond=0x5565e0e0) at pthread_cond_wait.c:502
spin = 0
buffer = {__routine = 0x77bff250 <__condvar_cleanup_waiting>, __arg 
= 0x72e3bd80, __canceltype = 0, __prev = 0x0}
cbuffer = {wseq = 0, cond = 0x5565e0e0, mutex = 0x5565e0b8, 
private = 0}
rt = 
err = 
g = 0
flags = 
g1_start = 
signals = 
result = 0
wseq = 0
seq = 0
private = 0
maxspin = 
err = 
result = 
wseq = 
g = 
seq = 
flags = 
private = 
signals = 
g1_start = 
spin = 
buffer = 
cbuffer = 
rt = 
s = 
#2  __pthread_cond_wait (cond=0x5565e0e0, mutex=0x5565e0b8) at 
pthread_cond_wait.c:655
No locals.
#3  0x770a14a3 in cnd_wait (mtx=0x5565e0b8, cond=0x5565e0e0) at 
../src/../include/c11/threads_posix.h:155
__PRETTY_FUNCTION__ = 
--Type  

[PATCH] drm/amdgpu/display: fix compiler errors [-Werror,-Wparentheses-equality]

2019-02-04 Thread Vishwakarma, Pratik
Remove extraneous parentheses around the comparison
to silence this warning

Signed-off-by: Pratik Vishwakarma 
---
 drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c 
b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c
index 7d102ac0d61b..1ef0074302c5 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c
@@ -63,7 +63,7 @@ void scaler_settings_calculation(struct dcn_bw_internal_vars 
*v)
if (v->interlace_output[k] == 1.0) {
v->v_ratio[k] = 2.0 * v->v_ratio[k];
}
-   if ((v->underscan_output[k] == 1.0)) {
+   if (v->underscan_output[k] == 1.0) {
v->h_ratio[k] = v->h_ratio[k] * v->under_scan_factor;
v->v_ratio[k] = v->v_ratio[k] * v->under_scan_factor;
}
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx