[PATCH] drm/amd/powerplay: avoid possible buffer overflow

2019-05-28 Thread Young Xiao
Make sure the clock level enforced is within the allowed
ranges in case PP_SCLK and PP_MCLK.

Signed-off-by: Young Xiao <92siuy...@gmail.com>
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
index 707cd4b..ae6cbe8 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
@@ -1836,6 +1836,12 @@ static int vega12_force_clock_level(struct pp_hwmgr 
*hwmgr,
case PP_SCLK:
soft_min_level = mask ? (ffs(mask) - 1) : 0;
soft_max_level = mask ? (fls(mask) - 1) : 0;
+   if (soft_max_level >= data->dpm_table.gfx_table.count) {
+   pr_err("Clock level specified %d is over max allowed 
%d\n",
+   soft_max_level,
+   data->dpm_table.gfx_table.count - 1);
+   return -EINVAL;
+   }
 
data->dpm_table.gfx_table.dpm_state.soft_min_level =

data->dpm_table.gfx_table.dpm_levels[soft_min_level].value;
@@ -1856,6 +1862,12 @@ static int vega12_force_clock_level(struct pp_hwmgr 
*hwmgr,
case PP_MCLK:
soft_min_level = mask ? (ffs(mask) - 1) : 0;
soft_max_level = mask ? (fls(mask) - 1) : 0;
+   if (soft_max_level >= data->dpm_table.gfx_table.count) {
+   pr_err("Clock level specified %d is over max allowed 
%d\n",
+   soft_max_level,
+   data->dpm_table.gfx_table.count - 1);
+   return -EINVAL;
+   }
 
data->dpm_table.mem_table.dpm_state.soft_min_level =

data->dpm_table.mem_table.dpm_levels[soft_min_level].value;
-- 
2.7.4



[PATCH v2] drm/amdgpu/display: Fix reload driver error

2019-05-28 Thread Emily Deng
Issue:
Will have follow error when reload driver:
[ 3986.567739] sysfs: cannot create duplicate filename 
'/devices/pci:00/:00:07.0/drm_dp_aux_dev'
[ 3986.567743] CPU: 6 PID: 1767 Comm: modprobe Tainted: G   OE 
5.0.0-rc1-custom #1
[ 3986.567745] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 3986.567746] Call Trace:
..
[ 3986.567808]  drm_dp_aux_register_devnode+0xdc/0x140 [drm_kms_helper]
..
[ 3986.569081] kobject_add_internal failed for drm_dp_aux_dev with -EEXIST, 
don't try to register things with the same name in the same directory.

Reproduce sequences:
1.modprobe amdgpu
2.modprobe -r amdgpu
3.modprobe amdgpu

Root cause:
When unload driver, it doesn't unregister aux.

v2: Don't use has_aux

Signed-off-by: Emily Deng 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 8fe1685..941313b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3760,6 +3760,13 @@ int amdgpu_dm_connector_atomic_get_property(struct 
drm_connector *connector,
return ret;
 }
 
+static void amdgpu_dm_connector_unregister(struct drm_connector *connector)
+{
+   struct amdgpu_dm_connector *amdgpu_dm_connector = 
to_amdgpu_dm_connector(connector);
+
+   drm_dp_aux_unregister(_dm_connector->dm_dp_aux.aux);
+}
+
 static void amdgpu_dm_connector_destroy(struct drm_connector *connector)
 {
struct amdgpu_dm_connector *aconnector = 
to_amdgpu_dm_connector(connector);
@@ -3788,6 +3795,11 @@ static void amdgpu_dm_connector_destroy(struct 
drm_connector *connector)
drm_dp_cec_unregister_connector(>dm_dp_aux.aux);
drm_connector_unregister(connector);
drm_connector_cleanup(connector);
+   if (aconnector->i2c) {
+   i2c_del_adapter(>i2c->base);
+   kfree(aconnector->i2c);
+   }
+
kfree(connector);
 }
 
@@ -3846,7 +3858,8 @@ static const struct drm_connector_funcs 
amdgpu_dm_connector_funcs = {
.atomic_duplicate_state = amdgpu_dm_connector_atomic_duplicate_state,
.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
.atomic_set_property = amdgpu_dm_connector_atomic_set_property,
-   .atomic_get_property = amdgpu_dm_connector_atomic_get_property
+   .atomic_get_property = amdgpu_dm_connector_atomic_get_property,
+   .early_unregister = amdgpu_dm_connector_unregister
 };
 
 static int get_modes(struct drm_connector *connector)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer

2019-05-28 Thread Deng, Emily
Hi Christian,
 I have reverted the before change as your suggestion, and sent this new 
patch, could you help to review this?

Best wishes
Emily Deng



>-Original Message-
>From: amd-gfx  On Behalf Of Deng,
>Emily
>Sent: Wednesday, May 29, 2019 10:52 AM
>To: amd-gfx@lists.freedesktop.org
>Subject: RE: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer
>
>Ping..
>
>Best wishes
>Emily Deng
>
>
>
>>-Original Message-
>>From: Deng, Emily 
>>Sent: Tuesday, May 28, 2019 6:14 PM
>>To: Deng, Emily ; amd-gfx@lists.freedesktop.org
>>Subject: RE: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer
>>
>>Ping ..
>>
>>Best wishes
>>Emily Deng
>>
>>
>>
>>>-Original Message-
>>>From: amd-gfx  On Behalf Of
>>>Emily Deng
>>>Sent: Tuesday, May 28, 2019 4:06 PM
>>>To: amd-gfx@lists.freedesktop.org
>>>Cc: Deng, Emily 
>>>Subject: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer
>>>
>>>As it will destroy clear_state_obj, and also will unpin it in the
>>>gfx_v9_0_sw_fini, so don't need to call amdgpu_bo_free_kernel in
>>>gfx_v9_0_sw_fini, or it will have unpin warning.
>>>
>>>Signed-off-by: Emily Deng 
>>>---
>>> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +---
>>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>>
>>>diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>>b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>>index c763733..cc5a382 100644
>>>--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>>+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>>@@ -1794,9 +1794,7 @@ static int gfx_v9_0_sw_fini(void *handle)
>>>
>>> gfx_v9_0_mec_fini(adev);
>>> gfx_v9_0_ngg_fini(adev);
>>>-amdgpu_bo_free_kernel(>gfx.rlc.clear_state_obj,
>>>->gfx.rlc.clear_state_gpu_addr,
>>>-(void **)>gfx.rlc.cs_ptr);
>>>+amdgpu_bo_unref(>gfx.rlc.clear_state_obj);
>>> if (adev->asic_type == CHIP_RAVEN) {
>>> amdgpu_bo_free_kernel(>gfx.rlc.cp_table_obj,
>>> >gfx.rlc.cp_table_gpu_addr,
>>>--
>>>2.7.4
>>>
>>>___
>>>amd-gfx mailing list
>>>amd-gfx@lists.freedesktop.org
>>>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>___
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
--- Begin Message ---
>-Original Message-
>From: Koenig, Christian 
>Sent: Tuesday, May 28, 2019 3:43 PM
>To: Deng, Emily ; Quan, Evan
>; amd-gfx@lists.freedesktop.org
>Subject: Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin
>
>Am 28.05.19 um 09:38 schrieb Deng, Emily:
>>> -Original Message-
>>> From: Koenig, Christian 
>>> Sent: Tuesday, May 28, 2019 3:04 PM
>>> To: Quan, Evan ; Deng, Emily
>;
>>> amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin
>>>
>>> Ok in this case the patch is a NAK.
>>>
>>> The correct solution is to stop using amdgpu_bo_free_kernel in
>>> gfx_v9_0_sw_fini.
>> So we just lead the memory leak here and not destroy the bo? I don't think
>it is correct.
>
>Oh, no. That's not what I meant.
>
>We should stop using amdgpu_bo_free_kernel and instead use
>amdgpu_bo_free!

>Sorry for not being clear here,
>Christian.
Thanks for your good suggestion.  Will revert this patch, and submit another 
patch.

Best wishes
Emily Deng
>
>>> BTW: Are we using the kernel pointer somewhere? Cause that one
>became
>>> completely invalid because of patch "drm/amdgpu: pin the csb buffer
>>> on hw init".
>>>
>>> Christian.
>>>
>>> Am 28.05.19 um 03:42 schrieb Quan, Evan:
 The original unpin in hw_fini was introduced by
 https://lists.freedesktop.org/archives/amd-gfx/2018-July/023681.html

 Evan
> -Original Message-
> From: amd-gfx  On Behalf Of
> Christian K?nig
> Sent: Monday, May 27, 2019 7:02 PM
> To: Deng, Emily ; amd-
>g...@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin
>
> Am 27.05.19 um 10:41 schrieb Emily Deng:
>> As it will destroy clear_state_obj, and also will unpin it in the
>> gfx_v9_0_sw_fini, so don't need to call csb_vram unpin in
>> gfx_v9_0_hw_fini, or it will have unpin warning.
>>
>> v2: For suspend, still need to do unpin
>>
>> Signed-off-by: Emily Deng 
>> ---
>> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> index 5eb70e8..5b1ff48 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> @@ -3395,7 +3395,8 @@ static int gfx_v9_0_hw_fini(void *handle)
>>  gfx_v9_0_cp_enable(adev, false);
>>  adev->gfx.rlc.funcs->stop(adev);
>>
>> -gfx_v9_0_csb_vram_unpin(adev);
>> +if 

RE: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer

2019-05-28 Thread Deng, Emily
Ping..

Best wishes
Emily Deng



>-Original Message-
>From: Deng, Emily 
>Sent: Tuesday, May 28, 2019 6:14 PM
>To: Deng, Emily ; amd-gfx@lists.freedesktop.org
>Subject: RE: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer
>
>Ping ..
>
>Best wishes
>Emily Deng
>
>
>
>>-Original Message-
>>From: amd-gfx  On Behalf Of
>>Emily Deng
>>Sent: Tuesday, May 28, 2019 4:06 PM
>>To: amd-gfx@lists.freedesktop.org
>>Cc: Deng, Emily 
>>Subject: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer
>>
>>As it will destroy clear_state_obj, and also will unpin it in the
>>gfx_v9_0_sw_fini, so don't need to call amdgpu_bo_free_kernel in
>>gfx_v9_0_sw_fini, or it will have unpin warning.
>>
>>Signed-off-by: Emily Deng 
>>---
>> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +---
>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>
>>diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>index c763733..cc5a382 100644
>>--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>@@ -1794,9 +1794,7 @@ static int gfx_v9_0_sw_fini(void *handle)
>>
>>  gfx_v9_0_mec_fini(adev);
>>  gfx_v9_0_ngg_fini(adev);
>>- amdgpu_bo_free_kernel(>gfx.rlc.clear_state_obj,
>>- >gfx.rlc.clear_state_gpu_addr,
>>- (void **)>gfx.rlc.cs_ptr);
>>+ amdgpu_bo_unref(>gfx.rlc.clear_state_obj);
>>  if (adev->asic_type == CHIP_RAVEN) {
>>  amdgpu_bo_free_kernel(>gfx.rlc.cp_table_obj,
>>  >gfx.rlc.cp_table_gpu_addr,
>>--
>>2.7.4
>>
>>___
>>amd-gfx mailing list
>>amd-gfx@lists.freedesktop.org
>>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 3.0

2019-05-28 Thread Li, Ching-shih (Louis)
Hi Christian,

Your explanation matches my code study and test results. Well, I will remove 
original read. Shirish and I will re-test it. I will submit it after test done.
As for other related code, yes, I assumed there might be similar issues as 
well. My plan is to create internal ticket and write test code to check if any 
problem can be hit. Then we can do fix. I have the current patch focus on the 
current Chrome issue.

Thanks for your review and information.

BR,
Louis

From: Christian König 
Sent: Tuesday, May 28, 2019 3:24 PM
To: Li, Ching-shih (Louis) ; Liu, Leo ; 
S, Shirish ; Grodzovsky, Andrey ; 
Zhang, Jerry ; Deng, Emily ; Deucher, 
Alexander 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 
3.0

[CAUTION: External Email]
Wow, really good catch!

The underlying problem is most likely that VCE block is either power or clock 
gated and because of this the readptr read always returns zero.

Now amdgpu_ring_alloc() informs the power management code that the block is 
about to be used and so the gating is turned off.

Mhm, that is probably wrong at a hole bunch of other places, at least the UVD 
and VCN code comes to mind.

I agree with Leo that you should remove the original read (so to not read 
twice) and it would be realy nice if you could double check the other code 
(UVD/VCN) for similar problems as well.

Regards,
Christian.

Am 27.05.19 um 19:20 schrieb Li, Ching-shih (Louis):
I don’t mean to read it twice. The solution is to make first read later. I 
didn’t modify the original code to make code difference less and simple. I 
guess it should work to remove the original read there.


From: Liu, Leo 
Sent: Tuesday, May 28, 2019 12:40 AM
To: Li, Ching-shih (Louis) 
; S, Shirish 
; Grodzovsky, Andrey 
; Zhang, Jerry 
; Deng, Emily 
; Deucher, Alexander 

Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 
3.0


int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)
{
struct amdgpu_device *adev = ring->adev;

uint32_t rptr = amdgpu_ring_get_rptr(ring);

unsigned i;
int r, timeout = adev->usec_timeout;

/* skip ring test for sriov*/
if (amdgpu_sriov_vf(adev))
return 0;

r = amdgpu_ring_alloc(ring, 16);
if (r)
return r;

amdgpu_ring_write(ring, VCE_CMD_END);
amdgpu_ring_commit(ring);



Above is original code, rptr is updated when called, and below is your patch, 
my question is why do you need to get rptr twice?

@@ -1084,6 +1084,8 @@ int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)

if (r)

   return r;



+   rptr = amdgpu_ring_get_rptr(ring);

+

amdgpu_ring_write(ring, VCE_CMD_END);

amdgpu_ring_commit(ring);




On 5/27/19 12:22 PM, Li, Ching-shih (Louis) wrote:
Hi Leo,

Yes, I confirm it is the root cause for the Chrome S3 issue. Whenever system is 
resumed, the original instruction always gets zero. However, I have no idea why 
it fails, and didn’t verify this problem on CRB or any other Linux platform yet.
Although I think the ideal solution is an indicator, e.g. a register, for 
driver to check if related firmware and hardware are ready to work. So driver 
can make sure it is ok to read rptr. Without any reference document, I can only 
try to solve the problem by modifying driver. Debug traces reveal that only 
first rptr read fails, but the read in check loop is ok. Therefore, a solution 
comes to mind: to update rptr later for initial rptr value. Tests prove it 
working in Chrome platforms. Fyi~

BR,
Louis

From: Liu, Leo 
Sent: Monday, May 27, 2019 9:01 PM
To: S, Shirish ; Grodzovsky, 
Andrey ; Zhang, 
Jerry ; Deng, Emily 
; Deucher, Alexander 

Cc: amd-gfx@lists.freedesktop.org; Li, 
Ching-shih (Louis) 
Subject: Re: [PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 
3.0



On 5/27/19 3:42 AM, S, Shirish wrote:

From: Louis Li 



[What]

vce ring test fails consistently during resume in s3 cycle, due to

mismatch read & write pointers.

On debug/analysis its found that rptr to be compared is not being

correctly updated/read, which leads to this failure.

Below is the failure signature:

  [drm:amdgpu_vce_ring_test_ring] *ERROR* amdgpu: ring 12 test failed

  [drm:amdgpu_device_ip_resume_phase2] *ERROR* resume of IP block  
failed -110

  [drm:amdgpu_device_resume] *ERROR* amdgpu_device_ip_resume failed (-110).



[How]

fetch rptr 

Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-28 Thread Khalid Aziz
On Tue, 2019-05-28 at 16:40 +0100, Catalin Marinas wrote:
> On Tue, May 28, 2019 at 03:54:11PM +0100, Andrew Murray wrote:
> > On Mon, May 27, 2019 at 03:37:20PM +0100, Catalin Marinas wrote:
> > > On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote:
> > > > This patch is a part of a series that extends arm64 kernel ABI
> > > > to allow to
> > > > pass tagged user pointers (with the top byte set to something
> > > > else other
> > > > than 0x00) as syscall arguments.
> > > > 
> > > > This patch allows tagged pointers to be passed to the following
> > > > memory
> > > > syscalls: brk, get_mempolicy, madvise, mbind, mincore, mlock,
> > > > mlock2,
> > > > mmap, mmap_pgoff, mprotect, mremap, msync, munlock, munmap,
> > > > remap_file_pages, shmat and shmdt.
> > > > 
> > > > This is done by untagging pointers passed to these syscalls in
> > > > the
> > > > prologues of their handlers.
> > > > 
> > > > Signed-off-by: Andrey Konovalov 
> > > 
> > > Actually, I don't think any of these wrappers get called (have
> > > you
> > > tested this patch?). Following commit 4378a7d4be30 ("arm64:
> > > implement
> > > syscall wrappers"), I think we have other macro names for
> > > overriding the
> > > sys_* ones.
> > 
> > What is the value in adding these wrappers?
> 
> Not much value, initially proposed just to keep the core changes
> small.
> I'm fine with moving them back in the generic code (but see below).
> 
> I think another aspect is how we define the ABI. Is allowing tags to
> mlock() for example something specific to arm64 or would sparc ADI
> need
> the same? In the absence of other architectures defining such ABI, my
> preference would be to keep the wrappers in the arch code.
> 
> Assuming sparc won't implement untagged_addr(), we can place the
> macros
> back in the generic code but, as per the review here, we need to be
> more
> restrictive on where we allow tagged addresses. For example, if
> mmap()
> gets a tagged address with MAP_FIXED, is it expected to return the
> tag?

I would recommend against any ABI differences between ARM64 MTE/TBI and
sparc ADI unless it simply can not be helped. My understanding of
MTE/TBI is limited, so I will explain how sparc ADI works. On sparc, a
tagged address has no meaning until following steps happen:

1. set the user mode PSTATE.mcde bit. This acts as the master switch to
enable ADI for a process.

2. set TTE.mcd bit on TLB entries that match the address range ADI is
being enabled on.

3. Store version tag for the range of addresses userspace wants ADI
enabled on using "stxa" instruction. These tags are stored in physical
memory by the memory controller.

Steps 1 and 2 are accomplished by userspace by calling mprotect() with
PROT_ADI. Tags are set by storing tags in a loop, for example:

version = 10;
tmp_addr = shmaddr;
end = shmaddr + BUFFER_SIZE;
while (tmp_addr < end) {
asm volatile(
"stxa %1, [%0]0x90\n\t"
:
: "r" (tmp_addr), "r" (version));
tmp_addr += adi_blksz;
}

With these semantics, giving mmap() or shamat() a tagged address is
meaningless since no tags have been stored at the addresses mmap() will
allocate and one can not store tags before memory range has been
allocated. If we choose to allow tagged addresses to come into mmap()
and shmat(), sparc code can strip the tags unconditionally and that may
help simplify ABI and/or code.

> 
> My thoughts on allowing tags (quick look):
> 
> brk - no
> get_mempolicy - yes
> madvise - yes
> mbind - yes
> mincore - yes
> mlock, mlock2, munlock - yes
> mmap - no (we may change this with MTE but not for TBI)
> mmap_pgoff - not used on arm64
> mprotect - yes
> mremap - yes for old_address, no for new_address (on par with mmap)
> msync - yes
> munmap - probably no (mmap does not return tagged ptrs)
> remap_file_pages - no (also deprecated syscall)
> shmat, shmdt - shall we allow tagged addresses on shared memory?
> 
> The above is only about the TBI ABI while ignoring hardware MTE. For
> the
> latter, we may want to change the mmap() to allow pre-colouring on
> page
> fault which means that munmap()/mprotect() should also support tagged
> pointers. Possibly mremap() as well but we need to decide whether it
> should allow re-colouring the page (probably no, in which case
> old_address and new_address should have the same tag). For some of
> these
> we'll end up with arm64 specific wrappers again, unless sparc ADI
> adopts
> exactly the same ABI restrictions.
> 

Let us keep any restrictions common across ARM64 and sparc. pre-
coloring on sparc in the kernel would mean kernel will have to execute
stxa instructions in a loop for each page being faulted in. Not that
big a deal but doesn't that assume the entire page has the same tag
which is dedcued from the upper bits of address? Shouldn't we support
tags at the same granularity level as what the hardware supports? We

[PATCH 1/2] mm/hmm.c: support automatic NUMA balancing

2019-05-28 Thread Kuehling, Felix
From: Philip Yang 

While the page is migrating by NUMA balancing, HMM failed to detect this
condition and still return the old page.  Application will use the new
page migrated, but driver pass the old page physical address to GPU, this
crash the application later.

Use pte_protnone(pte) to return this condition and then hmm_vma_do_fault
will allocate new page.

Link: http://lkml.kernel.org/r/20190510195258.9930-2-felix.kuehl...@amd.com
Signed-off-by: Philip Yang 
Signed-off-by: Felix Kuehling 
Reviewed-by: Jérôme Glisse 
Cc: Alex Deucher 
Cc: Dave Airlie 
Signed-off-by: Andrew Morton 
---
 mm/hmm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 0db8491090b8..599d8e82db67 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -559,7 +559,7 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk,
 
 static inline uint64_t pte_to_hmm_pfn_flags(struct hmm_range *range, pte_t pte)
 {
-   if (pte_none(pte) || !pte_present(pte))
+   if (pte_none(pte) || !pte_present(pte) || pte_protnone(pte))
return 0;
return pte_write(pte) ? range->flags[HMM_PFN_VALID] |
range->flags[HMM_PFN_WRITE] :
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 0/2] Two HMM patches from MMOTS

2019-05-28 Thread Kuehling, Felix
These are two important HMM bug fixes to fix the HMM-based userptr
implementation. They are alread staged in MMOTS:
https://www.ozlabs.org/~akpm/mmots/broken-out/

Kuehling, Felix (1):
  mm/hmm.c: only set FAULT_FLAG_ALLOW_RETRY for non-blocking

Philip Yang (1):
  mm/hmm.c: support automatic NUMA balancing

 mm/hmm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/2] mm/hmm.c: only set FAULT_FLAG_ALLOW_RETRY for non-blocking

2019-05-28 Thread Kuehling, Felix
From: "Kuehling, Felix" 

Don't set this flag by default in hmm_vma_do_fault. It is set
conditionally just a few lines below. Setting it unconditionally
can lead to handle_mm_fault doing a non-blocking fault, returning
-EBUSY and unlocking mmap_sem unexpectedly.

Link: http://lkml.kernel.org/r/20190510195258.9930-3-felix.kuehl...@amd.com
Signed-off-by: Felix Kuehling 
Reviewed-by: Jérôme Glisse 
Cc: Alex Deucher 
Cc: Dave Airlie 
Signed-off-by: Andrew Morton 
---
 mm/hmm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 599d8e82db67..018cb5c241a6 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -339,7 +339,7 @@ struct hmm_vma_walk {
 static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr,
bool write_fault, uint64_t *pfn)
 {
-   unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_REMOTE;
+   unsigned int flags = FAULT_FLAG_REMOTE;
struct hmm_vma_walk *hmm_vma_walk = walk->private;
struct hmm_range *range = hmm_vma_walk->range;
struct vm_area_struct *vma = walk->vma;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/3] drm/amdkfd: Simplify eviction state logic

2019-05-28 Thread Kuehling, Felix
Always mark evicted queues with q->properties.is_evicted = true, even
queues that are inactive for other reason. This simplifies maintaining
the eviction state as it doesn't require updating is_evicted when other
queue activation conditions change.

On the other hand, we now need to check those other queue activation
conditions whenever an evicted queues is restored. To minimize code
duplication, move the queue activation check into a macro so it can be
maintained in one central place.

Signed-off-by: Felix Kuehling 
Reviewed-by: Philip Cox 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 84 ++-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 15 +---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 10 +--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 10 +--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  5 ++
 5 files changed, 56 insertions(+), 68 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index ece35c7a77b5..c18355d4cb95 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -289,13 +289,11 @@ static int create_queue_nocpsch(struct 
device_queue_manager *dqm,
}
q->properties.vmid = qpd->vmid;
/*
-* Eviction state logic: we only mark active queues as evicted
-* to avoid the overhead of restoring inactive queues later
+* Eviction state logic: mark all queues as evicted, even ones
+* not currently active. Restoring inactive queues later only
+* updates the is_evicted flag but is a no-op otherwise.
 */
-   if (qpd->evicted)
-   q->properties.is_evicted = (q->properties.queue_size > 0 &&
-   q->properties.queue_percent > 0 &&
-   q->properties.queue_address != 0);
+   q->properties.is_evicted = !!qpd->evicted;
 
q->properties.tba_addr = qpd->tba_addr;
q->properties.tma_addr = qpd->tma_addr;
@@ -518,14 +516,6 @@ static int update_queue(struct device_queue_manager *dqm, 
struct queue *q)
}
mqd_mgr = dqm->mqd_mgrs[get_mqd_type_from_queue_type(
q->properties.type)];
-   /*
-* Eviction state logic: we only mark active queues as evicted
-* to avoid the overhead of restoring inactive queues later
-*/
-   if (pdd->qpd.evicted)
-   q->properties.is_evicted = (q->properties.queue_size > 0 &&
-   q->properties.queue_percent > 0 &&
-   q->properties.queue_address != 0);
 
/* Save previous activity state for counters */
prev_active = q->properties.is_active;
@@ -590,7 +580,7 @@ static int evict_process_queues_nocpsch(struct 
device_queue_manager *dqm,
struct queue *q;
struct mqd_manager *mqd_mgr;
struct kfd_process_device *pdd;
-   int retval = 0;
+   int retval, ret = 0;
 
dqm_lock(dqm);
if (qpd->evicted++ > 0) /* already evicted, do nothing */
@@ -600,25 +590,31 @@ static int evict_process_queues_nocpsch(struct 
device_queue_manager *dqm,
pr_info_ratelimited("Evicting PASID %u queues\n",
pdd->process->pasid);
 
-   /* unactivate all active queues on the qpd */
+   /* Mark all queues as evicted. Deactivate all active queues on
+* the qpd.
+*/
list_for_each_entry(q, >queues_list, list) {
+   q->properties.is_evicted = true;
if (!q->properties.is_active)
continue;
+
mqd_mgr = dqm->mqd_mgrs[get_mqd_type_from_queue_type(
q->properties.type)];
-   q->properties.is_evicted = true;
q->properties.is_active = false;
retval = mqd_mgr->destroy_mqd(mqd_mgr, q->mqd,
KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN,
KFD_UNMAP_LATENCY_MS, q->pipe, q->queue);
-   if (retval)
-   goto out;
+   if (retval && !ret)
+   /* Return the first error, but keep going to
+* maintain a consistent eviction state
+*/
+   ret = retval;
dqm->queue_count--;
}
 
 out:
dqm_unlock(dqm);
-   return retval;
+   return ret;
 }
 
 static int evict_process_queues_cpsch(struct device_queue_manager *dqm,
@@ -636,11 +632,14 @@ static int evict_process_queues_cpsch(struct 
device_queue_manager *dqm,
pr_info_ratelimited("Evicting PASID %u queues\n",
pdd->process->pasid);
 
-   /* unactivate all active queues on the qpd */
+   /* Mark all queues as evicted. Deactivate all active queues on
+* 

[PATCH 3/3] drm/amdkfd: Implement queue priority controls for gfx9

2019-05-28 Thread Kuehling, Felix
From: Jay Cornwall 

Ported from gfx8, no changes in register setup.

Signed-off-by: Jay Cornwall 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index a65efca36217..818944b52e11 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -67,6 +67,12 @@ static void update_cu_mask(struct mqd_manager *mm, void *mqd,
m->compute_static_thread_mgmt_se3);
 }
 
+static void set_priority(struct v9_mqd *m, struct queue_properties *q)
+{
+   m->cp_hqd_pipe_priority = pipe_priority_map[q->priority];
+   m->cp_hqd_queue_priority = q->priority;
+}
+
 static struct kfd_mem_obj *allocate_mqd(struct kfd_dev *kfd,
struct queue_properties *q)
 {
@@ -141,9 +147,6 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
1 << CP_HQD_QUANTUM__QUANTUM_SCALE__SHIFT |
10 << CP_HQD_QUANTUM__QUANTUM_DURATION__SHIFT;
 
-   m->cp_hqd_pipe_priority = 1;
-   m->cp_hqd_queue_priority = 15;
-
if (q->format == KFD_QUEUE_FORMAT_AQL) {
m->cp_hqd_aql_control =
1 << CP_HQD_AQL_CONTROL__CONTROL0__SHIFT;
@@ -246,6 +249,7 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
m->cp_hqd_ctx_save_control = 0;
 
update_cu_mask(mm, mqd, q);
+   set_priority(m, q);
 
q->is_active = QUEUE_IS_ACTIVE(*q);
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 0/3] KFD upstreaming

2019-05-28 Thread Kuehling, Felix
New feature: queue priorities

The eviction state logic change is preparation for some debugger support
we're working on but haven't settled on the final ABI yet.

Felix Kuehling (1):
  drm/amdkfd: Simplify eviction state logic

Jay Cornwall (1):
  drm/amdkfd: Implement queue priority controls for gfx9

ozeng (1):
  drm/amdkfd: CP queue priority controls

 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 84 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c  | 20 +
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |  2 +-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 27 +++---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 20 ++---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 22 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 16 
 7 files changed, 113 insertions(+), 78 deletions(-)

-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/3] drm/amdkfd: CP queue priority controls

2019-05-28 Thread Kuehling, Felix
From: ozeng 

Translate queue priority into pipe priority and write to MQDs.
The priority values are used to perform queue and pipe arbitration.

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c  | 20 +++
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |  2 +-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 12 ---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 12 ---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 11 ++
 5 files changed, 50 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
index 9307811bc427..cc04b362f510 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
@@ -25,6 +25,26 @@
 #include "amdgpu_amdkfd.h"
 #include "kfd_device_queue_manager.h"
 
+/* Mapping queue priority to pipe priority, indexed by queue priority */
+int pipe_priority_map[] = {
+   KFD_PIPE_PRIORITY_CS_LOW,
+   KFD_PIPE_PRIORITY_CS_LOW,
+   KFD_PIPE_PRIORITY_CS_LOW,
+   KFD_PIPE_PRIORITY_CS_LOW,
+   KFD_PIPE_PRIORITY_CS_LOW,
+   KFD_PIPE_PRIORITY_CS_LOW,
+   KFD_PIPE_PRIORITY_CS_LOW,
+   KFD_PIPE_PRIORITY_CS_MEDIUM,
+   KFD_PIPE_PRIORITY_CS_MEDIUM,
+   KFD_PIPE_PRIORITY_CS_MEDIUM,
+   KFD_PIPE_PRIORITY_CS_MEDIUM,
+   KFD_PIPE_PRIORITY_CS_HIGH,
+   KFD_PIPE_PRIORITY_CS_HIGH,
+   KFD_PIPE_PRIORITY_CS_HIGH,
+   KFD_PIPE_PRIORITY_CS_HIGH,
+   KFD_PIPE_PRIORITY_CS_HIGH
+};
+
 struct kfd_mem_obj *allocate_hiq_mqd(struct kfd_dev *dev)
 {
struct kfd_mem_obj *mqd_mem_obj = NULL;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
index 56af256a191b..66b8c67e5340 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
@@ -62,7 +62,7 @@
  * per KFD_MQD_TYPE for each device.
  *
  */
-
+extern int pipe_priority_map[];
 struct mqd_manager {
int (*init_mqd)(struct mqd_manager *mm, void **mqd,
struct kfd_mem_obj **mqd_mem_obj, uint64_t *gart_addr,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index 5370a897526a..e911438d76b3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -66,6 +66,12 @@ static void update_cu_mask(struct mqd_manager *mm, void *mqd,
m->compute_static_thread_mgmt_se3);
 }
 
+static void set_priority(struct cik_mqd *m, struct queue_properties *q)
+{
+   m->cp_hqd_pipe_priority = pipe_priority_map[q->priority];
+   m->cp_hqd_queue_priority = q->priority;
+}
+
 static struct kfd_mem_obj *allocate_mqd(struct kfd_dev *kfd,
struct queue_properties *q)
 {
@@ -81,7 +87,6 @@ static struct kfd_mem_obj *allocate_mqd(struct kfd_dev *kfd,
return mqd_mem_obj;
 }
 
-
 static int init_mqd(struct mqd_manager *mm, void **mqd,
struct kfd_mem_obj **mqd_mem_obj, uint64_t *gart_addr,
struct queue_properties *q)
@@ -131,8 +136,7 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
 * 1 = CS_MEDIUM (typically between HP3D and GFX
 * 2 = CS_HIGH (typically above HP3D)
 */
-   m->cp_hqd_pipe_priority = 1;
-   m->cp_hqd_queue_priority = 15;
+   set_priority(m, q);
 
if (q->format == KFD_QUEUE_FORMAT_AQL)
m->cp_hqd_iq_rptr = AQL_ENABLE;
@@ -230,6 +234,7 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
m->cp_hqd_pq_control |= NO_UPDATE_RPTR;
 
update_cu_mask(mm, mqd, q);
+   set_priority(m, q);
 
q->is_active = QUEUE_IS_ACTIVE(*q);
 
@@ -354,6 +359,7 @@ static int update_mqd_hiq(struct mqd_manager *mm, void *mqd,
 
q->is_active = QUEUE_IS_ACTIVE(*q);
 
+   set_priority(m, q);
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index 335aded8855d..00e6a5955120 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -31,6 +31,7 @@
 #include "gca/gfx_8_0_sh_mask.h"
 #include "gca/gfx_8_0_enum.h"
 #include "oss/oss_3_0_sh_mask.h"
+
 #define CP_MQD_CONTROL__PRIV_STATE__SHIFT 0x8
 
 static inline struct vi_mqd *get_mqd(void *mqd)
@@ -68,6 +69,12 @@ static void update_cu_mask(struct mqd_manager *mm, void *mqd,
m->compute_static_thread_mgmt_se3);
 }
 
+static void set_priority(struct vi_mqd *m, struct queue_properties *q)
+{
+   m->cp_hqd_pipe_priority = pipe_priority_map[q->priority];
+   m->cp_hqd_queue_priority = q->priority;
+}
+
 static struct kfd_mem_obj *allocate_mqd(struct kfd_dev *kfd,
struct 

Re: [PATCH] drm/amdkfd: Fix a potential circular lock

2019-05-28 Thread Kuehling, Felix
On 2019-05-28 2:28 p.m., Zeng, Oak wrote:
> The idea to break the circular lock dependency is, unlock dqm
> temporarily before calling init_mqd in call stack #1 (see below)
>
> [  513.604034] ==
> [  513.604205] WARNING: possible circular locking dependency detected
> [  513.604375] 4.18.0-kfd-root #2 Tainted: GW
> [  513.604530] --
> [  513.604699] kswapd0/611 is trying to acquire lock:
> [  513.604840] d254022e (>lock_hidden){+.+.}, at: 
> evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
> [  513.605150]
> but task is already holding lock:
> [  513.605307] 961547fc (_vma->rwsem){}, at: 
> page_lock_anon_vma_read+0xe4/0x250
> [  513.605540]
> which lock already depends on the new lock.
>
> [  513.605747]
> the existing dependency chain (in reverse order) is:
> [  513.605944]
> -> #4 (_vma->rwsem){}:
> [  513.606106]__vma_adjust+0x147/0x7f0
> [  513.606231]__split_vma+0x179/0x190
> [  513.606353]mprotect_fixup+0x217/0x260
> [  513.606553]do_mprotect_pkey+0x211/0x380
> [  513.606752]__x64_sys_mprotect+0x1b/0x20
> [  513.606954]do_syscall_64+0x50/0x1a0
> [  513.607149]entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  513.607380]
> -> #3 (>i_mmap_rwsem){}:
> [  513.607678]rmap_walk_file+0x1f0/0x280
> [  513.607887]page_referenced+0xdd/0x180
> [  513.608081]shrink_page_list+0x853/0xcb0
> [  513.608279]shrink_inactive_list+0x33b/0x700
> [  513.608483]shrink_node_memcg+0x37a/0x7f0
> [  513.608682]shrink_node+0xd8/0x490
> [  513.608869]balance_pgdat+0x18b/0x3b0
> [  513.609062]kswapd+0x203/0x5c0
> [  513.609241]kthread+0x100/0x140
> [  513.609420]ret_from_fork+0x24/0x30
> [  513.609607]
> -> #2 (fs_reclaim){+.+.}:
> [  513.609883]kmem_cache_alloc_trace+0x34/0x2e0
> [  513.610093]reservation_object_reserve_shared+0x139/0x300
> [  513.610326]ttm_bo_init_reserved+0x291/0x480 [ttm]
> [  513.610567]amdgpu_bo_do_create+0x1d2/0x650 [amdgpu]
> [  513.610811]amdgpu_bo_create+0x40/0x1f0 [amdgpu]
> [  513.611041]amdgpu_bo_create_reserved+0x249/0x2d0 [amdgpu]
> [  513.611290]amdgpu_bo_create_kernel+0x12/0x70 [amdgpu]
> [  513.611584]amdgpu_ttm_init+0x2cb/0x560 [amdgpu]
> [  513.611823]gmc_v9_0_sw_init+0x400/0x750 [amdgpu]
> [  513.612491]amdgpu_device_init+0x14eb/0x1990 [amdgpu]
> [  513.612730]amdgpu_driver_load_kms+0x78/0x290 [amdgpu]
> [  513.612958]drm_dev_register+0x111/0x1a0
> [  513.613171]amdgpu_pci_probe+0x11c/0x1e0 [amdgpu]
> [  513.613389]local_pci_probe+0x3f/0x90
> [  513.613581]pci_device_probe+0x102/0x1c0
> [  513.613779]driver_probe_device+0x2a7/0x480
> [  513.613984]__driver_attach+0x10a/0x110
> [  513.614179]bus_for_each_dev+0x67/0xc0
> [  513.614372]bus_add_driver+0x1eb/0x260
> [  513.614565]driver_register+0x5b/0xe0
> [  513.614756]do_one_initcall+0xac/0x357
> [  513.614952]do_init_module+0x5b/0x213
> [  513.615145]load_module+0x2542/0x2d30
> [  513.615337]__do_sys_finit_module+0xd2/0x100
> [  513.615541]do_syscall_64+0x50/0x1a0
> [  513.615731]entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  513.615963]
> -> #1 (reservation_ww_class_mutex){+.+.}:
> [  513.616293]amdgpu_amdkfd_alloc_gtt_mem+0xcf/0x2c0 [amdgpu]
> [  513.616554]init_mqd+0x223/0x260 [amdgpu]
> [  513.616779]create_queue_nocpsch+0x4d9/0x600 [amdgpu]
> [  513.617031]pqm_create_queue+0x37c/0x520 [amdgpu]
> [  513.617270]kfd_ioctl_create_queue+0x2f9/0x650 [amdgpu]
> [  513.617522]kfd_ioctl+0x202/0x350 [amdgpu]
> [  513.617724]do_vfs_ioctl+0x9f/0x6c0
> [  513.617914]ksys_ioctl+0x66/0x70
> [  513.618095]__x64_sys_ioctl+0x16/0x20
> [  513.618286]do_syscall_64+0x50/0x1a0
> [  513.618476]entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  513.618695]
> -> #0 (>lock_hidden){+.+.}:
> [  513.618984]__mutex_lock+0x98/0x970
> [  513.619197]evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
> [  513.619459]kfd_process_evict_queues+0x3b/0xb0 [amdgpu]
> [  513.619710]kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu]
> [  513.620103]amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu]
> [  513.620363]amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu]
> [  513.620614]__mmu_notifier_invalidate_range_start+0x70/0xb0
> [  513.620851]try_to_unmap_one+0x7fc/0x8f0
> [  513.621049]rmap_walk_anon+0x121/0x290
> [  513.621242]try_to_unmap+0x93/0xf0
> [  513.621428]shrink_page_list+0x606/0xcb0
> [  513.621625]

Re: [PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

2019-05-28 Thread Dave Airlie
On Wed, 29 May 2019 at 02:47, Emil Velikov  wrote:
>
> On 2019/05/28, Koenig, Christian wrote:
> > Am 28.05.19 um 18:10 schrieb Emil Velikov:
> > > On 2019/05/28, Daniel Vetter wrote:
> > >> On Tue, May 28, 2019 at 10:03 AM Koenig, Christian
> > >>  wrote:
> > >>> Am 28.05.19 um 09:38 schrieb Daniel Vetter:
> >  [SNIP]
> > > Might be a good idea looking into reverting it partially, so that at
> > > least command submission and buffer allocation is still blocked.
> >  I thought the issue is a lot more than vainfo, it's pretty much every
> >  hacked up compositor under the sun getting this wrong one way or
> >  another. Thinking about this some more, I also have no idea how you'd
> >  want to deprecate rendering on primary nodes in general. Apparently
> >  that breaks -modesetting already, and probably lots more compositors.
> >  And it looks like we're finally achieve the goal kms set out to 10
> >  years ago, and new compositors are sprouting up all the time. I guess
> >  we could just break them all (on new hardware) and tell them to all
> >  suck it up. But I don't think that's a great option. And just
> >  deprecating this on amdgpu is going to be even harder, since then
> >  everywhere else it'll keep working, and it's just amdgpu.ko that looks
> >  broken.
> > 
> >  Aside: I'm not supporting Emil's idea here because it fixes any issues
> >  Intel has - Intel doesn't care. I support it because reality sucks,
> >  people get this render vs. primary vs. multi-gpu prime wrong all the
> >  time (that's also why we have hardcoded display+gpu pairs in mesa for
> >  the various soc combinations out there), and this looks like a
> >  pragmatic solution. It'd be nice if every compositor and everything
> >  else would perfectly support multi gpu and only use render nodes for
> >  rendering, and only primary nodes for display. But reality is that
> >  people hack on stuff until gears on screen and then move on to more
> >  interesting things (to them). So I don't think we'll ever win this :-/
> > >>> Yeah, but this is a classic case of working around user space issues by
> > >>> making kernel changes instead of fixing user space.
> > >>>
> > >>> Having privileged (output control) and unprivileged (rendering control)
> > >>> functionality behind the same node is a mistake we have made a long time
> > >>> ago and render nodes finally seemed to be a way to fix that.
> > >>>
> > >>> I mean why are compositors using the primary node in the first place?
> > >>> Because they want to have access to privileged resources I think and in
> > >>> this case it is perfectly ok to do so.
> > >>>
> > >>> Now extending unprivileged access to the primary node actually sounds
> > >>> like a step into the wrong direction to me.
> > >>>
> > >>> I rather think that we should go down the route of completely dropping
> > >>> command submission and buffer allocation through the primary node for
> > >>> non master clients. And then as next step at some point drop support for
> > >>> authentication/flink.
> > >>>
> > >>> I mean we have done this with UMS as well and I don't see much other way
> > >>> to move forward and get rid of those ancient interface in the long term.
> > >> Well kms had some really good benefits that drove quick adoption, like
> > >> "suspend/resume actually has a chance of working" or "comes with
> > >> buffer management so you can run multiple gears".
> > >>
> > >> The render node thing is a lot more niche use case (prime, better priv
> > >> separation), plus "it's cleaner design". And the "cleaner design" part
> > >> is something that empirically doesn't seem to matter :-/ Just two
> > >> examples:
> > >> - KHR_display/leases just iterated display resources on the fd needed
> > >> for rendering (and iirc there was even a patch to expose that for
> > >> render nodes too so it works with DRI3), because implementing
> > >> protocols is too hard. Barely managed to stop that one before it
> > >> happened.
> > >> - Various video players use the vblank ioctl on directly to schedule
> > >> frames, without telling the compositor. I discovered that when I
> > >> wanted to limite the vblank ioctl to master clients only. Again,
> > >> apparently too hard to use the existing extensions, or fix the bugs in
> > >> there, or whatever. One userspace got fixed last year, but it'll
> > >> probably get copypasted around forever :-/
> > >>
> > >> So I don't think we'll ever manage to roll a clean split out, and best
> > >> we can do is give in and just hand userspace what it wants. As much as
> > >> that's misguided and unclean and all that. Maybe it'll result in a
> > >> least fewer stuff getting run as root to hack around this, because
> > >> fixing properly seems not to be on the table.
> > >>
> > >> The beauty of kms is that we've achieved the mission, everyone's
> > >> writing their own thing. Which is also terrible, and I don't 

Re: [PATCH] drm/amdkfd: Return proper error code for gws alloc API

2019-05-28 Thread Deucher, Alexander
Reviewed-by: Alex Deucher 

From: amd-gfx  on behalf of Zeng, Oak 

Sent: Tuesday, May 28, 2019 3:53 PM
To: amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix; Zeng, Oak
Subject: [PATCH] drm/amdkfd: Return proper error code for gws alloc API

Change-Id: Iaa81c6aa5f97e888291771e1aa70b02fccdcb9e0
Signed-off-by: Oak Zeng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c   | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 87177ed..e304271 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2124,7 +2124,7 @@ int amdgpu_amdkfd_add_gws_to_process(void *info, void 
*gws, struct kgd_mem **mem

 *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
 if (!*mem)
-   return -EINVAL;
+   return -ENOMEM;

 mutex_init(&(*mem)->lock);
 (*mem)->bo = amdgpu_bo_ref(gws_bo);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index aab2aa6..491f0dd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1576,7 +1576,7 @@ static int kfd_ioctl_alloc_queue_gws(struct file *filep,

 if (!hws_gws_support ||
 dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS)
-   return -EINVAL;
+   return -ENODEV;

 dev = kfd_device_by_id(args->gpu_id);
 if (!dev) {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index c2c570e..da09586 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -103,7 +103,7 @@ int pqm_set_gws(struct process_queue_manager *pqm, unsigned 
int qid,

 /* Only allow one queue per process can have GWS assigned */
 if (gws && pdd->qpd.num_gws)
-   return -EINVAL;
+   return -EBUSY;

 if (!gws && pdd->qpd.num_gws == 0)
 return -EINVAL;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdkfd: Return proper error code for gws alloc API

2019-05-28 Thread Zeng, Oak
Change-Id: Iaa81c6aa5f97e888291771e1aa70b02fccdcb9e0
Signed-off-by: Oak Zeng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c   | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 87177ed..e304271 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2124,7 +2124,7 @@ int amdgpu_amdkfd_add_gws_to_process(void *info, void 
*gws, struct kgd_mem **mem
 
*mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
if (!*mem)
-   return -EINVAL;
+   return -ENOMEM;
 
mutex_init(&(*mem)->lock);
(*mem)->bo = amdgpu_bo_ref(gws_bo);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index aab2aa6..491f0dd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1576,7 +1576,7 @@ static int kfd_ioctl_alloc_queue_gws(struct file *filep,
 
if (!hws_gws_support ||
dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS)
-   return -EINVAL;
+   return -ENODEV;
 
dev = kfd_device_by_id(args->gpu_id);
if (!dev) {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index c2c570e..da09586 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -103,7 +103,7 @@ int pqm_set_gws(struct process_queue_manager *pqm, unsigned 
int qid,
 
/* Only allow one queue per process can have GWS assigned */
if (gws && pdd->qpd.num_gws)
-   return -EINVAL;
+   return -EBUSY;
 
if (!gws && pdd->qpd.num_gws == 0)
return -EINVAL;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: fix a race in GPU reset with IB test

2019-05-28 Thread Alex Deucher
Split late_init into two functions, one (do_late_init) which
just does the hw init, and late_init which calls do_late_init
and schedules the IB test work.  Call do_late_init in
the GPU reset code to run the init code, but not schedule
the IB test code.  The IB test code is called directly
in the gpu reset code so no need to run the IB tests
in a separate work thread.  If we do, we end up racing.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 43 +-
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7a8c2201cd04..6b90840307dc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1869,19 +1869,7 @@ static int amdgpu_device_set_pg_state(struct 
amdgpu_device *adev, enum amd_power
return 0;
 }
 
-/**
- * amdgpu_device_ip_late_init - run late init for hardware IPs
- *
- * @adev: amdgpu_device pointer
- *
- * Late initialization pass for hardware IPs.  The list of all the hardware
- * IPs that make up the asic is walked and the late_init callbacks are run.
- * late_init covers any special initialization that an IP requires
- * after all of the have been initialized or something that needs to happen
- * late in the init process.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
+static int amdgpu_device_do_ip_late_init(struct amdgpu_device *adev)
 {
int i = 0, r;
 
@@ -1902,14 +1890,35 @@ static int amdgpu_device_ip_late_init(struct 
amdgpu_device *adev)
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_GATE);
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_GATE);
 
-   queue_delayed_work(system_wq, >late_init_work,
-  msecs_to_jiffies(AMDGPU_RESUME_MS));
-
amdgpu_device_fill_reset_magic(adev);
 
return 0;
 }
 
+/**
+ * amdgpu_device_ip_late_init - run late init for hardware IPs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Late initialization pass for hardware IPs.  The list of all the hardware
+ * IPs that make up the asic is walked and the late_init callbacks are run.
+ * late_init covers any special initialization that an IP requires
+ * after all of the have been initialized or something that needs to happen
+ * late in the init process.
+ * Returns 0 on success, negative error code on failure.
+ */
+static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
+{
+   int r;
+
+   r = amdgpu_device_do_ip_late_init(adev);
+
+   queue_delayed_work(system_wq, >late_init_work,
+  msecs_to_jiffies(AMDGPU_RESUME_MS));
+
+   return r;
+}
+
 /**
  * amdgpu_device_ip_fini - run fini for hardware IPs
  *
@@ -3502,7 +3511,7 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info 
*hive,
if (vram_lost)

amdgpu_device_fill_reset_magic(tmp_adev);
 
-   r = amdgpu_device_ip_late_init(tmp_adev);
+   r = amdgpu_device_do_ip_late_init(tmp_adev);
if (r)
goto out;
 
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 0/2] drm/amd/display: Add HDR output metadata support for amdgpu

2019-05-28 Thread Nicholas Kazlauskas
This patch series enables HDR output metadata support in amdgpu using the
DRM HDR interface merged in drm-misc-next. Enabled for DCE and DCN ASICs
over DP and HDMI.

It's limited to static HDR metadata support for now since that's all the
DRM interface supports. It requires a modeset for entering and exiting HDR
but the metadata can be changed without one.

Cc: Harry Wentland 

Nicholas Kazlauskas (2):
  drm/amd/display: Expose HDR output metadata for supported connectors
  drm/amd/display: Only force modesets when toggling HDR

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 155 +-
 1 file changed, 151 insertions(+), 4 deletions(-)

-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/2] drm/amd/display: Expose HDR output metadata for supported connectors

2019-05-28 Thread Nicholas Kazlauskas
[Why]
For userspace to send static HDR metadata to the display we need to
attach the property on the connector and send it to DC.

[How]
The property is attached to HDMI and DP connectors. Since the metadata
isn't actually available when creating the connector this isn't a
property we can dynamically support based on the extension block
being available or not.

When the HDR metadata is changed a modeset will be forced for now.
We need to switch from 8bpc to 10bpc in most cases anyway, and we want
to fully exit HDR mode when userspace gives us a NULL metadata, so this
isn't completely unnecessary.

The requirement can later be reduced to just entering and exiting HDR
or switching max bpc.

Cc: Harry Wentland 
Signed-off-by: Nicholas Kazlauskas 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 125 ++
 1 file changed, 125 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 995f9df66142..eb31acca7ed6 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3871,6 +3871,121 @@ enum drm_mode_status 
amdgpu_dm_connector_mode_valid(struct drm_connector *connec
return result;
 }
 
+static int fill_hdr_info_packet(const struct drm_connector_state *state,
+   struct dc_info_packet *out)
+{
+   struct hdmi_drm_infoframe frame;
+   unsigned char buf[30]; /* 26 + 4 */
+   ssize_t len;
+   int ret, i;
+
+   memset(out, 0, sizeof(*out));
+
+   if (!state->hdr_output_metadata)
+   return 0;
+
+   ret = drm_hdmi_infoframe_set_hdr_metadata(, state);
+   if (ret)
+   return ret;
+
+   len = hdmi_drm_infoframe_pack_only(, buf, sizeof(buf));
+   if (len < 0)
+   return (int)len;
+
+   /* Static metadata is a fixed 26 bytes + 4 byte header. */
+   if (len != 30)
+   return -EINVAL;
+
+   /* Prepare the infopacket for DC. */
+   switch (state->connector->connector_type) {
+   case DRM_MODE_CONNECTOR_HDMIA:
+   out->hb0 = 0x87; /* type */
+   out->hb1 = 0x01; /* version */
+   out->hb2 = 0x1A; /* length */
+   out->sb[0] = buf[3]; /* checksum */
+   i = 1;
+   break;
+
+   case DRM_MODE_CONNECTOR_DisplayPort:
+   case DRM_MODE_CONNECTOR_eDP:
+   out->hb0 = 0x00; /* sdp id, zero */
+   out->hb1 = 0x87; /* type */
+   out->hb2 = 0x1D; /* payload len - 1 */
+   out->hb3 = (0x13 << 2); /* sdp version */
+   out->sb[0] = 0x01; /* version */
+   out->sb[1] = 0x1A; /* length */
+   i = 2;
+   break;
+
+   default:
+   return -EINVAL;
+   }
+
+   memcpy(>sb[i], [4], 26);
+   out->valid = true;
+
+   print_hex_dump(KERN_DEBUG, "HDR SB:", DUMP_PREFIX_NONE, 16, 1, out->sb,
+  sizeof(out->sb), false);
+
+   return 0;
+}
+
+static bool
+is_hdr_metadata_different(const struct drm_connector_state *old_state,
+ const struct drm_connector_state *new_state)
+{
+   struct drm_property_blob *old_blob = old_state->hdr_output_metadata;
+   struct drm_property_blob *new_blob = new_state->hdr_output_metadata;
+
+   if (old_blob != new_blob) {
+   if (old_blob && new_blob &&
+   old_blob->length == new_blob->length)
+   return memcmp(old_blob->data, new_blob->data,
+ old_blob->length);
+
+   return true;
+   }
+
+   return false;
+}
+
+static int
+amdgpu_dm_connector_atomic_check(struct drm_connector *conn,
+struct drm_connector_state *new_con_state)
+{
+   struct drm_atomic_state *state = new_con_state->state;
+   struct drm_connector_state *old_con_state =
+   drm_atomic_get_old_connector_state(state, conn);
+   struct drm_crtc *crtc = new_con_state->crtc;
+   struct drm_crtc_state *new_crtc_state;
+   int ret;
+
+   if (!crtc)
+   return 0;
+
+   if (is_hdr_metadata_different(old_con_state, new_con_state)) {
+   struct dc_info_packet hdr_infopacket;
+
+   ret = fill_hdr_info_packet(new_con_state, _infopacket);
+   if (ret)
+   return ret;
+
+   new_crtc_state = drm_atomic_get_crtc_state(state, crtc);
+   if (IS_ERR(new_crtc_state))
+   return PTR_ERR(new_crtc_state);
+
+   /*
+* DC considers the stream backends changed if the
+* static metadata changes. Forcing the modeset also
+* gives a simple way for userspace to switch from
+* 8bpc to 10bpc when setting the metadata.
+*/
+   

[PATCH 2/2] drm/amd/display: Only force modesets when toggling HDR

2019-05-28 Thread Nicholas Kazlauskas
[Why]
We can issue HDR static metadata as part of stream updates for
non-modesets as long as we force a modeset when entering or exiting HDR.

This avoids unnecessary blanking for simple metadata updates.

[How]
When changing scaling and abm for the stream also check if HDR has
changed and send the stream update. This will only happen in non-modeset
cases.

Cc: Harry Wentland 
Signed-off-by: Nicholas Kazlauskas 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 34 +++
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index eb31acca7ed6..443b13ec268d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3978,9 +3978,16 @@ amdgpu_dm_connector_atomic_check(struct drm_connector 
*conn,
 * DC considers the stream backends changed if the
 * static metadata changes. Forcing the modeset also
 * gives a simple way for userspace to switch from
-* 8bpc to 10bpc when setting the metadata.
+* 8bpc to 10bpc when setting the metadata to enter
+* or exit HDR.
+*
+* Changing the static metadata after it's been
+* set is permissible, however. So only force a
+* modeset if we're entering or exiting HDR.
 */
-   new_crtc_state->mode_changed = true;
+   new_crtc_state->mode_changed =
+   !old_con_state->hdr_output_metadata ||
+   !new_con_state->hdr_output_metadata;
}
 
return 0;
@@ -5881,7 +5888,9 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
struct amdgpu_crtc *acrtc = 
to_amdgpu_crtc(dm_new_con_state->base.crtc);
struct dc_surface_update dummy_updates[MAX_SURFACES];
struct dc_stream_update stream_update;
+   struct dc_info_packet hdr_packet;
struct dc_stream_status *status = NULL;
+   bool abm_changed, hdr_changed, scaling_changed;
 
memset(_updates, 0, sizeof(dummy_updates));
memset(_update, 0, sizeof(stream_update));
@@ -5898,11 +5907,19 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
 
-   if (!is_scaling_state_different(dm_new_con_state, 
dm_old_con_state) &&
-   (dm_new_crtc_state->abm_level == 
dm_old_crtc_state->abm_level))
+   scaling_changed = is_scaling_state_different(dm_new_con_state,
+dm_old_con_state);
+
+   abm_changed = dm_new_crtc_state->abm_level !=
+ dm_old_crtc_state->abm_level;
+
+   hdr_changed =
+   is_hdr_metadata_different(old_con_state, new_con_state);
+
+   if (!scaling_changed && !abm_changed && !hdr_changed)
continue;
 
-   if (is_scaling_state_different(dm_new_con_state, 
dm_old_con_state)) {
+   if (scaling_changed) {

update_stream_scaling_settings(_new_con_state->base.crtc->mode,
dm_new_con_state, (struct 
dc_stream_state *)dm_new_crtc_state->stream);
 
@@ -5910,12 +5927,17 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
stream_update.dst = dm_new_crtc_state->stream->dst;
}
 
-   if (dm_new_crtc_state->abm_level != 
dm_old_crtc_state->abm_level) {
+   if (abm_changed) {
dm_new_crtc_state->stream->abm_level = 
dm_new_crtc_state->abm_level;
 
stream_update.abm_level = _new_crtc_state->abm_level;
}
 
+   if (hdr_changed) {
+   fill_hdr_info_packet(new_con_state, _packet);
+   stream_update.hdr_static_metadata = _packet;
+   }
+
status = dc_stream_get_status(dm_new_crtc_state->stream);
WARN_ON(!status);
WARN_ON(!status->plane_count);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 4/4] drm/amdkfd: Check against device cgroup

2019-05-28 Thread Tejun Heo
Hello,

On Fri, May 17, 2019 at 08:12:17PM +, Kuehling, Felix wrote:
> Patches 1,2,4 will be submitted through amd-staging-drm-next. Patch 3 
> goes through the cgroup tree. Patch 4 depends on patch 3. So submitting 
> patch 4 will need to wait until we rebase amd-staging-drm-next on a new 
> enough kernel release that includes patch 3.
> 
> Patch 1 and 2 could be submitted now or wait for patch 3 as well so we 
> submit all our cgroup stuff in amdgpu and KFD together. It probably 
> makes most sense to wait since unused code tends to rot.
> 
> Patches 1,2,4 are already reviewed by me. Feel free to add my Acked-by 
> to patch 3.

Please feel free to add my acked-by and take patch 3 with the rest of
the patchset.

Thanks.

-- 
tejun
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 0/4] AMDKFD (AMD GPU compute) support for device cgroup.

2019-05-28 Thread Tejun Heo
Hello,

On Fri, May 17, 2019 at 08:04:42PM +, Kasiviswanathan, Harish wrote:
> 1). Documentation for user on how to use device cgroup for amdkfd device. I 
> have some more information on this in patch 4. 

I see.  Yeah, I just missed that.

Thanks.

-- 
tejun
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdkfd: Fix a potential circular lock

2019-05-28 Thread Zeng, Oak
The idea to break the circular lock dependency is, unlock dqm
temporarily before calling init_mqd in call stack #1 (see below)

[  513.604034] ==
[  513.604205] WARNING: possible circular locking dependency detected
[  513.604375] 4.18.0-kfd-root #2 Tainted: GW
[  513.604530] --
[  513.604699] kswapd0/611 is trying to acquire lock:
[  513.604840] d254022e (>lock_hidden){+.+.}, at: 
evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.605150]
   but task is already holding lock:
[  513.605307] 961547fc (_vma->rwsem){}, at: 
page_lock_anon_vma_read+0xe4/0x250
[  513.605540]
   which lock already depends on the new lock.

[  513.605747]
   the existing dependency chain (in reverse order) is:
[  513.605944]
   -> #4 (_vma->rwsem){}:
[  513.606106]__vma_adjust+0x147/0x7f0
[  513.606231]__split_vma+0x179/0x190
[  513.606353]mprotect_fixup+0x217/0x260
[  513.606553]do_mprotect_pkey+0x211/0x380
[  513.606752]__x64_sys_mprotect+0x1b/0x20
[  513.606954]do_syscall_64+0x50/0x1a0
[  513.607149]entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  513.607380]
   -> #3 (>i_mmap_rwsem){}:
[  513.607678]rmap_walk_file+0x1f0/0x280
[  513.607887]page_referenced+0xdd/0x180
[  513.608081]shrink_page_list+0x853/0xcb0
[  513.608279]shrink_inactive_list+0x33b/0x700
[  513.608483]shrink_node_memcg+0x37a/0x7f0
[  513.608682]shrink_node+0xd8/0x490
[  513.608869]balance_pgdat+0x18b/0x3b0
[  513.609062]kswapd+0x203/0x5c0
[  513.609241]kthread+0x100/0x140
[  513.609420]ret_from_fork+0x24/0x30
[  513.609607]
   -> #2 (fs_reclaim){+.+.}:
[  513.609883]kmem_cache_alloc_trace+0x34/0x2e0
[  513.610093]reservation_object_reserve_shared+0x139/0x300
[  513.610326]ttm_bo_init_reserved+0x291/0x480 [ttm]
[  513.610567]amdgpu_bo_do_create+0x1d2/0x650 [amdgpu]
[  513.610811]amdgpu_bo_create+0x40/0x1f0 [amdgpu]
[  513.611041]amdgpu_bo_create_reserved+0x249/0x2d0 [amdgpu]
[  513.611290]amdgpu_bo_create_kernel+0x12/0x70 [amdgpu]
[  513.611584]amdgpu_ttm_init+0x2cb/0x560 [amdgpu]
[  513.611823]gmc_v9_0_sw_init+0x400/0x750 [amdgpu]
[  513.612491]amdgpu_device_init+0x14eb/0x1990 [amdgpu]
[  513.612730]amdgpu_driver_load_kms+0x78/0x290 [amdgpu]
[  513.612958]drm_dev_register+0x111/0x1a0
[  513.613171]amdgpu_pci_probe+0x11c/0x1e0 [amdgpu]
[  513.613389]local_pci_probe+0x3f/0x90
[  513.613581]pci_device_probe+0x102/0x1c0
[  513.613779]driver_probe_device+0x2a7/0x480
[  513.613984]__driver_attach+0x10a/0x110
[  513.614179]bus_for_each_dev+0x67/0xc0
[  513.614372]bus_add_driver+0x1eb/0x260
[  513.614565]driver_register+0x5b/0xe0
[  513.614756]do_one_initcall+0xac/0x357
[  513.614952]do_init_module+0x5b/0x213
[  513.615145]load_module+0x2542/0x2d30
[  513.615337]__do_sys_finit_module+0xd2/0x100
[  513.615541]do_syscall_64+0x50/0x1a0
[  513.615731]entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  513.615963]
   -> #1 (reservation_ww_class_mutex){+.+.}:
[  513.616293]amdgpu_amdkfd_alloc_gtt_mem+0xcf/0x2c0 [amdgpu]
[  513.616554]init_mqd+0x223/0x260 [amdgpu]
[  513.616779]create_queue_nocpsch+0x4d9/0x600 [amdgpu]
[  513.617031]pqm_create_queue+0x37c/0x520 [amdgpu]
[  513.617270]kfd_ioctl_create_queue+0x2f9/0x650 [amdgpu]
[  513.617522]kfd_ioctl+0x202/0x350 [amdgpu]
[  513.617724]do_vfs_ioctl+0x9f/0x6c0
[  513.617914]ksys_ioctl+0x66/0x70
[  513.618095]__x64_sys_ioctl+0x16/0x20
[  513.618286]do_syscall_64+0x50/0x1a0
[  513.618476]entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  513.618695]
   -> #0 (>lock_hidden){+.+.}:
[  513.618984]__mutex_lock+0x98/0x970
[  513.619197]evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[  513.619459]kfd_process_evict_queues+0x3b/0xb0 [amdgpu]
[  513.619710]kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu]
[  513.620103]amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu]
[  513.620363]amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu]
[  513.620614]__mmu_notifier_invalidate_range_start+0x70/0xb0
[  513.620851]try_to_unmap_one+0x7fc/0x8f0
[  513.621049]rmap_walk_anon+0x121/0x290
[  513.621242]try_to_unmap+0x93/0xf0
[  513.621428]shrink_page_list+0x606/0xcb0
[  513.621625]shrink_inactive_list+0x33b/0x700
[  513.621835]shrink_node_memcg+0x37a/0x7f0
[  513.622034]shrink_node+0xd8/0x490
[  513.622219]balance_pgdat+0x18b/0x3b0
[  513.622410]kswapd+0x203/0x5c0
[  513.622589]

Re: Quick question for the mobile Raven Ridge auto-rotate function

2019-05-28 Thread Deucher, Alexander
The FCH team is working on a driver for it and should have something available 
by August.

Alex


From: amd-gfx  on behalf of Luya 
Tshimbalanga 
Sent: Thursday, May 23, 2019 1:34 AM
To: amd-gfx@lists.freedesktop.org
Subject: Quick question for the mobile Raven Ridge auto-rotate function

[CAUTION: External Email]

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hello team,

Thank for you making mobile Raven Ridge nearly fully functional with the
open source driver for multiple devices like HP Envy x360 Ryzen 2500u.
However, missing is the ability to auto-rotate the screen when switching
from landscape to portrait in tablet mode.

Which channel will be better to request enabling that function in open
source driver? See the related issue below:

Red Hat bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1651886

Linux kernel report: https://bugzilla.kernel.org/show_bug.cgi?id=199715

I will be willing to test such driver to report the functionality.


Thanks in advance,

Luya Tshimbalanga
Fedora Design Team

-BEGIN PGP SIGNATURE-

iQEzBAEBCAAdFiEEWyB+BQtYiFz4GUNDXlKBdNiiYJoFAlzmMM8ACgkQXlKBdNii
YJq/wAgAn8jWhzXLLbBYjtai4f0C4rw1cKt1MDi48qOxlSiPaOplfG8RLrCGVFL3
jMgReHhN/U3zfy3SRBXa2zsTspdVKRnxewMxJHJmS653prOAEVsfd41b/XDMInmp
kFCcTKXwR9GlUYPbSuj3pMLSwq3OHmBlPjnpL4NMXlmrcQ6psN9I992Itg8HEoh2
3vGF5qRdKuidLnu9xRNLceLjvpvTyJ5fhH/Ry5sylX5oJhdW7WlR5HE+Smsgu7hW
rVGgGl6yFdUEGaiFRqhPxuEpC07i7Vi5REqB+s/YUgzM+Wn86taA4Ld1R9dMeZIm
g7EJeO8c6RKII2MOKWrKtTpvuUg+RQ==
=Ftfg
-END PGP SIGNATURE-

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 07/10] drm/amd/display: use ttm_eu_reserve_buffers instead of amdgpu_bo_reserve v2

2019-05-28 Thread Alex Deucher
On Tue, May 28, 2019 at 1:05 PM Sam Ravnborg  wrote:
>
> Hi Christian.
>
> On Tue, May 28, 2019 at 06:25:54PM +0200, Christian König wrote:
> > From: Chunming Zhou 
> >
> > add ticket for display bo, so that it can preempt busy bo.
> >
> > v2: fix stupid rebase error
> >
> > Change-Id: I9f031cdcc8267de00e819ae303baa0a52df8ebb9
> What is this?
> I do not recall seeing this in a changelog before?

It's an artifact of gerrit which we use internally for git management.

Alex

>
> (Sorry for not commenting on the patch, most of it is beyond my
> understanding for now).
>
> Sam
>
> > Signed-off-by: Chunming Zhou 
> > Reviewed-by: Christian König 
> > ---
> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 21 ++-
> >  1 file changed, 16 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 4a1755bce96c..56f320f3fd72 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -4182,6 +4182,9 @@ static int dm_plane_helper_prepare_fb(struct 
> > drm_plane *plane,
> >   struct amdgpu_device *adev;
> >   struct amdgpu_bo *rbo;
> >   struct dm_plane_state *dm_plane_state_new, *dm_plane_state_old;
> > + struct list_head list;
> > + struct ttm_validate_buffer tv;
> > + struct ww_acquire_ctx ticket;
> >   uint64_t tiling_flags;
> >   uint32_t domain;
> >   int r;
> > @@ -4198,9 +4201,17 @@ static int dm_plane_helper_prepare_fb(struct 
> > drm_plane *plane,
> >   obj = new_state->fb->obj[0];
> >   rbo = gem_to_amdgpu_bo(obj);
> >   adev = amdgpu_ttm_adev(rbo->tbo.bdev);
> > - r = amdgpu_bo_reserve(rbo, false);
> > - if (unlikely(r != 0))
> > + INIT_LIST_HEAD();
> > +
> > + tv.bo = >tbo;
> > + tv.num_shared = 1;
> > + list_add(, );
> > +
> > + r = ttm_eu_reserve_buffers(, , false, NULL, true);
> > + if (r) {
> > + dev_err(adev->dev, "fail to reserve bo (%d)\n", r);
> >   return r;
> > + }
> >
> >   if (plane->type != DRM_PLANE_TYPE_CURSOR)
> >   domain = amdgpu_display_supported_domains(adev);
> > @@ -4211,21 +4222,21 @@ static int dm_plane_helper_prepare_fb(struct 
> > drm_plane *plane,
> >   if (unlikely(r != 0)) {
> >   if (r != -ERESTARTSYS)
> >   DRM_ERROR("Failed to pin framebuffer with error 
> > %d\n", r);
> > - amdgpu_bo_unreserve(rbo);
> > + ttm_eu_backoff_reservation(, );
> >   return r;
> >   }
> >
> >   r = amdgpu_ttm_alloc_gart(>tbo);
> >   if (unlikely(r != 0)) {
> >   amdgpu_bo_unpin(rbo);
> > - amdgpu_bo_unreserve(rbo);
> > + ttm_eu_backoff_reservation(, );
> >   DRM_ERROR("%p bind failed\n", rbo);
> >   return r;
> >   }
> >
> >   amdgpu_bo_get_tiling_flags(rbo, _flags);
> >
> > - amdgpu_bo_unreserve(rbo);
> > + ttm_eu_backoff_reservation(, );
> >
> >   afb->address = amdgpu_bo_gpu_offset(rbo);
> >
> > --
> > 2.17.1
> >
> > ___
> > dri-devel mailing list
> > dri-de...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 07/10] drm/amd/display: use ttm_eu_reserve_buffers instead of amdgpu_bo_reserve v2

2019-05-28 Thread Sam Ravnborg
Hi Christian.

On Tue, May 28, 2019 at 06:25:54PM +0200, Christian König wrote:
> From: Chunming Zhou 
> 
> add ticket for display bo, so that it can preempt busy bo.
> 
> v2: fix stupid rebase error
> 
> Change-Id: I9f031cdcc8267de00e819ae303baa0a52df8ebb9
What is this?
I do not recall seeing this in a changelog before?

(Sorry for not commenting on the patch, most of it is beyond my
understanding for now).

Sam

> Signed-off-by: Chunming Zhou 
> Reviewed-by: Christian König 
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 21 ++-
>  1 file changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 4a1755bce96c..56f320f3fd72 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -4182,6 +4182,9 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
> *plane,
>   struct amdgpu_device *adev;
>   struct amdgpu_bo *rbo;
>   struct dm_plane_state *dm_plane_state_new, *dm_plane_state_old;
> + struct list_head list;
> + struct ttm_validate_buffer tv;
> + struct ww_acquire_ctx ticket;
>   uint64_t tiling_flags;
>   uint32_t domain;
>   int r;
> @@ -4198,9 +4201,17 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
> *plane,
>   obj = new_state->fb->obj[0];
>   rbo = gem_to_amdgpu_bo(obj);
>   adev = amdgpu_ttm_adev(rbo->tbo.bdev);
> - r = amdgpu_bo_reserve(rbo, false);
> - if (unlikely(r != 0))
> + INIT_LIST_HEAD();
> +
> + tv.bo = >tbo;
> + tv.num_shared = 1;
> + list_add(, );
> +
> + r = ttm_eu_reserve_buffers(, , false, NULL, true);
> + if (r) {
> + dev_err(adev->dev, "fail to reserve bo (%d)\n", r);
>   return r;
> + }
>  
>   if (plane->type != DRM_PLANE_TYPE_CURSOR)
>   domain = amdgpu_display_supported_domains(adev);
> @@ -4211,21 +4222,21 @@ static int dm_plane_helper_prepare_fb(struct 
> drm_plane *plane,
>   if (unlikely(r != 0)) {
>   if (r != -ERESTARTSYS)
>   DRM_ERROR("Failed to pin framebuffer with error %d\n", 
> r);
> - amdgpu_bo_unreserve(rbo);
> + ttm_eu_backoff_reservation(, );
>   return r;
>   }
>  
>   r = amdgpu_ttm_alloc_gart(>tbo);
>   if (unlikely(r != 0)) {
>   amdgpu_bo_unpin(rbo);
> - amdgpu_bo_unreserve(rbo);
> + ttm_eu_backoff_reservation(, );
>   DRM_ERROR("%p bind failed\n", rbo);
>   return r;
>   }
>  
>   amdgpu_bo_get_tiling_flags(rbo, _flags);
>  
> - amdgpu_bo_unreserve(rbo);
> + ttm_eu_backoff_reservation(, );
>  
>   afb->address = amdgpu_bo_gpu_offset(rbo);
>  
> -- 
> 2.17.1
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-28 Thread Catalin Marinas
On Thu, May 23, 2019 at 02:31:16PM -0700, Kees Cook wrote:
> syzkaller already attempts to randomly inject non-canonical and
> 0x addresses for user pointers in syscalls in an effort to
> find bugs like CVE-2017-5123 where waitid() via unchecked put_user() was
> able to write directly to kernel memory[1].
> 
> It seems that using TBI by default and not allowing a switch back to
> "normal" ABI without a reboot actually means that userspace cannot inject
> kernel pointers into syscalls any more, since they'll get universally
> stripped now. Is my understanding correct, here? i.e. exploiting
> CVE-2017-5123 would be impossible under TBI?
> 
> If so, then I think we should commit to the TBI ABI and have a boot
> flag to disable it, but NOT have a process flag, as that would allow
> attackers to bypass the masking. The only flag should be "TBI or MTE".
> 
> If so, can I get top byte masking for other architectures too? Like,
> just to strip high bits off userspace addresses? ;)

Just for fun, hack/attempt at your idea which should not interfere with
TBI. Only briefly tested on arm64 (and the s390 __TYPE_IS_PTR macro is
pretty weird ;)):

--8<-
diff --git a/arch/s390/include/asm/compat.h b/arch/s390/include/asm/compat.h
index 63b46e30b2c3..338455a74eff 100644
--- a/arch/s390/include/asm/compat.h
+++ b/arch/s390/include/asm/compat.h
@@ -11,9 +11,6 @@
 
 #include 
 
-#define __TYPE_IS_PTR(t) (!__builtin_types_compatible_p( \
-   typeof(0?(__force t)0:0ULL), u64))
-
 #define __SC_DELOUSE(t,v) ({ \
BUILD_BUG_ON(sizeof(t) > 4 && !__TYPE_IS_PTR(t)); \
(__force t)(__TYPE_IS_PTR(t) ? ((v) & 0x7fff) : (v)); \
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index e2870fe1be5b..b1b9fe8502da 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -119,8 +119,15 @@ struct io_uring_params;
 #define __TYPE_IS_L(t) (__TYPE_AS(t, 0L))
 #define __TYPE_IS_UL(t)(__TYPE_AS(t, 0UL))
 #define __TYPE_IS_LL(t) (__TYPE_AS(t, 0LL) || __TYPE_AS(t, 0ULL))
+#define __TYPE_IS_PTR(t) (!__builtin_types_compatible_p(typeof(0 ? (__force 
t)0 : 0ULL), u64))
 #define __SC_LONG(t, a) __typeof(__builtin_choose_expr(__TYPE_IS_LL(t), 0LL, 
0L)) a
+#ifdef CONFIG_64BIT
+#define __SC_CAST(t, a)(__TYPE_IS_PTR(t) \
+   ? (__force t) ((__u64)a & ~(1UL << 55)) \
+   : (__force t) a)
+#else
 #define __SC_CAST(t, a)(__force t) a
+#endif
 #define __SC_ARGS(t, a)a
 #define __SC_TEST(t, a) (void)BUILD_BUG_ON_ZERO(!__TYPE_IS_LL(t) && sizeof(t) 
> sizeof(long))
 

-- 
Catalin


Re: [PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

2019-05-28 Thread Emil Velikov
On 2019/05/28, Koenig, Christian wrote:
> Am 28.05.19 um 18:10 schrieb Emil Velikov:
> > On 2019/05/28, Daniel Vetter wrote:
> >> On Tue, May 28, 2019 at 10:03 AM Koenig, Christian
> >>  wrote:
> >>> Am 28.05.19 um 09:38 schrieb Daniel Vetter:
>  [SNIP]
> > Might be a good idea looking into reverting it partially, so that at
> > least command submission and buffer allocation is still blocked.
>  I thought the issue is a lot more than vainfo, it's pretty much every
>  hacked up compositor under the sun getting this wrong one way or
>  another. Thinking about this some more, I also have no idea how you'd
>  want to deprecate rendering on primary nodes in general. Apparently
>  that breaks -modesetting already, and probably lots more compositors.
>  And it looks like we're finally achieve the goal kms set out to 10
>  years ago, and new compositors are sprouting up all the time. I guess
>  we could just break them all (on new hardware) and tell them to all
>  suck it up. But I don't think that's a great option. And just
>  deprecating this on amdgpu is going to be even harder, since then
>  everywhere else it'll keep working, and it's just amdgpu.ko that looks
>  broken.
> 
>  Aside: I'm not supporting Emil's idea here because it fixes any issues
>  Intel has - Intel doesn't care. I support it because reality sucks,
>  people get this render vs. primary vs. multi-gpu prime wrong all the
>  time (that's also why we have hardcoded display+gpu pairs in mesa for
>  the various soc combinations out there), and this looks like a
>  pragmatic solution. It'd be nice if every compositor and everything
>  else would perfectly support multi gpu and only use render nodes for
>  rendering, and only primary nodes for display. But reality is that
>  people hack on stuff until gears on screen and then move on to more
>  interesting things (to them). So I don't think we'll ever win this :-/
> >>> Yeah, but this is a classic case of working around user space issues by
> >>> making kernel changes instead of fixing user space.
> >>>
> >>> Having privileged (output control) and unprivileged (rendering control)
> >>> functionality behind the same node is a mistake we have made a long time
> >>> ago and render nodes finally seemed to be a way to fix that.
> >>>
> >>> I mean why are compositors using the primary node in the first place?
> >>> Because they want to have access to privileged resources I think and in
> >>> this case it is perfectly ok to do so.
> >>>
> >>> Now extending unprivileged access to the primary node actually sounds
> >>> like a step into the wrong direction to me.
> >>>
> >>> I rather think that we should go down the route of completely dropping
> >>> command submission and buffer allocation through the primary node for
> >>> non master clients. And then as next step at some point drop support for
> >>> authentication/flink.
> >>>
> >>> I mean we have done this with UMS as well and I don't see much other way
> >>> to move forward and get rid of those ancient interface in the long term.
> >> Well kms had some really good benefits that drove quick adoption, like
> >> "suspend/resume actually has a chance of working" or "comes with
> >> buffer management so you can run multiple gears".
> >>
> >> The render node thing is a lot more niche use case (prime, better priv
> >> separation), plus "it's cleaner design". And the "cleaner design" part
> >> is something that empirically doesn't seem to matter :-/ Just two
> >> examples:
> >> - KHR_display/leases just iterated display resources on the fd needed
> >> for rendering (and iirc there was even a patch to expose that for
> >> render nodes too so it works with DRI3), because implementing
> >> protocols is too hard. Barely managed to stop that one before it
> >> happened.
> >> - Various video players use the vblank ioctl on directly to schedule
> >> frames, without telling the compositor. I discovered that when I
> >> wanted to limite the vblank ioctl to master clients only. Again,
> >> apparently too hard to use the existing extensions, or fix the bugs in
> >> there, or whatever. One userspace got fixed last year, but it'll
> >> probably get copypasted around forever :-/
> >>
> >> So I don't think we'll ever manage to roll a clean split out, and best
> >> we can do is give in and just hand userspace what it wants. As much as
> >> that's misguided and unclean and all that. Maybe it'll result in a
> >> least fewer stuff getting run as root to hack around this, because
> >> fixing properly seems not to be on the table.
> >>
> >> The beauty of kms is that we've achieved the mission, everyone's
> >> writing their own thing. Which is also terrible, and I don't think
> >> it'll get better.
> >
> > With the risk of coming rude I will repeat my earlier comment:
> >
> > The problem is _neither_ Intel nor libva specific.
> >
> >
> >
> > That said, let's step back for a moment 

Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-28 Thread Catalin Marinas
On Tue, May 28, 2019 at 04:56:45PM +0100, Dave P Martin wrote:
> On Tue, May 28, 2019 at 04:40:58PM +0100, Catalin Marinas wrote:
> 
> [...]
> 
> > My thoughts on allowing tags (quick look):
> >
> > brk - no
> 
> [...]
> 
> > mlock, mlock2, munlock - yes
> > mmap - no (we may change this with MTE but not for TBI)
> 
> [...]
> 
> > mprotect - yes
> 
> I haven't following this discussion closely... what's the rationale for
> the inconsistencies here (feel free to refer me back to the discussion
> if it's elsewhere).

_My_ rationale (feel free to disagree) is that mmap() by default would
not return a tagged address (ignoring MTE for now). If it gets passed a
tagged address or a "tagged NULL" (for lack of a better name) we don't
have clear semantics of whether the returned address should be tagged in
this ABI relaxation. I'd rather reserve this specific behaviour if we
overload the non-zero tag meaning of mmap() for MTE. Similar reasoning
for mremap(), at least on the new_address argument (not entirely sure
about old_address).

munmap() should probably follow the mmap() rules.

As for brk(), I don't see why the user would need to pass a tagged
address, we can't associate any meaning to this tag.

For the rest, since it's likely such addresses would have been tagged by
malloc() in user space, we should allow tagged pointers.

-- 
Catalin


[PATCH 06/10] drm/ttm: fix busy memory to fail other user v10

2019-05-28 Thread Christian König
BOs on the LRU might be blocked during command submission
and cause OOM situations.

Avoid this by blocking for the first busy BO not locked by
the same ticket as the BO we are searching space for.

v10: completely start over with the patch since we didn't
 handled a whole bunch of corner cases.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 77 ++--
 1 file changed, 66 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 3bc31da1df67..98aa90056807 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -774,32 +774,72 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable);
  * b. Otherwise, trylock it.
  */
 static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
-   struct ttm_operation_ctx *ctx, bool *locked)
+   struct ttm_operation_ctx *ctx, bool *locked, bool *busy)
 {
bool ret = false;
 
-   *locked = false;
if (bo->resv == ctx->resv) {
reservation_object_assert_held(bo->resv);
if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT
|| !list_empty(>ddestroy))
ret = true;
+   *locked = false;
+   if (busy)
+   *busy = false;
} else {
-   *locked = reservation_object_trylock(bo->resv);
-   ret = *locked;
+   ret = reservation_object_trylock(bo->resv);
+   *locked = ret;
+   if (busy)
+   *busy = !ret;
}
 
return ret;
 }
 
+/**
+ * ttm_mem_evict_wait_busy - wait for a busy BO to become available
+ *
+ * @busy_bo: BO which couldn't be locked with trylock
+ * @ctx: operation context
+ * @ticket: acquire ticket
+ *
+ * Try to lock a busy buffer object to avoid failing eviction.
+ */
+static int ttm_mem_evict_wait_busy(struct ttm_buffer_object *busy_bo,
+  struct ttm_operation_ctx *ctx,
+  struct ww_acquire_ctx *ticket)
+{
+   int r;
+
+   if (!busy_bo || !ticket)
+   return -EBUSY;
+
+   if (ctx->interruptible)
+   r = reservation_object_lock_interruptible(busy_bo->resv,
+ ticket);
+   else
+   r = reservation_object_lock(busy_bo->resv, ticket);
+
+   /*
+* TODO: It would be better to keep the BO locked until allocation is at
+* least tried one more time, but that would mean a much larger rework
+* of TTM.
+*/
+   if (!r)
+   reservation_object_unlock(busy_bo->resv);
+
+   return r == -EDEADLK ? -EAGAIN : r;
+}
+
 static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
   uint32_t mem_type,
   const struct ttm_place *place,
-  struct ttm_operation_ctx *ctx)
+  struct ttm_operation_ctx *ctx,
+  struct ww_acquire_ctx *ticket)
 {
+   struct ttm_buffer_object *bo = NULL, *busy_bo = NULL;
struct ttm_bo_global *glob = bdev->glob;
struct ttm_mem_type_manager *man = >man[mem_type];
-   struct ttm_buffer_object *bo = NULL;
bool locked = false;
unsigned i;
int ret;
@@ -807,8 +847,15 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
spin_lock(>lru_lock);
for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
list_for_each_entry(bo, >lru[i], lru) {
-   if (!ttm_bo_evict_swapout_allowable(bo, ctx, ))
+   bool busy;
+
+   if (!ttm_bo_evict_swapout_allowable(bo, ctx, ,
+   )) {
+   if (busy && !busy_bo &&
+   bo->resv->lock.ctx != ticket)
+   busy_bo = bo;
continue;
+   }
 
if (place && !bdev->driver->eviction_valuable(bo,
  place)) {
@@ -827,8 +874,13 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
}
 
if (!bo) {
+   if (busy_bo)
+   ttm_bo_get(busy_bo);
spin_unlock(>lru_lock);
-   return -EBUSY;
+   ret = ttm_mem_evict_wait_busy(busy_bo, ctx, ticket);
+   if (busy_bo)
+   ttm_bo_put(busy_bo);
+   return ret;
}
 
kref_get(>list_kref);
@@ -914,7 +966,8 @@ static int ttm_bo_mem_force_space(struct ttm_buffer_object 
*bo,
return ret;
if (mem->mm_node)
break;
-   ret = 

[PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3

2019-05-28 Thread Christian König
This avoids OOM situations when we have lots of threads
submitting at the same time.

v3: apply this to the whole driver, not just CS

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c| 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 20f2955d2a55..3e2da24cd17a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
}
 
r = ttm_eu_reserve_buffers(>ticket, >validated, true,
-  , true);
+  , false);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 06f83cac0d3a..f660628e6af9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
list_add(_tv.head, );
amdgpu_vm_get_pd_bo(vm, , );
 
-   r = ttm_eu_reserve_buffers(, , true, NULL, true);
+   r = ttm_eu_reserve_buffers(, , true, NULL, false);
if (r) {
DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r);
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index d513a5ad03dd..ed25a4e14404 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,
 
amdgpu_vm_get_pd_bo(vm, , _pd);
 
-   r = ttm_eu_reserve_buffers(, , false, , true);
+   r = ttm_eu_reserve_buffers(, , false, , false);
if (r) {
dev_err(adev->dev, "leaking bo va because "
"we fail to reserve bo (%d)\n", r);
@@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 
amdgpu_vm_get_pd_bo(>vm, , _pd);
 
-   r = ttm_eu_reserve_buffers(, , true, , true);
+   r = ttm_eu_reserve_buffers(, , true, , false);
if (r)
goto error_unref;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index c430e8259038..d60593cc436e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -155,7 +155,7 @@ static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, 
bool no_intr)
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
int r;
 
-   r = ttm_bo_reserve(>tbo, !no_intr, false, NULL);
+   r = __ttm_bo_reserve(>tbo, !no_intr, false, NULL);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
dev_err(adev->dev, "%p reserve failed\n", bo);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 09/10] drm/amdgpu: create GDS, GWS and OA in system domain

2019-05-28 Thread Christian König
And only move them in on validation. This allows for better control
when multiple processes are fighting over those resources.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 93b2c5a48a71..30493429851e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -495,7 +495,11 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 #endif
 
bo->tbo.bdev = >mman.bdev;
-   amdgpu_bo_placement_from_domain(bo, bp->domain);
+   if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA |
+ AMDGPU_GEM_DOMAIN_GDS))
+   amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU);
+   else
+   amdgpu_bo_placement_from_domain(bo, bp->domain);
if (bp->type == ttm_bo_type_kernel)
bo->tbo.priority = 1;
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 08/10] drm/amdgpu: drop some validation failure messages

2019-05-28 Thread Christian König
The messages about amdgpu_cs_list_validate are duplicated because the
caller will complain into the logs as well and we can also get
interrupted by a signal here.

Also fix the the caller to not report -EAGAIN from validation.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index fff558cf385b..20f2955d2a55 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -671,16 +671,12 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
*p,
}
 
r = amdgpu_cs_list_validate(p, );
-   if (r) {
-   DRM_ERROR("amdgpu_cs_list_validate(duplicates) failed.\n");
+   if (r)
goto error_validate;
-   }
 
r = amdgpu_cs_list_validate(p, >validated);
-   if (r) {
-   DRM_ERROR("amdgpu_cs_list_validate(validated) failed.\n");
+   if (r)
goto error_validate;
-   }
 
amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved,
 p->bytes_moved_vis);
@@ -1383,7 +1379,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
if (r) {
if (r == -ENOMEM)
DRM_ERROR("Not enough memory for command 
submission!\n");
-   else if (r != -ERESTARTSYS)
+   else if (r != -ERESTARTSYS && r != -EAGAIN)
DRM_ERROR("Failed to process the buffer list %d!\n", r);
goto out;
}
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 07/10] drm/amd/display: use ttm_eu_reserve_buffers instead of amdgpu_bo_reserve v2

2019-05-28 Thread Christian König
From: Chunming Zhou 

add ticket for display bo, so that it can preempt busy bo.

v2: fix stupid rebase error

Change-Id: I9f031cdcc8267de00e819ae303baa0a52df8ebb9
Signed-off-by: Chunming Zhou 
Reviewed-by: Christian König 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 21 ++-
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 4a1755bce96c..56f320f3fd72 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4182,6 +4182,9 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
struct amdgpu_device *adev;
struct amdgpu_bo *rbo;
struct dm_plane_state *dm_plane_state_new, *dm_plane_state_old;
+   struct list_head list;
+   struct ttm_validate_buffer tv;
+   struct ww_acquire_ctx ticket;
uint64_t tiling_flags;
uint32_t domain;
int r;
@@ -4198,9 +4201,17 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
obj = new_state->fb->obj[0];
rbo = gem_to_amdgpu_bo(obj);
adev = amdgpu_ttm_adev(rbo->tbo.bdev);
-   r = amdgpu_bo_reserve(rbo, false);
-   if (unlikely(r != 0))
+   INIT_LIST_HEAD();
+
+   tv.bo = >tbo;
+   tv.num_shared = 1;
+   list_add(, );
+
+   r = ttm_eu_reserve_buffers(, , false, NULL, true);
+   if (r) {
+   dev_err(adev->dev, "fail to reserve bo (%d)\n", r);
return r;
+   }
 
if (plane->type != DRM_PLANE_TYPE_CURSOR)
domain = amdgpu_display_supported_domains(adev);
@@ -4211,21 +4222,21 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("Failed to pin framebuffer with error %d\n", 
r);
-   amdgpu_bo_unreserve(rbo);
+   ttm_eu_backoff_reservation(, );
return r;
}
 
r = amdgpu_ttm_alloc_gart(>tbo);
if (unlikely(r != 0)) {
amdgpu_bo_unpin(rbo);
-   amdgpu_bo_unreserve(rbo);
+   ttm_eu_backoff_reservation(, );
DRM_ERROR("%p bind failed\n", rbo);
return r;
}
 
amdgpu_bo_get_tiling_flags(rbo, _flags);
 
-   amdgpu_bo_unreserve(rbo);
+   ttm_eu_backoff_reservation(, );
 
afb->address = amdgpu_bo_gpu_offset(rbo);
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 05/10] drm/ttm: immediately move BOs to the new LRU v2

2019-05-28 Thread Christian König
Move BOs which are currently in a lower domain to the new
LRU before allocating backing space while validating.

This makes sure that we always have enough entries on the
LRU to allow for other processes to wait for an operation
to complete.

v2: generalize the test

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 43 +++-
 1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 0dbb23b0dedd..3bc31da1df67 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -166,20 +166,22 @@ static void ttm_bo_release_list(struct kref *list_kref)
ttm_mem_global_free(bdev->glob->mem_glob, acc_size);
 }
 
-void ttm_bo_add_to_lru(struct ttm_buffer_object *bo)
+static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo,
+ struct ttm_mem_reg *mem)
 {
struct ttm_bo_device *bdev = bo->bdev;
struct ttm_mem_type_manager *man;
 
reservation_object_assert_held(bo->resv);
+   BUG_ON(!list_empty(>lru));
 
if (!list_empty(>lru))
return;
 
-   if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT)
+   if (mem->placement & TTM_PL_FLAG_NO_EVICT)
return;
 
-   man = >man[bo->mem.mem_type];
+   man = >man[mem->mem_type];
list_add_tail(>lru, >lru[bo->priority]);
kref_get(>list_kref);
 
@@ -189,6 +191,11 @@ void ttm_bo_add_to_lru(struct ttm_buffer_object *bo)
kref_get(>list_kref);
}
 }
+
+void ttm_bo_add_to_lru(struct ttm_buffer_object *bo)
+{
+   ttm_bo_add_mem_to_lru(bo, >mem);
+}
 EXPORT_SYMBOL(ttm_bo_add_to_lru);
 
 static void ttm_bo_ref_bug(struct kref *list_kref)
@@ -1001,6 +1008,14 @@ static int ttm_bo_mem_placement(struct ttm_buffer_object 
*bo,
 
mem->mem_type = mem_type;
mem->placement = cur_flags;
+
+   if (bo->mem.mem_type < mem_type && !list_empty(>lru)) {
+   spin_lock(>bdev->glob->lru_lock);
+   ttm_bo_del_from_lru(bo);
+   ttm_bo_add_mem_to_lru(bo, mem);
+   spin_unlock(>bdev->glob->lru_lock);
+   }
+
return 0;
 }
 
@@ -1034,7 +1049,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
if (ret == -EBUSY)
continue;
if (ret)
-   return ret;
+   goto error;
 
type_found = true;
mem->mm_node = NULL;
@@ -1044,13 +1059,13 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
man = >man[mem->mem_type];
ret = (*man->func->get_node)(man, bo, place, mem);
if (unlikely(ret))
-   return ret;
+   goto error;
 
if (mem->mm_node) {
ret = ttm_bo_add_move_fence(bo, man, mem);
if (unlikely(ret)) {
(*man->func->put_node)(man, mem);
-   return ret;
+   goto error;
}
return 0;
}
@@ -1063,7 +1078,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
if (ret == -EBUSY)
continue;
if (ret)
-   return ret;
+   goto error;
 
type_found = true;
mem->mm_node = NULL;
@@ -1075,15 +1090,23 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
return 0;
 
if (ret && ret != -EBUSY)
-   return ret;
+   goto error;
}
 
+   ret = -ENOMEM;
if (!type_found) {
pr_err(TTM_PFX "No compatible memory type found\n");
-   return -EINVAL;
+   ret = -EINVAL;
}
 
-   return -ENOMEM;
+error:
+   if (bo->mem.mem_type == TTM_PL_SYSTEM && !list_empty(>lru)) {
+   spin_lock(>bdev->glob->lru_lock);
+   ttm_bo_move_to_lru_tail(bo, NULL);
+   spin_unlock(>bdev->glob->lru_lock);
+   }
+
+   return ret;
 }
 EXPORT_SYMBOL(ttm_bo_mem_space);
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 04/10] drm/ttm: cleanup ttm_bo_mem_space

2019-05-28 Thread Christian König
We tried this once before, but that turned out to be more
complicated than thought. With all the right prerequisites
it looks like we can do this now.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 127 ++-
 1 file changed, 66 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 702cd89adbf9..0dbb23b0dedd 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -893,13 +893,12 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object 
*bo,
  * space, or we've evicted everything and there isn't enough space.
  */
 static int ttm_bo_mem_force_space(struct ttm_buffer_object *bo,
-   uint32_t mem_type,
-   const struct ttm_place *place,
-   struct ttm_mem_reg *mem,
-   struct ttm_operation_ctx *ctx)
+ const struct ttm_place *place,
+ struct ttm_mem_reg *mem,
+ struct ttm_operation_ctx *ctx)
 {
struct ttm_bo_device *bdev = bo->bdev;
-   struct ttm_mem_type_manager *man = >man[mem_type];
+   struct ttm_mem_type_manager *man = >man[mem->mem_type];
int ret;
 
do {
@@ -908,11 +907,11 @@ static int ttm_bo_mem_force_space(struct 
ttm_buffer_object *bo,
return ret;
if (mem->mm_node)
break;
-   ret = ttm_mem_evict_first(bdev, mem_type, place, ctx);
+   ret = ttm_mem_evict_first(bdev, mem->mem_type, place, ctx);
if (unlikely(ret != 0))
return ret;
} while (1);
-   mem->mem_type = mem_type;
+
return ttm_bo_add_move_fence(bo, man, mem);
 }
 
@@ -960,6 +959,51 @@ static bool ttm_bo_mt_compatible(struct 
ttm_mem_type_manager *man,
return true;
 }
 
+/**
+ * ttm_bo_mem_placement - check if placement is compatible
+ * @bo: BO to find memory for
+ * @place: where to search
+ * @mem: the memory object to fill in
+ * @ctx: operation context
+ *
+ * Check if placement is compatible and fill in mem structure.
+ * Returns -EBUSY if placement won't work or negative error code.
+ * 0 when placement can be used.
+ */
+static int ttm_bo_mem_placement(struct ttm_buffer_object *bo,
+   const struct ttm_place *place,
+   struct ttm_mem_reg *mem,
+   struct ttm_operation_ctx *ctx)
+{
+   struct ttm_bo_device *bdev = bo->bdev;
+   uint32_t mem_type = TTM_PL_SYSTEM;
+   struct ttm_mem_type_manager *man;
+   uint32_t cur_flags = 0;
+   int ret;
+
+   ret = ttm_mem_type_from_place(place, _type);
+   if (ret)
+   return ret;
+
+   man = >man[mem_type];
+   if (!man->has_type || !man->use_type)
+   return -EBUSY;
+
+   if (!ttm_bo_mt_compatible(man, mem_type, place, _flags))
+   return -EBUSY;
+
+   cur_flags = ttm_bo_select_caching(man, bo->mem.placement, cur_flags);
+   /*
+* Use the access and other non-mapping-related flag bits from
+* the memory placement flags to the current flags
+*/
+   ttm_flag_masked(_flags, place->flags, ~TTM_PL_MASK_MEMTYPE);
+
+   mem->mem_type = mem_type;
+   mem->placement = cur_flags;
+   return 0;
+}
+
 /**
  * Creates space for memory region @mem according to its type.
  *
@@ -974,11 +1018,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
struct ttm_operation_ctx *ctx)
 {
struct ttm_bo_device *bdev = bo->bdev;
-   struct ttm_mem_type_manager *man;
-   uint32_t mem_type = TTM_PL_SYSTEM;
-   uint32_t cur_flags = 0;
bool type_found = false;
-   bool type_ok = false;
int i, ret;
 
ret = reservation_object_reserve_shared(bo->resv, 1);
@@ -988,37 +1028,20 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
mem->mm_node = NULL;
for (i = 0; i < placement->num_placement; ++i) {
const struct ttm_place *place = >placement[i];
+   struct ttm_mem_type_manager *man;
 
-   ret = ttm_mem_type_from_place(place, _type);
+   ret = ttm_bo_mem_placement(bo, place, mem, ctx);
+   if (ret == -EBUSY)
+   continue;
if (ret)
return ret;
-   man = >man[mem_type];
-   if (!man->has_type || !man->use_type)
-   continue;
-
-   type_ok = ttm_bo_mt_compatible(man, mem_type, place,
-   _flags);
-
-   if (!type_ok)
-   continue;
 
type_found = true;
-   cur_flags = ttm_bo_select_caching(man, bo->mem.placement,
-  

[PATCH 01/10] drm/ttm: Make LRU removal optional v2

2019-05-28 Thread Christian König
We are already doing this for DMA-buf imports and also for
amdgpu VM BOs for quite a while now.

If this doesn't run into any problems we are probably going
to stop removing BOs from the LRU altogether.

v2: drop BUG_ON from ttm_bo_add_to_lru

Signed-off-by: Christian König 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  9 
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  4 ++--
 drivers/gpu/drm/qxl/qxl_release.c |  2 +-
 drivers/gpu/drm/radeon/radeon_gem.c   |  2 +-
 drivers/gpu/drm/radeon/radeon_object.c|  2 +-
 drivers/gpu/drm/ttm/ttm_bo.c  | 23 ++-
 drivers/gpu/drm/ttm/ttm_execbuf_util.c| 20 +---
 drivers/gpu/drm/virtio/virtgpu_ioctl.c|  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c  |  3 ++-
 drivers/gpu/drm/vmwgfx/vmwgfx_validation.h|  2 +-
 include/drm/ttm/ttm_bo_driver.h   |  5 +++-
 include/drm/ttm/ttm_execbuf_util.h|  3 ++-
 14 files changed, 46 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index e1cae4a37113..647e18f9e136 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -574,7 +574,7 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
amdgpu_vm_get_pd_bo(vm, >list, >vm_pd[0]);
 
ret = ttm_eu_reserve_buffers(>ticket, >list,
-false, >duplicates);
+false, >duplicates, true);
if (!ret)
ctx->reserved = true;
else {
@@ -647,7 +647,7 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
}
 
ret = ttm_eu_reserve_buffers(>ticket, >list,
-false, >duplicates);
+false, >duplicates, true);
if (!ret)
ctx->reserved = true;
else
@@ -1800,7 +1800,8 @@ static int validate_invalid_user_pages(struct 
amdkfd_process_info *process_info)
}
 
/* Reserve all BOs and page tables for validation */
-   ret = ttm_eu_reserve_buffers(, _list, false, );
+   ret = ttm_eu_reserve_buffers(, _list, false, ,
+true);
WARN(!list_empty(), "Duplicates should be empty");
if (ret)
goto out_free;
@@ -2006,7 +2007,7 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, 
struct dma_fence **ef)
}
 
ret = ttm_eu_reserve_buffers(, ,
-false, _save);
+false, _save, true);
if (ret) {
pr_debug("Memory eviction: TTM Reserve Failed. Try again\n");
goto ttm_reserve_fail;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index d72cc583ebd1..fff558cf385b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
}
 
r = ttm_eu_reserve_buffers(>ticket, >validated, true,
-  );
+  , true);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 54dd02a898b9..06f83cac0d3a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
list_add(_tv.head, );
amdgpu_vm_get_pd_bo(vm, , );
 
-   r = ttm_eu_reserve_buffers(, , true, NULL);
+   r = ttm_eu_reserve_buffers(, , true, NULL, true);
if (r) {
DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r);
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 7b840367004c..d513a5ad03dd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,
 
amdgpu_vm_get_pd_bo(vm, , _pd);
 
-   r = ttm_eu_reserve_buffers(, , false, );
+   r = ttm_eu_reserve_buffers(, , false, , true);
if (r) {
dev_err(adev->dev, "leaking bo va because "
"we fail to reserve bo (%d)\n", r);
@@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 
amdgpu_vm_get_pd_bo(>vm, , _pd);
 
-   r = ttm_eu_reserve_buffers(, , true, );
+   r = ttm_eu_reserve_buffers(, , true, , true);
if 

[PATCH 02/10] drm/ttm: return immediately in case of a signal

2019-05-28 Thread Christian König
When a signal arrives we should return immediately for
handling it and not try other placements first.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 06bbcd2679b2..7b59e5ecde7f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -979,7 +979,6 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
uint32_t cur_flags = 0;
bool type_found = false;
bool type_ok = false;
-   bool has_erestartsys = false;
int i, ret;
 
ret = reservation_object_reserve_shared(bo->resv, 1);
@@ -1070,8 +1069,8 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
mem->placement = cur_flags;
return 0;
}
-   if (ret == -ERESTARTSYS)
-   has_erestartsys = true;
+   if (ret && ret != -EBUSY)
+   return ret;
}
 
if (!type_found) {
@@ -1079,7 +1078,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
return -EINVAL;
}
 
-   return (has_erestartsys) ? -ERESTARTSYS : -ENOMEM;
+   return -ENOMEM;
 }
 EXPORT_SYMBOL(ttm_bo_mem_space);
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 03/10] drm/ttm: remove manual placement preference

2019-05-28 Thread Christian König
If drivers don't prefer a system memory placement
they should not but it into the placement list first.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 7b59e5ecde7f..702cd89adbf9 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1012,8 +1012,12 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
ttm_flag_masked(_flags, place->flags,
~TTM_PL_MASK_MEMTYPE);
 
-   if (mem_type == TTM_PL_SYSTEM)
-   break;
+   if (mem_type == TTM_PL_SYSTEM) {
+   mem->mem_type = mem_type;
+   mem->placement = cur_flags;
+   mem->mm_node = NULL;
+   return 0;
+   }
 
ret = (*man->func->get_node)(man, bo, place, mem);
if (unlikely(ret))
@@ -1025,16 +1029,12 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
(*man->func->put_node)(man, mem);
return ret;
}
-   break;
+   mem->mem_type = mem_type;
+   mem->placement = cur_flags;
+   return 0;
}
}
 
-   if ((type_ok && (mem_type == TTM_PL_SYSTEM)) || mem->mm_node) {
-   mem->mem_type = mem_type;
-   mem->placement = cur_flags;
-   return 0;
-   }
-
for (i = 0; i < placement->num_busy_placement; ++i) {
const struct ttm_place *place = >busy_placement[i];
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

2019-05-28 Thread Koenig, Christian
Am 28.05.19 um 18:10 schrieb Emil Velikov:
> On 2019/05/28, Daniel Vetter wrote:
>> On Tue, May 28, 2019 at 10:03 AM Koenig, Christian
>>  wrote:
>>> Am 28.05.19 um 09:38 schrieb Daniel Vetter:
 [SNIP]
> Might be a good idea looking into reverting it partially, so that at
> least command submission and buffer allocation is still blocked.
 I thought the issue is a lot more than vainfo, it's pretty much every
 hacked up compositor under the sun getting this wrong one way or
 another. Thinking about this some more, I also have no idea how you'd
 want to deprecate rendering on primary nodes in general. Apparently
 that breaks -modesetting already, and probably lots more compositors.
 And it looks like we're finally achieve the goal kms set out to 10
 years ago, and new compositors are sprouting up all the time. I guess
 we could just break them all (on new hardware) and tell them to all
 suck it up. But I don't think that's a great option. And just
 deprecating this on amdgpu is going to be even harder, since then
 everywhere else it'll keep working, and it's just amdgpu.ko that looks
 broken.

 Aside: I'm not supporting Emil's idea here because it fixes any issues
 Intel has - Intel doesn't care. I support it because reality sucks,
 people get this render vs. primary vs. multi-gpu prime wrong all the
 time (that's also why we have hardcoded display+gpu pairs in mesa for
 the various soc combinations out there), and this looks like a
 pragmatic solution. It'd be nice if every compositor and everything
 else would perfectly support multi gpu and only use render nodes for
 rendering, and only primary nodes for display. But reality is that
 people hack on stuff until gears on screen and then move on to more
 interesting things (to them). So I don't think we'll ever win this :-/
>>> Yeah, but this is a classic case of working around user space issues by
>>> making kernel changes instead of fixing user space.
>>>
>>> Having privileged (output control) and unprivileged (rendering control)
>>> functionality behind the same node is a mistake we have made a long time
>>> ago and render nodes finally seemed to be a way to fix that.
>>>
>>> I mean why are compositors using the primary node in the first place?
>>> Because they want to have access to privileged resources I think and in
>>> this case it is perfectly ok to do so.
>>>
>>> Now extending unprivileged access to the primary node actually sounds
>>> like a step into the wrong direction to me.
>>>
>>> I rather think that we should go down the route of completely dropping
>>> command submission and buffer allocation through the primary node for
>>> non master clients. And then as next step at some point drop support for
>>> authentication/flink.
>>>
>>> I mean we have done this with UMS as well and I don't see much other way
>>> to move forward and get rid of those ancient interface in the long term.
>> Well kms had some really good benefits that drove quick adoption, like
>> "suspend/resume actually has a chance of working" or "comes with
>> buffer management so you can run multiple gears".
>>
>> The render node thing is a lot more niche use case (prime, better priv
>> separation), plus "it's cleaner design". And the "cleaner design" part
>> is something that empirically doesn't seem to matter :-/ Just two
>> examples:
>> - KHR_display/leases just iterated display resources on the fd needed
>> for rendering (and iirc there was even a patch to expose that for
>> render nodes too so it works with DRI3), because implementing
>> protocols is too hard. Barely managed to stop that one before it
>> happened.
>> - Various video players use the vblank ioctl on directly to schedule
>> frames, without telling the compositor. I discovered that when I
>> wanted to limite the vblank ioctl to master clients only. Again,
>> apparently too hard to use the existing extensions, or fix the bugs in
>> there, or whatever. One userspace got fixed last year, but it'll
>> probably get copypasted around forever :-/
>>
>> So I don't think we'll ever manage to roll a clean split out, and best
>> we can do is give in and just hand userspace what it wants. As much as
>> that's misguided and unclean and all that. Maybe it'll result in a
>> least fewer stuff getting run as root to hack around this, because
>> fixing properly seems not to be on the table.
>>
>> The beauty of kms is that we've achieved the mission, everyone's
>> writing their own thing. Which is also terrible, and I don't think
>> it'll get better.
>
> With the risk of coming rude I will repeat my earlier comment:
>
> The problem is _neither_ Intel nor libva specific.
>
>
>
> That said, let's step back for a moment and consider:
>
>   - the "block everything but KMS via the primary node" idea is great but
> orthogonal
>
>   - the series does address issues that are vendor-agnostic
>
>   - by default this series does _not_ cause 

Re: [PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

2019-05-28 Thread Emil Velikov
On 2019/05/28, Daniel Vetter wrote:
> On Tue, May 28, 2019 at 10:03 AM Koenig, Christian
>  wrote:
> >
> > Am 28.05.19 um 09:38 schrieb Daniel Vetter:
> > > [SNIP]
> > >> Might be a good idea looking into reverting it partially, so that at
> > >> least command submission and buffer allocation is still blocked.
> > > I thought the issue is a lot more than vainfo, it's pretty much every
> > > hacked up compositor under the sun getting this wrong one way or
> > > another. Thinking about this some more, I also have no idea how you'd
> > > want to deprecate rendering on primary nodes in general. Apparently
> > > that breaks -modesetting already, and probably lots more compositors.
> > > And it looks like we're finally achieve the goal kms set out to 10
> > > years ago, and new compositors are sprouting up all the time. I guess
> > > we could just break them all (on new hardware) and tell them to all
> > > suck it up. But I don't think that's a great option. And just
> > > deprecating this on amdgpu is going to be even harder, since then
> > > everywhere else it'll keep working, and it's just amdgpu.ko that looks
> > > broken.
> > >
> > > Aside: I'm not supporting Emil's idea here because it fixes any issues
> > > Intel has - Intel doesn't care. I support it because reality sucks,
> > > people get this render vs. primary vs. multi-gpu prime wrong all the
> > > time (that's also why we have hardcoded display+gpu pairs in mesa for
> > > the various soc combinations out there), and this looks like a
> > > pragmatic solution. It'd be nice if every compositor and everything
> > > else would perfectly support multi gpu and only use render nodes for
> > > rendering, and only primary nodes for display. But reality is that
> > > people hack on stuff until gears on screen and then move on to more
> > > interesting things (to them). So I don't think we'll ever win this :-/
> >
> > Yeah, but this is a classic case of working around user space issues by
> > making kernel changes instead of fixing user space.
> >
> > Having privileged (output control) and unprivileged (rendering control)
> > functionality behind the same node is a mistake we have made a long time
> > ago and render nodes finally seemed to be a way to fix that.
> >
> > I mean why are compositors using the primary node in the first place?
> > Because they want to have access to privileged resources I think and in
> > this case it is perfectly ok to do so.
> >
> > Now extending unprivileged access to the primary node actually sounds
> > like a step into the wrong direction to me.
> >
> > I rather think that we should go down the route of completely dropping
> > command submission and buffer allocation through the primary node for
> > non master clients. And then as next step at some point drop support for
> > authentication/flink.
> >
> > I mean we have done this with UMS as well and I don't see much other way
> > to move forward and get rid of those ancient interface in the long term.
> 
> Well kms had some really good benefits that drove quick adoption, like
> "suspend/resume actually has a chance of working" or "comes with
> buffer management so you can run multiple gears".
> 
> The render node thing is a lot more niche use case (prime, better priv
> separation), plus "it's cleaner design". And the "cleaner design" part
> is something that empirically doesn't seem to matter :-/ Just two
> examples:
> - KHR_display/leases just iterated display resources on the fd needed
> for rendering (and iirc there was even a patch to expose that for
> render nodes too so it works with DRI3), because implementing
> protocols is too hard. Barely managed to stop that one before it
> happened.
> - Various video players use the vblank ioctl on directly to schedule
> frames, without telling the compositor. I discovered that when I
> wanted to limite the vblank ioctl to master clients only. Again,
> apparently too hard to use the existing extensions, or fix the bugs in
> there, or whatever. One userspace got fixed last year, but it'll
> probably get copypasted around forever :-/
> 
> So I don't think we'll ever manage to roll a clean split out, and best
> we can do is give in and just hand userspace what it wants. As much as
> that's misguided and unclean and all that. Maybe it'll result in a
> least fewer stuff getting run as root to hack around this, because
> fixing properly seems not to be on the table.
> 
> The beauty of kms is that we've achieved the mission, everyone's
> writing their own thing. Which is also terrible, and I don't think
> it'll get better.


With the risk of coming rude I will repeat my earlier comment:

The problem is _neither_ Intel nor libva specific.



That said, let's step back for a moment and consider:

 - the "block everything but KMS via the primary node" idea is great but
orthogonal

 - the series does address issues that are vendor-agnostic

 - by default this series does _not_ cause any regression be that for
new or old userspace

 - there 

Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-28 Thread Dave Martin
On Tue, May 28, 2019 at 04:40:58PM +0100, Catalin Marinas wrote:

[...]

> My thoughts on allowing tags (quick look):
>
> brk - no

[...]

> mlock, mlock2, munlock - yes
> mmap - no (we may change this with MTE but not for TBI)

[...]

> mprotect - yes

I haven't following this discussion closely... what's the rationale for
the inconsistencies here (feel free to refer me back to the discussion
if it's elsewhere).

Cheers
---Dave


Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-28 Thread Catalin Marinas
On Tue, May 28, 2019 at 03:54:11PM +0100, Andrew Murray wrote:
> On Mon, May 27, 2019 at 03:37:20PM +0100, Catalin Marinas wrote:
> > On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote:
> > > This patch is a part of a series that extends arm64 kernel ABI to allow to
> > > pass tagged user pointers (with the top byte set to something else other
> > > than 0x00) as syscall arguments.
> > > 
> > > This patch allows tagged pointers to be passed to the following memory
> > > syscalls: brk, get_mempolicy, madvise, mbind, mincore, mlock, mlock2,
> > > mmap, mmap_pgoff, mprotect, mremap, msync, munlock, munmap,
> > > remap_file_pages, shmat and shmdt.
> > > 
> > > This is done by untagging pointers passed to these syscalls in the
> > > prologues of their handlers.
> > > 
> > > Signed-off-by: Andrey Konovalov 
> > 
> > Actually, I don't think any of these wrappers get called (have you
> > tested this patch?). Following commit 4378a7d4be30 ("arm64: implement
> > syscall wrappers"), I think we have other macro names for overriding the
> > sys_* ones.
> 
> What is the value in adding these wrappers?

Not much value, initially proposed just to keep the core changes small.
I'm fine with moving them back in the generic code (but see below).

I think another aspect is how we define the ABI. Is allowing tags to
mlock() for example something specific to arm64 or would sparc ADI need
the same? In the absence of other architectures defining such ABI, my
preference would be to keep the wrappers in the arch code.

Assuming sparc won't implement untagged_addr(), we can place the macros
back in the generic code but, as per the review here, we need to be more
restrictive on where we allow tagged addresses. For example, if mmap()
gets a tagged address with MAP_FIXED, is it expected to return the tag?

My thoughts on allowing tags (quick look):

brk - no
get_mempolicy - yes
madvise - yes
mbind - yes
mincore - yes
mlock, mlock2, munlock - yes
mmap - no (we may change this with MTE but not for TBI)
mmap_pgoff - not used on arm64
mprotect - yes
mremap - yes for old_address, no for new_address (on par with mmap)
msync - yes
munmap - probably no (mmap does not return tagged ptrs)
remap_file_pages - no (also deprecated syscall)
shmat, shmdt - shall we allow tagged addresses on shared memory?

The above is only about the TBI ABI while ignoring hardware MTE. For the
latter, we may want to change the mmap() to allow pre-colouring on page
fault which means that munmap()/mprotect() should also support tagged
pointers. Possibly mremap() as well but we need to decide whether it
should allow re-colouring the page (probably no, in which case
old_address and new_address should have the same tag). For some of these
we'll end up with arm64 specific wrappers again, unless sparc ADI adopts
exactly the same ABI restrictions.

-- 
Catalin


Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-28 Thread Andrew Murray
On Mon, May 27, 2019 at 03:37:20PM +0100, Catalin Marinas wrote:
> On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote:
> > This patch is a part of a series that extends arm64 kernel ABI to allow to
> > pass tagged user pointers (with the top byte set to something else other
> > than 0x00) as syscall arguments.
> > 
> > This patch allows tagged pointers to be passed to the following memory
> > syscalls: brk, get_mempolicy, madvise, mbind, mincore, mlock, mlock2,
> > mmap, mmap_pgoff, mprotect, mremap, msync, munlock, munmap,
> > remap_file_pages, shmat and shmdt.
> > 
> > This is done by untagging pointers passed to these syscalls in the
> > prologues of their handlers.
> > 
> > Signed-off-by: Andrey Konovalov 
> 
> Actually, I don't think any of these wrappers get called (have you
> tested this patch?). Following commit 4378a7d4be30 ("arm64: implement
> syscall wrappers"), I think we have other macro names for overriding the
> sys_* ones.

What is the value in adding these wrappers?

The untagged_addr macro is defined for all in linux/mm.h and these patches
already use untagged_addr in generic code. Thus adding a few more
untagged_addr in the generic syscall handlers (which turn to a nop for most)
is surely better than adding wrappers?

Even if other architectures implement untagged_addr in the future it would
be more consistent if they untagged in the same places and thus not adding
these wrappers enforces that.

Thanks,

Andrew Murray

> 
> -- 
> Catalin
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-28 Thread Andrey Konovalov
Thanks for a lot of valuable input! I've read through all the replies
and got somewhat lost. What are the changes I need to do to this
series?

1. Should I move untagging for memory syscalls back to the generic
code so other arches would make use of it as well, or should I keep
the arm64 specific memory syscalls wrappers and address the comments
on that patch?

2. Should I make untagging opt-in and controlled by a command line argument?

3. Should I "add Documentation/core-api/user-addresses.rst to describe
proper care and handling of user space pointers with untagged_addr(),
with examples based on all the cases seen so far in this series"?
Which examples specifically should it cover?

Is there something else?


RX 580 and 5K displays, bandwidth validation failed whith multiple monitors

2019-05-28 Thread Gaël HERMET

Hi,

I have been facing an issue with my 5K display (iiyama ProLite 
XB2779QQS-S1).


It works fine as long as it is the only active monitor, as soon as I 
activate another monitor the main one (5k) can't display more than 4k.


Debug using "echo 0x4 > /sys/module/drm/parameters/debug" show this :
mai 23 09:01:22 bureau-gael /usr/lib/gdm3/gdm-x-session[3465]: (EE) 
AMDGPU(0): failed to set mode: Invalid argument
mai 23 09:01:22 bureau-gael kernel: [drm:dce112_validate_bandwidth 
[amdgpu]] dce112_validate_bandwidth: Bandwidth validation failed!


I disabled the check by forcing is_display_configuration_supported to 
return true in dce_calcs.c and it works fine.


Anything I can do to correct this bandwidth calculation ?


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 02/12] dma-buf: add dma_buf_(un)map_attachment_locked variants v3

2019-05-28 Thread Hillf Danton

On Mon, 27 May 2019 18:56:20 +0800 Christian Koenig wrote:
> Thanks for the comments, but you are looking at a completely outdated 
> patchset.
> 
> If you are interested in the newest one please ping me and I'm going to CC you
> when I send out the next version.
> 
Ping...

Thanks
Hillf

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/amdgpu: Remove duplicate including duplicate header

2019-05-28 Thread Hariprasad Kelam
remove duplicate entry of soc15.h. Issue identified by includecheck

Signed-off-by: Hariprasad Kelam 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index c763733..d723332 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -34,7 +34,6 @@
 #include "vega10_enum.h"
 #include "hdp/hdp_4_0_offset.h"
 
-#include "soc15.h"
 #include "soc15_common.h"
 #include "clearstate_gfx9.h"
 #include "v9_structs.h"
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-28 Thread Catalin Marinas
On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote:
>  /*
>   * Wrappers to pass the pt_regs argument.
>   */
>  #define sys_personality  sys_arm64_personality
> +#define sys_mmap_pgoff   sys_arm64_mmap_pgoff
> +#define sys_mremap   sys_arm64_mremap
> +#define sys_munmap   sys_arm64_munmap
> +#define sys_brk  sys_arm64_brk
> +#define sys_get_mempolicysys_arm64_get_mempolicy
> +#define sys_madvise  sys_arm64_madvise
> +#define sys_mbindsys_arm64_mbind
> +#define sys_mlocksys_arm64_mlock
> +#define sys_mlock2   sys_arm64_mlock2
> +#define sys_munlock  sys_arm64_munlock
> +#define sys_mprotect sys_arm64_mprotect
> +#define sys_msyncsys_arm64_msync
> +#define sys_mincore  sys_arm64_mincore
> +#define sys_remap_file_pages sys_arm64_remap_file_pages
> +#define sys_shmatsys_arm64_shmat
> +#define sys_shmdtsys_arm64_shmdt

This hunk should be (I sent a separate patch for sys_personality):

@@ -160,23 +163,23 @@ SYSCALL_DEFINE1(arm64_shmdt, char __user *, shmaddr)
 /*
  * Wrappers to pass the pt_regs argument.
  */
-#define sys_personalitysys_arm64_personality
-#define sys_mmap_pgoff sys_arm64_mmap_pgoff
-#define sys_mremap sys_arm64_mremap
-#define sys_munmap sys_arm64_munmap
-#define sys_brksys_arm64_brk
-#define sys_get_mempolicy  sys_arm64_get_mempolicy
-#define sys_madvisesys_arm64_madvise
-#define sys_mbind  sys_arm64_mbind
-#define sys_mlock  sys_arm64_mlock
-#define sys_mlock2 sys_arm64_mlock2
-#define sys_munlocksys_arm64_munlock
-#define sys_mprotect   sys_arm64_mprotect
-#define sys_msync  sys_arm64_msync
-#define sys_mincoresys_arm64_mincore
-#define sys_remap_file_pages   sys_arm64_remap_file_pages
-#define sys_shmat  sys_arm64_shmat
-#define sys_shmdt  sys_arm64_shmdt
+#define __arm64_sys_personality__arm64_sys_arm64_personality
+#define __arm64_sys_mmap_pgoff __arm64_sys_arm64_mmap_pgoff
+#define __arm64_sys_mremap __arm64_sys_arm64_mremap
+#define __arm64_sys_munmap __arm64_sys_arm64_munmap
+#define __arm64_sys_brk__arm64_sys_arm64_brk
+#define __arm64_sys_get_mempolicy  __arm64_sys_arm64_get_mempolicy
+#define __arm64_sys_madvise__arm64_sys_arm64_madvise
+#define __arm64_sys_mbind  __arm64_sys_arm64_mbind
+#define __arm64_sys_mlock  __arm64_sys_arm64_mlock
+#define __arm64_sys_mlock2 __arm64_sys_arm64_mlock2
+#define __arm64_sys_munlock__arm64_sys_arm64_munlock
+#define __arm64_sys_mprotect   __arm64_sys_arm64_mprotect
+#define __arm64_sys_msync  __arm64_sys_arm64_msync
+#define __arm64_sys_mincore__arm64_sys_arm64_mincore
+#define __arm64_sys_remap_file_pages   __arm64_sys_arm64_remap_file_pages
+#define __arm64_sys_shmat  __arm64_sys_arm64_shmat
+#define __arm64_sys_shmdt  __arm64_sys_arm64_shmdt
 
 asmlinkage long sys_ni_syscall(const struct pt_regs *);
 #define __arm64_sys_ni_syscall sys_ni_syscall

-- 
Catalin


Re: [PATCH] drm/amdgpu/display: Fix reload driver error

2019-05-28 Thread Kazlauskas, Nicholas
On 5/28/19 6:28 AM, Emily Deng wrote:
> Issue:
> Will have follow error when reload driver:
> [ 3986.567739] sysfs: cannot create duplicate filename 
> '/devices/pci:00/:00:07.0/drm_dp_aux_dev'
> [ 3986.567743] CPU: 6 PID: 1767 Comm: modprobe Tainted: G   OE 
> 5.0.0-rc1-custom #1
> [ 3986.567745] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> [ 3986.567746] Call Trace:
> ..
> [ 3986.567808]  drm_dp_aux_register_devnode+0xdc/0x140 [drm_kms_helper]
> ..
> [ 3986.569081] kobject_add_internal failed for drm_dp_aux_dev with -EEXIST, 
> don't try to register things with the same name in the same directory.
> 
> Reproduce sequences:
> 1.modprobe amdgpu
> 2.modprobe -r amdgpu
> 3.modprobe amdgpu
> 
> Root cause:
> When unload driver, it don't unregister aux.
> 
> Signed-off-by: Emily Deng 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h   |  1 +
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 18 
> +-
>   .../drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c|  6 +-
>   3 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> index e48fd35..720955b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> @@ -525,6 +525,7 @@ enum amdgpu_connector_dither {
>   struct amdgpu_dm_dp_aux {
>   struct drm_dp_aux aux;
>   struct ddc_service *ddc_service;
> + bool has_aux;
>   };
>   
>   struct amdgpu_i2c_adapter {
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 8fe1685..de369ae 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -3760,6 +3760,16 @@ int amdgpu_dm_connector_atomic_get_property(struct 
> drm_connector *connector,
>   return ret;
>   }
>   
> +static void amdgpu_dm_connector_unregister(struct drm_connector *connector)
> +{
> + struct amdgpu_dm_connector *amdgpu_dm_connector = 
> to_amdgpu_dm_connector(connector);
> +
> + if (amdgpu_dm_connector->dm_dp_aux.has_aux) {
> + drm_dp_aux_unregister(_dm_connector->dm_dp_aux.aux);
> + amdgpu_dm_connector->dm_dp_aux.has_aux = false;
> + }
> +}

This is a good catch, but I'm pretty sure we don't need the has_aux 
state here for this.

You should be able to call drm_dp_aux_unregister regardless of whether 
drm_dp_aux_register succeeded or not.

Nicholas Kazlauskas

> +
>   static void amdgpu_dm_connector_destroy(struct drm_connector *connector)
>   {
>   struct amdgpu_dm_connector *aconnector = 
> to_amdgpu_dm_connector(connector);
> @@ -3788,6 +3798,11 @@ static void amdgpu_dm_connector_destroy(struct 
> drm_connector *connector)
>   drm_dp_cec_unregister_connector(>dm_dp_aux.aux);
>   drm_connector_unregister(connector);
>   drm_connector_cleanup(connector);
> + if (aconnector->i2c) {
> + i2c_del_adapter(>i2c->base);
> + kfree(aconnector->i2c);
> + }
> +
>   kfree(connector);
>   }
>   
> @@ -3846,7 +3861,8 @@ static const struct drm_connector_funcs 
> amdgpu_dm_connector_funcs = {
>   .atomic_duplicate_state = amdgpu_dm_connector_atomic_duplicate_state,
>   .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
>   .atomic_set_property = amdgpu_dm_connector_atomic_set_property,
> - .atomic_get_property = amdgpu_dm_connector_atomic_get_property
> + .atomic_get_property = amdgpu_dm_connector_atomic_get_property,
> + .early_unregister = amdgpu_dm_connector_unregister
>   };
>   
>   static int get_modes(struct drm_connector *connector)
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> index 6e205ee..190e92c 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> @@ -387,12 +387,16 @@ static const struct drm_dp_mst_topology_cbs dm_mst_cbs 
> = {
>   void amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
>  struct amdgpu_dm_connector *aconnector)
>   {
> + int ret;
>   aconnector->dm_dp_aux.aux.name = "dmdc";
>   aconnector->dm_dp_aux.aux.dev = dm->adev->dev;
>   aconnector->dm_dp_aux.aux.transfer = dm_dp_aux_transfer;
>   aconnector->dm_dp_aux.ddc_service = aconnector->dc_link->ddc;
>   
> - drm_dp_aux_register(>dm_dp_aux.aux);
> + ret = drm_dp_aux_register(>dm_dp_aux.aux);
> + if (!ret)
> + aconnector->dm_dp_aux.has_aux = true;
> +
>   drm_dp_cec_register_connector(>dm_dp_aux.aux,
> aconnector->base.name, dm->adev->dev);
>   aconnector->mst_mgr.cbs = _mst_cbs;
> 


[PATCH] drm/amdgpu/display: Fix reload driver error

2019-05-28 Thread Emily Deng
Issue:
Will have follow error when reload driver:
[ 3986.567739] sysfs: cannot create duplicate filename 
'/devices/pci:00/:00:07.0/drm_dp_aux_dev'
[ 3986.567743] CPU: 6 PID: 1767 Comm: modprobe Tainted: G   OE 
5.0.0-rc1-custom #1
[ 3986.567745] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 3986.567746] Call Trace:
..
[ 3986.567808]  drm_dp_aux_register_devnode+0xdc/0x140 [drm_kms_helper]
..
[ 3986.569081] kobject_add_internal failed for drm_dp_aux_dev with -EEXIST, 
don't try to register things with the same name in the same directory.

Reproduce sequences:
1.modprobe amdgpu
2.modprobe -r amdgpu
3.modprobe amdgpu

Root cause:
When unload driver, it don't unregister aux.

Signed-off-by: Emily Deng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h   |  1 +
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 18 +-
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c|  6 +-
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index e48fd35..720955b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -525,6 +525,7 @@ enum amdgpu_connector_dither {
 struct amdgpu_dm_dp_aux {
struct drm_dp_aux aux;
struct ddc_service *ddc_service;
+   bool has_aux;
 };
 
 struct amdgpu_i2c_adapter {
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 8fe1685..de369ae 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3760,6 +3760,16 @@ int amdgpu_dm_connector_atomic_get_property(struct 
drm_connector *connector,
return ret;
 }
 
+static void amdgpu_dm_connector_unregister(struct drm_connector *connector)
+{
+   struct amdgpu_dm_connector *amdgpu_dm_connector = 
to_amdgpu_dm_connector(connector);
+
+   if (amdgpu_dm_connector->dm_dp_aux.has_aux) {
+   drm_dp_aux_unregister(_dm_connector->dm_dp_aux.aux);
+   amdgpu_dm_connector->dm_dp_aux.has_aux = false;
+   }
+}
+
 static void amdgpu_dm_connector_destroy(struct drm_connector *connector)
 {
struct amdgpu_dm_connector *aconnector = 
to_amdgpu_dm_connector(connector);
@@ -3788,6 +3798,11 @@ static void amdgpu_dm_connector_destroy(struct 
drm_connector *connector)
drm_dp_cec_unregister_connector(>dm_dp_aux.aux);
drm_connector_unregister(connector);
drm_connector_cleanup(connector);
+   if (aconnector->i2c) {
+   i2c_del_adapter(>i2c->base);
+   kfree(aconnector->i2c);
+   }
+
kfree(connector);
 }
 
@@ -3846,7 +3861,8 @@ static const struct drm_connector_funcs 
amdgpu_dm_connector_funcs = {
.atomic_duplicate_state = amdgpu_dm_connector_atomic_duplicate_state,
.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
.atomic_set_property = amdgpu_dm_connector_atomic_set_property,
-   .atomic_get_property = amdgpu_dm_connector_atomic_get_property
+   .atomic_get_property = amdgpu_dm_connector_atomic_get_property,
+   .early_unregister = amdgpu_dm_connector_unregister
 };
 
 static int get_modes(struct drm_connector *connector)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 6e205ee..190e92c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -387,12 +387,16 @@ static const struct drm_dp_mst_topology_cbs dm_mst_cbs = {
 void amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
   struct amdgpu_dm_connector *aconnector)
 {
+   int ret;
aconnector->dm_dp_aux.aux.name = "dmdc";
aconnector->dm_dp_aux.aux.dev = dm->adev->dev;
aconnector->dm_dp_aux.aux.transfer = dm_dp_aux_transfer;
aconnector->dm_dp_aux.ddc_service = aconnector->dc_link->ddc;
 
-   drm_dp_aux_register(>dm_dp_aux.aux);
+   ret = drm_dp_aux_register(>dm_dp_aux.aux);
+   if (!ret)
+   aconnector->dm_dp_aux.has_aux = true;
+
drm_dp_cec_register_connector(>dm_dp_aux.aux,
  aconnector->base.name, dm->adev->dev);
aconnector->mst_mgr.cbs = _mst_cbs;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer

2019-05-28 Thread Deng, Emily
Ping ..

Best wishes
Emily Deng



>-Original Message-
>From: amd-gfx  On Behalf Of Emily
>Deng
>Sent: Tuesday, May 28, 2019 4:06 PM
>To: amd-gfx@lists.freedesktop.org
>Cc: Deng, Emily 
>Subject: [PATCH] drm/amdgpu:Fix the unpin warning about csb buffer
>
>As it will destroy clear_state_obj, and also will unpin it in the 
>gfx_v9_0_sw_fini,
>so don't need to call amdgpu_bo_free_kernel in gfx_v9_0_sw_fini, or it will
>have unpin warning.
>
>Signed-off-by: Emily Deng 
>---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
>diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>index c763733..cc5a382 100644
>--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>@@ -1794,9 +1794,7 @@ static int gfx_v9_0_sw_fini(void *handle)
>
>   gfx_v9_0_mec_fini(adev);
>   gfx_v9_0_ngg_fini(adev);
>-  amdgpu_bo_free_kernel(>gfx.rlc.clear_state_obj,
>-  >gfx.rlc.clear_state_gpu_addr,
>-  (void **)>gfx.rlc.cs_ptr);
>+  amdgpu_bo_unref(>gfx.rlc.clear_state_obj);
>   if (adev->asic_type == CHIP_RAVEN) {
>   amdgpu_bo_free_kernel(>gfx.rlc.cp_table_obj,
>   >gfx.rlc.cp_table_gpu_addr,
>--
>2.7.4
>
>___
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: reserve stollen vram for raven series

2019-05-28 Thread Huang, Ray
> -Original Message-
> From: amd-gfx  On Behalf Of Cui,
> Flora
> Sent: Tuesday, May 28, 2019 3:55 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Cui, Flora 
> Subject: [PATCH] drm/amdgpu: reserve stollen vram for raven series
> 
> to avoid screen corruption during modprobe.
> 
> Change-Id: I8671de6ed46285585dbe866832c6d2b835ca37f3
> Signed-off-by: Flora Cui 

Thanks. Yes. In raven series, we still found corruption at the start of vram.

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index 602593b..dbc83a1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -624,9 +624,8 @@ static bool gmc_v9_0_keep_stolen_memory(struct
> amdgpu_device *adev)
>*/
>   switch (adev->asic_type) {
>   case CHIP_VEGA10:
> - return true;
>   case CHIP_RAVEN:
> - return (adev->pdev->device == 0x15d8);
> + return true;
>   case CHIP_VEGA12:
>   case CHIP_VEGA20:
>   default:
> --
> 2.7.4
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

2019-05-28 Thread Daniel Vetter
On Tue, May 28, 2019 at 10:03 AM Koenig, Christian
 wrote:
>
> Am 28.05.19 um 09:38 schrieb Daniel Vetter:
> > [SNIP]
> >> Might be a good idea looking into reverting it partially, so that at
> >> least command submission and buffer allocation is still blocked.
> > I thought the issue is a lot more than vainfo, it's pretty much every
> > hacked up compositor under the sun getting this wrong one way or
> > another. Thinking about this some more, I also have no idea how you'd
> > want to deprecate rendering on primary nodes in general. Apparently
> > that breaks -modesetting already, and probably lots more compositors.
> > And it looks like we're finally achieve the goal kms set out to 10
> > years ago, and new compositors are sprouting up all the time. I guess
> > we could just break them all (on new hardware) and tell them to all
> > suck it up. But I don't think that's a great option. And just
> > deprecating this on amdgpu is going to be even harder, since then
> > everywhere else it'll keep working, and it's just amdgpu.ko that looks
> > broken.
> >
> > Aside: I'm not supporting Emil's idea here because it fixes any issues
> > Intel has - Intel doesn't care. I support it because reality sucks,
> > people get this render vs. primary vs. multi-gpu prime wrong all the
> > time (that's also why we have hardcoded display+gpu pairs in mesa for
> > the various soc combinations out there), and this looks like a
> > pragmatic solution. It'd be nice if every compositor and everything
> > else would perfectly support multi gpu and only use render nodes for
> > rendering, and only primary nodes for display. But reality is that
> > people hack on stuff until gears on screen and then move on to more
> > interesting things (to them). So I don't think we'll ever win this :-/
>
> Yeah, but this is a classic case of working around user space issues by
> making kernel changes instead of fixing user space.
>
> Having privileged (output control) and unprivileged (rendering control)
> functionality behind the same node is a mistake we have made a long time
> ago and render nodes finally seemed to be a way to fix that.
>
> I mean why are compositors using the primary node in the first place?
> Because they want to have access to privileged resources I think and in
> this case it is perfectly ok to do so.
>
> Now extending unprivileged access to the primary node actually sounds
> like a step into the wrong direction to me.
>
> I rather think that we should go down the route of completely dropping
> command submission and buffer allocation through the primary node for
> non master clients. And then as next step at some point drop support for
> authentication/flink.
>
> I mean we have done this with UMS as well and I don't see much other way
> to move forward and get rid of those ancient interface in the long term.

Well kms had some really good benefits that drove quick adoption, like
"suspend/resume actually has a chance of working" or "comes with
buffer management so you can run multiple gears".

The render node thing is a lot more niche use case (prime, better priv
separation), plus "it's cleaner design". And the "cleaner design" part
is something that empirically doesn't seem to matter :-/ Just two
examples:
- KHR_display/leases just iterated display resources on the fd needed
for rendering (and iirc there was even a patch to expose that for
render nodes too so it works with DRI3), because implementing
protocols is too hard. Barely managed to stop that one before it
happened.
- Various video players use the vblank ioctl on directly to schedule
frames, without telling the compositor. I discovered that when I
wanted to limite the vblank ioctl to master clients only. Again,
apparently too hard to use the existing extensions, or fix the bugs in
there, or whatever. One userspace got fixed last year, but it'll
probably get copypasted around forever :-/

So I don't think we'll ever manage to roll a clean split out, and best
we can do is give in and just hand userspace what it wants. As much as
that's misguided and unclean and all that. Maybe it'll result in a
least fewer stuff getting run as root to hack around this, because
fixing properly seems not to be on the table.

The beauty of kms is that we've achieved the mission, everyone's
writing their own thing. Which is also terrible, and I don't think
it'll get better.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu:Fix the unpin warning about csb buffer

2019-05-28 Thread Emily Deng
As it will destroy clear_state_obj, and also will unpin it in the
gfx_v9_0_sw_fini, so don't need to
call amdgpu_bo_free_kernel in gfx_v9_0_sw_fini, or it will have unpin warning.

Signed-off-by: Emily Deng 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index c763733..cc5a382 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1794,9 +1794,7 @@ static int gfx_v9_0_sw_fini(void *handle)
 
gfx_v9_0_mec_fini(adev);
gfx_v9_0_ngg_fini(adev);
-   amdgpu_bo_free_kernel(>gfx.rlc.clear_state_obj,
-   >gfx.rlc.clear_state_gpu_addr,
-   (void **)>gfx.rlc.cs_ptr);
+   amdgpu_bo_unref(>gfx.rlc.clear_state_obj);
if (adev->asic_type == CHIP_RAVEN) {
amdgpu_bo_free_kernel(>gfx.rlc.cp_table_obj,
>gfx.rlc.cp_table_gpu_addr,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

2019-05-28 Thread Koenig, Christian
Am 28.05.19 um 09:38 schrieb Daniel Vetter:
> [SNIP]
>> Might be a good idea looking into reverting it partially, so that at
>> least command submission and buffer allocation is still blocked.
> I thought the issue is a lot more than vainfo, it's pretty much every
> hacked up compositor under the sun getting this wrong one way or
> another. Thinking about this some more, I also have no idea how you'd
> want to deprecate rendering on primary nodes in general. Apparently
> that breaks -modesetting already, and probably lots more compositors.
> And it looks like we're finally achieve the goal kms set out to 10
> years ago, and new compositors are sprouting up all the time. I guess
> we could just break them all (on new hardware) and tell them to all
> suck it up. But I don't think that's a great option. And just
> deprecating this on amdgpu is going to be even harder, since then
> everywhere else it'll keep working, and it's just amdgpu.ko that looks
> broken.
>
> Aside: I'm not supporting Emil's idea here because it fixes any issues
> Intel has - Intel doesn't care. I support it because reality sucks,
> people get this render vs. primary vs. multi-gpu prime wrong all the
> time (that's also why we have hardcoded display+gpu pairs in mesa for
> the various soc combinations out there), and this looks like a
> pragmatic solution. It'd be nice if every compositor and everything
> else would perfectly support multi gpu and only use render nodes for
> rendering, and only primary nodes for display. But reality is that
> people hack on stuff until gears on screen and then move on to more
> interesting things (to them). So I don't think we'll ever win this :-/

Yeah, but this is a classic case of working around user space issues by 
making kernel changes instead of fixing user space.

Having privileged (output control) and unprivileged (rendering control) 
functionality behind the same node is a mistake we have made a long time 
ago and render nodes finally seemed to be a way to fix that.

I mean why are compositors using the primary node in the first place? 
Because they want to have access to privileged resources I think and in 
this case it is perfectly ok to do so.

Now extending unprivileged access to the primary node actually sounds 
like a step into the wrong direction to me.

I rather think that we should go down the route of completely dropping 
command submission and buffer allocation through the primary node for 
non master clients. And then as next step at some point drop support for 
authentication/flink.

I mean we have done this with UMS as well and I don't see much other way 
to move forward and get rid of those ancient interface in the long term.

Regards,
Christian.

> -Daniel

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: reserve stollen vram for raven series

2019-05-28 Thread Cui, Flora
to avoid screen corruption during modprobe.

Change-Id: I8671de6ed46285585dbe866832c6d2b835ca37f3
Signed-off-by: Flora Cui 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 602593b..dbc83a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -624,9 +624,8 @@ static bool gmc_v9_0_keep_stolen_memory(struct 
amdgpu_device *adev)
 */
switch (adev->asic_type) {
case CHIP_VEGA10:
-   return true;
case CHIP_RAVEN:
-   return (adev->pdev->device == 0x15d8);
+   return true;
case CHIP_VEGA12:
case CHIP_VEGA20:
default:
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin

2019-05-28 Thread Deng, Emily
>-Original Message-
>From: Koenig, Christian 
>Sent: Tuesday, May 28, 2019 3:43 PM
>To: Deng, Emily ; Quan, Evan
>; amd-gfx@lists.freedesktop.org
>Subject: Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin
>
>Am 28.05.19 um 09:38 schrieb Deng, Emily:
>>> -Original Message-
>>> From: Koenig, Christian 
>>> Sent: Tuesday, May 28, 2019 3:04 PM
>>> To: Quan, Evan ; Deng, Emily
>;
>>> amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin
>>>
>>> Ok in this case the patch is a NAK.
>>>
>>> The correct solution is to stop using amdgpu_bo_free_kernel in
>>> gfx_v9_0_sw_fini.
>> So we just lead the memory leak here and not destroy the bo? I don't think
>it is correct.
>
>Oh, no. That's not what I meant.
>
>We should stop using amdgpu_bo_free_kernel and instead use
>amdgpu_bo_free!

>Sorry for not being clear here,
>Christian.
Thanks for your good suggestion.  Will revert this patch, and submit another 
patch.

Best wishes
Emily Deng
>
>>> BTW: Are we using the kernel pointer somewhere? Cause that one
>became
>>> completely invalid because of patch "drm/amdgpu: pin the csb buffer
>>> on hw init".
>>>
>>> Christian.
>>>
>>> Am 28.05.19 um 03:42 schrieb Quan, Evan:
 The original unpin in hw_fini was introduced by
 https://lists.freedesktop.org/archives/amd-gfx/2018-July/023681.html

 Evan
> -Original Message-
> From: amd-gfx  On Behalf Of
> Christian K?nig
> Sent: Monday, May 27, 2019 7:02 PM
> To: Deng, Emily ; amd-
>g...@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin
>
> Am 27.05.19 um 10:41 schrieb Emily Deng:
>> As it will destroy clear_state_obj, and also will unpin it in the
>> gfx_v9_0_sw_fini, so don't need to call csb_vram unpin in
>> gfx_v9_0_hw_fini, or it will have unpin warning.
>>
>> v2: For suspend, still need to do unpin
>>
>> Signed-off-by: Emily Deng 
>> ---
>> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> index 5eb70e8..5b1ff48 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> @@ -3395,7 +3395,8 @@ static int gfx_v9_0_hw_fini(void *handle)
>>  gfx_v9_0_cp_enable(adev, false);
>>  adev->gfx.rlc.funcs->stop(adev);
>>
>> -gfx_v9_0_csb_vram_unpin(adev);
>> +if (adev->in_suspend)
>> +gfx_v9_0_csb_vram_unpin(adev);
> That doesn't looks like a good idea to me.
>
> Why do we have unpin both in the sw_fini as well as the hw_fini
> code
>>> paths?
> Regards,
> Christian.
>
>>  return 0;
>> }
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin

2019-05-28 Thread Koenig, Christian
Am 28.05.19 um 09:38 schrieb Deng, Emily:
>> -Original Message-
>> From: Koenig, Christian 
>> Sent: Tuesday, May 28, 2019 3:04 PM
>> To: Quan, Evan ; Deng, Emily
>> ; amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin
>>
>> Ok in this case the patch is a NAK.
>>
>> The correct solution is to stop using amdgpu_bo_free_kernel in
>> gfx_v9_0_sw_fini.
> So we just lead the memory leak here and not destroy the bo? I don't think it 
> is correct.

Oh, no. That's not what I meant.

We should stop using amdgpu_bo_free_kernel and instead use amdgpu_bo_free!

Sorry for not being clear here,
Christian.

>> BTW: Are we using the kernel pointer somewhere? Cause that one became
>> completely invalid because of patch "drm/amdgpu: pin the csb buffer on hw
>> init".
>>
>> Christian.
>>
>> Am 28.05.19 um 03:42 schrieb Quan, Evan:
>>> The original unpin in hw_fini was introduced by
>>> https://lists.freedesktop.org/archives/amd-gfx/2018-July/023681.html
>>>
>>> Evan
 -Original Message-
 From: amd-gfx  On Behalf Of
 Christian K?nig
 Sent: Monday, May 27, 2019 7:02 PM
 To: Deng, Emily ; amd-gfx@lists.freedesktop.org
 Subject: Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin

 Am 27.05.19 um 10:41 schrieb Emily Deng:
> As it will destroy clear_state_obj, and also will unpin it in the
> gfx_v9_0_sw_fini, so don't need to call csb_vram unpin in
> gfx_v9_0_hw_fini, or it will have unpin warning.
>
> v2: For suspend, still need to do unpin
>
> Signed-off-by: Emily Deng 
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> index 5eb70e8..5b1ff48 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> @@ -3395,7 +3395,8 @@ static int gfx_v9_0_hw_fini(void *handle)
>   gfx_v9_0_cp_enable(adev, false);
>   adev->gfx.rlc.funcs->stop(adev);
>
> - gfx_v9_0_csb_vram_unpin(adev);
> + if (adev->in_suspend)
> + gfx_v9_0_csb_vram_unpin(adev);
 That doesn't looks like a good idea to me.

 Why do we have unpin both in the sw_fini as well as the hw_fini code
>> paths?
 Regards,
 Christian.

>   return 0;
> }
 ___
 amd-gfx mailing list
 amd-gfx@lists.freedesktop.org
 https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu

2019-05-28 Thread Christian König

Reviewed-by: Christian König 

Am 28.05.19 um 05:05 schrieb Cui, Flora:

the patch is Reviewed-by: Flora Cui 

在 5/28/2019 10:52 AM, Chunming Zhou 写道:

Change-Id: I2b1af1478fbddbb5084b90b3ff85c2eb964bd217
Signed-off-by: Chunming Zhou 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
   1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 78706dfa753a..1f38d6fc1fe3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1307,7 +1307,8 @@ static struct drm_driver kms_driver = {
.driver_features =
DRIVER_USE_AGP | DRIVER_ATOMIC |
DRIVER_GEM |
-   DRIVER_PRIME | DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ,
+   DRIVER_PRIME | DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ |
+   DRIVER_SYNCOBJ_TIMELINE,
.load = amdgpu_driver_load_kms,
.open = amdgpu_driver_open_kms,
.postclose = amdgpu_driver_postclose_kms,

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

2019-05-28 Thread Daniel Vetter
On Tue, May 28, 2019 at 8:58 AM Koenig, Christian
 wrote:
>
> Am 27.05.19 um 17:26 schrieb Daniel Vetter:
> > On Mon, May 27, 2019 at 3:42 PM Koenig, Christian
> >  wrote:
> >> Am 27.05.19 um 15:26 schrieb Emil Velikov:
> >>> On 2019/05/27, Daniel Vetter wrote:
>  On Mon, May 27, 2019 at 10:47:39AM +, Koenig, Christian wrote:
> > Am 27.05.19 um 10:17 schrieb Emil Velikov:
> >> From: Emil Velikov 
> >>
> >> Currently one can circumvent DRM_AUTH, when the ioctl is exposed via 
> >> the
> >> render node. A seemingly deliberate design decision.
> >>
> >> Hence we can drop the DRM_AUTH all together (details in follow-up 
> >> patch)
> >> yet not all userspace checks if it's authenticated, but instead uses
> >> uncommon assumptions.
> >>
> >> After days of digging through git log and testing, only a single 
> >> (ab)use
> >> was spotted - the Mesa RADV driver, using the AMDGPU_INFO ioctl and
> >> assuming that failure implies lack of authentication.
> >>
> >> Affected versions are:
> >> - the whole 18.2.x series, which is EOL
> >> - the whole 18.3.x series, which is EOL
> >> - the 19.0.x series, prior to 19.0.4
> > Well you could have saved your time, cause this is still a NAK.
> >
> > For the record: I strongly think that we don't want to expose any render
> > functionality on the primary node.
> >
> > To even go a step further I would say that at least on AMD hardware we
> > should completely disable DRI2 for one of the next generations.
> >
> > As a follow up I would then completely disallow the DRM authentication
> > for amdgpu, so that the command submission interface on the primary node
> > can only be used by the display server.
>  So amdgpu is running in one direction, while everyone else is running in
>  the other direction? Please note that your patch to change i915 landed
>  already, so that ship is sailing (but we could ofc revert that back
>  again).
> 
>  Imo really not a great idea if we do a amdgpu vs. everyone else split
>  here. If we want to deprecate dri2/flink/rendering on primary nodes 
>  across
>  the stack, then that should be a stack wide decision, not an amdgpu one.
> 
> >>> Cannot agree more - I would love to see drivers stay consistent.
> >> Yeah, completely agree to that. That's why I think we should not do this
> >> at all and just let Intel fix it's userspace bugs :P
> > So you're planning to submit that revert? You did jump the gun with
> > sending out that patch after all too ... (aside from it got merged
> > because of some other mixup with r-b tags and what not).
>
> Well already regretting submitting that. On the other hand what is the
> minimum IOCTLs we need to get working to fix the vainfo issue?

We have a bit more time, it's only going to be in 5.3. But yeah if we
don't bottom out on any of the options here I think revert it has to
be.

> Might be a good idea looking into reverting it partially, so that at
> least command submission and buffer allocation is still blocked.

I thought the issue is a lot more than vainfo, it's pretty much every
hacked up compositor under the sun getting this wrong one way or
another. Thinking about this some more, I also have no idea how you'd
want to deprecate rendering on primary nodes in general. Apparently
that breaks -modesetting already, and probably lots more compositors.
And it looks like we're finally achieve the goal kms set out to 10
years ago, and new compositors are sprouting up all the time. I guess
we could just break them all (on new hardware) and tell them to all
suck it up. But I don't think that's a great option. And just
deprecating this on amdgpu is going to be even harder, since then
everywhere else it'll keep working, and it's just amdgpu.ko that looks
broken.

Aside: I'm not supporting Emil's idea here because it fixes any issues
Intel has - Intel doesn't care. I support it because reality sucks,
people get this render vs. primary vs. multi-gpu prime wrong all the
time (that's also why we have hardcoded display+gpu pairs in mesa for
the various soc combinations out there), and this looks like a
pragmatic solution. It'd be nice if every compositor and everything
else would perfectly support multi gpu and only use render nodes for
rendering, and only primary nodes for display. But reality is that
people hack on stuff until gears on screen and then move on to more
interesting things (to them). So I don't think we'll ever win this :-/
-Daniel

> Christian.
>
> > -Daniel
> >
> >> Anyway my concern is really that we should stop extending functionality
> >> on the primary node.
> >>
> >> E.g. the render node is for use by the clients and the primary node for
> >> mode setting and use by the display server only.
> >>
> >> Regards,
> >> Christian.
> >>
> >>> Fwiw, this series consistently paves the way toward nuking DRM_AUTH ;-)
> >>>
> >>> 

RE: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin

2019-05-28 Thread Deng, Emily
>-Original Message-
>From: Koenig, Christian 
>Sent: Tuesday, May 28, 2019 3:04 PM
>To: Quan, Evan ; Deng, Emily
>; amd-gfx@lists.freedesktop.org
>Subject: Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin
>
>Ok in this case the patch is a NAK.
>
>The correct solution is to stop using amdgpu_bo_free_kernel in
>gfx_v9_0_sw_fini.
So we just lead the memory leak here and not destroy the bo? I don't think it 
is correct.
>
>BTW: Are we using the kernel pointer somewhere? Cause that one became
>completely invalid because of patch "drm/amdgpu: pin the csb buffer on hw
>init".
>
>Christian.
>
>Am 28.05.19 um 03:42 schrieb Quan, Evan:
>> The original unpin in hw_fini was introduced by
>> https://lists.freedesktop.org/archives/amd-gfx/2018-July/023681.html
>>
>> Evan
>>> -Original Message-
>>> From: amd-gfx  On Behalf Of
>>> Christian K?nig
>>> Sent: Monday, May 27, 2019 7:02 PM
>>> To: Deng, Emily ; amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin
>>>
>>> Am 27.05.19 um 10:41 schrieb Emily Deng:
 As it will destroy clear_state_obj, and also will unpin it in the
 gfx_v9_0_sw_fini, so don't need to call csb_vram unpin in
 gfx_v9_0_hw_fini, or it will have unpin warning.

 v2: For suspend, still need to do unpin

 Signed-off-by: Emily Deng 
 ---
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
 b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
 index 5eb70e8..5b1ff48 100644
 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
 +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
 @@ -3395,7 +3395,8 @@ static int gfx_v9_0_hw_fini(void *handle)
gfx_v9_0_cp_enable(adev, false);
adev->gfx.rlc.funcs->stop(adev);

 -  gfx_v9_0_csb_vram_unpin(adev);
 +  if (adev->in_suspend)
 +  gfx_v9_0_csb_vram_unpin(adev);
>>> That doesn't looks like a good idea to me.
>>>
>>> Why do we have unpin both in the sw_fini as well as the hw_fini code
>paths?
>>>
>>> Regards,
>>> Christian.
>>>
return 0;
}
>>> ___
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: enable PCIE atomics ops support

2019-05-28 Thread Christian König

Am 27.05.19 um 20:23 schrieb Kuehling, Felix:

On 2019-05-27 7:51 a.m., Christian König wrote:

That idea sounds sane to me as well.

By the way, do we somewhere signal to userspace if atomics are
supported or not?

Yes. KFD topology (a flag in the iolink) provides that information to
user mode.


Yeah, but we are enabling this for the render node as well.

Question is now if we have that in the info IOCTL or not? I strongly 
suspect that's not present there.


Regards,
Christian.



Regards,
    Felix



I mean would be nice to keep the state inside adev if this fails for
some reason.

Christian.

Am 27.05.19 um 13:16 schrieb Zhang, Hawking:

How about put pci_enable_atomic_ops_to_root ahead of
amdgpu_device_ip_early_init, while move pci_atomic_requested from kfd
device to kgd device ? In such way, we can avoid duplicate atomic
request from both amdgpu and amdkfd.

Regards,
Hawking
-Original Message-
From: amd-gfx  On Behalf Of
Xiao, Jack
Sent: 2019年5月27日 18:19
To: amd-gfx@lists.freedesktop.org
Cc: Xiao, Jack 
Subject: [PATCH] drm/amdgpu: enable PCIE atomics ops support

GPU atomics operation depends on PCIE atomics support.
Always enable PCIE atomics ops support in case that it hasn't been
enabled.

Signed-off-by: Jack Xiao 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++
   1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bdd1fe73..a2c6064 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2562,6 +2562,13 @@ int amdgpu_device_init(struct amdgpu_device
*adev,
   if (adev->rio_mem == NULL)
   DRM_INFO("PCI I/O BAR is not found.\n");
   +    /* enable PCIE atomic ops */
+    r = pci_enable_atomic_ops_to_root(adev->pdev,
+    PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
+    PCI_EXP_DEVCAP2_ATOMIC_COMP64);
+    if (r)
+    DRM_INFO("PCIE atomic ops is not supported\n");
+
   amdgpu_device_get_pcie_info(adev);
     /* early init functions */
--
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 3.0

2019-05-28 Thread Christian König

Wow, really good catch!

The underlying problem is most likely that VCE block is either power or 
clock gated and because of this the readptr read always returns zero.


Now amdgpu_ring_alloc() informs the power management code that the block 
is about to be used and so the gating is turned off.


Mhm, that is probably wrong at a hole bunch of other places, at least 
the UVD and VCN code comes to mind.


I agree with Leo that you should remove the original read (so to not 
read twice) and it would be realy nice if you could double check the 
other code (UVD/VCN) for similar problems as well.


Regards,
Christian.

Am 27.05.19 um 19:20 schrieb Li, Ching-shih (Louis):


I don’t mean to read it twice. The solution is to make first read 
later. I didn’t modify the original code to make code difference less 
and simple. I guess it should work to remove the original read there.


*From:*Liu, Leo 
*Sent:* Tuesday, May 28, 2019 12:40 AM
*To:* Li, Ching-shih (Louis) ; S, Shirish 
; Grodzovsky, Andrey ; 
Zhang, Jerry ; Deng, Emily ; 
Deucher, Alexander 

*Cc:* amd-gfx@lists.freedesktop.org
*Subject:* Re: [PATCH] drm/amdgpu: fix ring test failure issue during 
s3 in vce 3.0


int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)
{
    struct amdgpu_device *adev = ring->adev;

    uint32_t rptr = amdgpu_ring_get_rptr(ring);

    unsigned i;
    int r, timeout = adev->usec_timeout;

    /* skip ring test for sriov*/
    if (amdgpu_sriov_vf(adev))
        return 0;

    r = amdgpu_ring_alloc(ring, 16);
    if (r)
        return r;

    amdgpu_ring_write(ring, VCE_CMD_END);
    amdgpu_ring_commit(ring);

Above is original code, rptr is updated when called, and below is your 
patch, my question is why do you need to get rptr twice?


@@ -1084,6 +1084,8 @@ int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)
     if (r)
    return r;
  
+   rptr = amdgpu_ring_get_rptr(ring);

+
     amdgpu_ring_write(ring, VCE_CMD_END);
     amdgpu_ring_commit(ring);
  


On 5/27/19 12:22 PM, Li, Ching-shih (Louis) wrote:

Hi Leo,

Yes, I confirm it is the root cause *for the Chrome S3 issue*.
Whenever system is resumed, the original instruction always gets
zero. However, I have no idea why it fails, and didn’t verify this
problem on CRB or any other Linux platform yet.

Although I think the ideal solution is an indicator, e.g. a
register, for driver to check if related firmware and hardware are
ready to work. So driver can make sure it is ok to read rptr.
Without any reference document, I can only try to solve the
problem by modifying driver. Debug traces reveal that only first
rptr read fails, but the read in check loop is ok. Therefore, a
solution comes to mind: to update rptr later for initial rptr
value. Tests prove it working in Chrome platforms. Fyi~

BR,

Louis

*From:*Liu, Leo  
*Sent:* Monday, May 27, 2019 9:01 PM
*To:* S, Shirish  ;
Grodzovsky, Andrey 
; Zhang, Jerry
 ; Deng, Emily
 ; Deucher,
Alexander 

*Cc:* amd-gfx@lists.freedesktop.org
; Li, Ching-shih (Louis)
 
*Subject:* Re: [PATCH] drm/amdgpu: fix ring test failure issue
during s3 in vce 3.0

On 5/27/19 3:42 AM, S, Shirish wrote:

From: Louis Li  

  


[What]

vce ring test fails consistently during resume in s3 cycle, due to

mismatch read & write pointers.

On debug/analysis its found that rptr to be compared is not being

correctly updated/read, which leads to this failure.

Below is the failure signature:

   [drm:amdgpu_vce_ring_test_ring] *ERROR* amdgpu: ring 12 test failed

   [drm:amdgpu_device_ip_resume_phase2] *ERROR* resume of IP block 
 failed -110

   [drm:amdgpu_device_resume] *ERROR* amdgpu_device_ip_resume failed 
(-110).

  


[How]

fetch rptr appropriately, meaning move its read location further down

in the code flow.

With this patch applied the s3 failure is no more seen for >5k s3 
cycles,

which otherwise is pretty consistent.

  


Signed-off-by: Louis Li  


---

  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 ++

  1 file changed, 2 insertions(+)

  


diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c

index c021b11..92f9d46 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c

@@ -1084,6 +1084,8 @@ int amdgpu_vce_ring_test_ring(struct amdgpu_ring 
*ring)

   if (r)

    

Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin

2019-05-28 Thread Koenig, Christian
Ok in this case the patch is a NAK.

The correct solution is to stop using amdgpu_bo_free_kernel in 
gfx_v9_0_sw_fini.

BTW: Are we using the kernel pointer somewhere? Cause that one became 
completely invalid because of patch "drm/amdgpu: pin the csb buffer on 
hw init".

Christian.

Am 28.05.19 um 03:42 schrieb Quan, Evan:
> The original unpin in hw_fini was introduced by
> https://lists.freedesktop.org/archives/amd-gfx/2018-July/023681.html
>
> Evan
>> -Original Message-
>> From: amd-gfx  On Behalf Of
>> Christian K?nig
>> Sent: Monday, May 27, 2019 7:02 PM
>> To: Deng, Emily ; amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH] drm/amdgpu: Don't need to call csb_vram_unpin
>>
>> Am 27.05.19 um 10:41 schrieb Emily Deng:
>>> As it will destroy clear_state_obj, and also will unpin it in the
>>> gfx_v9_0_sw_fini, so don't need to call csb_vram unpin in
>>> gfx_v9_0_hw_fini, or it will have unpin warning.
>>>
>>> v2: For suspend, still need to do unpin
>>>
>>> Signed-off-by: Emily Deng 
>>> ---
>>>drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 ++-
>>>1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> index 5eb70e8..5b1ff48 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> @@ -3395,7 +3395,8 @@ static int gfx_v9_0_hw_fini(void *handle)
>>> gfx_v9_0_cp_enable(adev, false);
>>> adev->gfx.rlc.funcs->stop(adev);
>>>
>>> -   gfx_v9_0_csb_vram_unpin(adev);
>>> +   if (adev->in_suspend)
>>> +   gfx_v9_0_csb_vram_unpin(adev);
>> That doesn't looks like a good idea to me.
>>
>> Why do we have unpin both in the sw_fini as well as the hw_fini code paths?
>>
>> Regards,
>> Christian.
>>
>>> return 0;
>>>}
>> ___
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

2019-05-28 Thread Koenig, Christian
Am 27.05.19 um 17:26 schrieb Daniel Vetter:
> On Mon, May 27, 2019 at 3:42 PM Koenig, Christian
>  wrote:
>> Am 27.05.19 um 15:26 schrieb Emil Velikov:
>>> On 2019/05/27, Daniel Vetter wrote:
 On Mon, May 27, 2019 at 10:47:39AM +, Koenig, Christian wrote:
> Am 27.05.19 um 10:17 schrieb Emil Velikov:
>> From: Emil Velikov 
>>
>> Currently one can circumvent DRM_AUTH, when the ioctl is exposed via the
>> render node. A seemingly deliberate design decision.
>>
>> Hence we can drop the DRM_AUTH all together (details in follow-up patch)
>> yet not all userspace checks if it's authenticated, but instead uses
>> uncommon assumptions.
>>
>> After days of digging through git log and testing, only a single (ab)use
>> was spotted - the Mesa RADV driver, using the AMDGPU_INFO ioctl and
>> assuming that failure implies lack of authentication.
>>
>> Affected versions are:
>> - the whole 18.2.x series, which is EOL
>> - the whole 18.3.x series, which is EOL
>> - the 19.0.x series, prior to 19.0.4
> Well you could have saved your time, cause this is still a NAK.
>
> For the record: I strongly think that we don't want to expose any render
> functionality on the primary node.
>
> To even go a step further I would say that at least on AMD hardware we
> should completely disable DRI2 for one of the next generations.
>
> As a follow up I would then completely disallow the DRM authentication
> for amdgpu, so that the command submission interface on the primary node
> can only be used by the display server.
 So amdgpu is running in one direction, while everyone else is running in
 the other direction? Please note that your patch to change i915 landed
 already, so that ship is sailing (but we could ofc revert that back
 again).

 Imo really not a great idea if we do a amdgpu vs. everyone else split
 here. If we want to deprecate dri2/flink/rendering on primary nodes across
 the stack, then that should be a stack wide decision, not an amdgpu one.

>>> Cannot agree more - I would love to see drivers stay consistent.
>> Yeah, completely agree to that. That's why I think we should not do this
>> at all and just let Intel fix it's userspace bugs :P
> So you're planning to submit that revert? You did jump the gun with
> sending out that patch after all too ... (aside from it got merged
> because of some other mixup with r-b tags and what not).

Well already regretting submitting that. On the other hand what is the 
minimum IOCTLs we need to get working to fix the vainfo issue?

Might be a good idea looking into reverting it partially, so that at 
least command submission and buffer allocation is still blocked.

Christian.

> -Daniel
>
>> Anyway my concern is really that we should stop extending functionality
>> on the primary node.
>>
>> E.g. the render node is for use by the clients and the primary node for
>> mode setting and use by the display server only.
>>
>> Regards,
>> Christian.
>>
>>> Fwiw, this series consistently paves the way toward nuking DRM_AUTH ;-)
>>>
>>> Thanks
>>> Emil
>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx