答复: 答复: 答复: [PATCH] drm/amdgpu:fix amdgpu_sa_bo_new error

2017-02-08 Thread Liu, Monk
that's because some customer modified QEMU and trap each VF's memory access, 
which cost too much time if using CPU access FB,


emmm, I guess we have no better choice by far,  do as you suggested,


to kill risk the negative effect that we may have no console if ib test failed, 
I suggest we don't block driver proceeding after ib test failure detected , 
doable ?


发件人: Christian König <deathsim...@vodafone.de>
发送时间: 2017年2月8日 23:59:10
收件人: Liu, Monk; Michel Dänzer
抄送: amd-gfx@lists.freedesktop.org
主题: Re: 答复: 答复: [PATCH] drm/amdgpu:fix amdgpu_sa_bo_new error

Am 08.02.2017 um 16:53 schrieb Liu, Monk:

agreed, why not just use cpu to clear it ? is it because performance ?

Pixel Ding removed the CPU clear because "There's a failure caused by this is 
that handshaking gets timeout of SRIOV virtual function."

I can only assume that this is really adding to much delay at bootup.



发件人: Michel Dänzer <mic...@daenzer.net><mailto:mic...@daenzer.net>
发送时间: 2017年2月8日 23:52:02
收件人: Christian König; Liu, Monk
抄送: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
主题: Re: 答复: [PATCH] drm/amdgpu:fix amdgpu_sa_bo_new error

On 09/02/17 12:30 AM, Christian König wrote:
> The IB test make the decision if the hardware is working or not.
>
> So they should be the first commands (except for the ring tests) we send
> to the hardware.
>
> When we allocate the fb before the test we send the clear command to the
> hardware without knowing if the hardware really works or not.
>
> Not a big issue, but I think the order makes more sense that way.

I just wonder if it's worth all the trouble, just to clear the fbcon
buffer[0], if the result is that the console output is delayed, possibly
indefinitely.

Actually that change is quite beneficial because the IB tests where usually 
revealing any problem.

Not 100% sure, but when we initialize the fb later that might actually allow us 
to better track such problems down.

Going to check that,
Christian.


[0] We don't have any other hardware acceleration for fbcon, so its BO
is only accessed by the CPU and display hardware after this, and has to
be pinned in the CPU visible area of VRAM at all times anyway.


--
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: 答复: 答复: [PATCH] drm/amdgpu:fix amdgpu_sa_bo_new error

2017-02-08 Thread Christian König

Am 08.02.2017 um 16:53 schrieb Liu, Monk:


agreed, why not just use cpu to clear it ? is it because performance ?



Pixel Ding removed the CPU clear because "There's a failure caused by 
this is that handshaking gets timeout of SRIOV virtual function."


I can only assume that this is really adding to much delay at bootup.



*发件人:* Michel Dänzer <mic...@daenzer.net>
*发送时间:* 2017年2月8日 23:52:02
*收件人:* Christian König; Liu, Monk
*抄送:* amd-gfx@lists.freedesktop.org
*主题:* Re: 答复: [PATCH] drm/amdgpu:fix amdgpu_sa_bo_new error
On 09/02/17 12:30 AM, Christian König wrote:
> The IB test make the decision if the hardware is working or not.
>
> So they should be the first commands (except for the ring tests) we send
> to the hardware.
>
> When we allocate the fb before the test we send the clear command to the
> hardware without knowing if the hardware really works or not.
>
> Not a big issue, but I think the order makes more sense that way.

I just wonder if it's worth all the trouble, just to clear the fbcon
buffer[0], if the result is that the console output is delayed, possibly
indefinitely.


Actually that change is quite beneficial because the IB tests where 
usually revealing any problem.


Not 100% sure, but when we initialize the fb later that might actually 
allow us to better track such problems down.


Going to check that,
Christian.



[0] We don't have any other hardware acceleration for fbcon, so its BO
is only accessed by the CPU and display hardware after this, and has to
be pinned in the CPU visible area of VRAM at all times anyway.


--
Earthling Michel Dänzer   | http://www.amd.com 
<http://www.amd.com>

Libre software enthusiast | Mesa and X developer



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


答复: 答复: [PATCH] drm/amdgpu:fix amdgpu_sa_bo_new error

2017-02-08 Thread Liu, Monk
agreed, why not just use cpu to clear it ? is it because performance ?


发件人: Michel Dänzer <mic...@daenzer.net>
发送时间: 2017年2月8日 23:52:02
收件人: Christian König; Liu, Monk
抄送: amd-gfx@lists.freedesktop.org
主题: Re: 答复: [PATCH] drm/amdgpu:fix amdgpu_sa_bo_new error

On 09/02/17 12:30 AM, Christian König wrote:
> The IB test make the decision if the hardware is working or not.
>
> So they should be the first commands (except for the ring tests) we send
> to the hardware.
>
> When we allocate the fb before the test we send the clear command to the
> hardware without knowing if the hardware really works or not.
>
> Not a big issue, but I think the order makes more sense that way.

I just wonder if it's worth all the trouble, just to clear the fbcon
buffer[0], if the result is that the console output is delayed, possibly
indefinitely.

[0] We don't have any other hardware acceleration for fbcon, so its BO
is only accessed by the CPU and display hardware after this, and has to
be pinned in the CPU visible area of VRAM at all times anyway.


--
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: 答复: [PATCH] drm/amdgpu:fix amdgpu_sa_bo_new error

2017-02-08 Thread Michel Dänzer
On 09/02/17 12:30 AM, Christian König wrote:
> The IB test make the decision if the hardware is working or not.
> 
> So they should be the first commands (except for the ring tests) we send
> to the hardware.
> 
> When we allocate the fb before the test we send the clear command to the
> hardware without knowing if the hardware really works or not.
> 
> Not a big issue, but I think the order makes more sense that way.

I just wonder if it's worth all the trouble, just to clear the fbcon
buffer[0], if the result is that the console output is delayed, possibly
indefinitely.

[0] We don't have any other hardware acceleration for fbcon, so its BO
is only accessed by the CPU and display hardware after this, and has to
be pinned in the CPU visible area of VRAM at all times anyway.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: 答复: [PATCH] drm/amdgpu:fix amdgpu_sa_bo_new error

2017-02-08 Thread Christian König

The IB test make the decision if the hardware is working or not.

So they should be the first commands (except for the ring tests) we send 
to the hardware.


When we allocate the fb before the test we send the clear command to the 
hardware without knowing if the hardware really works or not.


Not a big issue, but I think the order makes more sense that way.

Regards,
Christian.

Am 08.02.2017 um 16:21 schrieb Liu, Monk:


yeah, make sense

@Christian, why move fbdev_init() further down after ib test ? any 
consideration ?



*发件人:* Michel Dänzer 
*发送时间:* 2017年2月8日 23:19:23
*收件人:* Christian König; Liu, Monk
*抄送:* amd-gfx@lists.freedesktop.org
*主题:* Re: [PATCH] drm/amdgpu:fix amdgpu_sa_bo_new error
On 08/02/17 07:52 PM, Christian König wrote:
> Am 08.02.2017 um 10:41 schrieb Monk Liu:
>> ib_pool init should prior to fbdev_init, otherwise
>> there will be error from amdgpu_sa_bo_new
>> (amdgpu_sa.c:323)
>>
>> fbdev_init will call ttm_validate which further call
>> amdgpu_sa_bo_new.
>>
>> Change-Id: I3a969570d443f61a44f67b0d76b3871ca5c3ea81
>> Signed-off-by: Monk Liu 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index afcae15..4169bb1 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -1918,14 +1918,14 @@ int amdgpu_device_init(struct amdgpu_device
>> *adev,
>>   /* Get a log2 for easy divisions. */
>>   adev->mm_stats.log2_max_MBps = ilog2(max(1u, max_MBps));
>>   -amdgpu_fbdev_init(adev);
>> -
>>   r = amdgpu_ib_pool_init(adev);
>>   if (r) {
>>   dev_err(adev->dev, "IB initialization failed (%d).\n", r);
>>   goto failed;
>>   }
>>   +amdgpu_fbdev_init(adev);
>> +
>
> As noted internally as well, please more that one more further down
> behind the call to amdgpu_ib_ring_tests().

If I understand correctly, that could result in no console output in the
worst case if there's trouble with the ring & IB tests? That doesn't
seem very good.


--
Earthling Michel Dänzer   | http://www.amd.com 


Libre software enthusiast | Mesa and X developer


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


答复: [PATCH] drm/amdgpu:fix amdgpu_sa_bo_new error

2017-02-08 Thread Liu, Monk
yeah, make sense

@Christian, why move fbdev_init() further down after ib test ? any 
consideration ?


发件人: Michel Dänzer 
发送时间: 2017年2月8日 23:19:23
收件人: Christian König; Liu, Monk
抄送: amd-gfx@lists.freedesktop.org
主题: Re: [PATCH] drm/amdgpu:fix amdgpu_sa_bo_new error

On 08/02/17 07:52 PM, Christian König wrote:
> Am 08.02.2017 um 10:41 schrieb Monk Liu:
>> ib_pool init should prior to fbdev_init, otherwise
>> there will be error from amdgpu_sa_bo_new
>> (amdgpu_sa.c:323)
>>
>> fbdev_init will call ttm_validate which further call
>> amdgpu_sa_bo_new.
>>
>> Change-Id: I3a969570d443f61a44f67b0d76b3871ca5c3ea81
>> Signed-off-by: Monk Liu 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index afcae15..4169bb1 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -1918,14 +1918,14 @@ int amdgpu_device_init(struct amdgpu_device
>> *adev,
>>   /* Get a log2 for easy divisions. */
>>   adev->mm_stats.log2_max_MBps = ilog2(max(1u, max_MBps));
>>   -amdgpu_fbdev_init(adev);
>> -
>>   r = amdgpu_ib_pool_init(adev);
>>   if (r) {
>>   dev_err(adev->dev, "IB initialization failed (%d).\n", r);
>>   goto failed;
>>   }
>>   +amdgpu_fbdev_init(adev);
>> +
>
> As noted internally as well, please more that one more further down
> behind the call to amdgpu_ib_ring_tests().

If I understand correctly, that could result in no console output in the
worst case if there's trouble with the ring & IB tests? That doesn't
seem very good.


--
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx