Re: [PATCH] drm/amdgpu: For virtual_display feature, the vblank_get_counter hook is always return 0 when there's no hardware frame counter which can be used.

2016-08-16 Thread Michel Dänzer
On 16/08/16 07:15 PM, Emily Deng wrote:
> Signed-off-by: Emily Deng 

Please change the shortlog to be no longer than ~72 characters. Maybe
something like this for the commit log:

drm/amdgpu: Hardcode virtual DCE vblank / scanout position return values

By hardcoding 0 for the vblank counter and -EINVAL for the scanout
position return value, we signal to the core DRM code that there are no
hardware counters we can use for these.


> diff --git a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c 
> b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
> index 2ce5f90..85f14a6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
> +++ b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
> @@ -55,10 +55,7 @@ static void dce_virtual_vblank_wait(struct amdgpu_device 
> *adev, int crtc)
>  
>  static u32 dce_virtual_vblank_get_counter(struct amdgpu_device *adev, int 
> crtc)
>  {
> - if (crtc >= adev->mode_info.num_crtc)
> - return 0;
> - else
> - return adev->ddev->vblank[crtc].count;
> + return 0;
>  }
>  
>  static void dce_virtual_page_flip(struct amdgpu_device *adev,
> @@ -70,13 +67,10 @@ static void dce_virtual_page_flip(struct amdgpu_device 
> *adev,
>  static int dce_virtual_crtc_get_scanoutpos(struct amdgpu_device *adev, int 
> crtc,
>   u32 *vbl, u32 *position)
>  {
> - if ((crtc < 0) || (crtc >= adev->mode_info.num_crtc))
> - return -EINVAL;
> -
>   *vbl = 0;
>   *position = 0;
>  
> - return 0;
> + return -EINVAL;
>  }

Would it be possible to add short-circuits for the virtual display case
in amdgpu_get_crtc_scanoutpos and amdgpu_get_vblank_counter_kms, so they
don't do any unnecessary work, and remove dce_virtual_vblank_get_counter
and dce_virtual_crtc_get_scanoutpos?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: Random short freezes due to TTM buffer migrations

2016-08-16 Thread Zhou, David(ChunMing)
Add his email.

> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Zhou, David(ChunMing)
> Sent: Wednesday, August 17, 2016 9:57 AM
> To: Kuehling, Felix ; Christian König
> ; Marek Olšák ; amd-
> g...@lists.freedesktop.org
> Subject: RE: Random short freezes due to TTM buffer migrations
> 
> +David Mao,
> 
> Well, our Vulcan stack aslo encountered this problem before, the
> performance is very low when migration is often. At that moment, we want
> to add some algorithm for eviction LRU, but failed to find an appropriate
> generic  way. Then UMD decreased some VRAM usage at last.
> Hope we can get a solution for full VRAM usage this time.
> 
> Regards,
> David Zhou
> 
> > -Original Message-
> > From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> > Of Felix Kuehling
> > Sent: Wednesday, August 17, 2016 2:34 AM
> > To: Christian König ; Marek Olšák
> > ; amd-gfx@lists.freedesktop.org
> > Subject: Re: Random short freezes due to TTM buffer migrations
> >
> > Very nice. I'm looking forward to this for KFD as well.
> >
> > One question: Will it be possible to share these split BOs as dmabufs?
> >
> > Regards,
> >   Felix
> >
> >
> > On 16-08-16 11:27 AM, Christian König wrote:
> > > Hi Marek,
> > >
> > > I'm already working on this.
> > >
> > > My current approach is to use a custom BO manager for VRAM with TTM
> > > and so split allocations into chunks of 4MB.
> > >
> > > Large BOs are still swapped out as one, but it makes it much more
> > > likely to that you can allocate 1/2 of VRAM as one buffer.
> > >
> > > Give me till the end of the week to finish this and then we can test
> > > if that's sufficient or if we need to do more.
> > >
> > > Regards,
> > > Christian.
> > >
> > > Am 16.08.2016 um 16:33 schrieb Marek Olšák:
> > >> Hi,
> > >>
> > >> I'm seeing random temporary freezes (up to 2 seconds) under memory
> > >> pressure. Before I describe the exact circumstances, I'd like to
> > >> say that this is a serious issue affecting playability of certain
> > >> AAA Linux games.
> > >>
> > >> In order to reproduce this, an application should:
> > >> - allocate a few very large buffers (256-512 MB per buffer)
> > >> - allocate more memory than there is available VRAM. The issue also
> > >> occurs (but at a lower frequency) if the app needs only 80% of VRAM.
> > >>
> > >> Example: ttm_bo_validate needs to migrate a 512 MB buffer. The
> > >> total size of moved memory for that call can be as high as 1.5 GB.
> > >> This is always followed by a big temporary drop in VRAM usage.
> > >>
> > >> The game I'm testing needs 3.4 GB of VRAM.
> > >>
> > >> Setups:
> > >> Tonga - 2 GB: It's nearly unplayable, because freezes occur too often.
> > >> Fiji - 4 GB: There is one freeze at the beginning (which is
> > >> annoying too), after that it's smooth.
> > >>
> > >> So even 4 GB is not enough.
> > >>
> > >> Workarounds:
> > >> - Split buffers into smaller pieces in the kernel. It's not
> > >> necessary to manage memory at page granularity (64KB). Splitting
> > >> buffers into 16MB-large pieces might not be optimal but it would be
> > >> a significant improvement.
> > >> - Or do the same in Mesa. This would prevent inter-process and
> > >> inter-API buffer sharing for split buffers (DRI, OpenCL), but we
> > >> would at least verify how much the situation improves.
> > >>
> > >> Other issues sharing the same cause:
> > >> - Allocations requesting 1/3 or more VRAM have a high chance of
> > >> failing. It's generally not possible to allocate 1/2 or more VRAM
> > >> as one buffer.
> > >>
> > >> Comments welcome,
> > >>
> > >> Marek
> > >> ___
> > >> amd-gfx mailing list
> > >> amd-gfx@lists.freedesktop.org
> > >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > >
> > >
> > > ___
> > > amd-gfx mailing list
> > > amd-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Reverted another change to fix buffer move hangs (was Re: [PATCH] drm/ttm: partial revert "cleanup ttm_tt_(unbind|destroy)" v2)

2016-08-16 Thread Felix Kuehling
Thank you. Sorry, I already pushed it with Alex's R-B, without yours.


On 16-08-16 03:53 AM, Christian König wrote:
> Am 15.08.2016 um 23:03 schrieb Alex Deucher:
>> On Mon, Aug 15, 2016 at 3:06 PM, Felix Kuehling
>>  wrote:
>>> Patch against current amd-staging-4.6 is attached.
>> Reviewed-by: Alex Deucher 
>
> Reviewed-by: Christian König .
>
>>
>>> Regards,
>>>Felix
>>>
>>>
>>> On 16-08-13 05:25 AM, Christian König wrote:
 Am 13.08.2016 um 01:22 schrieb Felix Kuehling:
> [CC Kent FYI]
>
> On 16-08-11 04:31 PM, Deucher, Alexander wrote:
>>> -Original Message-
>>> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On
>>> Behalf
>>> Of Felix Kuehling
>>> Sent: Thursday, August 11, 2016 3:52 PM
>>> To: Michel Dänzer; Christian König
>>> Cc: amd-gfx@lists.freedesktop.org
>>> Subject: Reverted another change to fix buffer move hangs (was Re:
>>> [PATCH] drm/ttm: partial revert "cleanup
>>> ttm_tt_(unbind|destroy)" v2)
>>>
>>> We had to revert another change on the KFD branch to fix a
>>> buffer move
>>> problem: 8b6b79f43801f00ddcdc10a4d5719eba4b2e32aa (drm/amdgpu:
>>> group BOs
>>> by log2 of the size on the LRU v2
>> That makes sense.  I think you may want a different LRU scheme for
>> KFD or at least special handling for KFD buffers.
> [FK] But I think the patch shouldn't cause hangs, regardless.
>
> I eventually found what the problem was. The "group BOs by log2 of
> the
> size on the LRU v2" patch exposed a latent bug related to the GART
> size.
> On our KFD branch, we calculate the GART size differently, and it can
> easily go above 4GB. I think on amd-staging-4.6 the GART size can
> also
> go above 4GB on cards with lots of VRAM.
>
> However, the offset parameter in amdgpu_gart_bind and unbind is only
> 32-bit. With the patch our test ended up using GART offsets beyond
> 4GB
> for the first time. Changing the offset parameter to uint64_t
> fixes the
> problem.
 Nice catch, please provide a patch to fix this.

> Our test also demonstrates a potential flaw in the log2 grouping
> patch:
> When a buffer of a previously unused size is added to the LRU, it
> gets
> added to the front of the list, rather than the tail. So an
> application
> that allocates a very large buffer after a bunch of smaller
> buffers, is
> very likely to have that buffer evicted over and over again before
> any
> smaller buffers are considered for eviction. I believe, this can
> result
> in thrashing of large buffers.
>
> Some other observations: When the last BO of a given size is removed
> from the LRU list, the LRU tail for that size is left "floating"
> in the
> middle of the LRU list. So the next BO of that size that is added,
> will
> be added at an arbitrary position in the list. It may even end up
> in the
> middle of a block of pages of a different size. So a log2 grouping
> may
> end up being split.
 Yeah, those are more or less known issues.

 Keep in mind that we only added the grouping by log2 of the size to
 have a justification to push the TTM changes upstream for the coming
 KFD fences.

 E.g. so that we are able to have this upstream before we try to push
 on the fence code.

 I will take a look at fixing those issues when I have time, shouldn't
 be to complicated to set the entries to zero when they aren't used or
 adjust other entries as well when some are added.

 Regards,
 Christian.

> Regards,
> Felix
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

>>>
>>> ___
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>
>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


underclocking support rx480

2016-08-16 Thread Jarkko Korpi
I haven't tried yet the overclocking feature that's limited 20% at command 
line. But please make it possible to downlock too.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: underclocking support rx480

2016-08-16 Thread Deucher, Alexander
You can already limit the clock levels as I described previously.

Alex

From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Jarkko Korpi
Sent: Tuesday, August 16, 2016 2:37 PM
To: amd-gfx@lists.freedesktop.org
Subject: underclocking support rx480


I haven't tried yet the overclocking feature that's limited 20% at command 
line. But please make it possible to downlock too.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Random short freezes due to TTM buffer migrations

2016-08-16 Thread Christian König

Am 16.08.2016 um 17:56 schrieb Marek Olšák:

On Tue, Aug 16, 2016 at 5:27 PM, Christian König
 wrote:

Hi Marek,

I'm already working on this.

My current approach is to use a custom BO manager for VRAM with TTM and so
split allocations into chunks of 4MB.

Large BOs are still swapped out as one, but it makes it much more likely to
that you can allocate 1/2 of VRAM as one buffer.

Do you mean GTT->swap migrations or VRAM->GTT?


VRAM->GTT. In other words the BO is always either completely in VRAM or 
complete in GTT space, not partially in GTT and partially in VRAM.




In Mesa, I can at least split MSAA color and depth buffers, because:
- I don't have to support sharing.
- They are big.
- I need different BO priorities for FMASK+CMASK, then the first 2
samples in one buffer, then the next 2 samples in another buffer, etc.
- I need to allow different locations for each of those.


That sounds like you want to split such allocations up into multiple BOs 
in Mesa anyway.


The only problem is the increased overhead during command submission.

Regards,
Christian.


Marek



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Random short freezes due to TTM buffer migrations

2016-08-16 Thread Marek Olšák
Hi,

I'm seeing random temporary freezes (up to 2 seconds) under memory
pressure. Before I describe the exact circumstances, I'd like to say
that this is a serious issue affecting playability of certain AAA
Linux games.

In order to reproduce this, an application should:
- allocate a few very large buffers (256-512 MB per buffer)
- allocate more memory than there is available VRAM. The issue also
occurs (but at a lower frequency) if the app needs only 80% of VRAM.

Example: ttm_bo_validate needs to migrate a 512 MB buffer. The total
size of moved memory for that call can be as high as 1.5 GB. This is
always followed by a big temporary drop in VRAM usage.

The game I'm testing needs 3.4 GB of VRAM.

Setups:
Tonga - 2 GB: It's nearly unplayable, because freezes occur too often.
Fiji - 4 GB: There is one freeze at the beginning (which is annoying
too), after that it's smooth.

So even 4 GB is not enough.

Workarounds:
- Split buffers into smaller pieces in the kernel. It's not necessary
to manage memory at page granularity (64KB). Splitting buffers into
16MB-large pieces might not be optimal but it would be a significant
improvement.
- Or do the same in Mesa. This would prevent inter-process and
inter-API buffer sharing for split buffers (DRI, OpenCL), but we would
at least verify how much the situation improves.

Other issues sharing the same cause:
- Allocations requesting 1/3 or more VRAM have a high chance of
failing. It's generally not possible to allocate 1/2 or more VRAM as
one buffer.

Comments welcome,

Marek
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: DCE wait for idle

2016-08-16 Thread Alex Deucher
On Tue, Aug 16, 2016 at 7:53 AM, StDenis, Tom  wrote:
> In these functions
>
>
> static bool dce_v11_0_is_idle(void *handle)
> {
> return true;
> }
>
> static int dce_v11_0_wait_for_idle(void *handle)
> {
> return 0;
> }
>
> Shouldn't they wait on the GUI bit of the GRBM_STATUS register?

There is no DCE state in the SRBM_STATUS registers.  GRBM_STATUS is
only the 3d/compute engine.  The GUI bit just means something in the
3d/compute block is busy.  It's not DCE status.  There are a lot of
components in the DCE engine that could be busy (grph, crtc,
transmitters, encoders, line buffers, etc.) and if a display path is
active, they are likely to be.  That said, the status of the DCE block
is generally not interesting for most things we use the idle callbacks
for.

Alex
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: fix possible bad kref_put in amdgpu_uvd_ring_end_use

2016-08-16 Thread Christian König
NAK, we already merged a patch to avoid the fence_put() in general when 
the ring test fails.


Regards,
Christian.

Am 16.08.2016 um 08:33 schrieb Matthew Macy:

Clang identified this when I was merging up 4.8-rc1/rc2. I usually just disable 
warnings as they pop up as I treat the drivers as vendor code and FreeBSD's 
default clang settings are a bit on the anal side. However, this appears to be 
a legitimate bug. I pointed the problem out on #dri-devel and was asked to send 
a patch here.

I haven't submitted patches before, but this is a trivial fix so bear with me.

 From 89ea7621c52ff9d3b6e48fa315609a042f2f5e0d Mon Sep 17 00:00:00 2001
From: Matt Macy 
Date: Mon, 15 Aug 2016 23:22:49 -0700
Subject: [PATCH] drm/amdgpu: fix possible kref_put on random stack value

If amdgpu_uvd_get_create_msg fails fence_put(fence) will be called with
fence uninitialized - possibly leading to kref_put being called on whatever
value happens to be on the stack. Initializing fence to NULL precludes this.

Signed-off-by: Matt Macy mm...@nextbsd.org
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index b11f4e8..59931d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -1161,7 +1161,7 @@ void amdgpu_uvd_ring_end_use(struct amdgpu_ring *ring)
   */
  int amdgpu_uvd_ring_test_ib(struct amdgpu_ring *ring, long timeout)
  {
-   struct fence *fence;
+   struct fence *fence = NULL;
long r;
  
  	r = amdgpu_uvd_get_create_msg(ring, 1, NULL);



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx