Re: [PATCH] drm/amdgpu: For virtual_display feature, the vblank_get_counter hook is always return 0 when there's no hardware frame counter which can be used.
On 16/08/16 07:15 PM, Emily Deng wrote: > Signed-off-by: Emily DengPlease change the shortlog to be no longer than ~72 characters. Maybe something like this for the commit log: drm/amdgpu: Hardcode virtual DCE vblank / scanout position return values By hardcoding 0 for the vblank counter and -EINVAL for the scanout position return value, we signal to the core DRM code that there are no hardware counters we can use for these. > diff --git a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c > b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c > index 2ce5f90..85f14a6 100644 > --- a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c > +++ b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c > @@ -55,10 +55,7 @@ static void dce_virtual_vblank_wait(struct amdgpu_device > *adev, int crtc) > > static u32 dce_virtual_vblank_get_counter(struct amdgpu_device *adev, int > crtc) > { > - if (crtc >= adev->mode_info.num_crtc) > - return 0; > - else > - return adev->ddev->vblank[crtc].count; > + return 0; > } > > static void dce_virtual_page_flip(struct amdgpu_device *adev, > @@ -70,13 +67,10 @@ static void dce_virtual_page_flip(struct amdgpu_device > *adev, > static int dce_virtual_crtc_get_scanoutpos(struct amdgpu_device *adev, int > crtc, > u32 *vbl, u32 *position) > { > - if ((crtc < 0) || (crtc >= adev->mode_info.num_crtc)) > - return -EINVAL; > - > *vbl = 0; > *position = 0; > > - return 0; > + return -EINVAL; > } Would it be possible to add short-circuits for the virtual display case in amdgpu_get_crtc_scanoutpos and amdgpu_get_vblank_counter_kms, so they don't do any unnecessary work, and remove dce_virtual_vblank_get_counter and dce_virtual_crtc_get_scanoutpos? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: Random short freezes due to TTM buffer migrations
Add his email. > -Original Message- > From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf > Of Zhou, David(ChunMing) > Sent: Wednesday, August 17, 2016 9:57 AM > To: Kuehling, Felix; Christian König > ; Marek Olšák ; amd- > g...@lists.freedesktop.org > Subject: RE: Random short freezes due to TTM buffer migrations > > +David Mao, > > Well, our Vulcan stack aslo encountered this problem before, the > performance is very low when migration is often. At that moment, we want > to add some algorithm for eviction LRU, but failed to find an appropriate > generic way. Then UMD decreased some VRAM usage at last. > Hope we can get a solution for full VRAM usage this time. > > Regards, > David Zhou > > > -Original Message- > > From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf > > Of Felix Kuehling > > Sent: Wednesday, August 17, 2016 2:34 AM > > To: Christian König ; Marek Olšák > > ; amd-gfx@lists.freedesktop.org > > Subject: Re: Random short freezes due to TTM buffer migrations > > > > Very nice. I'm looking forward to this for KFD as well. > > > > One question: Will it be possible to share these split BOs as dmabufs? > > > > Regards, > > Felix > > > > > > On 16-08-16 11:27 AM, Christian König wrote: > > > Hi Marek, > > > > > > I'm already working on this. > > > > > > My current approach is to use a custom BO manager for VRAM with TTM > > > and so split allocations into chunks of 4MB. > > > > > > Large BOs are still swapped out as one, but it makes it much more > > > likely to that you can allocate 1/2 of VRAM as one buffer. > > > > > > Give me till the end of the week to finish this and then we can test > > > if that's sufficient or if we need to do more. > > > > > > Regards, > > > Christian. > > > > > > Am 16.08.2016 um 16:33 schrieb Marek Olšák: > > >> Hi, > > >> > > >> I'm seeing random temporary freezes (up to 2 seconds) under memory > > >> pressure. Before I describe the exact circumstances, I'd like to > > >> say that this is a serious issue affecting playability of certain > > >> AAA Linux games. > > >> > > >> In order to reproduce this, an application should: > > >> - allocate a few very large buffers (256-512 MB per buffer) > > >> - allocate more memory than there is available VRAM. The issue also > > >> occurs (but at a lower frequency) if the app needs only 80% of VRAM. > > >> > > >> Example: ttm_bo_validate needs to migrate a 512 MB buffer. The > > >> total size of moved memory for that call can be as high as 1.5 GB. > > >> This is always followed by a big temporary drop in VRAM usage. > > >> > > >> The game I'm testing needs 3.4 GB of VRAM. > > >> > > >> Setups: > > >> Tonga - 2 GB: It's nearly unplayable, because freezes occur too often. > > >> Fiji - 4 GB: There is one freeze at the beginning (which is > > >> annoying too), after that it's smooth. > > >> > > >> So even 4 GB is not enough. > > >> > > >> Workarounds: > > >> - Split buffers into smaller pieces in the kernel. It's not > > >> necessary to manage memory at page granularity (64KB). Splitting > > >> buffers into 16MB-large pieces might not be optimal but it would be > > >> a significant improvement. > > >> - Or do the same in Mesa. This would prevent inter-process and > > >> inter-API buffer sharing for split buffers (DRI, OpenCL), but we > > >> would at least verify how much the situation improves. > > >> > > >> Other issues sharing the same cause: > > >> - Allocations requesting 1/3 or more VRAM have a high chance of > > >> failing. It's generally not possible to allocate 1/2 or more VRAM > > >> as one buffer. > > >> > > >> Comments welcome, > > >> > > >> Marek > > >> ___ > > >> amd-gfx mailing list > > >> amd-gfx@lists.freedesktop.org > > >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > > > > > > > ___ > > > amd-gfx mailing list > > > amd-gfx@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > > ___ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: Reverted another change to fix buffer move hangs (was Re: [PATCH] drm/ttm: partial revert "cleanup ttm_tt_(unbind|destroy)" v2)
Thank you. Sorry, I already pushed it with Alex's R-B, without yours. On 16-08-16 03:53 AM, Christian König wrote: > Am 15.08.2016 um 23:03 schrieb Alex Deucher: >> On Mon, Aug 15, 2016 at 3:06 PM, Felix Kuehling >>wrote: >>> Patch against current amd-staging-4.6 is attached. >> Reviewed-by: Alex Deucher > > Reviewed-by: Christian König . > >> >>> Regards, >>>Felix >>> >>> >>> On 16-08-13 05:25 AM, Christian König wrote: Am 13.08.2016 um 01:22 schrieb Felix Kuehling: > [CC Kent FYI] > > On 16-08-11 04:31 PM, Deucher, Alexander wrote: >>> -Original Message- >>> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On >>> Behalf >>> Of Felix Kuehling >>> Sent: Thursday, August 11, 2016 3:52 PM >>> To: Michel Dänzer; Christian König >>> Cc: amd-gfx@lists.freedesktop.org >>> Subject: Reverted another change to fix buffer move hangs (was Re: >>> [PATCH] drm/ttm: partial revert "cleanup >>> ttm_tt_(unbind|destroy)" v2) >>> >>> We had to revert another change on the KFD branch to fix a >>> buffer move >>> problem: 8b6b79f43801f00ddcdc10a4d5719eba4b2e32aa (drm/amdgpu: >>> group BOs >>> by log2 of the size on the LRU v2 >> That makes sense. I think you may want a different LRU scheme for >> KFD or at least special handling for KFD buffers. > [FK] But I think the patch shouldn't cause hangs, regardless. > > I eventually found what the problem was. The "group BOs by log2 of > the > size on the LRU v2" patch exposed a latent bug related to the GART > size. > On our KFD branch, we calculate the GART size differently, and it can > easily go above 4GB. I think on amd-staging-4.6 the GART size can > also > go above 4GB on cards with lots of VRAM. > > However, the offset parameter in amdgpu_gart_bind and unbind is only > 32-bit. With the patch our test ended up using GART offsets beyond > 4GB > for the first time. Changing the offset parameter to uint64_t > fixes the > problem. Nice catch, please provide a patch to fix this. > Our test also demonstrates a potential flaw in the log2 grouping > patch: > When a buffer of a previously unused size is added to the LRU, it > gets > added to the front of the list, rather than the tail. So an > application > that allocates a very large buffer after a bunch of smaller > buffers, is > very likely to have that buffer evicted over and over again before > any > smaller buffers are considered for eviction. I believe, this can > result > in thrashing of large buffers. > > Some other observations: When the last BO of a given size is removed > from the LRU list, the LRU tail for that size is left "floating" > in the > middle of the LRU list. So the next BO of that size that is added, > will > be added at an arbitrary position in the list. It may even end up > in the > middle of a block of pages of a different size. So a log2 grouping > may > end up being split. Yeah, those are more or less known issues. Keep in mind that we only added the grouping by log2 of the size to have a justification to push the TTM changes upstream for the coming KFD fences. E.g. so that we are able to have this upstream before we try to push on the fence code. I will take a look at fixing those issues when I have time, shouldn't be to complicated to set the entries to zero when they aren't used or adjust other entries as well when some are added. Regards, Christian. > Regards, > Felix > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>> >>> ___ >>> amd-gfx mailing list >>> amd-gfx@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>> > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
underclocking support rx480
I haven't tried yet the overclocking feature that's limited 20% at command line. But please make it possible to downlock too. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: underclocking support rx480
You can already limit the clock levels as I described previously. Alex From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Jarkko Korpi Sent: Tuesday, August 16, 2016 2:37 PM To: amd-gfx@lists.freedesktop.org Subject: underclocking support rx480 I haven't tried yet the overclocking feature that's limited 20% at command line. But please make it possible to downlock too. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: Random short freezes due to TTM buffer migrations
Am 16.08.2016 um 17:56 schrieb Marek Olšák: On Tue, Aug 16, 2016 at 5:27 PM, Christian Königwrote: Hi Marek, I'm already working on this. My current approach is to use a custom BO manager for VRAM with TTM and so split allocations into chunks of 4MB. Large BOs are still swapped out as one, but it makes it much more likely to that you can allocate 1/2 of VRAM as one buffer. Do you mean GTT->swap migrations or VRAM->GTT? VRAM->GTT. In other words the BO is always either completely in VRAM or complete in GTT space, not partially in GTT and partially in VRAM. In Mesa, I can at least split MSAA color and depth buffers, because: - I don't have to support sharing. - They are big. - I need different BO priorities for FMASK+CMASK, then the first 2 samples in one buffer, then the next 2 samples in another buffer, etc. - I need to allow different locations for each of those. That sounds like you want to split such allocations up into multiple BOs in Mesa anyway. The only problem is the increased overhead during command submission. Regards, Christian. Marek ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Random short freezes due to TTM buffer migrations
Hi, I'm seeing random temporary freezes (up to 2 seconds) under memory pressure. Before I describe the exact circumstances, I'd like to say that this is a serious issue affecting playability of certain AAA Linux games. In order to reproduce this, an application should: - allocate a few very large buffers (256-512 MB per buffer) - allocate more memory than there is available VRAM. The issue also occurs (but at a lower frequency) if the app needs only 80% of VRAM. Example: ttm_bo_validate needs to migrate a 512 MB buffer. The total size of moved memory for that call can be as high as 1.5 GB. This is always followed by a big temporary drop in VRAM usage. The game I'm testing needs 3.4 GB of VRAM. Setups: Tonga - 2 GB: It's nearly unplayable, because freezes occur too often. Fiji - 4 GB: There is one freeze at the beginning (which is annoying too), after that it's smooth. So even 4 GB is not enough. Workarounds: - Split buffers into smaller pieces in the kernel. It's not necessary to manage memory at page granularity (64KB). Splitting buffers into 16MB-large pieces might not be optimal but it would be a significant improvement. - Or do the same in Mesa. This would prevent inter-process and inter-API buffer sharing for split buffers (DRI, OpenCL), but we would at least verify how much the situation improves. Other issues sharing the same cause: - Allocations requesting 1/3 or more VRAM have a high chance of failing. It's generally not possible to allocate 1/2 or more VRAM as one buffer. Comments welcome, Marek ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: DCE wait for idle
On Tue, Aug 16, 2016 at 7:53 AM, StDenis, Tomwrote: > In these functions > > > static bool dce_v11_0_is_idle(void *handle) > { > return true; > } > > static int dce_v11_0_wait_for_idle(void *handle) > { > return 0; > } > > Shouldn't they wait on the GUI bit of the GRBM_STATUS register? There is no DCE state in the SRBM_STATUS registers. GRBM_STATUS is only the 3d/compute engine. The GUI bit just means something in the 3d/compute block is busy. It's not DCE status. There are a lot of components in the DCE engine that could be busy (grph, crtc, transmitters, encoders, line buffers, etc.) and if a display path is active, they are likely to be. That said, the status of the DCE block is generally not interesting for most things we use the idle callbacks for. Alex ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: fix possible bad kref_put in amdgpu_uvd_ring_end_use
NAK, we already merged a patch to avoid the fence_put() in general when the ring test fails. Regards, Christian. Am 16.08.2016 um 08:33 schrieb Matthew Macy: Clang identified this when I was merging up 4.8-rc1/rc2. I usually just disable warnings as they pop up as I treat the drivers as vendor code and FreeBSD's default clang settings are a bit on the anal side. However, this appears to be a legitimate bug. I pointed the problem out on #dri-devel and was asked to send a patch here. I haven't submitted patches before, but this is a trivial fix so bear with me. From 89ea7621c52ff9d3b6e48fa315609a042f2f5e0d Mon Sep 17 00:00:00 2001 From: Matt MacyDate: Mon, 15 Aug 2016 23:22:49 -0700 Subject: [PATCH] drm/amdgpu: fix possible kref_put on random stack value If amdgpu_uvd_get_create_msg fails fence_put(fence) will be called with fence uninitialized - possibly leading to kref_put being called on whatever value happens to be on the stack. Initializing fence to NULL precludes this. Signed-off-by: Matt Macy mm...@nextbsd.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c index b11f4e8..59931d4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c @@ -1161,7 +1161,7 @@ void amdgpu_uvd_ring_end_use(struct amdgpu_ring *ring) */ int amdgpu_uvd_ring_test_ib(struct amdgpu_ring *ring, long timeout) { - struct fence *fence; + struct fence *fence = NULL; long r; r = amdgpu_uvd_get_create_msg(ring, 1, NULL); ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx