Re: [PATCH xserver] dri2: Invalidate DRI2 buffers for all windows with the same pixmap on swap.
On Mar 25, 2011, at 3:32 PM, Michel Dänzer wrote: On Fre, 2011-03-25 at 14:47 +0200, Ville Syrjälä wrote: On Fri, Mar 25, 2011 at 12:35:37PM +0100, ext Michel Dänzer wrote: From: Michel Dänzer daen...@vmware.com Without this, when a compositing manager unredirects a fullscreen window which uses DRI2 and page flipping, the DRI2 buffer information for the compositing manager's output window (typically the Composite Overlay Window or root window) may become stale, resulting in all kinds of hilarity. Make sense to me. Thanks, does that mean I have your Reviewed-by: ? fwiw (haven't touched that part of the code for multiple months), you can also have a ... Reviewed-by: Mario Kleiner mario.klei...@tuebingen.mpg.de Acked-by: Mario Kleiner mario.klei...@tuebingen.mpg.de ... from me. I was suffering the same problems and was just about to spend my weekend on hunting this bug in the xserver, so thanks for saving my weekend! -mario * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: Why no activity on this (nasty X segfault) bug?
On Apr 27, 2011, at 2:45 PM, Ian Pilcher wrote: On 04/23/2011 09:25 PM, Ian Pilcher wrote: If there's a different iteration of the patch that you would like me to test, please point me to it. Otherwise, I would urge you to get this thing pushed. https://bugs.freedesktop.org/show_bug.cgi?id=35452 Anybody out there? What is needed to get this pushed? Keith was asking for testers for his refined patch, maybe cc'ing him, as i do here, helps. Original thread was this, between Michel and Keith: http://lists.x.org/archives/xorg-devel/2011-March/020716.html The refined patch of Keith that needs testing: http://www.mail-archive.com/xorg-devel@lists.x.org/msg20650.html Maybe the fix should also be merged into the 1.9 series? Currently shipping distros, e.g., Ubuntu 10.10 all suffer it. thanks, -mario Thanks! -- == == Ian Pilcher arequip...@gmail.com If you're going to shift my paradigm ... at least buy me dinner first. == == ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: Why no activity on this (nasty X segfault) bug?
On Apr 27, 2011, at 5:12 PM, Ian Pilcher wrote: On 04/27/2011 10:00 AM, Mario Kleiner wrote: Original thread was this, between Michel and Keith: http://lists.x.org/archives/xorg-devel/2011-March/020716.html The refined patch of Keith that needs testing: http://www.mail-archive.com/xorg-devel@lists.x.org/msg20650.html The two patches in the second thread are just the patch from the first thread split in two, right? No, Keith patch is more efficient. It avoids walking parts of the window tree. ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [RFC] swap event handling fixes
On Apr 29, 2011, at 11:37 PM, Jesse Barnes wrote: On Thu, 28 Apr 2011 13:27:18 -0700 Jesse Barnes jbar...@virtuousgeek.org wrote: I obviously failed to count the swap event structure size after adding and removing fields a few times, and didn't even account for padding. The end result is that clients today won't receive the sbc_lo field at all, and so will likely stuff junk into that field on the client side (or zero at best). This patchset fixes up the structure definitions, bumps the proto levels, and adds server and client handling code for it all. It should be forward and backward compatible, but as always review and testing appreciated. I think the event_type checking on the client side still needs work; the field is split now so I need to check the right one on old servers. I'll also add swap support for the new requests in case people ever want to run this stuff between big and little endian machines. Btw, I've been trying to test these fixes but I can't make the existing code break; I'm getting a nicely incrementing sbc count using the piglit event test (after adding some code to dump the swap event fields), so somehow the bits are getting through. I just don't know how yet... One thing you could try as another quick test is to use xtrace on a opengl app. I remember seeing weird results for the msc or ust fields (i believe) of decoded and printed DRI2SwapComplete events. I always thought part of my toolchain was just out of sync, but maybe it was just the result of the wrong size/padding. -mario -- Jesse Barnes, Intel Open Source Technology Center ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: xf86-video-ati page flipping fixes
On May 5, 2011, at 4:23 PM, Ville Syrjälä wrote: On Thu, May 05, 2011 at 11:46:45AM +0200, Michel Dänzer wrote: On Mit, 2011-05-04 at 23:51 +0300, Ville Syrjala wrote: I came to the conclusion that the xserver DRI2 invalidate patches that have been discussed aren't really fixing the problem. I suppose they may make the problem slightly less likely to happen, but at least for me that likelyhood is still very high. The whole mess looks like a simple driver bug to me. I think the xserver patches are still necessary, otherwise how are the cached DRI2 pPriv-buffers updated for other windows sharing the same pixmap? As the real front buffers are not handed out to clients, there isn't that much reason to update them. Making sure the fake front buffer contain a more recent snapshot of the real front would be one reason though. Could it be that your patch fixes a closely related, but different bug? During OpenGL rendering, Mesa/Gallium caches the backbuffer assignment and only queries the current assignment from the x-server if its drawable has been invalidated, which is what the x-server patches do. I'd also think you will need both patch sets. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: xf86-video-ati page flipping fixes
On May 5, 2011, at 6:06 PM, Ville Syrjälä wrote: On Thu, May 05, 2011 at 05:09:56PM +0200, Mario Kleiner wrote: On May 5, 2011, at 4:23 PM, Ville Syrjälä wrote: On Thu, May 05, 2011 at 11:46:45AM +0200, Michel Dänzer wrote: On Mit, 2011-05-04 at 23:51 +0300, Ville Syrjala wrote: I came to the conclusion that the xserver DRI2 invalidate patches that have been discussed aren't really fixing the problem. I suppose they may make the problem slightly less likely to happen, but at least for me that likelyhood is still very high. The whole mess looks like a simple driver bug to me. I think the xserver patches are still necessary, otherwise how are the cached DRI2 pPriv-buffers updated for other windows sharing the same pixmap? As the real front buffers are not handed out to clients, there isn't that much reason to update them. Making sure the fake front buffer contain a more recent snapshot of the real front would be one reason though. Could it be that your patch fixes a closely related, but different bug? During OpenGL rendering, Mesa/Gallium caches the backbuffer assignment and only queries the current assignment from the x-server if its drawable has been invalidated, which is what the x-server patches do. I'd also think you will need both patch sets. Back buffers are private so a swap for one drawable doesn't need to invalidate them for other drawables. Are you sure this is also the case even when kms page flipping is used for fullscreen drawables? The workaround i implemented in my toolkit for the page flipping problem when using fullscreen drawables on current x servers was to make sure that it always ends a session with an even number of completed page-flipping swaps, by checking 'sbc' and issuing an extra swap before closing its window if the sbc wasn't an even number. This makes sure that the same buffer objects are assigned as front- and backbuffer at the time the window closes as when the application started and opened its fullscreen window. This fixed any desktop corruption i'd usually get when running compiz with non-redirected fullscreen windows after quitting my application. I could be totally wrong, but at least the symptoms and the working workaround suggested fullscreen drawables do share their back buffers when page flipping is on. -mario -- Ville Syrjälä syrj...@sci.fi http://www.sci.fi/~syrjala/ ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 2/2] DRI2: Add error message when working around driver bug
On Oct 6, 2010, at 1:05 PM, Pauli Nieminen wrote: There isn't API that allows application atomically query for msc changes and schedule swaps. If msc changes dramatically between query and scheduling application would schedule swap to happen at wrong time. Because of API limitations driver has to make msc increment for each vblank affecting the drawable. Hi Paul, this is not a driver bug which should be reported as an error, but simply handling of what happens if a drawable is moved by the user from one display head to a different display head, which likely has a different msc count because it runs at a different refresh rate or was enabled later than the first crtc. Therefore i don't think this error printout should be added. It is arguably not perfect, but probably good enough for the most common cases. The correct solution would be to virtualize the msc counter for each drawable and keep track of crtc changes and compensate for those at each change. I have some ideas on how to do this properly, just didn't have the time to test them. best, -mario Signed-off-by: Pauli Nieminen ext-pauli.niemi...@nokia.com --- hw/xfree86/dri2/dri2.c |7 +-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index d9b9d57..d70c115 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -860,9 +860,12 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, if (!(*ds-GetMSC)(pDraw, ust, current_msc)) pPriv-last_swap_target = 0; - if (current_msc pPriv-last_swap_target) + if (current_msc pPriv-last_swap_target) { pPriv-last_swap_target = current_msc; - + xf86DrvMsg(pScreen-myNum, X_ERROR, + [DRI2] %s: GetMSC returned swap count that is in + past. Working around driver bug.\n, __func__); + } } /* -- 1.7.0.4 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 1/2] DRI2: Avoid call to NULL pointer
On Oct 6, 2010, at 1:05 PM, Pauli Nieminen wrote: DDX driver may implement schedule swap without GetMSC. In that case we can't call GetMSC in DRI2SwapBuffers. I don't think this check is neccessary. Afaik if the ds-GetMSC entry point isn't defined then ds-ScheduleSwap isn't defined either, so the !ds-ScheduleSwap check at the beginning of the routine triggers and the fallback path is taken, no? -mario Signed-off-by: Pauli Nieminen ext-pauli.niemi...@nokia.com --- hw/xfree86/dri2/dri2.c | 11 +++ 1 files changed, 7 insertions(+), 4 deletions(-) diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index 228c0e0..d9b9d57 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -856,11 +856,14 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, * is moved to a crtc with a lower refresh rate, or a crtc that just * got enabled. */ - if (!(*ds-GetMSC)(pDraw, ust, current_msc)) - pPriv-last_swap_target = 0; + if (ds-GetMSC) { + if (!(*ds-GetMSC)(pDraw, ust, current_msc)) + pPriv-last_swap_target = 0; - if (current_msc pPriv-last_swap_target) - pPriv-last_swap_target = current_msc; + if (current_msc pPriv-last_swap_target) + pPriv-last_swap_target = current_msc; + + } /* * Swap target for this swap is last swap target + swap interval since -- 1.7.0.4 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 1/2] DRI2: Avoid call to NULL pointer
On Oct 7, 2010, at 10:06 AM, Pauli Nieminen wrote: On 07/10/10 00:42 +0200, ext Mario Kleiner wrote: On Oct 6, 2010, at 1:05 PM, Pauli Nieminen wrote: DDX driver may implement schedule swap without GetMSC. In that case we can't call GetMSC in DRI2SwapBuffers. I don't think this check is neccessary. Afaik if the ds-GetMSC entry point isn't defined then ds-ScheduleSwap isn't defined either, so the !ds-ScheduleSwap check at the beginning of the routine triggers and the fallback path is taken, no? no. It is perfectly valid to have ScheduleSwap without GetMSC. This is important for driver that can implement page flipping but doesn't have simple way to expose msc count. Ok, i'm not sure if such drivers should exist, and this snippet of code from DRI2ScreenInit... if (info-version = 4) { ds-ScheduleSwap = info-ScheduleSwap; ds-ScheduleWaitMSC = info-ScheduleWaitMSC; ds-GetMSC = info-GetMSC; cur_minor = 3; } else { cur_minor = 1; } ... implied for me that ScheduleSwap and GetMSC come in one all or nothing package. But GetMSC could be a null pointer anyway, so it doesn't hurt to protect against such configs and i think your patch does the right thing in this case. You can add a... Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de to your patch if you like. As far as i can see, such a driver would be limited to simple glXSwapBuffers() calls though if it doesn't know its own msc. It would also have to ignore the target_msc, divisor and remainder arguments to ScheduleSwap and only take swapInterval into account, as neither a glx client nor the server could provide any meaningful values for these arguments without knowing the msc. -mario -mario Signed-off-by: Pauli Nieminen ext-pauli.niemi...@nokia.com --- hw/xfree86/dri2/dri2.c | 11 +++ 1 files changed, 7 insertions(+), 4 deletions(-) diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index 228c0e0..d9b9d57 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -856,11 +856,14 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, * is moved to a crtc with a lower refresh rate, or a crtc that just * got enabled. */ - if (!(*ds-GetMSC)(pDraw, ust, current_msc)) - pPriv-last_swap_target = 0; + if (ds-GetMSC) { + if (!(*ds-GetMSC)(pDraw, ust, current_msc)) + pPriv-last_swap_target = 0; - if (current_msc pPriv-last_swap_target) - pPriv-last_swap_target = current_msc; + if (current_msc pPriv-last_swap_target) + pPriv-last_swap_target = current_msc; + + } /* * Swap target for this swap is last swap target + swap interval since -- 1.7.0.4 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 1/2] DRI2: Avoid call to NULL pointer
You can add a... Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de to your patch if you like. As far as i can see, such a driver would be limited to simple glXSwapBuffers() calls though if it doesn't know its own Oops, of course i meant a Reviewed-by: Mario Kleiner mario.klei...@tuebingen.mpg.de -mario msc. It would also have to ignore the target_msc, divisor and remainder arguments to ScheduleSwap and only take swapInterval into account, as neither a glx client nor the server could provide any meaningful values for these arguments without knowing the msc. -mario -mario Signed-off-by: Pauli Nieminen ext-pauli.niemi...@nokia.com --- hw/xfree86/dri2/dri2.c | 11 +++ 1 files changed, 7 insertions(+), 4 deletions(-) diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index 228c0e0..d9b9d57 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -856,11 +856,14 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, * is moved to a crtc with a lower refresh rate, or a crtc that just * got enabled. */ - if (!(*ds-GetMSC)(pDraw, ust, current_msc)) - pPriv-last_swap_target = 0; + if (ds-GetMSC) { + if (!(*ds-GetMSC)(pDraw, ust, current_msc)) + pPriv-last_swap_target = 0; - if (current_msc pPriv-last_swap_target) - pPriv-last_swap_target = current_msc; + if (current_msc pPriv-last_swap_target) + pPriv-last_swap_target = current_msc; + + } /* * Swap target for this swap is last swap target + swap interval since -- 1.7.0.4 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] DRI2: Expose API to set drawable swap limit.
in a couple of weeks or months, go for it: 1. We maintain a little fifo queue / list of pending swap requests for each drawable. 2. The DRI2ScheduleSwap() function just converts the incoming swapbuffers request into a little struct with all the parameters (target_msc, divisor, remainder, ...) and inserts it at the tail of that queue, after checking that swaps_pending current swaplimit. If (++swaps_pending) == 1, it kicks off a little DRI2ReallyScheduleOneSwap() function which dequeues this first request from the queue and calls into the ddx-ScheduleSwap() routine. At the end of the DRI2SwapComplete() function, when a swap really got completed, that DRI2ReallyScheduleOneSwap() function gets called and checks if more swap requests are pending in the queue and dequeues and processes the oldest request. If the queue is empty, it is a no op and the whole thing goes idle again. Maybe DRI2ReallyScheduleOneSwap() would also take care of msc jumps due to drawables being moved between crtc's etc. This can't perfectly solve delays in one swapbuffer request due to traffic jams in the command stream, but it could prevent that one misscheduled swap will trigger the followup ones to go ugly in random ways. The problems with multiple clients blocking on glXWaitForMscOML or glXWaitForSbcOML() is a separate issue which would need implementation of additional wait queues. And then, as we discussed on xds, there's the need for proper timestamping of copyswaps in the kernel, otherwise this proposal would only fix the pageflipped case, not the windowed case. thanks, -mario * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 2/2] DRI2: Add error message when working around driver bug
On Oct 25, 2010, at 6:52 PM, Jesse Barnes wrote: On Mon, 25 Oct 2010 17:13:58 +0300 Pauli Nieminen ext-pauli.niemi...@nokia.com wrote: There isn't API that allows application atomically query for msc changes and schedule swaps. If msc changes dramatically between query and scheduling application would schedule swap to happen at wrong time. Because of API limitations driver has to make msc increment for each vblank affecting the drawable. Signed-off-by: Pauli Nieminen ext-pauli.niemi...@nokia.com CC: Kristian Høgsberg k...@bitplanet.net --- hw/xfree86/dri2/dri2.c |7 +-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index d9b9d57..d70c115 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -860,9 +860,12 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, if (!(*ds-GetMSC)(pDraw, ust, current_msc)) pPriv-last_swap_target = 0; - if (current_msc pPriv-last_swap_target) + if (current_msc pPriv-last_swap_target) { pPriv-last_swap_target = current_msc; - + xf86DrvMsg(pScreen-myNum, X_ERROR, + [DRI2] %s: GetMSC returned swap count that is in + past. Working around driver bug.\n, __func__); + } } This one scares me a little. We added this so we could also catch drawables moving between screens with different msc bases, so this patch could cause a lot of false positives (no question that the specs could use some additions here though). Making it a debug message that only shows up with -verbose would be fine though. -- Jesse Barnes, Intel Open Source Technology Center I agree with Jesse, as a debug message at -verbose it would be fine. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 2/2] DRI2: Add error message when working around driver bug
bit?? Or even have some dedicated function that the app needs to call to acknowledge it realizes what just happened? Or a way for apps to tell the server how they want to be treated in such cases? E.g., one could have a transition counter for each drawable which increments at each crtc transition and a way for the app to query the current count and to pass the count with each blocking call, basically as a cookie? The implementation could compare counts to know if an app is aware at function call time that the crtc has changed and fail a call if the counts mismatch. c) Extend the spec with new possible error cases to watch out: Unblock such calls in some slightly ugly manner to at least avoid an application hang, e.g., just . Somehow notify the application of what happened (error return code, or deriveable from the returned timestamps/counts/intel_swap_events?), maybe print some warning to the server log? How to decide which call after a transition should be unblocked and which are valid calls again? Ideas? Thanks again for all your work on this. These are good improvements. Indeed! -mario * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] DRI2: Expose API to set drawable swap limit.
On Oct 28, 2010, at 6:10 PM, Pauli Nieminen wrote: I think we should have some way for the drivers to back out of this gracefully or at least cover their tails. E.g., allow the ddx to set a hard upper limit for the swap_limit, in a new field max_swap_limit. Your patch could make sure that swap_limit is never set to a value higher than that. Then the intel and radeon ddx could set max_swap_limit = 1 unless users somehow (xorg.conf option? dri.conf?) opt into risky higher max_swap_limits. True. I intended DRI2SwapLimit to be called from DDX only. DDX only entry wouldn't need any checks for limit changes. If we assume that all calls to DRI2SwapLimit are anyway driver initiated then max_swap_limits would only guard driver against it self. I can add the max_swap_limit if there is real need for it. Or some part of the implementation (your patch? the ddx itself?) should output some warning in the x log if a value is selected that it can't reliably handle, so app developers aren't confused to death about weird behaviour. DDX would have to implement the option and warning. Ok, sounds good. As long as the ddx has control over the settable swap_limit, i'm perfectly fine with it. Good point. I wasn't aware of that. So something like sketched above should/could be implemented inside the ddx instead of the common server code. But there, where it can already exchange to the new front, it should work, right? yes. It would work. Good. So that's one way to do it in the ddx. Thanks, -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 2/2] DRI2: Add error message when working around driver bug
On Oct 28, 2010, at 6:02 PM, Jesse Barnes wrote: On Thu, 28 Oct 2010 18:47:09 +0300 Pauli Nieminen ext-pauli.niemi...@nokia.com wrote: Most of what you have in (b) is pretty straightfoward; even the shared drawable case shouldn't be too bad, since each X connection could have bits indicating whether the counter has been picked up after a CRTC move. One option would be adding crct id parameter to calls. glXGetMscBaseRateOML would return rate, base msc and pipe id where this msc value is valid. Now all MSC calls would take the returned pipe id as parameter. If pipe id doesn't match current crtc any more then call would fail. This would allow complex applications to pass same pipe id to different context. Negative side is that API would have to be changed to include extra parameter. Yeah that would be a good extension though; we may as well expose the fact that different display pipes exist on the system, and have corresponding MSCs. Old applications using SGI_video_sync or existing OML behavior would work like they do today (with an MSC value that may jump, which we could fix with the virtualize count), and new ones would be pipe aware. I also like option b) the most, define new spec and api instead of working around the old specs limitations. Another way of doing it, in the same spirit, would be some generation counter. Starts with 1, increments each time that something changes in the configuration which could influence the display timing and mess with the schedule the app had in mind. If the drawable changes crtc's. If the crtc changes its video mode (esp. refresh rate) or configuration (mirrored/extended desktop, synchronized to other crtc's etc.), dpms change etc. A get call could return the current count, other calls could return the count that was valid at time of their processing. E.g., intel_swap_events could code more info like generation count, crtcid. Apps could pass in the count to the msc related functions and those would fail on mismatch to the current count. One could also have one special don't care value (e.g., 0) that says I don't care about an isolated glitch, because i'm not prepared to handle this anyway. Just do something to make sure i don't hang, e.g., fall through a blocking glXWaitMscOml() call or swap the buffers immediately on glXSwapBuffersMscOML(). I'm also for exposing rather more than less information, like pipe configuration, or the ability for the app to decide what it wants, e.g., * If the drawable covers multiple crtc's, or is in mirror display mode, should one of them define when to swap and the other should show tearing, or should each of them sync its swap separately, which looks nice, but can throttle redraw rates or possibly exhaust resources if the crtc's run and largely different rates. Windows has the concept of a primary display which defines the swap timing on extended desktops. The non-primary display just shows tearing. Mac OS behaves similar, except that you don't have control over which is the primary display, and some (sometimes buggy) heuristic decides for you and gives you the fun of working around it by replugging monitors and other fun things. I like control. The current intel and radeon ddx in page flipped mode will swap each crtc separately. Tear free, but with throttled framerate, as swap completion == swap completion of the last involved crtc. This btw. is a problem for the returned timestamps and timing if the crtc's run at different refresh rates, as the app doesn't know to which crtc the swap completion timestamp belongs. And it changes over time. For blitted copy-swaps you get tearfree on the assigned crtc for a drawable and tearing on the other one. Another approach would be to define swap times in system (gettimeofday () time). Specify a swap deadline tWhen and the system tries to swap at the earliest vblank with a time tNow = tWhen. Then one doesn't have to care too much about changes in msc rates. The NV_present_video http://www.opengl.org/registry/specs/NV/ present_video.txt extension does something similar for presentatio of video buffers. My own toolkit does this as well and for user code it's a natural way to specify presentation times, especially if it has to synchronize presentation with other modalities like sound, digital i/o, eye tracking etc. My code just uses the glXGetSyncValuesOML() call to translate a user-provided system time tWhen into a corresponding target_msc for glXSwapBuffersMscOML(). When we're at defining new api (christmas time is coming, i got lots of wishes), a new swapbuffers call could also define what to do if a presentation deadline can't be met. E.g., instead of a delayed swap it could drop the swap and completely skip a bufferswap request to get presentation timing back on schedule, so skipped frame errors can't accumulate. Something like that could be interesting for
Re: [PATCH v2 5/5] DRI2: Allow DDX to validate swap_limit changes
The whole patch series looks now good to me. For all of them: Reviewed-by: Mario Kleiner mario.klei...@tuebingen.mpg.de thanks, -mario On Nov 1, 2010, at 3:22 PM, Pauli Nieminen wrote: DDX can now implement validation for swap_limit changes to prevent configurations that are not support in driver. Signed-off-by: Pauli Nieminen ext-pauli.niemi...@nokia.com CC: Mario Kleiner mario.klei...@tuebingen.mpg.de --- hw/xfree86/dri2/dri2.c | 18 +- hw/xfree86/dri2/dri2.h | 14 ++ 2 files changed, 31 insertions(+), 1 deletions(-) diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index 255fed0..7c6a0e2 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -102,6 +102,7 @@ typedef struct _DRI2Screen { DRI2ScheduleWaitMSCProcPtr ScheduleWaitMSC; DRI2AuthMagicProcPtrAuthMagic; DRI2ReuseBufferNotifyProcPtr ReuseBufferNotify; +DRI2SwapLimitValidateProcPtr SwapLimitValidate; HandleExposuresProcPtr HandleExposures; @@ -191,13 +192,23 @@ DRI2AllocateDrawable(DrawablePtr pDraw) return pPriv; } +static Bool +DRI2DefaultSwapLimitValidate(DrawablePtr pDraw, int swap_limit) +{ +return swap_limit == 1; +} + Bool DRI2SwapLimit(DrawablePtr pDraw, int swap_limit) { DRI2DrawablePtr pPriv = DRI2GetDrawable(pDraw); +DRI2ScreenPtr ds = pPriv-dri2_screen; if (!pPriv) return FALSE; +if (!ds-SwapLimitValidate(pDraw, swap_limit)) + return FALSE; + pPriv-swap_limit = swap_limit; /* Check throttling */ @@ -1134,8 +1145,10 @@ DRI2ScreenInit(ScreenPtr pScreen, DRI2InfoPtr info) ds-AuthMagic = info-AuthMagic; } -if (info-version = 6) +if (info-version = 6) { ds-ReuseBufferNotify = info-ReuseBufferNotify; + ds-SwapLimitValidate = info-SwapLimitValidate; +} /* * if the driver doesn't provide an AuthMagic function or the info struct @@ -1148,6 +1161,9 @@ DRI2ScreenInit(ScreenPtr pScreen, DRI2InfoPtr info) goto err_out; #endif +if (!ds-SwapLimitValidate) + ds-SwapLimitValidate = DRI2DefaultSwapLimitValidate; + /* Initialize minor if needed and set to minimum provied by DDX */ if (!dri2_minor || dri2_minor cur_minor) dri2_minor = cur_minor; diff --git a/hw/xfree86/dri2/dri2.h b/hw/xfree86/dri2/dri2.h index 3d01c55..a4bba02 100644 --- a/hw/xfree86/dri2/dri2.h +++ b/hw/xfree86/dri2/dri2.h @@ -169,6 +169,19 @@ typedef void (*DRI2InvalidateProcPtr) (DrawablePtr pDraw, void *data); /** + * DRI2 calls this hook when ever swap_limit is going to be changed. Default + * implementation for the hook only accepts one as swap_limit. If driver can + * support other swap_limits it has to implement supported limits with this + * callback. + * + * \param pDraw drawable whos swap_limit is going to be changed + * \param swap_limit new swap_limit that going to be set + * \return TRUE if limit is support, FALSE if not. + */ +typedef Bool (*DRI2SwapLimitValidateProcPtr)(DrawablePtr pDraw, + int swap_limit); + +/** * Version of the DRI2InfoRec structure defined in this header */ #define DRI2INFOREC_VERSION 6 @@ -203,6 +216,7 @@ typedef struct { /* added in version 6 */ DRI2ReuseBufferNotifyProcPtr ReuseBufferNotify; +DRI2SwapLimitValidateProcPtr SwapLimitValidate; } DRI2InfoRec, *DRI2InfoPtr; extern _X_EXPORT int DRI2EventBase; -- 1.7.0.4 * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] glx: Remove swap barrier and hyperpipe support
On 11/19/2010 07:52 PM, Adam Jackson wrote: Never implemented in any open source driver. The implementation assumed explicit DDX driver knowledge of how the client-side driver worked, since at the time the server's GL renderer was not a DRI driver. But now, it is, so any implementation of these should be done with additional DRI driver API, like the swap control extension. I assume this is only about removing the current unused implementation, not about removing support completely from the xserver, right? At some point i'd like to implement some dri2 support for at least swap groups and possibly swap barriers. With multi-head cards like the amd eyefinity stuff (6 crtc's) we will need swap groups et al. to make good use of so many heads. As the x-server's dri2/ddx part is responsible for swap scheduling in dri2, i assume there will be some need for interaction between the client and server. I was hoping to tinker with this stuff within maybe a year or so. thanks, -mario Signed-off-by: Adam Jacksona...@redhat.com --- glx/Makefile.am |1 - glx/g_disptab.h | 52 -- glx/glxcmds.c| 233 -- glx/glxcmdsswap.c|1 - glx/glxdri.c |1 - glx/glxdri2.c|1 - glx/glxdriswrast.c |1 - glx/glxext.c | 19 + glx/glxscreens.c | 27 -- glx/glxscreens.h | 18 glx/indirect_table.c |1 - glx/xfont.c |1 - hw/xquartz/GL/indirect.c |2 - hw/xwin/glx/indirect.c |2 - 14 files changed, 1 insertions(+), 359 deletions(-) delete mode 100644 glx/g_disptab.h diff --git a/glx/Makefile.am b/glx/Makefile.am index 9d9fa3c..d708872 100644 --- a/glx/Makefile.am +++ b/glx/Makefile.am @@ -68,7 +68,6 @@ libglx_la_SOURCES = \ indirect_program.c \ indirect_table.h \ indirect_texture_compression.c \ -g_disptab.h \ glxbyteorder.h \ glxcmds.c \ glxcmdsswap.c \ diff --git a/glx/g_disptab.h b/glx/g_disptab.h deleted file mode 100644 index 9b4308b..000 --- a/glx/g_disptab.h +++ /dev/null @@ -1,52 +0,0 @@ -/* DO NOT EDIT - THIS FILE IS AUTOMATICALLY GENERATED */ -#ifdef HAVE_DIX_CONFIG_H -#includedix-config.h -#endif - -#ifndef _GLX_g_disptab_h_ -#define _GLX_g_disptab_h_ -/* - * SGI FREE SOFTWARE LICENSE B (Version 2.0, Sept. 18, 2008) - * Copyright (C) 1991-2000 Silicon Graphics, Inc. All Rights Reserved. - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the Software), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice including the dates of first publication and - * either this permission notice or a reference to - * http://oss.sgi.com/projects/FreeB/ - * shall be included in all copies or substantial portions of the Software. - * - * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS - * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * SILICON GRAPHICS, INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, - * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF - * OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - * Except as contained in this notice, the name of Silicon Graphics, Inc. - * shall not be used in advertising or otherwise to promote the sale, use or - * other dealings in this Software without prior written authorization from - * Silicon Graphics, Inc. - */ - -extern int __glXDisp_BindSwapBarrierSGIX(__GLXclientState *cl, GLbyte *pc); -extern int __glXDisp_QueryMaxSwapBarriersSGIX(__GLXclientState *cl, GLbyte *pc); -extern int __glXDisp_QueryHyperpipeNetworkSGIX(__GLXclientState *cl, GLbyte *pc); -extern int __glXDisp_DestroyHyperpipeConfigSGIX (__GLXclientState *cl, GLbyte *pc); -extern int __glXDisp_QueryHyperpipeConfigSGIX(__GLXclientState *cl, GLbyte *pc); -extern int __glXDisp_HyperpipeConfigSGIX(__GLXclientState *cl, GLbyte *pc); - -extern int __glXDispSwap_BindSwapBarrierSGIX(__GLXclientState *cl, GLbyte *pc); -extern int __glXDispSwap_QueryMaxSwapBarriersSGIX(__GLXclientState *cl, GLbyte *pc); -extern int __glXDispSwap_QueryHyperpipeNetworkSGIX(__GLXclientState *cl, GLbyte *pc); -extern int __glXDispSwap_DestroyHyperpipeConfigSGIX (__GLXclientState *cl, GLbyte *pc); -extern int __glXDispSwap_QueryHyperpipeConfigSGIX(__GLXclientState *cl, GLbyte *pc); -extern int __glXDispSwap_HyperpipeConfigSGIX(__GLXclientState *cl, GLbyte *pc); - -#endif
Re: Swap limit
On 12/08/2010 05:15 PM, Pauli Nieminen wrote: On 08/12/10 16:55 +0100, ext Alex Deucher wrote: One other thing that might be worth adding to DRI2 is a way for the driver to access the swap interval. If we could, then the driver could dynamically disable things like vline waits for buffer blits or do non-vsynced pageflipping more easily if the swap interval was 0. Actually I tough DRI2 was already telling SwapInteval with *swap_target = pPriv-last_swap_target + pPriv-swap_interval; If swap_target doesn't advance then driver would know to flip as fast as possible. But then I noticed a bug. First one is that we always blit with swapinterval is zero when it should be possible to flip (even tear free). Let it tear :). I think swap_interval == 0 should cause immediate swaps with tearing. All OpenGL implementations i know (Windows, MacOS classic, MacOS/X, Linux with proprietary drivers, old SGI's) interpret a swapinterval of zero as swap as soon as rendering is finished, immediately, don't sync to scanout cycle and i think that makes the most sense. It's good for benchmarking how fast your system can go if not throttled by the monitor (useful), for crazy gamers that trade visual quality for fps and for special applications that need to sometimes control swap timing themselves (like my toolkit). Code from DRI2ScheduleSwap: /* Old DDX or no swap interval, just blit */ if (!ds-ScheduleSwap || !pPriv-swap_interval) { Not a bug, just a half-done way of achieving the expected swap_interval zero ;-) [ok, maybe that is a bug]. It schedules an immediate copy-swap via blitting. Unfortunately the ddx doesn't know about the swap_interval, so it still synchronizes the execution of the blit to vsync via vline waits. That's tear-free, but it depends on the location and size of the drawable and the current position of the scanout if this will cause an immediate swap (if scanout is outside the drawables area) or a vsync'ed swap. It's a bit undefined behaviour for non-fullscreen drawables and it effectively enforces a minimum swap interval of 1 for fullscreen drawables, which is not what we want. Both the intel ddx and (soon) the ati ddx have a xorg.conf parameter SwapBuffersWait to disable vsync completely, then you'd get copy-swaps as fast as possible for swap_interval zero, but for a non-zero swap_interval you'd have some chance of tearing, so this is alos just a band-aid. If we wanted this for page-flipping we'd need a shortcut similar to this one which would bypass the vblank scheduling and call the ddx pageflip routine directly, plus an extension to the kernel's pageflip ioctl() to allow non-vsynced flips. And we need a new interface to tell the ddx that a swap should be non-vsync'ed. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: Swap limit
On Dec 13, 2010, at 11:21 AM, Michel Dänzer wrote: On Mon, 2010-12-13 at 11:35 +0200, Pauli Nieminen wrote: On 12/12/10 19:49 -0500, ext Jerome Glisse wrote: On Fri, Dec 10, 2010 at 9:40 AM, Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: On 12/08/2010 05:15 PM, Pauli Nieminen wrote: On 08/12/10 16:55 +0100, ext Alex Deucher wrote: One other thing that might be worth adding to DRI2 is a way for the driver to access the swap interval. If we could, then the driver could dynamically disable things like vline waits for buffer blits or do non-vsynced pageflipping more easily if the swap interval was 0. Actually I tough DRI2 was already telling SwapInteval with *swap_target = pPriv-last_swap_target + pPriv-swap_interval; If swap_target doesn't advance then driver would know to flip as fast as possible. But then I noticed a bug. First one is that we always blit with swapinterval is zero when it should be possible to flip (even tear free). Let it tear :). I think swap_interval == 0 should cause immediate swaps with tearing. All OpenGL implementations i know (Windows, MacOS classic, MacOS/X, Linux with proprietary drivers, old SGI's) interpret a swapinterval of zero as swap as soon as rendering is finished, immediately, don't sync to scanout cycle and i think that makes the most sense. It's good for benchmarking how fast your system can go if not throttled by the monitor (useful), for crazy gamers that trade visual quality for fps and for special applications that need to sometimes control swap timing themselves (like my toolkit). Code from DRI2ScheduleSwap: /* Old DDX or no swap interval, just blit */ if (!ds-ScheduleSwap || !pPriv-swap_interval) { Not a bug, just a half-done way of achieving the expected swap_interval zero ;-) [ok, maybe that is a bug]. It schedules an immediate copy-swap via blitting. Unfortunately the ddx doesn't know about the swap_interval, so it still synchronizes the execution of the blit to vsync via vline waits. That's tear-free, but it depends on the location and size of the drawable and the current position of the scanout if this will cause an immediate swap (if scanout is outside the drawables area) or a vsync'ed swap. It's a bit undefined behaviour for non-fullscreen drawables and it effectively enforces a minimum swap interval of 1 for fullscreen drawables, which is not what we want. Both the intel ddx and (soon) the ati ddx have a xorg.conf parameter SwapBuffersWait to disable vsync completely, then you'd get copy-swaps as fast as possible for swap_interval zero, but for a non-zero swap_interval you'd have some chance of tearing, so this is alos just a band-aid. If we wanted this for page-flipping we'd need a shortcut similar to this one which would bypass the vblank scheduling and call the ddx pageflip routine directly, plus an extension to the kernel's pageflip ioctl() to allow non-vsynced flips. And we need a new interface to tell the ddx that a swap should be non-vsync'ed. Unless i miss read hw spec all pageflip are always vsync no matter what you do. IIRC the crtc sancout address is updated during vblank from the register, so what is scanned next is what ever address was written last. If I remember correctly RADEON_CRTC_OFFSET_FLIP_CNTL had some effect on that. Correct, if that bit is 1 the flip happens on the next horizontal blank (the non-KMS radeon DRM always sets this bit for flipping). There's probably something similar for the AVIVO display engine. On Radeons you can select if flip should happen at next vblank or hblank. Alex's initial pageflip patches for Radeon used this even for vblank sync'd flips, so we know it works, but we changed to a different method for higher overall robustness. I assume other gpu's aren't different, as on os/x and windows, all gpu's will swap immediately (with tearing) when you set the swap_interval to zero. Adding this to the implementation wouldn't be difficult (at least for radeon's), it's just that the current pageflip ioctl() doesn't support an option to pass this to the kms-driver. Same with the ddx. Currently there isn't a specific interface for the x-server to communicate vsync vs. non-vsync to the ddx. I have to read the other e-mails in this thread first, esp. Pauli's, i'm a bit slow at the moment. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: Swap limit
On Dec 10, 2010, at 8:00 PM, Jesse Barnes wrote: On Fri, 10 Dec 2010 15:40:38 +0100 Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: It schedules an immediate copy-swap via blitting. Unfortunately the ddx doesn't know about the swap_interval, so it still synchronizes the execution of the blit to vsync via vline waits. That's tear-free, but it depends on the location and size of the drawable and the current position of the scanout if this will cause an immediate swap (if scanout is outside the drawables area) or a vsync'ed swap. It's a bit undefined behaviour for non-fullscreen drawables and it effectively enforces a minimum swap interval of 1 for fullscreen drawables, which is not what we want. That's actually the problem; swapbufferswait isn't *quite* equivalent to swap interval = 1, since you can potentially do multiple blits before the scanline intercepts your rect(s). But regardless, we should add an API option to allow blits to happen immediately when swap_interval = 0 both with and without tear avoidance. I think Pauli's followup covers this (tearing swap limit 0 vs tear free swap limit 0). Ja, if i understand Pauli's proposal correctly, it's non-vsync'ed flips/copies for the tearing case and some kind of limited n- buffering + dropping frames. I mostly like it, but i think clients should have control over the desired behaviour. The non-vsync'ed flips as i proposed it would be as easy as communicating swap_interval to the ddx (or the ddx querying it) and then the ddx not vsync'ing its copies or flips. I think that's the same as what Pauli proposes for the tearing case? I don't think using target_msc == current_msc is a good way of choosing this. Usually you want a flip to be delayed in this case, not a complete rendered frame to be dropped. At least i would panic if dropping frames would be suddenly the default behaviour of the ddx in case of target_msc == current_msc, unless there's extra api to control this. The reason i like this swap_interval == 0 -- tearing immedate flip is mostly because a) It's consistent with what all other os'es and non-free drivers do. As a toolkit developer i like it if the default behaviour of different platforms is somewhat consistent for a similar existing api, it makes porting and maintenance simpler. b) It can be used as a hack for userspace toolkits to allow somewhat synchronized swaps across drawables if there isn't proper api to control this. E.g., if you have two drawables on two display heads and you want to swap them simultaneously and at least somewhat tear- free, you can do glFinish on both drawables contexts, so the drawables are swap-ready, then schedule a swap_interval = 1 swap on one drawable, wait for it to complete its swap and then immediately schedule a swap_interval = 0 swap on the 2nd drawable. You will get a tear-free update of the 1st drawable and depending on latencies and how well the displays are synchronized, you may get a tear-free update on the 2nd head as well. Ideally we should have api to control this, maybe some OML_sync_control_2 extension. Ideally, i'd like to have a swapbuffers call that allows to... ...define the swap deadline by target_msc or target system time. ...vsync or non-vsync ...If we have n-buffering, what to do for each queued swap if we fall behind the target presentation time. Present the buffer, tear-free and just accumulate more delay, or drop the buffer to catch up. ...synchronize multiple drawables in their swap behaviour - the sgi_swap_group extension could handle that. For a extension of the pageflip ioctl() it would be good if we could pass some flags or more info to the kms driver. Currently we have a bitfield to pass flags to the drm's ioctl() entry point, but they aren't passed through to the kms drivers implementation of the ioctl. E.g.: - A flag to ask for non-vsynced pageflip - A flag or mask to implement synchronized flips across crtc's, to make mirror mode, swap_group support and drawables that span multiple crtc's robust. - Possible future flags to allow to implement frame-sequential stereo, e.g., to tell if you're queuing a new future frontbuffer for the left-eye or right-eye. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH xserver/dri2 2/3] DRI2WaitSbc(): Fixes for correct semantic of glXWaitForSbcOML()
Added implementation for case target_sbc == 0. In that case, the function shall schedule a wait until all pending swaps for the drawable have completed. Fix for non-blocking case. Old implementation returned random, uninitialized values for (ust,msc,sbc) if it returned immediately without scheduling a wait due to sbc = target_sbc. Now if function doesn't schedule a wait, but returns immediately, it returns the (ust,msc,sbc) of the most recently completed swap, i.e., the UST and MSC corresponding to the time when the returned current SBC was reached. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- hw/xfree86/dri2/dri2.c | 25 +++-- 1 files changed, 23 insertions(+), 2 deletions(-) diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index baf0df8..7f40d28 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -64,6 +64,8 @@ typedef struct _DRI2Drawable { CARD64 swap_count; CARD64 target_sbc; /* -1 means no SBC wait outstanding */ CARD64 last_swap_target; /* most recently queued swap target */ +CARD64 last_swap_msc; /* msc at completion of most recent swap */ +CARD64 last_swap_ust; /* ust at completion of most recent swap */ int swap_limit; /* for N-buffering */ } DRI2DrawableRec, *DRI2DrawablePtr; @@ -142,6 +144,8 @@ DRI2CreateDrawable(DrawablePtr pDraw) pPriv-swap_interval = 1; pPriv-last_swap_target = -1; pPriv-swap_limit = 1; /* default to double buffering */ +pPriv-last_swap_msc = 0; +pPriv-last_swap_ust = 0; if (pDraw-type == DRAWABLE_WINDOW) { @@ -518,6 +522,9 @@ DRI2SwapComplete(ClientPtr client, DrawablePtr pDraw, int frame, if (swap_complete) swap_complete(client, swap_data, type, ust, frame, pPriv-swap_count); +pPriv-last_swap_msc = frame; +pPriv-last_swap_ust = ust; + DRI2WakeClient(client, pDraw, frame, tv_sec, tv_usec); } @@ -714,8 +721,22 @@ DRI2WaitSBC(ClientPtr client, DrawablePtr pDraw, CARD64 target_sbc, if (pPriv == NULL) return BadDrawable; -if (pPriv-swap_count = target_sbc) - return Success; +/* target_sbc == 0 means to block until all pending swaps are + * finished. Recalculate target_sbc to get that behaviour. + */ +if (target_sbc == 0) +target_sbc = pPriv-swap_count + pPriv-swapsPending; + +/* If current swap count already = target_sbc, + * return immediately with (ust, msc, sbc) triplet of + * most recent completed swap. + */ +if (pPriv-swap_count = target_sbc) { +*sbc = pPriv-swap_count; +*msc = pPriv-last_swap_msc; +*ust = pPriv-last_swap_ust; +return Success; +} pPriv-target_sbc = target_sbc; DRI2BlockClient(client, pDraw); -- 1.6.6 ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH xserver/dri2] Improvements to DRI2 2.2 protocol bits.
The following series of 3 patches contains some bug fixes and adds some missing functionality to the new DRI2 2.2 protocol requests for implementation of the SGI_swap_interval and OML_sync_control GLX extensions. Jesse Barnes already had a positive look at these patches, but i couldn't test any of the patches, not even if they compile, due to lack of suitable hardware. Could somebody please test these and apply them if appropriate? Thanks, mario ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH xserver/dri2] Fixes and Improvements for DRI 2.2 protocol bits
Hi, the patch series against the xserver i just sent out contains bug fixes and adds missing functionality for the new DRI2 2.2 protocol requests. It should improve the handling of the SGI_swap_interval and OML_video_sync GLX extensions. Jesse Barnes already had a positive look at the patches. Due to lack of suitable hardware i wasn't able to test any of these, not even if they compile. Could somebody please test these and apply them if appropriate? Thanks, mario * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
Improvements to Intel DDX for implementation of DRI2 OML_sync_control extensions.
This series of 2 patches fixes issues in the Intel DDX implementation of I830DRI2ScheduleWaitMSC() and I830DRI2ScheduleSwap(). The previous implementation should mostly work in the simple glXSwapBuffers() case, although it may reduce the maximum swaprate to half the video refresh interval, e.g., max 30 fps at a refresh of 60 Hz. It won't work correctly for the glXSwapBuffersMscOML(target_msc, divisor, remainder) case if divisor or remainder are non-zero or target_msc is not 2 frames ahead of the current vblank count. It doesn't handle the (msc % divisor) == remainder case correctly. A similar problem exists for calls to glXWaitForMscOML(). These patches should fix above issues, or at least make handling more robust. Jesse Barnes already had a look at these patches, but due to lack of suitable hardware, i haven't tested them myself, not even if they compile cleanly. Could somebody please give them some good testing and merge them if appropriate? Thanks, -mario ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH xf86-video-intel 2/2] Fix handling of target_msc, divisor, remainder in I830DRI2ScheduleSwap()
The current code in I830DRI2ScheduleSwap() only schedules the correct vblank events for the case divisor == 0, i.e., the simple glXSwapBuffers() case. In a glXSwapBuffersMscOML() request, divisor can be 0, which would go wrong. This modified code should handle target_msc, divisor, remainder and the different cases defined in the OML_sync_control extension correctly for the divisor 0 case. It also tries to make sure that the effective framecount of swap satisfies all constraints, taking the 1 frame delay in pageflipping mode and possible delays in blitting/exchange mode due to DRM_VBLANK_NEXTONMISS into account. The swap_interval logic in the X-Servers DRI2SwapBuffers() call expects the returned swap_target from the DDX to be reasonably accurate, otherwise implementation of swap_interval for the glXSwapBuffers() as defined in the SGI_swap_interval extension may become unreliable. For non-pageflipped mode, the returned swap_target is always correct due to the adjustments done by drmWaitVBlank(), as DRM_VBLANK_NEXTONMISS is set. In pageflipped mode, DRM_VBLANK_NEXTONMISS can't be used without severe impact on performance, so the code in I830DRI2ScheduleSwap() must make manual adjustments to the returned vbl.reply.sequence number. This patch adds the needed adjustments. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- src/i830_dri.c | 110 +++ 1 files changed, 70 insertions(+), 40 deletions(-) diff --git a/src/i830_dri.c b/src/i830_dri.c index d5e085a..f3cd739 100644 --- a/src/i830_dri.c +++ b/src/i830_dri.c @@ -586,6 +586,7 @@ I830DRI2ScheduleSwap(ClientPtr client, DrawablePtr draw, DRI2BufferPtr front, int ret, pipe = I830DRI2DrawablePipe(draw), flip = 0; DRI2FrameEventPtr swap_info; enum DRI2FrameEventType swap_type = DRI2_SWAP; + CARD64 current_msc; swap_info = xcalloc(1, sizeof(DRI2FrameEventRec)); @@ -629,6 +630,8 @@ I830DRI2ScheduleSwap(ClientPtr client, DrawablePtr draw, DRI2BufferPtr front, return FALSE; } + current_msc = vbl.reply.sequence; + /* Flips need to be submitted one frame before */ if (DRI2CanFlip(draw) !intel-shadow_present intel-use_pageflipping) { @@ -638,55 +641,81 @@ I830DRI2ScheduleSwap(ClientPtr client, DrawablePtr draw, DRI2BufferPtr front, swap_info-type = swap_type; + /* Correct target_msc by 'flip' if swap_type == DRI2_FLIP. +* Do it early, so handling of different timing constraints +* for divisor, remainder and msc vs. target_msc works. +*/ + if (*target_msc 0) + *target_msc -= flip; + /* -* If divisor is zero, we just need to make sure target_msc passes -* before waking up the client. +* If divisor is zero, or current_msc is smaller than target_msc, +* we just need to make sure target_msc passes before initiating the swap. */ - if (divisor == 0) { - vbl.request.type = DRM_VBLANK_NEXTONMISS | - DRM_VBLANK_ABSOLUTE | DRM_VBLANK_EVENT; - if (pipe 0) - vbl.request.type |= DRM_VBLANK_SECONDARY; - - vbl.request.sequence = *target_msc; - vbl.request.sequence -= flip; - vbl.request.signal = (unsigned long)swap_info; - ret = drmWaitVBlank(intel-drmSubFD, vbl); - if (ret) { - xf86DrvMsg(scrn-scrnIndex, X_WARNING, - divisor 0 get vblank counter failed: %s\n, - strerror(errno)); - return FALSE; - } - - *target_msc = vbl.reply.sequence; - swap_info-frame = *target_msc; - - return TRUE; + if (divisor == 0 || current_msc *target_msc) { + vbl.request.type = DRM_VBLANK_ABSOLUTE | DRM_VBLANK_EVENT; + /* If non-pageflipping, but blitting/exchanging, we need to use +* DRM_VBLANK_NEXTONMISS to avoid unreliable timestamping later on. +*/ + if (flip == 0) + vbl.request.type |= DRM_VBLANK_NEXTONMISS; + if (pipe 0) + vbl.request.type |= DRM_VBLANK_SECONDARY; + + /* If target_msc already reached or passed, set it to +* current_msc to ensure we return a reasonable value back +* to the caller. This makes swap_interval logic more robust. +*/ + if (current_msc = *target_msc) + *target_msc = current_msc; + + vbl.request.sequence = *target_msc; + vbl.request.signal = (unsigned long)swap_info; + ret = drmWaitVBlank(intel-drmSubFD, vbl); + if (ret) { + xf86DrvMsg(scrn-scrnIndex, X_WARNING, + divisor 0 get vblank counter failed: %s\n
[PATCH xf86-video-intel 1/2] Fix I830DRI2ScheduleWaitMSC() to correctly handle target_msc, divisor and remainder.
Previous code only handled divisor == 0 case correctly. This should honor a given target_msc for the divisor 0 case and handle the (msc % divisor) == remainder constraint correctly. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- src/i830_dri.c | 19 +++ 1 files changed, 11 insertions(+), 8 deletions(-) diff --git a/src/i830_dri.c b/src/i830_dri.c index e1c1470..d5e085a 100644 --- a/src/i830_dri.c +++ b/src/i830_dri.c @@ -756,6 +756,7 @@ I830DRI2ScheduleWaitMSC(ClientPtr client, DrawablePtr draw, CARD64 target_msc, DRI2FrameEventPtr wait_info; drmVBlank vbl; int ret, pipe = I830DRI2DrawablePipe(draw); +CARD64 current_msc; /* Drawable not visible, return immediately */ if (pipe == -1) { @@ -785,11 +786,13 @@ I830DRI2ScheduleWaitMSC(ClientPtr client, DrawablePtr draw, CARD64 target_msc, return FALSE; } +current_msc = vbl.reply.sequence; + /* - * If divisor is zero, we just need to make sure target_msc passes - * before waking up the client. + * If divisor is zero, or current_msc is smaller than target_msc, + * we just need to make sure target_msc passes before waking up the client. */ -if (divisor == 0) { +if (divisor == 0 || current_msc target_msc) { vbl.request.type = DRM_VBLANK_ABSOLUTE | DRM_VBLANK_EVENT; if (pipe 0) vbl.request.type |= DRM_VBLANK_SECONDARY; @@ -815,16 +818,16 @@ I830DRI2ScheduleWaitMSC(ClientPtr client, DrawablePtr draw, CARD64 target_msc, if (pipe 0) vbl.request.type |= DRM_VBLANK_SECONDARY; +vbl.request.sequence = current_msc - (current_msc % divisor) + remainder; + /* - * If we have no remainder and the condition isn't satisified, it means + * If calculated remainder is larger than requested remainder, it means * we've passed the last point where seq % divisor == remainder, so we need * to wait for the next time that will happen. */ -if (((vbl.reply.sequence % divisor) != remainder) !remainder) - vbl.request.sequence += divisor; +if ((current_msc % divisor) remainder) + vbl.request.sequence += divisor; -vbl.request.sequence = vbl.reply.sequence - (vbl.reply.sequence % divisor) + - remainder; vbl.request.signal = (unsigned long)wait_info; ret = drmWaitVBlank(intel-drmSubFD, vbl); if (ret) { -- 1.6.6 ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
Re: Improvements to Intel DDX for implementation of DRI2 OML_sync_control extensions.
They are against Jesse Barnes xf86-video-intel tree: http://cgit.freedesktop.org/~jbarnes/xf86-video-intel/log/ git://people.freedesktop.org/~jbarnes/xf86-video-intel master branch The tree is in the same state as when i wrote the patches - they are following commit http://cgit.freedesktop.org/~jbarnes/xf86-video-intel/commit/? id=dd1930673565e70ec247f7f96ca0bdc575eecfb0 Ideas to fix? I'm a git newbie. thanks, -mario On Mar 4, 2010, at 1:28 AM, Eric Anholt wrote: On Sun, 21 Feb 2010 18:45:46 +0100, Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: This series of 2 patches fixes issues in the Intel DDX implementation of I830DRI2ScheduleWaitMSC() and I830DRI2ScheduleSwap(). The previous implementation should mostly work in the simple glXSwapBuffers() case, although it may reduce the maximum swaprate to half the video refresh interval, e.g., max 30 fps at a refresh of 60 Hz. It won't work correctly for the glXSwapBuffersMscOML(target_msc, divisor, remainder) case if divisor or remainder are non-zero or target_msc is not 2 frames ahead of the current vblank count. It doesn't handle the (msc % divisor) == remainder case correctly. A similar problem exists for calls to glXWaitForMscOML(). These patches should fix above issues, or at least make handling more robust. Jesse Barnes already had a look at these patches, but due to lack of suitable hardware, i haven't tested them myself, not even if they compile cleanly. Could somebody please give them some good testing and merge them if appropriate? These don't seem to apply, and I can't quite work out what code they're against * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] DRI2: fixup handling of last_swap_target
Jesse, i think this patch breaks the patch series i sent out a week ago, the one we discussed in length offlist and which you reviewed and recommended for inclusion. How to proceed? Could you try to merge my patches to your tree, then reapply your slightly modified patch again? That seems to require less shuffling around of code and less time to check everything's still sane. My patch already breaks stuff up similar to yours, you'd only need to move a block of setup code fro last_swap_target out of DRI2SwapBuffers and into DRI2CreateDrawable again. -mario On Mar 4, 2010, at 6:23 PM, Jesse Barnes wrote: [Sorry for the resend, forgot to cc Keith] We need to initialize the swap target, which is passed to the driver to schedule events. Rather than using -1 to indicate that the field is uninitialized, just make sure we initialize it at drawable creation time. Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index cd69ca0..301f4fd 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -116,9 +116,12 @@ DRI2GetDrawable(DrawablePtr pDraw) int DRI2CreateDrawable(DrawablePtr pDraw) { +DRI2ScreenPtr ds = DRI2GetScreen(pDraw-pScreen); WindowPtr pWin; PixmapPtr pPixmap; DRI2DrawablePtr pPriv; +CARD64 ust; +int ret; pPriv = DRI2GetDrawable(pDraw); if (pPriv != NULL) @@ -141,7 +144,10 @@ DRI2CreateDrawable(DrawablePtr pDraw) pPriv-swap_count = 0; pPriv-target_sbc = -1; pPriv-swap_interval = 1; -pPriv-last_swap_target = -1; +ret = (*ds-GetMSC)(pDraw, ust, pPriv-last_swap_target); +if (!ret) + pPriv-last_swap_target = 0; + pPriv-swap_limit = 1; /* default to double buffering */ if (pDraw-type == DRAWABLE_WINDOW) @@ -575,7 +581,6 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, DRI2ScreenPtr ds = DRI2GetScreen(pDraw-pScreen); DRI2DrawablePtr pPriv; DRI2BufferPtr pDestBuffer = NULL, pSrcBuffer = NULL; -CARD64 ust; int ret, i; pPriv = DRI2GetDrawable(pDraw); @@ -617,27 +622,6 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, } /* - * In the simple glXSwapBuffers case, all params will be 0, and we just - * need to schedule a swap for the last swap target + the swap interval. - * If the last swap target hasn't been set yet, call into the driver - * to get the current count. - */ -if (target_msc == 0 divisor == 0 remainder == 0 - pPriv-last_swap_target 0) { - ret = (*ds-GetMSC)(pDraw, ust, target_msc); - if (!ret) { - xf86DrvMsg(pScreen-myNum, X_ERROR, - [DRI2] %s: driver failed to return current MSC\n, - __func__); - return BadDrawable; - } -} - -/* First swap needs to initialize last_swap_target */ -if (pPriv-last_swap_target 0) - pPriv-last_swap_target = target_msc; - -/* * Swap target for this swap is last swap target + swap interval since * we have to account for the current swap count, interval, and the * number of pending swaps. ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
Re: Improvements to Intel DDX for implementation of DRI2 OML_sync_control extensions.
Is there a simple git way to switch to that tree, or would i need to checkout the whole tree again and then copy paste chunks of code from the old tree to the new one manually? -mario On Mar 4, 2010, at 4:56 PM, Jesse Barnes wrote: Oh sorry, that tree is out of date since I pushed everything to master. The tree to generate fixes against is git://git.freedesktop.org/git/xorg/driver/xf86-video-intel. Jesse On Thu, 4 Mar 2010 05:24:38 +0100 Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: They are against Jesse Barnes xf86-video-intel tree: http://cgit.freedesktop.org/~jbarnes/xf86-video-intel/log/ git://people.freedesktop.org/~jbarnes/xf86-video-intel master branch The tree is in the same state as when i wrote the patches - they are following commit http://cgit.freedesktop.org/~jbarnes/xf86-video-intel/commit/? id=dd1930673565e70ec247f7f96ca0bdc575eecfb0 Ideas to fix? I'm a git newbie. thanks, -mario On Mar 4, 2010, at 1:28 AM, Eric Anholt wrote: On Sun, 21 Feb 2010 18:45:46 +0100, Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: This series of 2 patches fixes issues in the Intel DDX implementation of I830DRI2ScheduleWaitMSC() and I830DRI2ScheduleSwap(). The previous implementation should mostly work in the simple glXSwapBuffers() case, although it may reduce the maximum swaprate to half the video refresh interval, e.g., max 30 fps at a refresh of 60 Hz. It won't work correctly for the glXSwapBuffersMscOML(target_msc, divisor, remainder) case if divisor or remainder are non-zero or target_msc is not 2 frames ahead of the current vblank count. It doesn't handle the (msc % divisor) == remainder case correctly. A similar problem exists for calls to glXWaitForMscOML(). These patches should fix above issues, or at least make handling more robust. Jesse Barnes already had a look at these patches, but due to lack of suitable hardware, i haven't tested them myself, not even if they compile cleanly. Could somebody please give them some good testing and merge them if appropriate? These don't seem to apply, and I can't quite work out what code they're against * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) -- Jesse Barnes, Intel Open Source Technology Center * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] DRI2: fixup handling of last_swap_target
On Mar 4, 2010, at 7:06 PM, Jesse Barnes wrote: Yeah, I'll try to apply your series on top and re-post it. Thanks. I'll just wait for updates from you then. Can i simply pull from your own xserver branch again to get a consistent view of the new situation? I'd then look at the thing again and see if anything still clashes or not. I know this one affects some of the things we discussed, since it makes offscreen drawing vs. swap interval even more broken than before (i.e. swap interval and throttling will both be broken when a drawable isn't being displayed), but on the other hand it'll prevent hangs, so that's good. Don't think the combined patches will introduce more offscreen brokenness than before, but reduce brokenness for the onscreen/non redirected cases quite a bit. I just meant that you changed lines of code very close to my changes, so my stuff will probably fail to merge and looking at it it seems that the amount of breakage/merge conflicts to fix will be much smaller if my patch is applied first, then yours on top of it instead of the other way round. -mario Thanks, Jesse On Thu, 4 Mar 2010 19:03:15 +0100 Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: Jesse, i think this patch breaks the patch series i sent out a week ago, the one we discussed in length offlist and which you reviewed and recommended for inclusion. How to proceed? Could you try to merge my patches to your tree, then reapply your slightly modified patch again? That seems to require less shuffling around of code and less time to check everything's still sane. My patch already breaks stuff up similar to yours, you'd only need to move a block of setup code fro last_swap_target out of DRI2SwapBuffers and into DRI2CreateDrawable again. -mario On Mar 4, 2010, at 6:23 PM, Jesse Barnes wrote: [Sorry for the resend, forgot to cc Keith] We need to initialize the swap target, which is passed to the driver to schedule events. Rather than using -1 to indicate that the field is uninitialized, just make sure we initialize it at drawable creation time. Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index cd69ca0..301f4fd 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -116,9 +116,12 @@ DRI2GetDrawable(DrawablePtr pDraw) int DRI2CreateDrawable(DrawablePtr pDraw) { +DRI2ScreenPtr ds = DRI2GetScreen(pDraw-pScreen); WindowPtr pWin; PixmapPtr pPixmap; DRI2DrawablePtr pPriv; +CARD64 ust; +int ret; pPriv = DRI2GetDrawable(pDraw); if (pPriv != NULL) @@ -141,7 +144,10 @@ DRI2CreateDrawable(DrawablePtr pDraw) pPriv-swap_count = 0; pPriv-target_sbc = -1; pPriv-swap_interval = 1; -pPriv-last_swap_target = -1; +ret = (*ds-GetMSC)(pDraw, ust, pPriv-last_swap_target); +if (!ret) + pPriv-last_swap_target = 0; + pPriv-swap_limit = 1; /* default to double buffering */ if (pDraw-type == DRAWABLE_WINDOW) @@ -575,7 +581,6 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, DRI2ScreenPtr ds = DRI2GetScreen(pDraw-pScreen); DRI2DrawablePtr pPriv; DRI2BufferPtr pDestBuffer = NULL, pSrcBuffer = NULL; -CARD64 ust; int ret, i; pPriv = DRI2GetDrawable(pDraw); @@ -617,27 +622,6 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, } /* - * In the simple glXSwapBuffers case, all params will be 0, and we just - * need to schedule a swap for the last swap target + the swap interval. - * If the last swap target hasn't been set yet, call into the driver - * to get the current count. - */ -if (target_msc == 0 divisor == 0 remainder == 0 - pPriv-last_swap_target 0) { - ret = (*ds-GetMSC)(pDraw, ust, target_msc); - if (!ret) { - xf86DrvMsg(pScreen-myNum, X_ERROR, - [DRI2] %s: driver failed to return current MSC\n, - __func__); - return BadDrawable; - } -} - -/* First swap needs to initialize last_swap_target */ -if (pPriv-last_swap_target 0) - pPriv-last_swap_target = target_msc; - -/* * Swap target for this swap is last swap target + swap interval since * we have to account for the current swap count, interval, and the * number of pending swaps. ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm
Re: [PATCH] DRI2: fixup handling of last_swap_target
it was blocked, otherwise there are a race conditions. thanks, -mario * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH xf86-video-intel 2/2] Fix handling of target_msc, divisor, remainder in I830DRI2ScheduleSwap()
On Mar 5, 2010, at 10:09 PM, Jesse Barnes wrote: I just fixed these up and pushed them into the tree along with another fix for handling offscreen drawables better. Tests indicate that OML swap divisor/remainder stuff is working correctly now. Thanks for doing that. But stuff got seriously garbled in I830ScheduleWaitMSC() inside src/i830_dri.c - this won't work correctly for any glXWaitForMscOML(target_msc, divisor, remainder) call, except for the special cases with divisor == 0! This passage... /* * If divisor is zero, or current_msc is smaller than target_msc, * we just need to make sure target_msc passes before waking up the * client. */ if (divisor == 0) { ... should read as... if (divisor == 0 || current_msc target_msc) { This passage ... /* * If the calculated remainder and the condition isn't satisified, it * means we've passed the last point where seq % divisor == remainder, * so we need to wait for the next time that will happen. */ if ((current_msc % divisor) != remainder) vbl.request.sequence += divisor; ... should be replaced by ... vbl.request.sequence = current_msc - (current_msc % divisor) + remainder; /* * If calculated remainder is larger than requested remainder, it means * we've passed the last point where seq % divisor == remainder, so we need * to wait for the next time that will happen. */ if ((current_msc % divisor) remainder) vbl.request.sequence += divisor; Other than that, it's fine. Oh and in I830DRI2ScheduleSwap() , this statement ... if (pipe 0) vbl.request.type |= DRM_VBLANK_SECONDARY; ... accidentally got copied twice into the if () clause, which is not harmful, but a bit redundant :-) best, -mario * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] DRI2: fixup handling of last_swap_target
. So its not just something totally far fetched. So i think you should have some call to DRI2Throttleclient at the beginning of DRI2ScheduleSwap() as well for more determinstic behaviour? I thought quite a bit about such effects while looking at your code. I think a really cool and more bullet-proof implementation of the OML_sync_control spec that would allow a client to render multiple frames ahead / have multiple swaps pending without getting into trouble with a few race-conditions in the current code is possible, but it would require a more dynamic scheduling of what happens when. Maybe we can discuss such refinements when your first iteration of the dri2 swap bits is finished and out there. -mario On Mar 7, 2010, at 6:10 PM, Jesse Barnes wrote: On Sun, 7 Mar 2010 08:44:51 +0100 Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: On Mar 5, 2010, at 6:50 PM, Jesse Barnes wrote: Ok pushed fixes for these issues to the repo above. I know you've put a lot of time into this already, but can you take another look to make sure I got everything? No problem. Note I left the swap interval default at 1 since GLX_MESA_swap_control should let us set it to 0 to get unthrottled behavior, which I just fixed and tested. Agreed, that's ok. GLX_MESA_swap_control is a way to do it. The current state looks almost good, esp. your fixes to DRI2SwapBuffers for swap_interval == 0 and for target_msc == 0. But a few more things: In DRI2SwapBuffers, in the body of ... /* Old DDX or no swap interval, just blit */ if (!ds-ScheduleSwap || !pPriv-swap_interval) { ... ... it calls DRI2SwapComplete and passes target_msc as the 'msc' value of swap completion. Would be probably better to pass a zero value, just as it is done in the Intel DDX when the swap doesn't happen at all at the requested target_msc, but rather immediately due to some problems. If ds-ScheduleSwap isn't available then the 'msc' is basically undefined and unknown, so zero is a better return value. In the !pPriv-swap_interval case one could call a *ds-getMSC() to get a more meaningful return value for 'msc'. Don't know if it's worth the effort, but a zero return would be better than 'target_msc' imho. Right, good idea. That way software using the new event won't get bogus values with old DDXes. We need to add another patch to Mesa. Mesa translates a glXSwapBuffers () call into a DRI2SwapBuffers() call with target_msc, divisor and remainder all zero and DRI2SwapBuffers takes this as hint that glXSwapBuffers behaviour is requested. However these are also legal arguments to a glXSwapBuffersMscOML(..., ,0,0,0) call, which has to behave differently than a glXSwapBuffers call -- In this case, DRI2SwapBuffers would confuse this with a glXSwapBuffers request and do the wrong thing. I'd propose adding this line to the __glXSwapBuffersMscOML() routine in mesa's /src/glx/x11/glxcmds.c file: if (target_msc == 0 divisor == 0 remainder == 0) remainder = 1; - if target_msc and divisor are zero, then the remainder is ignored inside the DRI implementation, so the actual value of remainder doesn't matter for swap behaviour, but can be used to disambiguate glXSwapBuffers vs. glXSwapBuffersMsc... for a glXSwapBuffersMscOML (0,0,0) call. Ok, I'll have to check the behaviors again, but this sounds reasonable. Your deferred pPriv deletion logic looks now good to me. But if deletion is deferred then i think almost all public functions inside dri2.c should return 'BadDrawable' not only for pPriv == NULL but also for pPriv-refCount == 0, e.g., DRI2SwapBuffers, DRI2GetMSC, ... From the viewpoint of client code, a refCount == 0 is the same as a destroyed drawable. For all practical purposes it is already gone. Maybe it would make sense to consolidate the (pPriv == NULL || pPriv-refCount == 0) check into a macro or function, something like DRI2IsDrawableAlive(pPriv); and then call that in most DRI2xxx functions? Yeah, that would make things cleaner, I'll do that. I believe there's also another problem with the pPriv-blockedClient logic. The implementation only keeps track of blockedClient, but not for the reason why the client blocked. This can cause trouble in DRI2WakeClient. Consider this example which i think would cause the client to hang: 1. Assume current msc is 1000 and there are 2 threads in the client. // Thread 1: Request swap at target_msc 1010: 2. glXSwapBuffersMscOML(dpy, drawable, 1010, 0, 0); // Thread 2: Request wait until target_msc 2000: 3. glXWaitForMscOML(dpu, drawable, 2000, 0, 0, ); 4. Thread 1: Calls XDestroyWindow(dpy, drawable); Step 2 will schedule a swap for msc = 1010. Step 3 will pPriv-blockedClient = client and IgnoreClient(client) the client and schedule a wakeup at msc = 2000. The xconnection dpy is now blocked. Step 4 doesn't execute yet, because the xconnection is blocked. Now at msc = 1010 - swap completes - DRI2SwapComplete - DRI2WakeClient - executes
Re: [PATCH xf86-video-intel 2/2] Fix handling of target_msc, divisor, remainder in I830DRI2ScheduleSwap()
On Mar 7, 2010, at 6:18 PM, Jesse Barnes wrote: Arg, I did botch that patch. And of course I only tested the swap buffers behavior and not OML's WaitMSC so I didn't catch it. I'll improve the test and push the fix. No problem. Just fyi: I noticed you added a test in I830DRI2ScheduleSwap() that outputs X_Warnings if a swap is scheduled more than 100 vblanks ahead. At least for the users of my toolkit, scheduling a swap more than 100 vblanks ahead wouldn't be something noteworthy but quite a typical case of normal use of the system. At the end of a work day some users would probably find ten thousands of such warnings in some log, assuming those warnings get logged at the normal log level. So this usage case is less weird than one would think :-) -mario ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] DRI2: fixup handling of last_swap_target
* Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] DRI2: fixup handling of last_swap_target
On Mar 8, 2010, at 6:07 PM, Jesse Barnes wrote: On Sun, 7 Mar 2010 21:16:21 +0100 Florian Mickler flor...@mickler.org wrote: On Sun, 7 Mar 2010 09:10:51 -0800 Jesse Barnes jbar...@virtuousgeek.org wrote: On Sun, 7 Mar 2010 08:44:51 +0100 Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: Yeah, this could also happen without OML I think, if a few swaps were queued (resulting in a block) and then a SGI_video_sync call occurred. I'll fix it up. Thanks, could this also happen with a single glxgears instance and no other 3d clients running? I don't *think* glxgears does this; it's been running fine for me at least. But you could definitely be hitting one of the MSC races; you can add some debug output to I830DRI2ScheduleSwap in src/i830_dri.c in the DDX driver to see what's going on. Presumably one of the paths there is putting the client to sleep and never waking it. The glxgears.c of mesa only does draw - glXSwapBuffers - draw - glXSwapBuffers - ... It doesn't use any of the new api's, so the only way it can block is probably via the DRI2ThrottleClient call when requesting a new backbuffer. I see at least one problematic thing. In the xserver's dri2.c file in DRI2SwapBuffers(), the pPriv-swapsPending++ statement is *after* the call into the ddx's I830DRI2ScheduleSwap() function, instead of *before* it. If a swap completes inside the ddx, it will call DRI2SwapComplete() which will do a pPriv-swapsPending-- before it was incremented to mark a swap as pending. That is, if swapsPending was zero before, it will now wrap around to 0x. This will happen if the fallback_blit path inside the ddx is used, e.g., if the drawable is not visible (yet). Not sure if this by itself is sufficient for the hang, because the call to pPriv-swapsPending++ after return from I830DRI2ScheduleSwap () should fix this, but if something else goes wrong and an error path is taken or an event not delivered, then swapsPending could get stuck at 0x or a similar high number, which would trigger DRI2ThrottleClient and block the client connection without ever waking it up again, as clients are only woken up if a swap completes, which can't happen if no swap is pending. - hang in _XReply on the client side. So we should probable move pPriv-swapsPending++ before the call to... ret = (*ds-ScheduleSwap)(client, pDraw, pDestBuffer, pSrcBuffer, swap_target, divisor, remainder, func, data); ... in DRI2SwapBuffers() and probably add a pPriv-swapsPending-- into the error handling path directly after the call to ds- ScheduleSwap. -mario ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH xf86-video-intel 2/2] Fix handling of target_msc, divisor, remainder in I830DRI2ScheduleSwap()
On Mar 8, 2010, at 8:34 PM, Jesse Barnes wrote: Ok just pushed these fixes; omlsync seems to do the right thing now with both waits and swaps. Sorry to torture you further, almost there ;-) -- the devil is in the details. In I830DRI2ScheduleWaitMSC(): At the end, in this if statement... if ((current_msc % divisor) remainder) vbl.request.sequence += divisor; ... the comparison should be = instead of , that is: if ((current_msc % divisor) = remainder) I reread the spec and realized the true meaning of this little snippet ...Otherwise, the function will block until the MSC value is incremented to a value such that MSC % divisor = remainder... It doesn't say until the msc *is* such that msc % divisor = remainder, but it says until the *msc is **incremented** to a value*. That means they don't want it to return immediately if the current msc already satisfies the constraint, but they want it to return basically at the start of the vblank interval when the msc reaches a value that satisfies the constraint. The idea is to synchronize the client's execution to the vblank interval. If a client just wants to wait for a certain msc without precise sync to the vblank interval, it can simply pass a divisor==0 and then will get that behaviour. Changing the comparison from a to a = above achieves that goal. This makes lots of sense if i think about how i would actually use that function in my toolkit or similar apps. I'd mostly want it to synchronize to the exact *start* of certain video refesh cycles. Then we should add this check to the if(divisor == 0 || current_msc target_msc) { } branch: if (current_msc = target_msc) target_msc = current_msc; This is similar to the check in I830DRI2ScheduleSwap. It guarantees that the target_msc can't fall behind the current_msc. This is important for all scheduled vblank events because the DRM will do the wrong thing if the requested vblank count is more than 2^23 counts behind the current vblank count. Events would get stuck forever if this happened due to a 32 bit wraparound, not good. What else? I rethought the idea of virtualizing the msc into a 64 bit value from the 32 count of the kernel. It is a bit more tricky than one would expect, so probably not a quick thing do to correctly - and not a thing to do now. But we could add some simple masking to the very top of I830DRI2ScheduleSwap() and I830DRI2ScheduleWaitMSC() to truncate all msc related input values from the server to 32 bits, ie. in I830DRI2ScheduleWaitMSC() target_msc = 0x; divisor = 0x; remainder = 0x; in I830DRI2ScheduleSwap() *target_msc = 0x; divisor = 0x; remainder = 0x; At least the few simple cases my brain can handle with paper pencil testing seem to do the right thing if a 32 bit vblank counter wraparound happens. At worst, there would be a small glitch every time the counter wraps around - about once every 4-12 months. Inbetween everything should work. I doubt that an app will regularly schedule swaps or waits multiple months ahead of time and still expect it to work perfectly ;-). Finally in I830DRI2ScheduleWaitMSC() the error handling is a bit inconsistent. Sometimes it returns failure to calling code, which would provoke a client hang due to a missing _XReply, sometimes it recovers and unstucks the client by providing a fake response. Similar to the blit_fallback: path in I830DRI2ScheduleSwap() you could maybe have a common error handler that at least calls DRI2WaitMSCComplete(client, draw, 0, 0, 0); to prevent the client from hanging? Ok, i'm hopefully running out of more ideas for now ;-) -mario ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
Re: DRI2 fixes
On Mar 22, 2010, at 11:03 PM, Jesse Barnes wrote: This is a collection of fixes from my personal server tree targeting the 1.8 release. They're mostly small fixes, but they fix a few important (i.e. common) cases with the new protocol code. Please review; I'll make any necessary changes, add the Reviewed-by tags, and push to Keith. Hi Jesse, done. You can add a reviewed-by from me on all relevant patches if you like -- i pulled them from your xserver-tree and checked on that. I didn't test any of them (no suitable hardware) but did a paper + pencil + brain review and looks good to me. thanks, -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[Patches][nouveau/ddx]: Improvements to bufferswap implementation and timestamping
Hi, here a set of patches against the nouveau-ddx. This is an extended and revised set, based on Francisco Jerez feedback from autumn last year. [1/9] Makes pageflipping work again on X-Server 1.12rc. It apparently stopped working somewhere around Xorg 1.11+. [2/9] Implements handling of pageflip completion events from the kernel. Francisco Jerez argument against including it was that the x-server didn't have a swaplimit api, so this couldn't be applied at the same time as the pseudo triple-buffering hack which is in place in the current ddx. Now we have the swaplimit api in 1.12, so this problem should be solved. [3/9] Makes use of the swaplimit api. A new xorg.conf option SwapLimit allows to select a swaplimit of 1 or 2 for double-buffering or triple-buffering. It defaults to 2 for Xorg 1.12+ and 1 for older servers. This way, on 1.12+ nouveau retains the kind of triple buffering behaviour it currently has, but swap completion timestamping (OML_sync_control, INTEL_swap_events, and client swap throttling) works conforming to the specs. On older servers it removes triple-buffering but makes nouveau conform to the specs. This is important for apps that need precise and reliable swap scheduling and timestamping. [4/9] A bug fix against corrupted desktop when switching from redirected to non-redirected fullscreen windows under a desktop compositor. Fixes FDO bug #35452. [5/9] Some fixes to swap scheduling, revised according to Francisco's review. [6/9] Fixes swap throttling for non-fullscreen windows under a desktop compositor. Split into a separate patch according to Francisco's feedback. [7/9] An attempt to provide more sane swap completion events for non- fullscreen windows. This is a bit of cheating, as it delivers the events for the earliest point in time one would expect the swap to complete, assuming only a lightly loaded gpu. Real completion could be later. I think this is an improvement because the current implementation delivers swap complete events with timestamps and counts that signal swap completion before the client even requested the swap. [8/9] This one adds Francisco's original triple-buffering hack back for Xorg 1.11 and earlier servers if the xorg.conf option requests a swaplimit 1. Users can choose between triple-buffering but broken timestamping or double-buffering with sane timestamping. [9/9] Fixes a corner case which could cause the ddx to segfault with its current triple-buffering implementation. I've tested these on single-display and dual-display setups, with/without compositor, with/without redirection and checked the robustness and precision of the timestamps with special measurement equipment, on XOrg 1.12-rc2 and 1.10. They work pretty well for me and finally make nouveau very useable for the kind of scientific applications that require precise swap scheduling and timestamping, so i'd love to see them reviewed and hopefully included into the ddx soon. A couple of things are a bit of hacks: [3/9] I think the setup of a default swap limit would be better done in the x-server itself instead of the ddx. The setup code is a bit awkward, hijacking the ddx-CreateBuffer function to apply the swaplimit. A DRI2GetSwapLimit() function is also missing from the server, which could help in the future to track swaplimit changes by other clients than the ddx itself. Unfortunately it is too late for the 1.12 release to do this. [8/9] Don't know if this is still wanted/needed or not? [9/9] It fixes the problem and doesn't affect performance, but is somewhat of a hack. I don't know how to do better, maybe somebody else has a better solution? Thanks, -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 1/9] dri2: Fix can_exchange() to allow page-flipping on new servers.
can_exchange() fails on at least Xorg 1.12+. This fixes it in the same way it was fixed in the ati intel ddx. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- src/nouveau_dri2.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c index 3aa5ec5..5b62425 100644 --- a/src/nouveau_dri2.c +++ b/src/nouveau_dri2.c @@ -160,7 +160,7 @@ can_exchange(DrawablePtr draw, PixmapPtr dst_pix, PixmapPtr src_pix) return ((DRI2CanFlip(draw) pNv-has_pageflip)) dst_pix-drawable.width == src_pix-drawable.width dst_pix-drawable.height == src_pix-drawable.height - dst_pix-drawable.depth == src_pix-drawable.depth + dst_pix-drawable.bitsPerPixel == src_pix-drawable.bitsPerPixel dst_pix-devKind == src_pix-devKind; } -- 1.7.5.4 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 2/9] dri2: Implement handling of pageflip completion events.
Requests pageflip completion events from the kernel. Implements pageflip completion handler to finalize and timestamp swaps. Completion handler includes a consistency check, and disambiguation if multiple crtc's are involved in a pageflip (e.g., clone mode, extendend desktop). Only the timestamp of the crtc whose vblank event initially triggered the swap is used, but handler waits for flip completion on all involved crtc's before completing the swap and releasing the old framebuffer. This code is almost identical to the code used in the ati/radeon ddx and intel ddx. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- src/drmmode_display.c | 107 +++-- src/nouveau_dri2.c| 89 ++-- src/nv_proto.h|5 ++- 3 files changed, 191 insertions(+), 10 deletions(-) diff --git a/src/drmmode_display.c b/src/drmmode_display.c index 75ef6dd..9e15c29 100644 --- a/src/drmmode_display.c +++ b/src/drmmode_display.c @@ -83,6 +83,21 @@ typedef struct { drmmode_prop_ptr props; } drmmode_output_private_rec, *drmmode_output_private_ptr; +typedef struct { +drmmode_ptr drmmode; +unsigned old_fb_id; +int flip_count; +void *event_data; +unsigned int fe_frame; +unsigned int fe_tv_sec; +unsigned int fe_tv_usec; +} drmmode_flipdata_rec, *drmmode_flipdata_ptr; + +typedef struct { +drmmode_flipdata_ptr flipdata; +Bool dispatch_me; +} drmmode_flipevtcarrier_rec, *drmmode_flipevtcarrier_ptr; + static void drmmode_output_dpms(xf86OutputPtr output, int mode); static drmmode_ptr @@ -1245,13 +1260,17 @@ drmmode_cursor_init(ScreenPtr pScreen) } Bool -drmmode_page_flip(DrawablePtr draw, PixmapPtr back, void *priv) +drmmode_page_flip(DrawablePtr draw, PixmapPtr back, void *priv, + unsigned int ref_crtc_hw_id) { ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum]; xf86CrtcConfigPtr config = XF86_CRTC_CONFIG_PTR(scrn); drmmode_crtc_private_ptr crtc = config-crtc[0]-driver_private; drmmode_ptr mode = crtc-drmmode; int ret, i, old_fb_id; + int emitted = 0; + drmmode_flipdata_ptr flipdata; + drmmode_flipevtcarrier_ptr flipcarrier; old_fb_id = mode-fb_id; ret = drmModeAddFB(mode-fd, scrn-virtualX, scrn-virtualY, @@ -1264,24 +1283,64 @@ drmmode_page_flip(DrawablePtr draw, PixmapPtr back, void *priv) return FALSE; } + flipdata = calloc(1, sizeof(drmmode_flipdata_rec)); + if (!flipdata) { + xf86DrvMsg(scrn-scrnIndex, X_WARNING, + flip queue: data alloc failed.\n); + goto error_undo; + } + + flipdata-event_data = priv; + flipdata-drmmode = mode; + for (i = 0; i config-num_crtc; i++) { crtc = config-crtc[i]-driver_private; if (!config-crtc[i]-enabled) continue; + flipdata-flip_count++; + + flipcarrier = calloc(1, sizeof(drmmode_flipevtcarrier_rec)); + if (!flipcarrier) { + xf86DrvMsg(scrn-scrnIndex, X_WARNING, + flip queue: carrier alloc failed.\n); + if (emitted == 0) + free(flipdata); + goto error_undo; + } + + /* Only the reference crtc will finally deliver its page flip +* completion event. All other crtc's events will be discarded. +*/ + flipcarrier-dispatch_me = ((1 i) == ref_crtc_hw_id); + flipcarrier-flipdata = flipdata; + ret = drmModePageFlip(mode-fd, crtc-mode_crtc-crtc_id, - mode-fb_id, 0, priv); + mode-fb_id, DRM_MODE_PAGE_FLIP_EVENT, + flipcarrier); if (ret) { xf86DrvMsg(scrn-scrnIndex, X_WARNING, flip queue failed: %s\n, strerror(errno)); - return FALSE; + + free(flipcarrier); + if (emitted == 0) + free(flipdata); + goto error_undo; } + + emitted++; } - drmModeRmFB(mode-fd, old_fb_id); + /* Will release old fb after all crtc's completed flip. */ + flipdata-old_fb_id = old_fb_id; return TRUE; + +error_undo: + drmModeRmFB(mode-fd, mode-fb_id); + mode-fb_id = old_fb_id; + return FALSE; } #ifdef HAVE_LIBUDEV @@ -1347,6 +1406,42 @@ drmmode_uevent_fini(ScrnInfoPtr scrn) } static void +drmmode_flip_handler(int fd, unsigned int frame, unsigned int tv_sec, +unsigned int tv_usec, void *event_data) +{ + drmmode_flipevtcarrier_ptr flipcarrier
[PATCH 3/9] dri2: Add support for DRI2SwapLimit() API.
Uses the new DRI2SwapLimit() API of X-Server 1.12+ to allow to change the maximum number of pending swaps on a drawable before the OpenGL client is throttled by the server. The new optional xorg.conf parameter SwapLimit allows to select a new swap limit = 1. The default swap limit is 2 for triple-buffering on XOrg 1.12+, 1 for double-buffering on older servers, as we can't change the swap limit there. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- man/nouveau.man| 11 +++ src/nouveau_dri2.c | 29 +++-- src/nv_const.h |2 ++ src/nv_driver.c| 34 ++ src/nv_type.h |3 +++ 5 files changed, 77 insertions(+), 2 deletions(-) diff --git a/man/nouveau.man b/man/nouveau.man index dd9d938..59f6c1a 100644 --- a/man/nouveau.man +++ b/man/nouveau.man @@ -93,6 +93,17 @@ will assign xrandr outputs LVDS and VGA-0 to this instance of the driver. .TP .BI Option \*qPageFlip\*q \*q boolean \*q Enable DRI2 page flipping. Default: on. +.TP +.BI Option \*qSwapLimit\*q \*q integer \*q +Set maximum allowed number of pending OpenGL double-buffer swaps for +a drawable before a client is blocked. +.br +A value of 1 corresponds to double-buffering. A value of 2 corresponds +to triple-buffering. Higher values may allow higher framerate, but also +increase lag for interactive applications, e.g., games. Nouveau currently +supports a maximum value of 2 on XOrg 1.12+ and a maximum of 1 on older servers. +.br +Default: 2 for XOrg 1.12+, 1 for older servers. .SH SEE ALSO __xservername__(__appmansuffix__), __xconfigfile__(__filemansuffix__), Xserver(__appmansuffix__), X(__miscmansuffix__) .SH AUTHORS diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c index acef08a..2908e56 100644 --- a/src/nouveau_dri2.c +++ b/src/nouveau_dri2.c @@ -42,6 +42,11 @@ nouveau_dri2_create_buffer(DrawablePtr pDraw, unsigned int attachment, } else { WindowPtr pwin = (WindowPtr)pDraw; ppix = pScreen-GetWindowPixmap(pwin); + +#if DRI2INFOREC_VERSION = 6 + /* Set initial swap limit on drawable. */ + DRI2SwapLimit(pDraw, pNv-swap_limit); +#endif } ppix-refcnt++; @@ -208,6 +213,20 @@ nouveau_wait_vblank(DrawablePtr draw, int type, CARD64 msc, return 0; } +#if DRI2INFOREC_VERSION = 6 +static Bool +nouveau_dri2_swap_limit_validate(DrawablePtr draw, int swap_limit) +{ + ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum]; + NVPtr pNv = NVPTR(scrn); + + if ((swap_limit 1 ) || (swap_limit pNv-max_swap_limit)) + return FALSE; + + return TRUE; +} +#endif + static void nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, unsigned int tv_sec, unsigned int tv_usec, @@ -293,8 +312,10 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, * not, to prevent it from blocking the client on the next * GetBuffers request (and let the client do triple-buffering). * -* XXX - The DRI2SwapLimit() API will allow us to move this to -* the flip handler with no FPS hit. +* XXX - The DRI2SwapLimit() API allowed us to move this to +* the flip handler with no FPS hit for page flipped swaps. +* It is still needed for copy swaps as we lack a method +* to detect true swap completion for DRI2_BLIT_COMPLETE. */ DRI2SwapComplete(s-client, draw, frame, tv_sec, tv_usec, type, s-func, s-data); @@ -534,6 +555,10 @@ nouveau_dri2_init(ScreenPtr pScreen) dri2.ScheduleWaitMSC = nouveau_dri2_schedule_wait; dri2.GetMSC = nouveau_dri2_get_msc; +#if DRI2INFOREC_VERSION = 6 + dri2.SwapLimitValidate = nouveau_dri2_swap_limit_validate; +#endif + return DRI2ScreenInit(pScreen, dri2); } diff --git a/src/nv_const.h b/src/nv_const.h index a27a951..5c232d4 100644 --- a/src/nv_const.h +++ b/src/nv_const.h @@ -15,6 +15,7 @@ typedef enum { OPTION_GLX_VBLANK, OPTION_ZAPHOD_HEADS, OPTION_PAGE_FLIP, +OPTION_SWAP_LIMIT, } NVOpts; @@ -28,6 +29,7 @@ static const OptionInfoRec NVOptions[] = { { OPTION_GLX_VBLANK, GLXVBlank,OPTV_BOOLEAN, {0}, FALSE }, { OPTION_ZAPHOD_HEADS, ZaphodHeads, OPTV_STRING,{0}, FALSE }, { OPTION_PAGE_FLIP,PageFlip, OPTV_BOOLEAN, {0}, FALSE }, +{ OPTION_SWAP_LIMIT, SwapLimit,OPTV_INTEGER, {0}, FALSE }, { -1, NULL, OPTV_NONE, {0}, FALSE } }; diff --git a/src/nv_driver.c b/src/nv_driver.c index 87ef2c4..5def531 100644 --- a/src/nv_driver.c +++ b/src/nv_driver.c @@ -31,6 +31,9 @@ #include xf86drm.h #include xf86drmMode.h #include nouveau_drm.h +#ifdef DRI2 +#include dri2.h +#endif /* * Forward definitions for the functions
[PATCH 4/9] dri2: Update front buffer pixmap and name before exchanging buffers
Buffer exchange assumes that the front buffer pixmap and name information is accurate. That may not be the case eg. if the window has been (un)redirected since the buffer was created. This is a translation to nouveau of a fix that was originally developed by Ville Syrjala syrj...@sci.fi for the ati/radeon ddx to fix the same bug there. See thread at: http://lists.x.org/archives/xorg-devel/2011-May/021908.html Fixes FDO bug #35452. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- src/nouveau_dri2.c | 45 ++--- 1 files changed, 42 insertions(+), 3 deletions(-) diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c index 2908e56..8608678 100644 --- a/src/nouveau_dri2.c +++ b/src/nouveau_dri2.c @@ -149,6 +149,35 @@ struct nouveau_dri2_vblank_state { }; static Bool +update_front(DrawablePtr draw, DRI2BufferPtr front) +{ + int r; + PixmapPtr pixmap; + struct nouveau_dri2_buffer *nvbuf = nouveau_dri2_buffer(front); + + if (draw-type == DRAWABLE_PIXMAP) + pixmap = (PixmapPtr)draw; + else + pixmap = (*draw-pScreen-GetWindowPixmap)((WindowPtr)draw); + + pixmap-refcnt++; + + exaMoveInPixmap(pixmap); + r = nouveau_bo_handle_get(nouveau_pixmap_bo(pixmap), front-name); + if (r) { + (*draw-pScreen-DestroyPixmap)(pixmap); + return FALSE; + } + + (*draw-pScreen-DestroyPixmap)(nvbuf-ppix); + front-pitch = pixmap-devKind; + front-cpp = pixmap-drawable.bitsPerPixel / 8; + nvbuf-ppix = pixmap; + + return TRUE; +} + +static Bool can_exchange(DrawablePtr draw, PixmapPtr dst_pix, PixmapPtr src_pix) { ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum]; @@ -234,13 +263,14 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, { ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum]; NVPtr pNv = NVPTR(scrn); - PixmapPtr dst_pix = nouveau_dri2_buffer(s-dst)-ppix; + PixmapPtr dst_pix; PixmapPtr src_pix = nouveau_dri2_buffer(s-src)-ppix; - struct nouveau_bo *dst_bo = nouveau_pixmap_bo(dst_pix); + struct nouveau_bo *dst_bo; struct nouveau_bo *src_bo = nouveau_pixmap_bo(src_pix); struct nouveau_channel *chan = pNv-chan; RegionRec reg; int type, ret; + Bool front_updated; REGION_INIT(0, reg, ((BoxRec){ 0, 0, draw-width, draw-height }), 0); REGION_TRANSLATE(0, reg, draw-x, draw-y); @@ -257,6 +287,15 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, if (ref_crtc_hw_id 1) ref_crtc_hw_id = 1; + /* Update frontbuffer pixmap and name: Could have changed due to +* window (un)redirection as part of compositing. +*/ + front_updated = update_front(draw, s-dst); + + /* Assign frontbuffer pixmap, after update in update_front() */ + dst_pix = nouveau_dri2_buffer(s-dst)-ppix; + dst_bo = nouveau_pixmap_bo(dst_pix); + /* Throttle on the previous frame before swapping */ nouveau_bo_map(dst_bo, NOUVEAU_BO_RD); nouveau_bo_unmap(dst_bo); @@ -275,7 +314,7 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, FIRE_RING(chan); } - if (can_exchange(draw, dst_pix, src_pix)) { + if (front_updated can_exchange(draw, dst_pix, src_pix)) { type = DRI2_EXCHANGE_COMPLETE; DamageRegionAppend(draw, reg); -- 1.7.5.4 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 5/9] dri2: Fixes to swap scheduling.
Fix some small off-by-one errors and a mismatch between 32 bit kernel interfaces for vblank count and 64 bit dri2 interfaces for target_msc et al. Return corrected target_msc to swap scheduling in x-server. A revised version of the patch discussed here: http://lists.freedesktop.org/archives/nouveau/2011-September/009143.html Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- src/nouveau_dri2.c | 23 +-- 1 files changed, 21 insertions(+), 2 deletions(-) diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c index 8608678..719b3bb 100644 --- a/src/nouveau_dri2.c +++ b/src/nouveau_dri2.c @@ -387,11 +387,22 @@ nouveau_dri2_schedule_swap(ClientPtr client, DrawablePtr draw, if (ret) goto fail; + /* Truncate to match kernel interfaces; means occasional overflow +* misses, but that's generally not a big deal. +*/ + *target_msc = 0x; + divisor = 0x; + remainder = 0x; + /* Calculate a swap target if we don't have one */ if (current_msc = *target_msc divisor) *target_msc = current_msc + divisor - (current_msc - remainder) % divisor; + /* Avoid underflow of unsigned value below */ + if (*target_msc == 0) + *target_msc = 1; + /* Request a vblank event one frame before the target */ ret = nouveau_wait_vblank(draw, DRM_VBLANK_ABSOLUTE | DRM_VBLANK_EVENT, @@ -399,7 +410,8 @@ nouveau_dri2_schedule_swap(ClientPtr client, DrawablePtr draw, expect_msc, NULL, s); if (ret) goto fail; - s-frame = (unsigned int) expect_msc 0x; + s-frame = 1 + ((unsigned int) expect_msc 0x); + *target_msc = 1 + expect_msc; } else { /* We can't/don't want to sync to vblank, just swap. */ nouveau_dri2_finish_swap(draw, 0, 0, 0, s); @@ -420,6 +432,13 @@ nouveau_dri2_schedule_wait(ClientPtr client, DrawablePtr draw, CARD64 current_msc; int ret; + /* Truncate to match kernel interfaces; means occasional overflow +* misses, but that's generally not a big deal. +*/ + target_msc = 0x; + divisor = 0x; + remainder = 0x; + if (!can_sync_to_vblank(draw)) { DRI2WaitMSCComplete(client, draw, target_msc, 0, 0); return TRUE; @@ -439,7 +458,7 @@ nouveau_dri2_schedule_wait(ClientPtr client, DrawablePtr draw, goto fail; /* Calculate a wait target if we don't have one */ - if (current_msc target_msc divisor) + if (current_msc = target_msc divisor) target_msc = current_msc + divisor - (current_msc - remainder) % divisor; -- 1.7.5.4 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 6/9] dri2: Allow vblank controlled swaps for redirected windows. Part I
Make sure that swaps for redirected windows under a compositor are still scheduled via vblank events, to avoid effects like 2900 fps swaps under a compositor. See discussion with Francisco Jerez at: http://lists.freedesktop.org/archives/nouveau/2011-September/009278.html http://lists.freedesktop.org/archives/nouveau/2011-September/009292.html This is part I of the agreed upon band-aid, in a separate patch. It allows to use vblank related functions on redirected windows and thereby fixes functions from sgi_sync_control and oml_sync_control extension, e.g., glXWaitForMscOML(), glXGetSyncValuesOML(), glXWaitVideoSyncSGI, ... Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- src/nouveau_dri2.c |2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c index 719b3bb..6a0800c 100644 --- a/src/nouveau_dri2.c +++ b/src/nouveau_dri2.c @@ -204,10 +204,8 @@ can_sync_to_vblank(DrawablePtr draw) { ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum]; NVPtr pNv = NVPTR(scrn); - PixmapPtr pix = NVGetDrawablePixmap(draw); return pNv-glx_vblank - nouveau_exa_pixmap_is_onscreen(pix) nv_window_belongs_to_crtc(scrn, draw-x, draw-y, draw-width, draw-height); } -- 1.7.5.4 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 7/9] dri2: Allow vblank controlled swaps for redirected windows. Part II
This part implements proper throttling for clients. For vblank synchronized blits, it defers DRI2SwapComplete() until 1 vblank after the framebuffer blit is submitted to the gpu. Rationale: For unredirected windows, this is the earliest time the blit swap can complete, as blits are submitted one vblank before the target vblank and synchronized with vblank in the gpu. This makes swap completion timestamps at least reasonable. For redirected windows, the compositor will probably pick up the blit swapped frontbuffer pixmap of the window quickly, but defer its own recomposition to the next vblank, at least if sync to vblank for the compositor is on. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- src/nouveau_dri2.c | 32 +--- 1 files changed, 29 insertions(+), 3 deletions(-) diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c index 6a0800c..fdc5148 100644 --- a/src/nouveau_dri2.c +++ b/src/nouveau_dri2.c @@ -135,6 +135,7 @@ nouveau_dri2_copy_region(DrawablePtr pDraw, RegionPtr pRegion, struct nouveau_dri2_vblank_state { enum { SWAP, + BLIT, WAIT } action; @@ -342,6 +343,22 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, REGION_TRANSLATE(0, reg, -draw-x, -draw-y); nouveau_dri2_copy_region(draw, reg, s-dst, s-src); + + if (can_sync_to_vblank(draw)) { + /* Request a vblank event one vblank from now, the most +* likely (optimistic?) time a direct framebuffer blit +* will complete or a desktop compositor will update its +* screen. This defers DRI2SwapComplete() to the earliest +* likely time of real swap completion. +*/ + s-action = BLIT; + ret = nouveau_wait_vblank(draw, DRM_VBLANK_EVENT | + DRM_VBLANK_RELATIVE, 1, + NULL, NULL, s); + /* Done, if success. Otherwise use fallback below. */ + if (!ret) + return; + } } /* @@ -351,8 +368,9 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, * * XXX - The DRI2SwapLimit() API allowed us to move this to * the flip handler with no FPS hit for page flipped swaps. -* It is still needed for copy swaps as we lack a method -* to detect true swap completion for DRI2_BLIT_COMPLETE. +* It is still needed as a fallback for some copy swaps as +* we lack a method to detect true swap completion for +* DRI2_BLIT_COMPLETE. */ DRI2SwapComplete(s-client, draw, frame, tv_sec, tv_usec, type, s-func, s-data); @@ -505,8 +523,10 @@ nouveau_dri2_vblank_handler(int fd, unsigned int frame, ret = dixLookupDrawable(draw, s-draw, serverClient, M_ANY, DixWriteAccess); - if (ret) + if (ret) { + free(s); return; + } switch (s-action) { case SWAP: @@ -517,6 +537,12 @@ nouveau_dri2_vblank_handler(int fd, unsigned int frame, DRI2WaitMSCComplete(s-client, draw, frame, tv_sec, tv_usec); free(s); break; + + case BLIT: + DRI2SwapComplete(s-client, draw, frame, tv_sec, tv_usec, +DRI2_BLIT_COMPLETE, s-func, s-data); + free(s); + break; } } -- 1.7.5.4 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 8/9] dri2: Reimplement hack for triple-buffering on old X-Servers.
X-Servers before 1.12.0 don't have the DRI2SwapLimit() API. On these, default to a swaplimit of 1 - double-buffering. This patch implements support for swap limit of 2, triple-buffering, on old x-servers via Francisco Jerez previous hack: Return DRI2SwapComplete() before the swap has completed, so clients don't get blocked on the pending swap. This allows for a triple-buffering look-alike behaviour, but breaks the swap scheduling and timestamping defined in the OML_sync_control spec, so applications which rely on conformant behaviour will break with a swap limit of 2 on pre 1.12.0 x-servers. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- man/nouveau.man|6 +- src/nouveau_dri2.c | 32 +--- src/nv_driver.c| 11 ++- 3 files changed, 40 insertions(+), 9 deletions(-) diff --git a/man/nouveau.man b/man/nouveau.man index 59f6c1a..7c72907 100644 --- a/man/nouveau.man +++ b/man/nouveau.man @@ -101,7 +101,11 @@ a drawable before a client is blocked. A value of 1 corresponds to double-buffering. A value of 2 corresponds to triple-buffering. Higher values may allow higher framerate, but also increase lag for interactive applications, e.g., games. Nouveau currently -supports a maximum value of 2 on XOrg 1.12+ and a maximum of 1 on older servers. +reliably supports a maximum value of 2 on XOrg 1.12+. A maximum setting of 2 +on older x-servers is allowed, but it will break conformance with the +OpenML OML_sync_control specification and will cause failure of software +that relies on correct presentation timing behaviour as defined in that +specification. .br Default: 2 for XOrg 1.12+, 1 for older servers. .SH SEE ALSO diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c index fdc5148..f0c7fec 100644 --- a/src/nouveau_dri2.c +++ b/src/nouveau_dri2.c @@ -255,6 +255,18 @@ nouveau_dri2_swap_limit_validate(DrawablePtr draw, int swap_limit) } #endif +/* Shall we intentionally violate the OML_sync_control spec to + * get some sort of triple-buffering behaviour on a pre 1.12.0 + * x-server? + */ +static Bool violate_oml(DrawablePtr draw) +{ + ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum]; + NVPtr pNv = NVPTR(scrn); + + return (DRI2INFOREC_VERSION 6) (pNv-swap_limit 1); +} + static void nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, unsigned int tv_sec, unsigned int tv_usec, @@ -319,7 +331,9 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, if (DRI2CanFlip(draw)) { type = DRI2_FLIP_COMPLETE; - ret = drmmode_page_flip(draw, src_pix, s, ref_crtc_hw_id); + ret = drmmode_page_flip(draw, src_pix, + violate_oml(draw) ? NULL : s, + ref_crtc_hw_id); if (!ret) goto out; } @@ -330,7 +344,7 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, DamageRegionProcessPending(draw); /* If it is a page flip, finish it in the flip event handler. */ - if (type == DRI2_FLIP_COMPLETE) + if ((type == DRI2_FLIP_COMPLETE) !violate_oml(draw)) return; } else { type = DRI2_BLIT_COMPLETE; @@ -344,7 +358,7 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, REGION_TRANSLATE(0, reg, -draw-x, -draw-y); nouveau_dri2_copy_region(draw, reg, s-dst, s-src); - if (can_sync_to_vblank(draw)) { + if (can_sync_to_vblank(draw) !violate_oml(draw)) { /* Request a vblank event one vblank from now, the most * likely (optimistic?) time a direct framebuffer blit * will complete or a desktop compositor will update its @@ -361,6 +375,14 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, } } + /* Special triple-buffering hack for old pre 1.12.0 x-servers used? */ + if (violate_oml(draw)) { + /* Signal to client that swap completion timestamps and counts +* are invalid - they violate the specification. +*/ + frame = tv_sec = tv_usec = 0; + } + /* * Tell the X server buffers are already swapped even if they're * not, to prevent it from blocking the client on the next @@ -371,6 +393,10 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame, * It is still needed as a fallback for some copy swaps as * we lack a method to detect true swap completion for * DRI2_BLIT_COMPLETE. +* +* It is also used if triple-buffering is requested on +* old x-servers which don't
[PATCH 9/9] dri2: Fix corner case crash for swaplimit 1
If a swaplimit 1 is set on a server which supports the swaplimit api (XOrg 1.12.0+), the following can happen: 1. Client calls glXSwapBuffersMscOML() with a swap target 1 vblank in the future, or a client calls glXSwapbuffers() while the swap interval is set to 1 (unusual but possible). 2. nouveau_dri2_finish_swap() is therefore called only at the target vblank, instead of immediately. 3. Because of the deferred execution of nouveu_dri2_finish_swap(), the OpenGL client can call x-servers DRI2GetBuffersWithFormat() before nouveau_dri2_finish_swap() executes and it deletes pixmaps that would be needed by nouveau_dri2_finish_swap() -- Segfault -- Crash. Prevent this: When a swap is scheduled into the future, we temporarily reduce the swaplimit to 1 until nouveau_dri2_finish_swap() is done, then restore it to its original value. This throttles the client inside the server in DRI2ThrottleClient() before it can call the evil DRI2GetbuffersWithFormat(). The client will still be released one video refresh interval before swap completion, so there is still some potential win. This doesn't affect the common case of swapping at the next vblank, where this throttling is not needed or done. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- src/nouveau_dri2.c | 26 ++ 1 files changed, 26 insertions(+), 0 deletions(-) diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c index f0c7fec..7878a5a 100644 --- a/src/nouveau_dri2.c +++ b/src/nouveau_dri2.c @@ -445,6 +445,26 @@ nouveau_dri2_schedule_swap(ClientPtr client, DrawablePtr draw, if (*target_msc == 0) *target_msc = 1; +#if DRI2INFOREC_VERSION = 6 + /* Is this a swap in the future, ie. the vblank event will +* not be immediately dispatched, but only at a future vblank? +* If so, we need to temporarily lower the swaplimit to 1, so +* that DRI2GetBuffersWithFormat() requests from the client get +* deferred in the x-server until the vblank event has been +* dispatched to us and nouveau_dri2_finish_swap() is done. If +* we wouldn't do this, DRI2GetBuffersWithFormat() would operate +* on wrong (pre-swap) buffers, and cause a segfault later on in +* nouveau_dri2_finish_swap(). Our vblank event handler restores +* the old swaplimit immediately after nouveau_dri2_finish_swap() +* is done, so we still get 1 video refresh cycle worth of +* triple-buffering. For a swap at next vblank, dispatch of the +* vblank event happens immediately, so there isn't any need +* for this lowered swaplimit. +*/ + if (current_msc *target_msc - 1) + DRI2SwapLimit(draw, 1); +#endif + /* Request a vblank event one frame before the target */ ret = nouveau_wait_vblank(draw, DRM_VBLANK_ABSOLUTE | DRM_VBLANK_EVENT, @@ -557,6 +577,12 @@ nouveau_dri2_vblank_handler(int fd, unsigned int frame, switch (s-action) { case SWAP: nouveau_dri2_finish_swap(draw, frame, tv_sec, tv_usec, s); +#if DRI2INFOREC_VERSION = 6 + /* Restore real swap limit on drawable, now that it is safe. */ + ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum]; + DRI2SwapLimit(draw, NVPTR(scrn)-swap_limit); +#endif + break; case WAIT: -- 1.7.5.4 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 1/9] dri2: Fix can_exchange() to allow page-flipping on new servers.
On 02/16/2012 11:04 AM, Michel Dänzer wrote: On Don, 2012-02-16 at 00:45 +0100, Mario Kleiner wrote: can_exchange() fails on at least Xorg 1.12+. This fixes it in the same way it was fixed in the ati intel ddx. Signed-off-by: Mario Kleinermario.klei...@tuebingen.mpg.de --- src/nouveau_dri2.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c index 3aa5ec5..5b62425 100644 --- a/src/nouveau_dri2.c +++ b/src/nouveau_dri2.c @@ -160,7 +160,7 @@ can_exchange(DrawablePtr draw, PixmapPtr dst_pix, PixmapPtr src_pix) return ((DRI2CanFlip(draw) pNv-has_pageflip)) dst_pix-drawable.width == src_pix-drawable.width dst_pix-drawable.height == src_pix-drawable.height - dst_pix-drawable.depth == src_pix-drawable.depth + dst_pix-drawable.bitsPerPixel == src_pix-drawable.bitsPerPixel dst_pix-devKind == src_pix-devKind; } Actually, it seems like the pixmap depths really should match, otherwise one could end up with the front pixmap depth not matching the window depth. Not sure that's a real problem right now, but it seems wonky at least... Have you investigated why the depths don't match? Depends on the meaning of investigated: One of the pixmaps has depth 24 bits (the pixmap of the root window) the other 32 bits (as requested from the client via DRI2GetBuffersWithFormat for RGBA8 visuals). Both have 32 bpp. I checked what the intel and ati ddx do. The ati ddx always checked for matching drawable.bitsPerPixel since kms pageflip support was implemented. The intel ddx does the same, but the code and comments suggests they tried both and checking for matching depths probably didn't work: cd xorg/drivers/xf86-video-intel/ git log -p e2615cdeef078dbd2e834b68c437f098a92b941d So everybody does it like this currently, and it seems to work. I also just tried setting DefaultDepth to 30 bits on a NV50 card which supports this. xdpyinfo confirms a screen depth of 30 bit == 10 bits per red/green/blue channel. glxinfo shows that only RGBA8 visuals are supported. Here we have a mismatch of depths 30 vs. 24 and the driver reverts correctly from page flipping to copy swaps, although i can't see immediately what causes it to correctly switch back to copy swaps. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 9/9] dri2: Fix corner case crash for swaplimit 1
On 02/16/2012 10:46 AM, Michel Dänzer wrote: On Don, 2012-02-16 at 00:45 +0100, Mario Kleiner wrote: If a swaplimit 1 is set on a server which supports the swaplimit api (XOrg 1.12.0+), the following can happen: 1. Client calls glXSwapBuffersMscOML() with a swap target 1 vblank in the future, or a client calls glXSwapbuffers() while the swap interval is set to 1 (unusual but possible). 2. nouveau_dri2_finish_swap() is therefore called only at the target vblank, instead of immediately. 3. Because of the deferred execution of nouveu_dri2_finish_swap(), the OpenGL client can call x-servers DRI2GetBuffersWithFormat() before nouveau_dri2_finish_swap() executes and it deletes pixmaps that would be needed by nouveau_dri2_finish_swap() -- Segfault -- Crash. Pixmaps are reference counted, so it should be possible to fix this via proper reference counting. Thanks for the tip. I'll try that. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] dri2: Pass AsyncSwap [swap_interval=0] requests to the drivers
On Feb 21, 2012, at 10:55 AM, Chris Wilson wrote: Currently, the midlayer dri2 code intercepts swap_interval=0 (ala vblank_mode=0) SwapBuffers and converts it to a CopyRegion request. This prevents the backend from doing anything meaningful in this case and typically ends up being vsync'ed since the drivers cannot distinguish it from a regular CopyRegion request. v2: Only invalidate the drawable on the behest of the backend, so that existing behaviour of not invalidating for async blit copies is preserved, suggested by Simon Farnsworh. And rebase for the intervening 12 months since v1. ...snip... int DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, CARD64 divisor, CARD64 remainder, CARD64 *swap_target, @@ -876,20 +900,11 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, /* Old DDX or no swap interval, just blit */ if (!ds-ScheduleSwap || !pPriv-swap_interval) { - BoxRec box; - RegionRec region; - - box.x1 = 0; - box.y1 = 0; - box.x2 = pDraw-width; - box.y2 = pDraw-height; - RegionInit(region, box, 0); - pPriv-swapsPending++; - - (*ds-CopyRegion)(pDraw, region, pDestBuffer, pSrcBuffer); - DRI2SwapComplete(client, pDraw, target_msc, 0, 0, DRI2_BLIT_COMPLETE, -func, data); + if ((*ds-AsyncSwap)(client, pDraw, +pDestBuffer, pSrcBuffer, +func, data)) + DRI2InvalidateDrawable(pDraw); return Success; } One thing i'm not sure about, maybe just a false alert, maybe a tiny race: The ddx could call DRI2SwapComplete() in *ds-AsyncSwap() before the server calls DRI2InvalidateDrawable() afterwards. Normally DRI2InvalidateDrawable() - As soon as client tries to render again - invalidated buffers - Client calls DRI2GetBuffersWithFormat() - DRI2ThrottleClient() - Server usually waits for DRI2SwapComplete(). With the current implementation the server could only process DRI2SwapComplete() [as triggered by a received vblank event or pageflip completion event from the kernel], after DRI2InvalidateDrawable() at the end of DRI2SwapBuffers(). a) A single threaded client can't start rendering before return from glXSwapBuffers[MscOML](), so no problem. b) A multi-threaded gl client could wait/poll for swap completion via glXWaitForSbcOML(), glXGetSyncValuesOML() or via the INTEL_swap_events extension, so it could be that a 2nd rendering thread would get released by DRI2SwapComplete() and start rendering again before receiving the buffer invalidate events. I'm not sure if this is a real problem, or if i'm missing something, but it could be prevented if the ds-AsyncSwap code itself would call DRI2InvalidateDrawable() at the proper time instead of the server calling it afterwards. ... snip ... + +/* added in version 7 */ + +/* Used when the client requests swap_interval=0, i.e. swap immediately + * with no throttling. Whether to tear or not is left up to the driver. I wonder if there is any case when the option of not tearing would make sense for a swaplimit of 0? The only when i set a swaplimit of zero in my apps is if i want to benchmark how fast the whole rendering loop could go in the best case, so i want it to tear. Same for benchmarks. Other than that Reviewed-by: Mario Kleiner mario.klei...@tuebingen.mpg.de thanks, -mario * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 1/9] dri2: Fix can_exchange() to allow page-flipping on new servers.
On 02/20/2012 11:27 AM, Michel Dänzer wrote: On Mon, 2012-02-20 at 05:59 +0100, Mario Kleiner wrote: On 02/16/2012 11:04 AM, Michel Dänzer wrote: On Don, 2012-02-16 at 00:45 +0100, Mario Kleiner wrote: can_exchange() fails on at least Xorg 1.12+. This fixes it in the same way it was fixed in the ati intel ddx. Signed-off-by: Mario Kleinermario.klei...@tuebingen.mpg.de --- src/nouveau_dri2.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c index 3aa5ec5..5b62425 100644 --- a/src/nouveau_dri2.c +++ b/src/nouveau_dri2.c @@ -160,7 +160,7 @@ can_exchange(DrawablePtr draw, PixmapPtr dst_pix, PixmapPtr src_pix) return ((DRI2CanFlip(draw) pNv-has_pageflip)) dst_pix-drawable.width == src_pix-drawable.width dst_pix-drawable.height == src_pix-drawable.height - dst_pix-drawable.depth == src_pix-drawable.depth + dst_pix-drawable.bitsPerPixel == src_pix-drawable.bitsPerPixel dst_pix-devKind == src_pix-devKind; } Actually, it seems like the pixmap depths really should match, otherwise one could end up with the front pixmap depth not matching the window depth. Not sure that's a real problem right now, but it seems wonky at least... Have you investigated why the depths don't match? Depends on the meaning of investigated: One of the pixmaps has depth 24 bits (the pixmap of the root window) the other 32 bits (as requested from the client via DRI2GetBuffersWithFormat for RGBA8 visuals). Both have 32 bpp. I checked what the intel and ati ddx do. The ati ddx always checked for matching drawable.bitsPerPixel since kms pageflip support was implemented. The intel ddx does the same, but the code and comments suggests they tried both and checking for matching depths probably didn't work: cd xorg/drivers/xf86-video-intel/ git log -p e2615cdeef078dbd2e834b68c437f098a92b941d So everybody does it like this currently, and it seems to work. Right. I must have been incorrectly thinking of flipping as changing the window pixmap. I still have some doubts — e.g. why is an RGBA visual chosen for a fullscreen window, and does this really have anything to do with changes in xserver 1.12, or rather in Mesa — but I think the change itself is okay. You are right, it is not the xserver, but more likely Mesa. I tried the patch again with some debug output against xserver 1.10.4 (Ubuntu 11.10 standard x-server) and the same happens. The FRONT_LEFT buffer is taken from the windows pixmap, which is depth 24, the BACK_LEFT buffer is allocated as a pixmap of 32 bits. So now i'm surprised it ever worked before... Original testing of this nouveau patchset was around August 2011, since then i've also updated mesa to 7.11 as part of a distribution upgrade from Ubuntu 11.04 - 11.10. Mesa requests 32 bit buffers as part of the BACK_LEFT attachment in DRI2GetBuffersWithFormat(). As far as i could trace it back, the code in src/gallium/state_trackers/dri/drm/dri2.c, dri2_drawable_get_buffers() mapped the PIPE_FORMAT_B8G8R8A8_UNORM (as requested by my toolkit) and also PIPE_FORMAT_B8G8R8X8_UNORM (from glxgears) to a DRI2 buffer format of 32 bits and it seems that that is used as .depth for the pixmaps drawable as well. There's this related commit in mesa: git log -p 576161289df68eedade591fbca4013329c9e5ded which changed the logic a bit after Mesa 7.12. Also this one: git log -p c72d7df16879e3210946ba92a7edc823815b6f16 So to support page flipping on different mesa versions i guess can_exchange needs to be lenient and check for the matching bitsPerPixel instead. I will change the commit message and title of this patch to be less misleading about the cause of the problem. I wonder if the depth field is still a good way to describe the buffer format, especially with things like depth 30 RGB10 scanout. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 01/10] dri2: Fix can_exchange() to allow page-flipping on more mesa versions.
can_exchange() returns false and thereby prevents page flipping on some drawables where page flipping would work fine. This due to non-matching drawable depths values between front buffer pixmap and back buffer pixmap, because front buffer pixmaps inherit the depth of the screen, typically 24 bits, whereas the depth value of back buffer pixmaps for a given RGB8 or RGBA8 visual depends on the mesa version in use, either 24 bits or 32 bits. Use bitsPerPixel instead of depth to decide if drawable is flippable. This will still catch really incompatible formats like 32 bpp vs. 16 bpp buffers. Tested for screen DefaultDepth 24 and also 30 bits (for RGB10 framebuffers) on NV-50. The problem was fixed in the same way in the ati intel ddx. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- src/nouveau_dri2.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c index 3aa5ec5..5b62425 100644 --- a/src/nouveau_dri2.c +++ b/src/nouveau_dri2.c @@ -160,7 +160,7 @@ can_exchange(DrawablePtr draw, PixmapPtr dst_pix, PixmapPtr src_pix) return ((DRI2CanFlip(draw) pNv-has_pageflip)) dst_pix-drawable.width == src_pix-drawable.width dst_pix-drawable.height == src_pix-drawable.height - dst_pix-drawable.depth == src_pix-drawable.depth + dst_pix-drawable.bitsPerPixel == src_pix-drawable.bitsPerPixel dst_pix-devKind == src_pix-devKind; } -- 1.7.5.4 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 10/10] dri2: Testpatch: Fix corner case crash but not the problem.
Do not apply - Just for illustration: Introduce reference counting for nouveau_dri2_buffers and temporarily increase refcount on nouveau_dri2_buffer frontbuffer to protect it against premature deletion by x-server as part of the DRI2GetBuffers[WithFormat]() path while the vblank event for a scheduled swap is still pending and nouveau_dri2_finish_swap() hasn't completed for this swap. This prevents server/ddx crash on deferred swaps with swaplimit 1 but doesn't fix the problem: We try to treat a setup which is only double-buffered as if it is truly triple-buffered. This causes flickering, so not useful in practice. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- src/nouveau_dri2.c | 57 +++- 1 files changed, 34 insertions(+), 23 deletions(-) diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c index 7878a5a..eb008c9 100644 --- a/src/nouveau_dri2.c +++ b/src/nouveau_dri2.c @@ -14,6 +14,7 @@ struct nouveau_dri2_buffer { DRI2BufferRec base; PixmapPtr ppix; + unsigned int refcnt; }; static inline struct nouveau_dri2_buffer * @@ -79,6 +80,7 @@ nouveau_dri2_create_buffer(DrawablePtr pDraw, unsigned int attachment, nvbuf-base.format = format; nvbuf-base.flags = 0; nvbuf-ppix = ppix; + nvbuf-refcnt++; nvpix = nouveau_pixmap(ppix); if (!nvpix || !nvpix-bo || @@ -100,6 +102,10 @@ nouveau_dri2_destroy_buffer(DrawablePtr pDraw, DRI2BufferPtr buf) if (!nvbuf) return; + nvbuf-refcnt--; + if (nvbuf-refcnt) + return; + pDraw-pScreen-DestroyPixmap(nvbuf-ppix); free(nvbuf); } @@ -178,6 +184,17 @@ update_front(DrawablePtr draw, DRI2BufferPtr front) return TRUE; } +static void +ref_front(DrawablePtr draw, DRI2BufferPtr front, Bool doref) +{ + struct nouveau_dri2_buffer *nvbuf = nouveau_dri2_buffer(front); + + if (doref) + nvbuf-refcnt++; + else + nouveau_dri2_destroy_buffer(draw, front); +} + static Bool can_exchange(DrawablePtr draw, PixmapPtr dst_pix, PixmapPtr src_pix) { @@ -445,25 +462,21 @@ nouveau_dri2_schedule_swap(ClientPtr client, DrawablePtr draw, if (*target_msc == 0) *target_msc = 1; -#if DRI2INFOREC_VERSION = 6 - /* Is this a swap in the future, ie. the vblank event will -* not be immediately dispatched, but only at a future vblank? -* If so, we need to temporarily lower the swaplimit to 1, so -* that DRI2GetBuffersWithFormat() requests from the client get -* deferred in the x-server until the vblank event has been -* dispatched to us and nouveau_dri2_finish_swap() is done. If -* we wouldn't do this, DRI2GetBuffersWithFormat() would operate -* on wrong (pre-swap) buffers, and cause a segfault later on in -* nouveau_dri2_finish_swap(). Our vblank event handler restores -* the old swaplimit immediately after nouveau_dri2_finish_swap() -* is done, so we still get 1 video refresh cycle worth of -* triple-buffering. For a swap at next vblank, dispatch of the -* vblank event happens immediately, so there isn't any need -* for this lowered swaplimit. + /* Increase refcount on frontbuffer to protect it against +* premature deletion by x-server as part of the +* DRI2GetBuffersWithFormat() path while the vblank event +* is still pending and nouveau_dri2_finish_swap() hasn't +* completed for this swap. +* +* XXX: This prevents server/ddx crash on deferred swaps, +* but doesn't fix the problem: We try to treat a setup +* which is only double-buffered as if it is truly triple- +* buffered. This causes flickering, so not useful in practice. +* +* Tinkering with the DRI2SwapLimit() is ugly but provides +* proper behaviour. */ - if (current_msc *target_msc - 1) - DRI2SwapLimit(draw, 1); -#endif + ref_front(draw, s-dst, true); /* Request a vblank event one frame before the target */ ret = nouveau_wait_vblank(draw, DRM_VBLANK_ABSOLUTE | @@ -577,12 +590,10 @@ nouveau_dri2_vblank_handler(int fd, unsigned int frame, switch (s-action) { case SWAP: nouveau_dri2_finish_swap(draw, frame, tv_sec, tv_usec, s); -#if DRI2INFOREC_VERSION = 6 - /* Restore real swap limit on drawable, now that it is safe. */ - ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum]; - DRI2SwapLimit(draw, NVPTR(scrn)-swap_limit); -#endif
Re: [Nouveau] [Patches][nouveau/ddx]: Improvements to bufferswap implementation and timestamping
On 02/29/2012 08:17 AM, Ben Skeggs wrote: On Thu, 2012-02-16 at 00:45 +0100, Mario Kleiner wrote: Hi, Hey Mario, What's your plan with this patchset? Do you intend on taking Michel's comments into account? CC'ing Francisco as he had some comments on IRC. I'd like to get this all sorted and merged so I can rebase the ported-to-new-libdrm ddx on top of it all and get that ready. Hi Ben, just sent out two updated patches wrt. to Michels feedback. [1/10] Replaces the original [1/9] patch. The code is the same, but the commit message is changed to be less misleading about the cause of the problem. See the e-mail i think i sent about a week ago. [10/10] Just for demonstration what i tried, not for application. The old [9/9] patch is still the best i can do at the moment. Following Michels proposal, with 10/10 i added refcounting to nouveau_dri2_buffer objects and keep a reference while the swap is still pending. This prevents deletion of the buffer by the x-server and thereby prevents the server crash. Problem: Now we have a display that doesn't crash, but flickers horribly while pageflipping with deferred swaps. I think it is because we still only have two buffers for double-buffering but pretend we're doing triple-buffering, but i could be totally wrong about the cause of the problem. The original patch [9/9] throttles the client in the special case where the pseudo-triplebuffering would result in flicker or crash, so it fixes the problem although in a rather hackish way. Not sure if this can be done cleanly without a real n-buffering implementation in the x-server? I think other than that there were no further comments and Francisco's original feedback from last year is already implemented in the patchset. More feedback etc. welcome, but i usually can only work on this part time on weekends if it needs further improvements before merging. Can the kms kernel patches be already merged? I think they are in a good shape now. Thanks, -mario Ben. here a set of patches against the nouveau-ddx. This is an extended and revised set, based on Francisco Jerez feedback from autumn last year. [1/9] Makes pageflipping work again on X-Server 1.12rc. It apparently stopped working somewhere around Xorg 1.11+. [2/9] Implements handling of pageflip completion events from the kernel. Francisco Jerez argument against including it was that the x-server didn't have a swaplimit api, so this couldn't be applied at the same time as the pseudo triple-buffering hack which is in place in the current ddx. Now we have the swaplimit api in 1.12, so this problem should be solved. [3/9] Makes use of the swaplimit api. A new xorg.conf option SwapLimit allows to select a swaplimit of 1 or 2 for double-buffering or triple-buffering. It defaults to 2 for Xorg 1.12+ and 1 for older servers. This way, on 1.12+ nouveau retains the kind of triple buffering behaviour it currently has, but swap completion timestamping (OML_sync_control, INTEL_swap_events, and client swap throttling) works conforming to the specs. On older servers it removes triple-buffering but makes nouveau conform to the specs. This is important for apps that need precise and reliable swap scheduling and timestamping. [4/9] A bug fix against corrupted desktop when switching from redirected to non-redirected fullscreen windows under a desktop compositor. Fixes FDO bug #35452. [5/9] Some fixes to swap scheduling, revised according to Francisco's review. [6/9] Fixes swap throttling for non-fullscreen windows under a desktop compositor. Split into a separate patch according to Francisco's feedback. [7/9] An attempt to provide more sane swap completion events for non- fullscreen windows. This is a bit of cheating, as it delivers the events for the earliest point in time one would expect the swap to complete, assuming only a lightly loaded gpu. Real completion could be later. I think this is an improvement because the current implementation delivers swap complete events with timestamps and counts that signal swap completion before the client even requested the swap. [8/9] This one adds Francisco's original triple-buffering hack back for Xorg 1.11 and earlier servers if the xorg.conf option requests a swaplimit 1. Users can choose between triple-buffering but broken timestamping or double-buffering with sane timestamping. [9/9] Fixes a corner case which could cause the ddx to segfault with its current triple-buffering implementation. I've tested these on single-display and dual-display setups, with/without compositor, with/without redirection and checked the robustness and precision of the timestamps with special measurement equipment, on XOrg 1.12-rc2 and 1.10. They work pretty well for me and finally make nouveau very useable for the kind of scientific applications that require precise swap scheduling and timestamping, so i'd love to see them reviewed and hopefully included into the ddx soon. A couple of things are a bit
Re: [PATCH 01/10] dri2: Fix can_exchange() to allow page-flipping on more mesa versions.
On Mar 1, 2012, at 9:37 PM, Daniel Stone wrote: Hi, On 1 March 2012 18:11, Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: can_exchange() returns false and thereby prevents page flipping on some drawables where page flipping would work fine. This due to non-matching drawable depths values between front buffer pixmap and back buffer pixmap, because front buffer pixmaps inherit the depth of the screen, typically 24 bits, whereas the depth value of back buffer pixmaps for a given RGB8 or RGBA8 visual depends on the mesa version in use, either 24 bits or 32 bits. Use bitsPerPixel instead of depth to decide if drawable is flippable. This will still catch really incompatible formats like 32 bpp vs. 16 bpp buffers. In theory, shouldn't this be a format-compatibility check, so you don't (unlikely though it is) attempt to flip to a BGR drawable on an RGB configuration, or a drawable on 0565? Cheers, Daniel Yes, i think that would be the best solution, in theory. But i don't think the info is easily available to the ddx? However, i didn't search very hard. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH] DRI2: Don't return junk reply instead of blocking in glXWaitForSbcOML()
DRI2WaitSBC() didn't block if requested targetSBC wasn't yet reached. Instead it returned a xreply with uninitialized junk return values, then blocked the connection until targetSBC was reached. Therefore the client didn't block, but continued with bogus return values from glXWaitForSbcOML. This patch fixes the problem by implementing DRI2WaitSBC similar to the clean and proven DRI2WaitMSC implementation. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- hw/xfree86/dri2/dri2.c| 10 -- hw/xfree86/dri2/dri2.h|3 +-- hw/xfree86/dri2/dri2ext.c | 16 +++- 3 files changed, 8 insertions(+), 21 deletions(-) diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index 9ec4caa..23ed3e5 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -909,8 +909,7 @@ DRI2WaitMSC(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, } int -DRI2WaitSBC(ClientPtr client, DrawablePtr pDraw, CARD64 target_sbc, - CARD64 *ust, CARD64 *msc, CARD64 *sbc) +DRI2WaitSBC(ClientPtr client, DrawablePtr pDraw, CARD64 target_sbc) { DRI2DrawablePtr pPriv; @@ -924,14 +923,13 @@ DRI2WaitSBC(ClientPtr client, DrawablePtr pDraw, CARD64 target_sbc, if (target_sbc == 0) target_sbc = pPriv-swap_count + pPriv-swapsPending; -/* If current swap count already = target_sbc, +/* If current swap count already = target_sbc, reply and * return immediately with (ust, msc, sbc) triplet of * most recent completed swap. */ if (pPriv-swap_count = target_sbc) { -*sbc = pPriv-swap_count; -*msc = pPriv-last_swap_msc; -*ust = pPriv-last_swap_ust; +ProcDRI2WaitMSCReply(client, pPriv-last_swap_ust, + pPriv-last_swap_msc, pPriv-swap_count); return Success; } diff --git a/hw/xfree86/dri2/dri2.h b/hw/xfree86/dri2/dri2.h index f08ca29..fe0bf6c 100644 --- a/hw/xfree86/dri2/dri2.h +++ b/hw/xfree86/dri2/dri2.h @@ -265,8 +265,7 @@ extern _X_EXPORT int DRI2WaitMSC(ClientPtr client, DrawablePtr pDrawable, extern _X_EXPORT int ProcDRI2WaitMSCReply(ClientPtr client, CARD64 ust, CARD64 msc, CARD64 sbc); extern _X_EXPORT int DRI2WaitSBC(ClientPtr client, DrawablePtr pDraw, -CARD64 target_sbc, CARD64 *ust, CARD64 *msc, -CARD64 *sbc); +CARD64 target_sbc); extern _X_EXPORT Bool DRI2ThrottleClient(ClientPtr client, DrawablePtr pDraw); extern _X_EXPORT Bool DRI2CanFlip(DrawablePtr pDraw); diff --git a/hw/xfree86/dri2/dri2ext.c b/hw/xfree86/dri2/dri2ext.c index db04268..8016edb 100644 --- a/hw/xfree86/dri2/dri2ext.c +++ b/hw/xfree86/dri2/dri2ext.c @@ -520,9 +520,8 @@ static int ProcDRI2WaitSBC(ClientPtr client) { REQUEST(xDRI2WaitSBCReq); -xDRI2MSCReply rep; DrawablePtr pDrawable; -CARD64 target, ust, msc, sbc; +CARD64 target; int status; REQUEST_SIZE_MATCH(xDRI2WaitSBCReq); @@ -532,18 +531,9 @@ ProcDRI2WaitSBC(ClientPtr client) return status; target = vals_to_card64(stuff-target_sbc_lo, stuff-target_sbc_hi); -status = DRI2WaitSBC(client, pDrawable, target, ust, msc, sbc); -if (status != Success) - return status; +status = DRI2WaitSBC(client, pDrawable, target); -rep.type = X_Reply; -rep.length = 0; -rep.sequenceNumber = client-sequence; -load_msc_reply(rep, ust, msc, sbc); - -WriteToClient(client, sizeof(xDRI2MSCReply), rep); - -return Success; +return status; } static int -- 1.6.6 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH] DRI2/xserver: Don't hang in glXSwapBuffers if drawable moves between crtc's
Hi This patch fixes a hang in glXSwapBuffers if a user moves a drawable from a fast running crtc, e.g., 60 Hz to a slower running crtc, e.g., 50 Hz, when using the new DRI2 sync swap bits. It should fix Bugzilla bug #28383. https://bugs.freedesktop.org/show_bug.cgi?id=28383 I've tested this on a R600 against master. I assume the poster of that bug report will test it as well. He has tested a previous version of the patch. Jesse could you review it? Would be good if this could make it into master and 1.8.2 as well. thanks, -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH] DRI2/xserver: Don't hang in glXSwapBuffers if drawable moves between crtc's
Detect if a drawable has been moved from an original crtc to a new crtc with a lower current vblank count than the original crtc inbetween glXSwapBuffers() calls. Reinitialize drawable's last_swap_target before scheduling next swap if such a move has taken place. last_swap_target defines the baseline for scheduling the next swap. If a movement between crtc's is not taken into account, the swap may schedule for a vblank count on the new crtc far in the future, resulting in a apparent hang of the drawable for a long time. Fixes Bugzilla bug #28383. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- hw/xfree86/dri2/dri2.c | 15 +++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index d33b0d1..1f80022 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -756,6 +756,7 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, DRI2DrawablePtr pPriv; DRI2BufferPtr pDestBuffer = NULL, pSrcBuffer = NULL; int ret, i; +CARD64 ust, current_msc; pPriv = DRI2GetDrawable(pDraw); if (pPriv == NULL) { @@ -800,12 +801,26 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, * need to schedule a swap for the last swap target + the swap interval. */ if (target_msc == 0 divisor == 0 remainder == 0) { + /* If the current vblank count of the drawable's crtc is lower +* than the count stored in last_swap_target from a previous swap +* then reinitialize last_swap_target to the current crtc's msc, +* otherwise the swap will hang. This will happen if the drawable +* is moved to a crtc with a lower refresh rate, or a crtc that just +* got enabled. +*/ + if (!(*ds-GetMSC)(pDraw, ust, current_msc)) + pPriv-last_swap_target = 0; + + if (current_msc pPriv-last_swap_target) + pPriv-last_swap_target = current_msc; + /* * Swap target for this swap is last swap target + swap interval since * we have to account for the current swap count, interval, and the * number of pending swaps. */ *swap_target = pPriv-last_swap_target + pPriv-swap_interval; + } else { /* glXSwapBuffersMscOML could have a 0 target_msc, honor it */ *swap_target = target_msc; -- 1.6.6 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] DRI2/xserver: Don't hang in glXSwapBuffers if drawable moves between crtc's
Hi Keith and Peter, this patch has been reviewed by Jesse Barnes and now successfully tested by the guy who submitted the bug report and by me. Fixes Bugzilla bug #28383, a quite ugly hang of application windows bufferswapping, including crashes when moving them between different crtc's of different refresh rate. It would be good to apply it to 1.9 and 1.8.2 if possible, for a more enjoyable multi-display OpenGL experience. thanks, -mario On Jun 14, 2010, at 6:13 PM, Jesse Barnes wrote: On Sun, 13 Jun 2010 18:05:26 +0200 Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: Detect if a drawable has been moved from an original crtc to a new crtc with a lower current vblank count than the original crtc inbetween glXSwapBuffers() calls. Reinitialize drawable's last_swap_target before scheduling next swap if such a move has taken place. last_swap_target defines the baseline for scheduling the next swap. If a movement between crtc's is not taken into account, the swap may schedule for a vblank count on the new crtc far in the future, resulting in a apparent hang of the drawable for a long time. Fixes Bugzilla bug #28383. Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de --- hw/xfree86/dri2/dri2.c | 15 +++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index d33b0d1..1f80022 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -756,6 +756,7 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, DRI2DrawablePtr pPriv; DRI2BufferPtr pDestBuffer = NULL, pSrcBuffer = NULL; int ret, i; +CARD64 ust, current_msc; pPriv = DRI2GetDrawable(pDraw); if (pPriv == NULL) { @@ -800,12 +801,26 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 target_msc, * need to schedule a swap for the last swap target + the swap interval. */ if (target_msc == 0 divisor == 0 remainder == 0) { + /* If the current vblank count of the drawable's crtc is lower +* than the count stored in last_swap_target from a previous swap +* then reinitialize last_swap_target to the current crtc's msc, +* otherwise the swap will hang. This will happen if the drawable + * is moved to a crtc with a lower refresh rate, or a crtc that just +* got enabled. +*/ + if (!(*ds-GetMSC)(pDraw, ust, current_msc)) + pPriv-last_swap_target = 0; + + if (current_msc pPriv-last_swap_target) + pPriv-last_swap_target = current_msc; + /* * Swap target for this swap is last swap target + swap interval since * we have to account for the current swap count, interval, and the * number of pending swaps. */ *swap_target = pPriv-last_swap_target + pPriv-swap_interval; + } else { /* glXSwapBuffersMscOML could have a 0 target_msc, honor it */ *swap_target = target_msc; Yeah, this looks ok. Really we should write up GLX extension that clarifies behavior in multi-head and display on/off situations too. Is that something you could do? Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org -- Jesse Barnes, Intel Open Source Technology Center ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: IgnoreClient question
On Jun 29, 2010, at 12:27 AM, Jesse Barnes wrote: In GL code, we have two subsystems using IgnoreClient these days: GLX and DRI2. GLX uses it to suspend clients while the server is VT switched away (not sure why, maybe some drivers can't handle it?). DRI2 uses it to implement swap throttling and various GLX extensions and DRI2 protocol requests. Jesse, do we really need IgnoreClient() for throttling in DRI2? If i remember correctly, the DRI2 requests i've seen are all like this: 1. Some request (DRI2GetBuffersWithFormat() / DRI2SwapBuffers() / glXWaitForMscOML() / glXWaitForSBCOML(). 2. Request needs to wait for some event, either a scheduled vblank event to be delivered from the kernel, or a swap to complete. 3. Server responds to the client which was waiting for a xreply in response to completion of the request. The client's calling thread always has to wait for a reply from the server before it can continue doing anything, so that thread would block anyway in xlib, waiting for a xreply from the server at step 3. Because we only have one wait slot per drawable, we block the whole xdisplay connection via IgnoreClient() to prevent other threads in the same client from issuing such blocking requests to the same drawable - it would overflow our single wait slot. This implementation isn't safe if multiple display connections address a single drawable, e.g., 1 connection per client thread. It also doesn't allow to handle multipe or parallel outstanding requests, something that would be useful with some of the new DRI2 protocol requests. If we could replace that single wait slot with one wait list per drawable and protocol request (1 list for glXWaitForMscOML, 1 list for glXWaitForSbcOML and maybe 1 list for swap related throttling), then we probably wouldn't need to block the connection via IgnoreClient at all and could have multiple outstanding / parallel requests. What i assume but didn't check is that xlib doesn't have a problem with getting replies out of sequence, e.g., getting a reply to a request with seq nr 15 before another reply with nr 12 on the same connection? Just neccessary for the sequence number in the reply matching the corresponding sequence number from the request? xlib doesn't time out if a request doesn't quickly get a reply? If we could implement such wait lists then we could ignore IgnoreClient and solve the multi-threading / multiple drawable issues i believe exist. -mario * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: IgnoreClient question
On Jun 29, 2010, at 2:24 AM, Keith Packard wrote: On Tue, 29 Jun 2010 02:18:31 +0200, Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: What i assume but didn't check is that xlib doesn't have a problem with getting replies out of sequence The X protocol is purely sequential -- you can't execute requests out of order (and queuing replies is part of the request execution). In particular, no X library would ever deal with replies being returned out of order; they count on the sequence number from the X server being monotonic. Ok, thanks. That makes live for multi-threaded clients a bit more difficult if they want to use the new extensions, because they would need to maintain multiple x connections, one for each thread, to circumvent this. I guess then xcb doesn't handle this either? Then i think the idea of multiple wait queues doesn't allow to get rid of IgnoreClient(), but we will still need those wait queues in addition to IgnoreClient() to handle multiple blocking requests for one drawable from different clients / x-connections correctly. The current code assumes exactly one client / x-connection per drawable. I think if we have multiple threads accessing the drawable over separate connections then the current IgnoreClient() method will not work and we will end up with undefined behaviour for some of the new dri2 protocol requests. E.g., what should work according to the spec is that i have multiple threads, each of them using its own dedicated x-connection to perform a glXWaitForSbcOML() or glXWaitForMscOML() or glXSwapBuffers()/ glXSwapBuffersMscOML() call for the same drawable and getting its request retired when the corresponding event happens. Currently random stuff would happen if i tried this. If i remember correctly the first thread making a request would be correctly handled and all other threads would hang forever - their requests would disappear into nirvana. Anyway, this is a separate issue from IgnoreClient(), so i'll discuss it separately with Jesse. Another question: If i have separate x-connections, e.g., one for each drawable and each connection has an opengl rendering context attached to the drawable, then i assume these context still can share state (display lists, shaders, texture objects, fbo's etc.), right? thanks, -mario * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: IgnoreClient question
On Jul 2, 2010, at 11:43 PM, Jesse Barnes wrote: On Tue, 29 Jun 2010 04:07:48 +0200 Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: ... Then i think the idea of multiple wait queues doesn't allow to get rid of IgnoreClient(), but we will still need those wait queues in addition to IgnoreClient() to handle multiple blocking requests for one drawable from different clients / x-connections correctly. Right, we have lists in the server now too, so it shouldn't be too hard. Another question: If i have separate x-connections, e.g., one for each drawable and each connection has an opengl rendering context attached to the drawable, then i assume these context still can share state (display lists, shaders, texture objects, fbo's etc.), right? Anything associated with the drawable would be shared, but context specific information would still be separate per thread. Display lists can be shared explicitly, likewise for some of the other objects, so you should be able to get the sharing you want. Good. I will switch our client toolkit to a 1 x-connection per thread model. If a wait queue implementation in the dri2 sync swap bits hasn't magically shown up by the time i hit the limits of the current implementation, i'll give implementing it myself a try. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: glXSwapBuffers fix for moving between crtcs is not following the OML_sync_control specification
On Jul 9, 2010, at 3:40 PM, Michel Dänzer wrote: On Fre, 2010-07-09 at 16:20 +0300, Pauli Nieminen wrote: This commit assumes that MSC maybe different depending on where the drawable is. While specification says For a multi-monitor system, the monitor used to determine MSC is screen 0 of display. which IMO means that there should be single MSC everywhere were target drawable can move. So correct fix would be making MSC value in-depend of CRTC where drawable is. That commit only fixes the glXSwapBuffers() case, which is not defined in the OML_sync_control extension. I couldn't find any specification of how glXSwapBuffers should behave wrt. drawable on different crtc's, e.g., SGI_swap_control doesn't talk about such multi-monitor issues. In that sense the patch doesn't violate a specification, because the behaviour is unspecified. Currently we have proper behaviour while the drawable is on a given crtc - it obeys swap_interval. For the swap that happens between a move of the drawable from one crtc to the other, we have a small timing glitch: The swap_interval setting is ignored for that swap - it swaps on the next retrace, as if swap_interval is 1. But then a swap_interval of 1 is the default setting and probably what almost all standard client apps use anyway. We could do a bit better and avoid that single glitch by mapping msc count from the crtc where the drawable was to the crtc where the drawable is now. But that would need us to keep track of the previous crtc of a drawable and we'd need new DDX entry points to query the current msc of any given crtc and the current crtc of a given drawable from within the server. Currently this info is only available in the DDX itself. Something like (pseudo-code): if (current_crtc != previous_crtc) last_swap_target = last_swap_target + (msc(current_crtc) - msc (previous_crtc)); Not sure if avoiding this small glitch is worth the hassle of two new entry points? If you use the glXSwapBuffersMscOML() command or other OML_sync_control commands then the commit doesn't apply. Currently it is up to the client application to find out if a drawable moved between crtc's inbetween swaps and handle this in a app specific manner. From the perspective of a toolkit developer i like this do it yourself behaviour for glXSwapBuffersMscOML(). OML_sync_control provides all the neccessary tools for a client app to detect and handle such situations in a way that leaves the application in control. Apps that use OML_sync_control will likely care very much about very precise visual timing. Some automagic corrections like the above for glXSwapBuffers() could spoil the day of someone (e.g., me ;) ) that needs very precise timing control and uses OML_sync_control for that purpose. I agree there *should* be such a single MSC, the question is how it should be calculated when the CRTCs don't all have the same vertical refresh rate. I guess OML_sync_control was specified at a time when systems with multiple-displays for one screen didn't exist, at least not ones with different refresh rates for the same screen. If you think about how applications would use the extension to precisely schedule swaps, i think the assumption is that the drawable will usually not move between crtc's or that all crtc's work perfectly synchronized with the same refresh rate, e.g., for multi-display virtual reality applications. For the old intel DRI1 swap scheduling hack, I solved this by making the MSC not correspond to any specific CRTC counter directly but making it a 'virtual' counter which increases at the same rate as the CRTC the window is currently being synchronized to. Looking at the GLX_OML_sync_control spec again, I think this solution should still be feasible in general, as all the extension calls take a drawable parameter. Hmm. I'm not sure if i as a toolkit developer would like that kind of virtualization. What i do is i have a given target system time at which the swap should happen - as close to that time as possible. I use the msc count and the associated ust timestamp together with the known video refresh rate and my given target time to translate the target time into a target_msc. I use these counts and timestamps to synchronize with other modalities like sound, various i/o etc. and often need sub-millisecond accurate timing. Sometimes i need to synchronize swaps across multiple displays. If such a virtualized counter wouldn't be very accurate or if the associated ust timestamps wouldn't be very accurate, i'd be in trouble and the OML_sync_control extension would lose most of its value to me. * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www
Re: glXSwapBuffers fix for moving between crtcs is not following the OML_sync_control specification
On Jul 9, 2010, at 3:20 PM, Pauli Nieminen wrote: But actual patch has two problems. First, GetMSC is not NULL checked before calling it. This causes crash with driver that is using flipping without OML_sync_control. Second, same freeze would happen I avoided a ds-GetMSC() NULL check because that code-path wouldn't be taken if OML_sync_control support in the DDX is missing. If you look at the top of DRI2SwapBuffers() there is a fallback path that is taken if those entry points are null. for application that is using OML_sync_control if it was moved between areas where is different MSC. In my option actual patch should be reverted because it is not correct fix for the bug. See my other response. thanks, -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: glXSwapBuffers fix for moving between crtcs is not following the OML_sync_control specification
On Jul 9, 2010, at 6:15 PM, Michel Dänzer wrote: [...] For the old intel DRI1 swap scheduling hack, I solved this by making the MSC not correspond to any specific CRTC counter directly but making it a 'virtual' counter which increases at the same rate as the CRTC the window is currently being synchronized to. Looking at the GLX_OML_sync_control spec again, I think this solution should still be feasible in general, as all the extension calls take a drawable parameter. Hmm. I'm not sure if i as a toolkit developer would like that kind of virtualization. What i do is i have a given target system time at which the swap should happen - as close to that time as possible. I use the msc count and the associated ust timestamp together with the known video refresh rate and my given target time to translate the target time into a target_msc. I use these counts and timestamps to synchronize with other modalities like sound, various i/o etc. and often need sub-millisecond accurate timing. Sometimes i need to synchronize swaps across multiple displays. If such a virtualized counter wouldn't be very accurate or if the associated ust timestamps wouldn't be very accurate, i'd be in trouble and the OML_sync_control extension would lose most of its value to me. The accuracy should be the same, and sophisticated apps / toolkits such as yours can still do the same things when the window moves between CRTCs. However, I suspect most apps / toolkits won't be as sophisticated, and would be more likely to work sensibly. I can see that maintaining a per-drawable virtual msc counter could work, although you'd need to do very careful forward and backward mappings from virtual - real crtc in various places, e.g., translate to true vblank counts for scheduling vblank events in the kernel and then translating back at swap completion for glXWaitForSbcOML and the new INTEL_swap_events extension, taking possible movements of the drawable between scheduling the swap and completing the swap into account. I think the asynchronous nature of the new dri2 sync swap bits can cause a couple of new headaches here. One problem i can see is if you have multiple drawables synced to/on the same crtc and you want to schedule them to swap in sync or at specific msc offsets to each other. As long as all msc values refer to the same real msc of the crtc, everything is fine. If each drawable has its own virtualized msc, e.g., initialized to zero at drawable creation time and then monotonically incrementing, you have different reference points - msc counts are no longer comparable and meaningful between drawables. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Bug: Deadlock for multi-threaded glx apps inside __glXInitialize()
Hi Kristian Testing with current mesa master, my toolkit deadlocks on the first call to a glX function (glXChooseVisual()). The deadlock was probably introduced by your recent commit: glx: Use _Xglobal_lock for protecting extension display list ab434f6b7641a64d30725a9ac24929240362d466 The problem is that the _Xglobal_lock is locked twice inside the __glXInitialize() function of mesa/src/glx/glxext.c, once inside __glXInitialize(), and then as part of dri2CreateDisplay() - ... - XextFindDisplay. The 2nd locking call on the already held lock deadlocks. Attached a backtrace with the problem and a patch/hack that fixes it for me, but introduces a race-condition itself, so this is obviously not the correct solution. The race condition would trigger if two threads would simultaneously do their first call to a glX function for the same Display* dpy handle, rather unlikely to happen in practice? If so, then the patch might be an acceptable fix until a better solution is found? Other applications (glxgears, games etc.) work correctly. The difference is that my toolkit calls XInitThreads() at startup to use xlib multi-threaded and afaik the locking calls only get enabled if xlib is switched to thread-safe mode, otherwise they are no-ops? thanks, -mario 0001-mesa-glx-Work-around-deadlock-on-_Xglobal_lock-in-__.patch Description: Binary data #0 0x0012d422 in __kernel_vsyscall () #1 0x031ccaf9 in __lll_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0 #2 0x031c813b in _L_lock_748 () from /lib/tls/i686/cmov/libpthread.so.0 #3 0x031c7f61 in pthread_mutex_lock () from /lib/tls/i686/cmov/libpthread.so.0 #4 0x0310fba6 in pthread_mutex_lock () from /lib/tls/i686/cmov/libc.so.6 #5 0x0292f41f in _XLockMutex (lip=0x2a25e34) at ../../src/locking.c:107 #6 0x034729af in XextFindDisplay (extinfo=0x0, dpy=0x84debb0) at ../../src/extutil.c:232 #7 0x0344faff in DRI2FindDisplay (dpy=0x84debf0) at dri2.c:81 #8 0x0344fd6e in DRI2QueryExtension (dpy=0x84debf0, eventBase=0xbfffd9cc, errorBase=0xbfffd9c8) at dri2.c:184 #9 0x0344e653 in dri2CreateDisplay (dpy=0x84debf0) at dri2_glx.c:900 #10 0x0342a6ce in __glXInitialize (dpy=0x84debf0) at glxext.c:848 #11 0x0342764a in GetGLXPrivScreenConfig (dpy=0x84debf0, scrn=0, ppriv=0xbfffdbdc, ppsc=0xbfffdbd8) at glxcmds.c:183 #12 0x03427d46 in glXChooseVisual (dpy=0x84debf0, screen=0, attribList=0xbfffdcd8) at glxcmds.c:1495 #13 0x032dae3f in PsychOSOpenOnscreenWindow (screenSettings=0xbfffe114, windowRecord=0x8501010, numBuffers=2, stereomode=0, conserveVRAM=0) at Linux/Screen/PsychWindowGlue.c:288 #14 0x032ee883 in PsychOpenOnscreenWindow (screenSettings=0xbfffe114, windowRecord=0xbfffe110, numBuffers=2, stereomode=0, rect=0xbfffe1b0, multiSample=0, sharedContextWindow=0x0) at Common/Screen/PsychWindowSupport.c:295 #15 0x0330e57c in SCREENOpenWindow () at Common/Screen/SCREENOpenWindow.c:301 #16 0x0334b79b in mexFunction (nlhs=1, plhs=0xb5ee7018, nrhs=4, prhs=0xb5ee7008) at Common/Base/PsychScriptingGlue.cc:959 #17 0x004e0f25 in call_mex(bool, void*, octave_value_list const, int, octave_mex_function*) () from /usr/lib/octave-3.2.3/liboctinterp.so #18 0x006f8868 in octave_mex_function::do_multi_index_op(int, octave_value_list const) () from /usr/lib/octave-3.2.3/liboctinterp.so * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [RFC] DRI2 synchronization and swap bits
Hello everybody My name is Mario Kleiner and i'm new to this list, so i apologize beforehand should i violate some rules of netiquette, state the totally obvious, or if this post is somehow considered off-topic or way too long. Please tell me if so, and how to do better next time. First some background to why i am posting, then some proposals more to the point of this RFC. I read this RFC and i'm very excited about the prospect of having well working support for the OML_sync_control extension in DRI2 on Linux/X11. I was hoping for this to happen since years, so a big thank you in advance! This is why i hope to provide some input from the perspective of future power-users of functions like glXGetSyncValuesOML(), glXSwapBuffersMscOML(), glXWaitForSbcOML. I'm the co-developer of a popular free-software toolkit (Psychtoolbox) that is used mostly in the neuroscience / cognitive science community by scientist to find out how the different senses (visual, auditory, haptic, ...) work and how they work together. Our requirements to graphics are often much more demanding than what a videogame, typical vr-environment or a mediaplayer has. Our users often have very strict requirements for scheduling frame- accurate and tear-free visual stimulus display, synchronizing bufferswaps across display-heads, and low-latency returns from swap- completion. Often they need swap-completion timestamps which are available with the shortest possible delay after a successfull swap and accurately tied to the vblank at which scanout of a swapped frame started. The need for timestamps with sub-millisecond accuracy is not uncommon. Therefore, well working OML_sync_control support would be basically a dream come true and a very compelling feature for Linux as a platform for cognitive science. I spent the last 12 hours reading the CompositeSwap page at the DRI- Wiki and through Jesse Barnes git-tree and the drivers/gpu/drm/ drm_irq.c file in the linux-next git-tree at kernel org, which i assume (correctly?) is the current state of art wrt. to the DRM, and have some thoughts or wishes. 1. Wrt to 2) DRI2WaitMSC/SBC a) Concern about blocking the client on the server side as opposed to a client side wait. I'm not sure about the extra latency involved by blocking the client on the server side, instead of a client side wait, but i can assure you that for our applications, 1 millisecond extra delay between swap- completion and unblocking can make a significant difference. Quite often certain actions need to be triggered in sync with swap completion. Examples are starting recording equipment for brain activity (fMRI, EEG, MEG, eye-trackers) or other physiological responses, starting sound playback or recording, sending trigger packets over a network, driving special digital/analog I/O boards, driving motion simulators etc. So low-latency unblocking would be much appreciated from our side. 2. On the CompositePage in the DRM Wiki, there is this comment: ...It seems that composited apps should never need to know about real world screen vblank issues, ... When dealing with a redirected window it seems it would be acceptable to come up with an entirely fake number for all existing extensions that care about vblanks.. I don't like this idea about entirely fake numbers and like to vote for a solution that is as close as possible to the non-redirected case. Most of our applications run in non-redirected, full-screen, undecorated, page-flipped windows, ie., without a compositor being involved. I can think of a couple future usage cases though where reasonably well working redirected/composited windows would be very useful for us, but only if we get meaningful timestamps and vblank counts that are tied to the actual display onset. 3. The Wiki also mentions The direct rendered cases outlined in the implementation notes above are complete, but there's a bug in the async glXSwapBuffers that sometimes causes clients to hang after swapping rather than continue. Looking through the code of http:// cgit.freedesktop.org/~jbarnes/xf86-video-intel/tree/src/i830_dri.c? id=a0e2e624c47516273fa3d260b86d8c293e2519e4 i can see that in I830DRI2SetupSwap() and I830DRI2SetupWaitMSC(), in the if (divisor == 0) { ...} path, the functions return after DRM_VBLANK_EVENT submission without assigning *event_frame = vbl.reply.sequence; This looks problematic to me, as the xserver is later submitting event_frame in the call to DRI2AddFrameEvent() inside DRI2SwapBuffers () as a cookie to find the right events for clients to wait on? Could this be a reason for clients hanging after swap? I found a few other spots where i other misunderstood something or there are small bugs. What is the appropriate way to report these? 4. According to spec, the different OML_sync_control functions do return a UST timestamp which is supposed to reflect the exact
Re: [RFC] DRI2 synchronization and swap bits
this beamposition timestamping in userspace to get rid of most of the scheduling/wakeup delay and jitter after swap completion on Windows and MacOS/X. Both systems provide a function to query the current scanout position. It is so far the only method that allows us to achieve sub-millisecond precision for our timestamps. It also allows us to provide a timestamp for vbl onset and a timestamp for start of scanout of a new frame. If Linux could do this within its timestamping code by default, it would be even more accurate. Our userspace code can get preempted at the wrong moment, but kernel code inside the irq handler should be more robust. Also it would allow a more precise implementation of the UST timestamp according to the OML_sync_control spec, as that spec asks for UST being the time of start of scanout of the first scanline of a new video frame, instead of start of vblank. Let me know what you think about this, -mario * Mario Kleiner Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tuebingen Germany e-mail: mario.klei...@tuebingen.mpg.de office: +49 (0)7071/601-1623 fax:+49 (0)7071/601-616 www:http://www.kyb.tuebingen.mpg.de/~kleinerm * For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. (Richard Feynman) ___ xorg-devel mailing list xorg-devel@lists.x.org http://lists.x.org/mailman/listinfo/xorg-devel
DRI3/Present fixes for XServers 1.16 and 1.17rc
Hi, i finally figured out why dri3/present failed so miserably with my neuro-science application, which relies on OML_sync_control and INTEL_swap_events extensively and worked nicely under dri2 for years. I found so far two bugs in the x-server and 3 bugs in mesas dri/present backend. I'll send out the mesa patches with fixes separately. The following two patches fix the x-server part for me. First one only caused pain during debugging - a bug in the DebugPresent macros, causing server crash on logout, closing of unredirected fullscreen windows, or toggling composition on/off whenever the debug macros were compiled in. The second patch fixes nouveau's massive tearing and various OML_sync_control related failures caused by lack of vsynced pageflips. The problem doesn't happen on intel, at least with the tested ddx versions, because intel-ddx (as of 2.99.914) never reports AsyncFlipCapability on my Intel HD Ironlake and Intel HD 4000 IvyBridge, so possibly a bug in the intel driver which cancels out with the server bug? These patches were tested to apply cleanly on master and on top of the 1.16.2 branch. They were tested on Intel HD, Intel HD-4000 and with nouveau on a NVidia gpu on top of 1.16.2. Could these be applied to 1.16 and later? thanks, -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 1/2] present: Avoid crashes in DebugPresent(), a bit more info.
DebugPresent() crashed the server when a dri3 drawable was closed while a pageflipped present was still pending, due to vblank-window- Null-Ptr deref, so debug builds caused new problems to debug. E.g., glXSwapBuffers(...); glXDestroyWindow(...); - Pageflip for non-existent window completes - boom. Also often happens when switching desktop compositor on/off due to Present unflips, or when logging out of session. Also add info if a Present is queued for copyswap or pageflip, if the present is vsynced, and the serial no of the Present request, to aid debugging of pageflip and vsync issues. The serial number is useful as Mesa's dri3/present backend encodes its sendSBC in the serial number, so one can easily correlate server debug output with Mesa and with the SBC values returned to actual OpenGL client applications via OML_sync_control and INTEL_swap_events extension, makes debugging quite a bit more easy. Please also cherry-pick this for a 1.16.x stable update. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- present/present.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/present/present.c b/present/present.c index cf283f4..d84bdef 100644 --- a/present/present.c +++ b/present/present.c @@ -435,7 +435,7 @@ present_flip_notify(present_vblank_ptr vblank, uint64_t ust, uint64_t crtc_msc) DebugPresent((\tn %lld %p %8lld: %08lx - %08lx\n, vblank-event_id, vblank, vblank-target_msc, vblank-pixmap ? vblank-pixmap-drawable.id : 0, - vblank-window-drawable.id)); + vblank-window ? vblank-window-drawable.id : 0)); assert (vblank == screen_priv-flip_pending); @@ -851,10 +851,10 @@ present_pixmap(WindowPtr window, } if (pixmap) -DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p)\n, +DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p) flip %d vsync %d serial %d\n, vblank-event_id, vblank, target_msc, vblank-pixmap-drawable.id, vblank-window-drawable.id, - target_crtc)); + target_crtc, vblank-flip, vblank-sync_flip, vblank-serial)); xorg_list_add(vblank-event_queue, present_exec_queue); vblank-queued = TRUE; @@ -946,7 +946,7 @@ present_vblank_destroy(present_vblank_ptr vblank) DebugPresent((\td %lld %p %8lld: %08lx - %08lx\n, vblank-event_id, vblank, vblank-target_msc, vblank-pixmap ? vblank-pixmap-drawable.id : 0, - vblank-window-drawable.id)); + vblank-window ? vblank-window-drawable.id : 0)); /* Drop pixmap reference */ if (vblank-pixmap) -- 2.1.0 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 2/2] present: Fix use of vsynced pageflips and honor PresentOptionAsync.
Pageflips for Pixmap presents were not synchronized to vblank by default, as they should be, due to some missing init for vblank-sync_flips. The PresentOptionAsync flag was completely ignored for controlling this. Vsynced flips only worked by accident on the intel-ddx, as that driver doesn't report PresentCapabilityAsync support (tested on Intel HD Ironlake and Intel HD 4000 IvyBridge + Linux 3.17). This lack might be a bug in the intel-ddx itself? On nouveau-ddx, which properly reports PresentCapabilityAsync, this always caused non-vsynced pageflips and pretty ugly tearing. This patch fixes the problem, as tested on top of XOrg 1.16.2 on nouveau and intel. Please also apply to XOrg 1.16 stable. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- present/present.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/present/present.c b/present/present.c index d84bdef..f24a218 100644 --- a/present/present.c +++ b/present/present.c @@ -737,6 +737,10 @@ present_pixmap(WindowPtr window, */ window_priv-msc = crtc_msc; +/* Clear async request if driver can't support it for pixmap present */ +if (pixmap (!screen_priv-info || !(screen_priv-info-capabilities PresentCapabilityAsync))) +options = ~PresentOptionAsync; + /* Adjust target_msc to match modulus */ if (crtc_msc = target_msc) { @@ -828,9 +832,10 @@ present_pixmap(WindowPtr window, vblank-msc_offset = window_priv-msc_offset; vblank-notifies = notifies; vblank-num_notifies = num_notifies; +vblank-sync_flip = TRUE; -if (!screen_priv-info || !(screen_priv-info-capabilities PresentCapabilityAsync)) -vblank-sync_flip = TRUE; +if (options PresentOptionAsync) +vblank-sync_flip = FALSE; if (pixmap present_check_flip (target_crtc, window, pixmap, vblank-sync_flip, valid, x_off, y_off)) { vblank-flip = TRUE; -- 2.1.0 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] dri2: add initial prime support. (v1.2)
On 03.07.2012, at 16:22, Dave Airlie wrote: From: Dave Airlie airl...@redhat.com This adds the initial prime support for dri2 offload. The main thing is when we get a connection from a prime client, we stored the information and mark all drawables from that client as prime. We then create all buffers for that drawable on the prime device dri2screen. Then DRI2UpdatePrime is provided which drivers can call to get a shared pixmap which they can use as the front buffer. The driver is then responsible for doing the back-front copy to the shared buffer. prime requires a compositing manager be run, but it handles the case where a window get un-redirected by allocating a new pixmap and pointing the crtc at it while the client is in that state. Currently prime can't handle pageflipping, so always does straight copy swap, v1.1: renumber on top of master. v1.2: fix auth on top of master. Signed-off-by: Dave Airlie airl...@redhat.com --- glx/glxdri2.c |2 +- hw/xfree86/dri2/dri2.c| 265 + hw/xfree86/dri2/dri2.h| 25 - hw/xfree86/dri2/dri2ext.c |4 +- 4 files changed, 267 insertions(+), 29 deletions(-) diff --git a/glx/glxdri2.c b/glx/glxdri2.c index 7b76c3a..984f115 100644 --- a/glx/glxdri2.c +++ b/glx/glxdri2.c @@ -840,7 +840,7 @@ __glXDRIscreenProbe(ScreenPtr pScreen) return NULL; if (!xf86LoaderCheckSymbol(DRI2Connect) || -!DRI2Connect(pScreen, DRI2DriverDRI, +!DRI2Connect(serverClient, pScreen, DRI2DriverDRI, screen-fd, driverName, deviceName)) { LogMessage(X_INFO, AIGLX: Screen %d is not DRI2 capable\n, pScreen-myNum); diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index d3b3c73..0f87820 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -45,7 +45,7 @@ #include dixstruct.h #include dri2.h #include xf86VGAarbiter.h - +#include damage.h #include xf86.h CARD8 dri2_major; /* version of DRI2 supported by DDX */ @@ -63,6 +63,17 @@ static DevPrivateKeyRec dri2PixmapPrivateKeyRec; #define dri2PixmapPrivateKey (dri2PixmapPrivateKeyRec) +static DevPrivateKeyRec dri2ClientPrivateKeyRec; + +#define dri2ClientPrivateKey (dri2ClientPrivateKeyRec) + +#define dri2ClientPrivate(_pClient) (dixLookupPrivate((_pClient)-devPrivates, \ + dri2ClientPrivateKey)) + +typedef struct _DRI2Client { +int prime_id; +} DRI2ClientRec, *DRI2ClientPtr; + static RESTYPE dri2DrawableRes; typedef struct _DRI2Screen *DRI2ScreenPtr; @@ -87,6 +98,9 @@ typedef struct _DRI2Drawable { int swap_limit; /* for N-buffering */ unsigned long serialNumber; Bool needInvalidate; +int prime_id; +PixmapPtr prime_slave_pixmap; +PixmapPtr redirectpixmap; } DRI2DrawableRec, *DRI2DrawablePtr; typedef struct _DRI2Screen { @@ -113,14 +127,47 @@ typedef struct _DRI2Screen { HandleExposuresProcPtr HandleExposures; ConfigNotifyProcPtr ConfigNotify; +DRI2CreateBuffer2ProcPtr CreateBuffer2; +DRI2DestroyBuffer2ProcPtr DestroyBuffer2; +DRI2CopyRegion2ProcPtr CopyRegion2; } DRI2ScreenRec; +static void +destroy_buffer(DrawablePtr pDraw, DRI2BufferPtr buffer, int prime_id); + static DRI2ScreenPtr DRI2GetScreen(ScreenPtr pScreen) { return dixLookupPrivate(pScreen-devPrivates, dri2ScreenPrivateKey); } +static ScreenPtr +GetScreenPrime(ScreenPtr master, int prime_id) +{ +ScreenPtr slave; +int i; + +if (prime_id == 0 || xorg_list_is_empty(master-offload_slave_list)) { +return master; +} +i = 0; +xorg_list_for_each_entry(slave, master-offload_slave_list, offload_head) { +if (i == (prime_id - 1)) +break; +i++; +} +if (!slave) +return master; +return slave; +} + +static DRI2ScreenPtr +DRI2GetScreenPrime(ScreenPtr master, int prime_id) +{ +ScreenPtr slave = GetScreenPrime(master, prime_id); +return DRI2GetScreen(slave); +} + static DRI2DrawablePtr DRI2GetDrawable(DrawablePtr pDraw) { @@ -187,7 +234,8 @@ DRI2AllocateDrawable(DrawablePtr pDraw) xorg_list_init(pPriv-reference_list); pPriv-serialNumber = DRI2DrawableSerial(pDraw); pPriv-needInvalidate = FALSE; - +pPriv-redirectpixmap = NULL; +pPriv-prime_slave_pixmap = NULL; if (pDraw-type == DRAWABLE_WINDOW) { pWin = (WindowPtr) pDraw; dixSetPrivate(pWin-devPrivates, dri2WindowPrivateKey, pPriv); @@ -286,6 +334,7 @@ DRI2CreateDrawable(ClientPtr client, DrawablePtr pDraw, XID id, DRI2InvalidateProcPtr invalidate, void *priv) { DRI2DrawablePtr pPriv; +DRI2ClientPtr dri2_client = dri2ClientPrivate(client); XID dri2_id; int rc; @@ -295,6 +344,8 @@ DRI2CreateDrawable(ClientPtr client,
Re: [PATCH] dri2: add initial prime support. (v1.2)
On 03.07.2012, at 20:47, Dave Airlie wrote: On Tue, Jul 3, 2012 at 7:39 PM, Mario Kleiner mario.klei...@tuebingen.mpg.de wrote: On 03.07.2012, at 16:22, Dave Airlie wrote: From: Dave Airlie airl...@redhat.com This adds the initial prime support for dri2 offload. The main thing is when we get a connection from a prime client, we stored the information and mark all drawables from that client as prime. We then create all buffers for that drawable on the prime device dri2screen. ... +DrawablePtr DRI2UpdatePrime(DrawablePtr pDraw, DRI2BufferPtr pDest) +{ +DRI2DrawablePtr pPriv = DRI2GetDrawable(pDraw); +PixmapPtr spix; +PixmapPtr mpix = GetDrawablePixmap(pDraw); +ScreenPtr master, slave; +Bool ret; + +master = mpix-drawable.pScreen; + +if (pDraw-type == DRAWABLE_WINDOW) { + WindowPtr pWin = (WindowPtr)pDraw; + PixmapPtr pPixmap = pDraw-pScreen-GetWindowPixmap(pWin); + + if (pDraw-pScreen-GetScreenPixmap(pDraw-pScreen) == pPixmap) { + if (pPriv-redirectpixmap + pPriv-redirectpixmap-drawable.width == pDraw-width + pPriv-redirectpixmap-drawable.height == pDraw-height + pPriv-redirectpixmap-drawable.depth == pDraw-depth) { + mpix = pPriv-redirectpixmap; + } else { +if (master-ReplaceScanoutPixmap) { +mpix = (*master-CreatePixmap)(master, pDraw-width, pDraw-height, + pDraw-depth, CREATE_PIXMAP_USAGE_SHARED); +if (!mpix) +return NULL; + +ret = (*master-ReplaceScanoutPixmap)(pDraw, mpix, TRUE); +if (ret == FALSE) { +(*master-DestroyPixmap)(mpix); +return NULL; +} The following bit ok? I probably just don't understand the code well enough, but is a ... if (pPriv-redirectpixmap) (*master-DestroyPixmap)(pPriv-redirectpixmap); … missing here before the assignment of the new mpix? +pPriv-redirectpixmap = mpix; +} else +return NULL; + } + } else if (pPriv-redirectpixmap) { +(*master-ReplaceScanoutPixmap)(pDraw, pPriv-redirectpixmap, FALSE); + (*master-DestroyPixmap)(pPriv-redirectpixmap); + pPriv-redirectpixmap = NULL; + } +} + … If you define a new CopyRegion2 hook, it would be good to now also pass along the pPriv- swap_interval parameter, or a bool vsync = pPriv-swap_interval ? true : false flag. That would allow the ddx CopyRegion2() to skip sync to vblank during copy if swap_interval is zero. Other OS'es and the NVidia/AMD binary blobs on Linux treat a zero swap interval as don't sync to retrace. DRI2SwapBuffers() currently skips all swap scheduling on zero swap interval and uses CopyRegion(), but CopyRegion doesn't know it is a zero swap interval/no-vsync swap, so it syncs anyway. All ddx'es have a SwapBuffersWait xorg.conf option to disable this behaviour, but i think that was added as a workaround, not really the amount of control that applications expect. I could do that but I really would prefer we do things in iterations and not confuse the interfaces, so add a CopyRegion3 with this sort of thing. Ok. Also Chris Wilson just posted an updated patch for an AsyncSwap hook, which would solve the same problem, depending on what gets in, in what order. The thing is I've not really considered timing because its really messy with two GPUs, current offload design is offload GPU calls CopyRegion2 which uses the hw copy engine to copy from the VRAM to GTT, the GTT object is shared with the other GPU and is the backing for the redirected window pixmap. So when the first driver copies to the second, it causes damage on the backing pixmap which gets sent to compositor which copies from it. Doing sync and flipping is a lot messier, since you probably want to sync everything on the displaying GPU. Thanks for the explanation. Posting damage at the current place could also cause composition of partially copied frames, right? Just a question to see if i understand the damage tracking correctly. If so, how does a compositor in the redirected case know when the offload slave has finished its blit to the redirectPixmap? How does the offload slave sync to vblank of the masters crtc in the unredirected case when its pPriv-redirectpixmap is set as a crtc's scanout pixmap? Again sync to vblank is definitely part 2, we've been looking at inter-gpu semaphores (by we I mean mlankhorst), but I'd appreciate any input you can give since you know this stuff fairly well ;-). Not so sure how well i know this stuff, but i totally agreed to the definitely part 2 thing :). I'm happy if your current patches get in soonish. I think i will just shut up and wait
Re: Initial DRI3000 protocol specs available
On 02/20/2013 09:27 PM, Keith Packard wrote: Chris Wilson ch...@chris-wilson.co.uk writes: What I don't see here is how the client instructs the server to handle a missed swap. Right, this first pass was just trying to replicate the DRI2 semantics; figuring out how to improve those seems like a good idea. From what game developers have told us, a missed swap should just tear instead of dropping a frame. It might be nice to inform the client that they're not keeping up with the target frame rate and let them scale stuff back; I'd suggest the SwapComplete event could contain enough information to let them know what actually happened. Please make this configurable. Tearing makes sense for a game, but for the kind of scientific apps i do, we don't want it to tear ever, or bad things would happen for us. We need it to just flip the frame delayed but vsync'ed and then the app can figure out via the INTEL_swap_events or glXWaitForSbcOML() that a deadline was missed and what to do to catch up. There's http://www.opengl.org/registry/specs/EXT/glx_swap_control_tear.txt that allows apps to define if they want to tear or vsync on a missed swap deadline. For example, with the typical use of swap-interval 0, divisor = 0, target = current we can choose to either emit this SwapRegion synchronously, or asynchronously (to risk tearing but allow the client catch up to its target framerate). I haven't heard anyone asking for us to skip a frame in this case to avoid tearing. See above :-). ... In the reply, swap_hi/swap_lo form a 64-bit swap count value when the swap will actually occur (e.g. the last queued swap count + (pending swap count * swap interval)). t I'm not sure exactly what SBC is meant to be. Is it a simple seqno of the SwapRegion in this Drawable's swap queue (why then does swap_interval matter), or is it meant to correlate with the vblank counter (in which case it is merely a predicted value)? Just like DRI2, it's the planned swap time. Don't be late! SBC in DRI2 is the running count of completed swaps for a drawable, ie., current swap count + pending swap count, not the planned swap time. Essentially a reference to a just queued swap via sbc = glXSwapBuffersMscOML(...), so you can use sbc as a unique id for that swap in glXWaitForSBCOML() or to match it up with the sbc in a returned INTEL_swap_event. Very useful, so i guess this should stay backwards-compatible. thanks, -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.
Restores proper immediate tearing swap behaviour for OpenGL bufferswap under DRI3/Present. Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- src/glx/dri3_glx.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c index 5796491..c53be1b 100644 --- a/src/glx/dri3_glx.c +++ b/src/glx/dri3_glx.c @@ -1518,6 +1518,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, xcb_connection_t *c = XGetXCBConnection(dpy); struct dri3_buffer *back; int64_t ret = 0; + uint32_t options = XCB_PRESENT_OPTION_NONE; unsigned flags = __DRI2_FLUSH_DRAWABLE; if (flush) @@ -1557,6 +1558,9 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, if (target_msc == 0) target_msc = priv-msc + priv-swap_interval * (priv-send_sbc - priv-recv_sbc); + if (priv-swap_interval == 0) + options |= XCB_PRESENT_OPTION_ASYNC; + back-busy = 1; back-last_swap = priv-send_sbc; xcb_present_pixmap(c, @@ -1570,7 +1574,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, None, /* target_crtc */ None, back-sync_fence, - XCB_PRESENT_OPTION_NONE, + options, target_msc, divisor, remainder, 0, NULL); -- 1.9.1 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 2/3] glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc
Prevent calls to glXGetSyncValuesOML() and glXWaitForMscOML() from overwriting the (ust,msc) values of the last successfull swapbuffers call (PresentPixmapCompleteNotify event), as glXWaitForSbcOML() relies on those values corresponding to the most recent completed swap, not to whatever was last returned from the server. Problematic call sequence without this patch would have been, e.g., glXSwapBuffers() ... wait ... swap completes - PresentPixmapComplete event - (ust,msc) updated to reflect swap completion time and count. ... wait for at least 1 video refresh cycle/vblank increment. glXGetSyncValuesOML() - PresentNotifyMsc event overwrites (ust,msc) of swap completion with (ust,msc) of most recent vblank glXWaitForSbcOML() - Returns sbc of last completed swap but (ust,msc) of last completed vblank, not of last completed swap. - Client is confused. Do this by tracking a separate set of (ust, msc) for the dri3_wait_for_msc() call than for the dri3_wait_for_sbc() call. This makes the glXWaitForSbcOML() call robust again and restores consistent behaviour with the DRI2 implementation. Fixes applications originally written and tested against DRI2 which also rely on this not regressing under DRI3/Present, e.g., Neuro-Science software like Psychtoolbox-3. This patch fixes the problem. Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- src/glx/dri3_glx.c | 11 +++ src/glx/dri3_priv.h | 5 - 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c index b4ac278..5796491 100644 --- a/src/glx/dri3_glx.c +++ b/src/glx/dri3_glx.c @@ -420,11 +420,14 @@ dri3_handle_present_event(struct dri3_drawable *priv, xcb_present_generic_event_ if (psc-show_fps_interval) show_fps(priv, ce-ust); + + priv-ust = ce-ust; + priv-msc = ce-msc; } else { priv-recv_msc_serial = ce-serial; + priv-vblank_ust = ce-ust; + priv-vblank_msc = ce-msc; } - priv-ust = ce-ust; - priv-msc = ce-msc; break; } case XCB_PRESENT_EVENT_IDLE_NOTIFY: { @@ -498,8 +501,8 @@ dri3_wait_for_msc(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, } } - *ust = priv-ust; - *msc = priv-msc; + *ust = priv-vblank_ust; + *msc = priv-vblank_msc; *sbc = priv-recv_sbc; return 1; diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h index 8e46640..222deb0 100644 --- a/src/glx/dri3_priv.h +++ b/src/glx/dri3_priv.h @@ -182,9 +182,12 @@ struct dri3_drawable { uint64_t send_sbc; uint64_t recv_sbc; - /* Last received UST/MSC values */ + /* Last received UST/MSC values for pixmap present complete */ uint64_t ust, msc; + /* Last received UST/MSC values for vblank */ + uint64_t vblank_ust, vblank_msc; + /* Serial numbers for tracking wait_for_msc events */ uint32_t send_msc_serial; uint32_t recv_msc_serial; -- 1.9.1 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [Mesa-dev] [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.
On 11/25/2014 07:52 AM, Axel Davy wrote: Hi, Sorry for the late reply, seems my e-mail filters hate me - i was already worried about the lack of replies. This patch removes the tripple buffering behaviour that the GLX implementation has with DRI3. I understand your concern for Medical softwares, but perhaps this would be better handled with an user option. Axel Davy I don't think it does that? For swapinterval 0 all the triple-buffering etc. is left active as usual, and swaps are supposed to be vsynced and tear-free. swapinterval is 1 by default, so this is the case by default. For a client or user selected swapinterval = 0, the code in Mesa schedules glXSwapBuffers calls to occur immediately, and the Present implementation in the server schedules them in a way so they are supposed to occur immediately/asap - assuming no vsync, otherwise the logic there wouldn't make much sense. Due to what i'm pretty sure is a bug (see my other 2 patches on top of the xserver dri3/present implementation) the PresentOptionAsync isn't handled correctly by the server for pagelfips as it should be, and as result vsync is always on on intel-ddx, and always off on nouveau-ddx (or any future driver which exposes PresentCapabilityAsync as nouveau does), which causes massive tearing on nouveau all the time under dri3/present. On any os/graphics-stack i'm familiar with (Windows and OSX all versions and drivers, Linux proprietary drivers, DRI2 with intel-ddx and nouveau-ddx) a swapinterval zero meant to swap immediately without vsync, so this just restores what probably any OpenGL client expects to happen. The MESA_swap_control spec even states that If interval is set to a value of 0, buffer swaps are not synchronized to a video frame. For medical or neuro-science applications like mine, or multi-display stereo/VR applications it is crucial to have this for users being able to easily diagnose multi-display synchronization issues without the need to learn how to use git and compilers and how graphics stacks work. For gamers and gaming benchmarks it's the ego-shooter mode and squeeze out max fps they like so much. Btw., digging deeper into the ddx'es i realized that my posted patch for the x-server bug is both incomplete and too complex. I've written a new patch which is only a simple one-liner but solves more problems than the old one. I'll send it out after giving it some testing. thanks, -mario On 25/11/2014 04:00, Mario Kleiner wrote : Restores proper immediate tearing swap behaviour for OpenGL bufferswap under DRI3/Present. Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- src/glx/dri3_glx.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c index 5796491..c53be1b 100644 --- a/src/glx/dri3_glx.c +++ b/src/glx/dri3_glx.c @@ -1518,6 +1518,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, xcb_connection_t *c = XGetXCBConnection(dpy); struct dri3_buffer *back; int64_t ret = 0; + uint32_t options = XCB_PRESENT_OPTION_NONE; unsigned flags = __DRI2_FLUSH_DRAWABLE; if (flush) @@ -1557,6 +1558,9 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, if (target_msc == 0) target_msc = priv-msc + priv-swap_interval * (priv-send_sbc - priv-recv_sbc); + if (priv-swap_interval == 0) + options |= XCB_PRESENT_OPTION_ASYNC; + back-busy = 1; back-last_swap = priv-send_sbc; xcb_present_pixmap(c, @@ -1570,7 +1574,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, None, /* target_crtc */ None, back-sync_fence, - XCB_PRESENT_OPTION_NONE, + options, target_msc, divisor, remainder, 0, NULL); ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.
On 11/25/2014 09:31 AM, Frank Binns wrote: Hi, I sent exactly the same patch back in April. Despite getting a review-by it was never merged. See: http://lists.freedesktop.org/archives/mesa-dev/2014-April/058347.html http://lists.freedesktop.org/archives/mesa-dev/2014-May/060432.html Thanks Frank Indeed, exactly the same patch - minus white-space differences - already reviewed. So i guess i can add your additional signed-off-by and chris wilsons reviewed-by to this one? I have a few more patches coming for this series, which fix two more mesa bugs related to dri3/present... thanks, -mario On 25/11/14 03:00, Mario Kleiner wrote: Restores proper immediate tearing swap behaviour for OpenGL bufferswap under DRI3/Present. Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- src/glx/dri3_glx.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c index 5796491..c53be1b 100644 --- a/src/glx/dri3_glx.c +++ b/src/glx/dri3_glx.c @@ -1518,6 +1518,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, xcb_connection_t *c = XGetXCBConnection(dpy); struct dri3_buffer *back; int64_t ret = 0; + uint32_t options = XCB_PRESENT_OPTION_NONE; unsigned flags = __DRI2_FLUSH_DRAWABLE; if (flush) @@ -1557,6 +1558,9 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, if (target_msc == 0) target_msc = priv-msc + priv-swap_interval * (priv-send_sbc - priv-recv_sbc); + if (priv-swap_interval == 0) + options |= XCB_PRESENT_OPTION_ASYNC; + back-busy = 1; back-last_swap = priv-send_sbc; xcb_present_pixmap(c, @@ -1570,7 +1574,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, None, /* target_crtc */ None, back-sync_fence, - XCB_PRESENT_OPTION_NONE, + options, target_msc, divisor, remainder, 0, NULL); ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 2/2] present: Fix use of vsynced pageflips and honor PresentOptionAsync. (v3)
Pageflips for Pixmap presents were not synchronized to vblank on drivers with support for PresentCapabilityAsync, due to some missing init for vblank-sync_flips. The PresentOptionAsync flag was completely ignored for pageflipped presents. Vsynced flips only worked by accident on the intel-ddx, as that driver doesn't have PresentCapabilityAsync support. On nouveau-ddx, which supports PresentCapabilityAsync, this always caused non-vsynced pageflips with pretty ugly tearing. This patch fixes the problem, as tested on top of XOrg 1.16.2 on nouveau and intel. Please also apply to XOrg 1.17 and XOrg 1.16.2 stable. Applying on top of XOrg 1.16.2 may require cherry-picking commit 2051514652481a83bd7cf22e57cb0fcd40333f33 which trivially fixes lack of support for protocol option PresentOptionCopy - get two bug fixes for the price of one! Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- present/present.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/present/present.c b/present/present.c index e5d3fd5..be1c9f1 100644 --- a/present/present.c +++ b/present/present.c @@ -834,7 +834,7 @@ present_pixmap(WindowPtr window, vblank-notifies = notifies; vblank-num_notifies = num_notifies; -if (!screen_priv-info || !(screen_priv-info-capabilities PresentCapabilityAsync)) +if (!(options PresentOptionAsync)) vblank-sync_flip = TRUE; if (!(options PresentOptionCopy) -- 1.9.1 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 1/2] present: Avoid crashes in DebugPresent(), a bit more info.
DebugPresent() crashed the server when a dri3 drawable was closed while a pageflipped present was still pending, due to vblank-window- Null-Ptr deref, so debug builds caused new problems to debug. E.g., glXSwapBuffers(...); glXDestroyWindow(...); - Pageflip for non-existent window completes - boom. Also often happens when switching desktop compositor on/off due to Present unflips, or when logging out of session. Also add info if a Present is queued for copyswap or pageflip, if the present is vsynced, and the serial no of the Present request, to aid debugging of pageflip and vsync issues. The serial number is useful as Mesa's dri3/present backend encodes its sendSBC in the serial number, so one can easily correlate server debug output with Mesa and with the SBC values returned to actual OpenGL client applications via OML_sync_control and INTEL_swap_events extension, makes debugging quite a bit more easy. Please also cherry-pick this for a 1.16.x stable update. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- present/present.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/present/present.c b/present/present.c index ac9047e..e5d3fd5 100644 --- a/present/present.c +++ b/present/present.c @@ -440,7 +440,7 @@ present_flip_notify(present_vblank_ptr vblank, uint64_t ust, uint64_t crtc_msc) DebugPresent((\tn %lld %p %8lld: %08lx - %08lx\n, vblank-event_id, vblank, vblank-target_msc, vblank-pixmap ? vblank-pixmap-drawable.id : 0, - vblank-window-drawable.id)); + vblank-window ? vblank-window-drawable.id : 0)); assert (vblank == screen_priv-flip_pending); @@ -859,10 +859,10 @@ present_pixmap(WindowPtr window, } if (pixmap) -DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p)\n, +DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p) flip %d vsync %d serial %d\n, vblank-event_id, vblank, target_msc, vblank-pixmap-drawable.id, vblank-window-drawable.id, - target_crtc)); + target_crtc, vblank-flip, vblank-sync_flip, vblank-serial)); xorg_list_add(vblank-event_queue, present_exec_queue); vblank-queued = TRUE; @@ -955,7 +955,7 @@ present_vblank_destroy(present_vblank_ptr vblank) DebugPresent((\td %lld %p %8lld: %08lx - %08lx\n, vblank-event_id, vblank, vblank-target_msc, vblank-pixmap ? vblank-pixmap-drawable.id : 0, - vblank-window-drawable.id)); + vblank-window ? vblank-window-drawable.id : 0)); /* Drop pixmap reference */ if (vblank-pixmap) -- 1.9.1 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
DRI3/Present fixes for XServers 1.16 and 1.17rc - (v2)
Hi, an updated set of patches to fix the bugs i found in the xserver dri3/present implementation and one bug in intel-ddx uxa/dri3/present implementation. Axel Davys comments made me rethink my original xserver patch and the new solution is simple and better and afaics how this was actually intended to work in the server, the server properly using the present_check_flip ddx driver function. Patch 1/2 fixes and slightly improves DebugPresent() macros for the server to avoid crashes at logout, compositor en/disable or closing windows while flips are pending when the server is compiled with debug macros on. Patch 2/2 fixes the use of PresentOptionAsync for page-flipped present, and makes Present working on nouveau without horrible tearing. These patches apply to master, 1.17rc and 1.16.2. They were tested on top of 1.16.2 with the dri3/present backends of nouveau master (glamor and exa) and intel master (sna and fixed uxa) on single-display and dual-display, also ran through my hardware timing test equipment. Patch uxa/present is a required fix for intel-ddx uxa backend, so intel_present_check_flip no longer lies to the server about its capabilities. Can the x-server patches please also be included into the 1.16 series? thanks, -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH] uxa/present: Handle sync_flip flag in intel_present_check_flip()
Make sure we reject async flips if we don't support async flips. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- src/uxa/intel_present.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/uxa/intel_present.c b/src/uxa/intel_present.c index d20043f..d2aa9ee 100644 --- a/src/uxa/intel_present.c +++ b/src/uxa/intel_present.c @@ -58,6 +58,8 @@ struct intel_present_vblank_event { uint64_tevent_id; }; +static Bool intel_present_has_async_flip(ScreenPtr screen); + static uint32_t pipe_select(int pipe) { if (pipe 1) @@ -266,6 +268,9 @@ intel_present_check_flip(RRCrtcPtr crtc, if (!bo) return FALSE; +if (!sync_flip !intel_present_has_async_flip(screen)) +return FALSE; + return TRUE; } -- 1.9.1 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 2/2] present: Fix use of vsynced pageflips and honor PresentOptionAsync. (v3)
? thanks, -mario Axel Davy On 02/12/2014 20:08, Mario Kleiner wrote : Pageflips for Pixmap presents were not synchronized to vblank on drivers with support for PresentCapabilityAsync, due to some missing init for vblank-sync_flips. The PresentOptionAsync flag was completely ignored for pageflipped presents. Vsynced flips only worked by accident on the intel-ddx, as that driver doesn't have PresentCapabilityAsync support. On nouveau-ddx, which supports PresentCapabilityAsync, this always caused non-vsynced pageflips with pretty ugly tearing. This patch fixes the problem, as tested on top of XOrg 1.16.2 on nouveau and intel. Please also apply to XOrg 1.17 and XOrg 1.16.2 stable. Applying on top of XOrg 1.16.2 may require cherry-picking commit 2051514652481a83bd7cf22e57cb0fcd40333f33 which trivially fixes lack of support for protocol option PresentOptionCopy - get two bug fixes for the price of one! Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- present/present.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/present/present.c b/present/present.c index e5d3fd5..be1c9f1 100644 --- a/present/present.c +++ b/present/present.c @@ -834,7 +834,7 @@ present_pixmap(WindowPtr window, vblank-notifies = notifies; vblank-num_notifies = num_notifies; -if (!screen_priv-info || !(screen_priv-info-capabilities PresentCapabilityAsync)) +if (!(options PresentOptionAsync)) vblank-sync_flip = TRUE; if (!(options PresentOptionCopy) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 2/2] present: Fix use of vsynced pageflips and honor PresentOptionAsync. (v3)
On 12/05/2014 12:56 AM, Eric Anholt wrote: Mario Kleiner mario.kleiner...@gmail.com writes: Pageflips for Pixmap presents were not synchronized to vblank on drivers with support for PresentCapabilityAsync, due to some missing init for vblank-sync_flips. The PresentOptionAsync flag was completely ignored for pageflipped presents. Vsynced flips only worked by accident on the intel-ddx, as that driver doesn't have PresentCapabilityAsync support. On nouveau-ddx, which supports PresentCapabilityAsync, this always caused non-vsynced pageflips with pretty ugly tearing. This patch fixes the problem, as tested on top of XOrg 1.16.2 on nouveau and intel. Please also apply to XOrg 1.17 and XOrg 1.16.2 stable. Applying on top of XOrg 1.16.2 may require cherry-picking commit 2051514652481a83bd7cf22e57cb0fcd40333f33 which trivially fixes lack of support for protocol option PresentOptionCopy - get two bug fixes for the price of one! Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- present/present.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/present/present.c b/present/present.c index e5d3fd5..be1c9f1 100644 --- a/present/present.c +++ b/present/present.c @@ -834,7 +834,7 @@ present_pixmap(WindowPtr window, vblank-notifies = notifies; vblank-num_notifies = num_notifies; -if (!screen_priv-info || !(screen_priv-info-capabilities PresentCapabilityAsync)) +if (!(options PresentOptionAsync)) vblank-sync_flip = TRUE; I think I'd like to see a hunk like this in with this patch, so that each driver doesn't need to have the cap check: diff --git a/present/present.c b/present/present.c index a9f2214..ed0d734 100644 --- a/present/present.c +++ b/present/present.c @@ -838,6 +838,9 @@ present_pixmap(WindowPtr window, vblank-sync_flip = TRUE; if (!(options PresentOptionCopy) +!((options PresentOptionAsync) + (!screen_priv-info || + !(screen_priv-info-capabilities PresentCapabilityAsync))) pixmap != NULL present_check_flip (target_crtc, window, pixmap, vblank-sync_flip, valid, x_off, y_off)) { Seem reasonable? If you wanted to squash this in, then this is: I'm not sure if drivers will really avoid the cap check, as i assume the definition of the check_flip() function requires them to implement it anyway? Does some spec somewhere require them to do it? Do driver writers check all server implementations to see if they can get away with less? But then having this hunk in doesn't hurt either, and it would keep the current intel-ddx uxa backends working, so i'll integrate it - after some urgently needed sleep. Thanks for the review. These server patches are actually the critical ones for me. Without them in XOrg 1.16+, all the mesa fixes would be utterly useless for my kind of applications. -mario Reviewed-by: Eric Anholt e...@anholt.net (So's patch 1/2, regardless). ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 1/2] present: Avoid crashes in DebugPresent(), a bit more info.
DebugPresent() crashed the server when a dri3 drawable was closed while a pageflipped present was still pending, due to vblank-window- Null-Ptr deref, so debug builds caused new problems to debug. E.g., glXSwapBuffers(...); glXDestroyWindow(...); - Pageflip for non-existent window completes - boom. Also often happens when switching desktop compositor on/off due to Present unflips, or when logging out of session. Also add info if a Present is queued for copyswap or pageflip, if the present is vsynced, and the serial no of the Present request, to aid debugging of pageflip and vsync issues. The serial number is useful as Mesa's dri3/present backend encodes its sendSBC in the serial number, so one can easily correlate server debug output with Mesa and with the SBC values returned to actual OpenGL client applications via OML_sync_control and INTEL_swap_events extension, makes debugging quite a bit more easy. Please also cherry-pick this for a 1.16.x stable update. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com Reviewed-by: Eric Anholt e...@anholt.net --- present/present.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/present/present.c b/present/present.c index ac9047e..e5d3fd5 100644 --- a/present/present.c +++ b/present/present.c @@ -440,7 +440,7 @@ present_flip_notify(present_vblank_ptr vblank, uint64_t ust, uint64_t crtc_msc) DebugPresent((\tn %lld %p %8lld: %08lx - %08lx\n, vblank-event_id, vblank, vblank-target_msc, vblank-pixmap ? vblank-pixmap-drawable.id : 0, - vblank-window-drawable.id)); + vblank-window ? vblank-window-drawable.id : 0)); assert (vblank == screen_priv-flip_pending); @@ -859,10 +859,10 @@ present_pixmap(WindowPtr window, } if (pixmap) -DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p)\n, +DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p) flip %d vsync %d serial %d\n, vblank-event_id, vblank, target_msc, vblank-pixmap-drawable.id, vblank-window-drawable.id, - target_crtc)); + target_crtc, vblank-flip, vblank-sync_flip, vblank-serial)); xorg_list_add(vblank-event_queue, present_exec_queue); vblank-queued = TRUE; @@ -955,7 +955,7 @@ present_vblank_destroy(present_vblank_ptr vblank) DebugPresent((\td %lld %p %8lld: %08lx - %08lx\n, vblank-event_id, vblank, vblank-target_msc, vblank-pixmap ? vblank-pixmap-drawable.id : 0, - vblank-window-drawable.id)); + vblank-window ? vblank-window-drawable.id : 0)); /* Drop pixmap reference */ if (vblank-pixmap) -- 2.1.0 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
DRI3/Present fixes for XServers 1.16 and 1.17rc - reviewed
Ok, the final patchset. I made the changes Eric Anholt proposed, retested, and got them reviewed by him. With those on top of XServer 1.16.2, DRI3/Present works for me on all drivers and backends (intel sna, uxa and nouveau exa, glamor), single and dual-display fullscreen and windowed, with/without compositor, vsynced or non-vsynced etc. The patches apply cleanly to master, 1.17-rc and 1.16.2. Thanks, -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH] sna: Also fix ZaphodHeads on Linux kernels older than 3.19
Only 3.19 will have O_NONBLOCK for drm_read(), so the current ddx will still stutter in ZaphodHead mode on current kernels. Fix the problem by adding a poll() on the drm fd before potentially blocking on read(). The logic is directly transplanted from the uxa backend intel_mode_read_drm_events() function. Fixes fdo bug #84744 on older kernels. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- src/sna/sna_display.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c index 163..a7ad6cc 100644 --- a/src/sna/sna_display.c +++ b/src/sna/sna_display.c @@ -7390,7 +7390,16 @@ fixup_flip: void sna_mode_wakeup(struct sna *sna) { char buffer[1024]; - int len, i; + int len, i, r; + struct pollfd p = { .fd = sna-kgem.fd, .events = POLLIN }; + + /* DRM read is blocking on old kernels, so poll first to avoid it. */ + do { + r = poll(p, 1, 0); + } while (r == -1 (errno == EINTR || errno == EAGAIN)); + + if (r = 0) + return; /* The DRM read semantics guarantees that we always get only * complete events. -- 1.9.1 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.
On 12/15/2014 06:46 AM, Keith Packard wrote: Mario Kleiner mario.kleiner...@gmail.com writes: Restores proper immediate tearing swap behaviour for OpenGL bufferswap under DRI3/Present. Hrm. I'd love for this to be controlled by the GLX_EXT_swap_control_tear extension, but that one uses negative interval values to indicate tearing, instead of providing a new API, and we can't tell the difference between 0 and -0. Are you sure you don't want GLX_EXT_swap_control_tear and an interval of -1 instead of an interval of 0 with tearing? Yes. GLX_EXT_swap_control_tear It's a different use case. Useful for games which want to avoid tearing if possible, but don't want to get in a tremor of switching frequently between say 60 fps and 30 fps if they only almost manage to do 60 fps, but not with reliable headroom. Would be also nice to have support for that, but only as an addition for swapinterval -1, not a replacement. The 0 case is good for benchmarking. In my specific case i always want vsync'ed swap for actual visual stimulation in neuroscience/medical settings, with no frame skipped ever. The bonus use for me, except for benchmarking how fast the system can go, is if one has a multi-display setup, e.g., dual-display for stereoscopic stimulation - one display per eye, or some CAVE like setup for VR with more than 2 displays. You want display updates and scanout on all of them synchronized, so the scene stays coherent. One simple way for visually testing multi-display sync is to intentionally swap all of them without vsync, e.g., timed to swap in the middle of the scanout. If the tear-lines on all displays are roughly at the same vertical position and stay there then that's a good visual test if stuff works. There are other ways to do it, but this is the one method that seems to work cross-platform, without lots of mental context switching depending on what os/gpu/server/driver combo with what settings one uses, and much more easy to grasp for scientists with no graphics background. You can see at a glance if stuff is roughly correct or not. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.
On 12/16/2014 09:23 AM, Keith Packard wrote: Mario Kleiner mario.kleiner...@gmail.com writes: The 0 case is good for benchmarking. Sure, but the current code does benchmarking just fine. In fact, because it doesn't copy queued frames that aren't the most recent before the vblank, benchmarks tend to run *faster* as a result, and people generally like that aspect of it... Hmm. For benchmarking i think i'd consider that a mild form of cheating. You get higher fps because you skip processing like the whole gpu blit overhead and host processing overhead for queuing / validating / processing the copy command in the command stream, so the benchmark numbers don't translate very well anymore in how the system would behave in a non-benchmark situation? ... but read on below ... In my specific case i always want vsync'ed swap for actual visual stimulation in neuroscience/medical settings, with no frame skipped ever. The bonus use for me, except for benchmarking how fast the system can go, is if one has a multi-display setup, e.g., dual-display for stereoscopic stimulation - one display per eye, or some CAVE like setup for VR with more than 2 displays. You want display updates and scanout on all of them synchronized, so the scene stays coherent. One simple way for visually testing multi-display sync is to intentionally swap all of them without vsync, e.g., timed to swap in the middle of the scanout. If the tear-lines on all displays are roughly at the same vertical position and stay there then that's a good visual test if stuff works. There are other ways to do it, but this is the one method that seems to work cross-platform, without lots of mental context switching depending on what os/gpu/server/driver combo with what settings one uses, and much more easy to grasp for scientists with no graphics background. You can see at a glance if stuff is roughly correct or not. It seems like you want something that the GL API doesn't express precisely; my reading of the GL spec definitely lets Present work the way it does today, and as you avoid tearing *and* improve performance in the vblank_mode=0 case, I'm very reluctant to change it. From GLX_EXT_swap_control and MESA_swap_control: If interval is set to a value of 0, buffer swaps are not synchronized to a video frame. It depends on how you interpret the not synchronized to a video frame? Can you explain your interpretation? I don't think the spec says anywhere that dropping old not most recent at vblank time frames is allowed, like Present does atm.? And the current Present implementation does synchronize the buffer swap of its most recent received Pixmap to the [onset of] a video frame, so while preventing tearing it goes a bit slower than it could go without vsync. Every past/other implementation than DRI3/Present that i have experience with interpreted the spec the way that patch tries to restore, at least everything i know of from OSX/Windows/Linux proprietary/DRI2, so my interpretation is certainly a valid interpretation, and it is the one that provides consistency and therefore the least surprise to implementers of GL clients and end users. Present could trivially offer a new bit to force tearing; I'm not sure how you'd get at that from GL though. It does already with PresentOptionAsync? It just needs to be used in accordance with the mainstream interpretation of the _swap_control spec, like this patch suggests. I'm not trying to claim here that the current behaviour of Mesa+Present isn't useful for some types of applications like games. I'm just saying it shouldn't be the default behaviour for swapinterval 0 or 0. As far as i understand the meaning, intention and origin of the EXT_swap_control_tear extension, the current Present implementation would implement a useful approximation of EXT_swap_control_tear for a swapinterval of 0. Not an exact implementation, but at least following the spirit of that extension. So i'm arguing for restoring the default behaviour any other implementation has with that patch, but providing the current behaviour via sync_control_tear? Or maybe even some new sync_control_tear2 to cover the difference between the current method and sync_control_tear. When we are at the topic, i can also send you my christmas wish list with proposals for future mesa/server releases: 1 - Another thing i'd love to have, which would require a new option PresentOptionDontSkip is the ability to not skip present requests which are late. That would allow to take advantage of mesas triple/quad buffering to queue frames for animations ahead of time for playing animations or videos and be still certain that every queued frame was shown at least for one video refresh cycle. I'd love to take advantage of the new triple-buffering behaviour, or maybe even use Present directly somehow for deeper n-buffering, but for some of my types of application i'd need to be certain that frames
Re: [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.
On 12/17/2014 04:17 AM, Keith Packard wrote: Mario Kleiner mario.kleiner...@gmail.com writes: Hmm. For benchmarking i think i'd consider that a mild form of cheating. You get higher fps because you skip processing like the whole gpu blit overhead and host processing overhead for queuing / validating / processing the copy command in the command stream, so the benchmark numbers don't translate very well anymore in how the system would behave in a non-benchmark situation? It's still a very useful mode -- imagine wanting the lowest possible latency between the user and the display; normally, you process input and generate a frame just after vblank, then (if the rendering is really quick), end up waiting most of the frame time not doing anything before finally updating it at the next vblank. With vblank_mode=0 and DRI3, you have the ability to try and generate another frame before vblank comes; if you manage, then you get that data on the screen, rather than the older version. So, it offers latency closer to the tearing vblank_mode=0 but without any tearing. That's why I did it; the fact that it offers a small performance benefit for benchmarks is an unintentional bonus feature. It is a useful mode for some applications, no disagreement here. I can think of use cases where i would exactly want to have that, e.g., VR stuff with head-tracking - rendering - Occulus Rift style display. I actually want to tinker with something like that early next year and see if i can make use of it. It's just that i need access to both, the old behaviour i described, and the new drop frame behaviour, and i need a way to select what i want at runtime via api without the need for easily overwhelmed and confused users to change config files or environment variables. I also always need meaningful and trustworthy feedback, at least for page-flipped presents, about what really happened for a presented frame - was it flipped, copied, exchanged, skipped, or did some error happen? That's why i'd like to have an extension to INTEL_swap_events to also report some new completion type skipped and error and that one patch 5/5 of mine for mesa reviewed and included, to make sure the swap_events don't fall apart so easily. That's why i think we could (ab)use the swap_control_tear extension to implement this behaviour: swapinterval 0 - sync to vblank, swap only at most every swapinterval'th refresh. swapinterval = 0 - Don't sync to vblank (PresentOptionAsync), swap immediately. swapinterval 0 - Do what your current implementation does for swapinterval = abs(swapinterval); The first two would restore consistent behaviour with all other implementations. The last one would implement something at least close to the spirit of of swap_control_tear. Although it would be even better to think of some extension to the extension so one can select if one really wants tearing if late as swap_control_tear defines, to squeeze out a few more msecs, or rather your tear free skip old frames method. Essentially i'd love to have more control over how this stuff is done, and a default behaviour close to what all other implementations do, to reduce confusion. As some kind of stop gap measure one could also think about defining a new vblank_mode to enable the new behaviour instead of the old one. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.
On 12/17/2014 05:49 AM, Keith Packard wrote: Mario Kleiner mario.kleiner...@gmail.com writes: It's just that i need access to both, the old behaviour i described, and the new drop frame behaviour, and i need a way to select what i want at runtime via api without the need for easily overwhelmed and confused users to change config files or environment variables. I also always need meaningful and trustworthy feedback, at least for page-flipped presents, about what really happened for a presented frame - was it flipped, copied, exchanged, skipped, or did some error happen? Present reports precisely what it did with each frame; flipped, copied, or skipped. That's why i'd like to have an extension to INTEL_swap_events to also report some new completion type skipped and error and that one patch 5/5 of mine for mesa reviewed and included, to make sure the swap_events don't fall apart so easily. You can use Present events on the target drawable; they're generated to whoever requests them, so you don't need to rely on the intel swap events alone. Never thought about that. Could you show me some short example snippet of XLib/GLX code how i reliably detect at runtime if Present is present, and then enable this? That would probably do for the moment and at the same time solve the problem that i don't know how to reliably detect at runtime if i'm on DRI2 or DRI3/Present. Making good use of this will require separate code-path and a way to select the right one. It's still important to fix that wraparound handling bug from my patch 5/5 for INTEL_swap_events. As some kind of stop gap measure one could also think about defining a new vblank_mode to enable the new behaviour instead of the old one. I really don't have a good suggestion here, given that we have such a limited API available. I thought i just made a suggestion how we could wiggle through it within existing api? And we can define new one for future extensions? Need to sleep now, thanks -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [RFC 5/5] modesetting: Implement page flipping support for Present.
) +return FALSE; +ms-flip_vblank_event-event_id = event_id; + +new_front_bo.gbm = glamor_gbm_bo_from_pixmap(screen, new_front); +new_front_bo.dumb = NULL; +if (!new_front_bo.gbm) { +xf86DrvMsg(scrn-scrnIndex, X_ERROR, + Failed to get GBM bo for flip to new front.\n); +free(ms-flip_vblank_event); +ms-flip_vblank_event = NULL; +return FALSE; +} + +/* Create a new handle for the back buffer */ +if (drmModeAddFB(ms-fd, scrn-virtualX, scrn-virtualY, + scrn-depth, scrn-bitsPerPixel, + drmmode_bo_get_pitch(new_front_bo), + drmmode_bo_get_handle(new_front_bo), new_fb_id)) +goto error_out; - error_out doesn't free(ms-flip_vblank_event); and ms-flip_vblank_event = NULL; Memory leak? + +drmmode_bo_destroy(ms-drmmode, new_front_bo); + +flags = DRM_MODE_PAGE_FLIP_EVENT; +if (async) +flags |= DRM_MODE_PAGE_FLIP_ASYNC; + +/* Queue flips on all enabled CRTCs. + * + * Note that if/when we get per-CRTC buffers, we'll have to update this. + * Right now it assumes a single shared fb across all CRTCs, with the + * kernel fixing up the offset of each CRTC as necessary. + * + * Also, flips queued on disabled or incorrectly configured displays + * may never complete; this is a configuration error. + */ +ms-fe_msc = 0; +ms-fe_usec = 0; + +for (i = 0; i config-num_crtc; i++) { +xf86CrtcPtr crtc = config-crtc[i]; + +if (!ms_crtc_on(crtc)) +continue; + +if (!queue_flip_on_crtc(screen, crtc, ref_crtc_vblank_pipe, new_fb_id, +flags)) { +goto error_undo; +} +} - queue_flip_on_crtc() has at least one error path where it doesn't free(ms-flip_vblank_event). Maybe that was partially the purpose of the currently uncommented intel_crtc_apply() in error_undo? + +ms-drmmode.old_fb_id = ms-drmmode.fb_id; +ms-drmmode.fb_id = new_fb_id; + +assert(ms-flip_count = 0); +if (ms-flip_count == 0) { +ms_pageflip_complete(screen); +} + +return TRUE; + +error_undo: +drmModeRmFB(ms-fd, new_fb_id); -- drmModeRmFB() could get called here even if a few queue_flip_on_crtc() calls succeeded for a multi-crtc flip, so the FB would be in use? Not sure though... Feel free to add a Reviewed-and-Tested-by: Mario Kleiner mario.kleiner...@gmail.com for the bits i tested and reviewed if that helps. The other patches in the series seem to be fine, thanks, -mario +for (i = 0; i config-num_crtc; i++) { +if (config-crtc[i]-enabled) { +ErrorF(XXX: crtc apply\n); +/* intel_crtc_apply(config-crtc[i]); */ +} +} + +error_out: +xf86DrvMsg(scrn-scrnIndex, X_WARNING, Page flip failed: %s\n, + strerror(errno)); + +assert(ms-flip_count = 0); +ms-flip_count = 0; +return FALSE; +} + +/* + * Test to see if page flipping is possible on the target crtc + */ +static Bool +ms_present_check_flip(RRCrtcPtr crtc, + WindowPtr window, + PixmapPtr pixmap, + Bool sync_flip) +{ +ScreenPtr screen = window-drawable.pScreen; +ScrnInfoPtr scrn = xf86ScreenToScrn(screen); +modesettingPtr ms = modesettingPTR(scrn); + +/* We only support page flipping with Glamor, for now. */ +if (!ms-drmmode.glamor) +return FALSE; + +if (!scrn-vtSema) +return FALSE; + +if (ms-drmmode.shadow_enable) +return FALSE; + +if (crtc !ms_crtc_on(crtc-devPrivate)) +return FALSE; + +/* Check stride, can't change that on flip */ +if (pixmap-devKind != drmmode_bo_get_pitch(ms-drmmode.front_bo)) +return FALSE; + +/* Make sure there's a bo we can get to */ +/* XXX: actually do this. also...is it sufficient? + * if (!glamor_get_pixmap_private(pixmap)) + * return FALSE; + */ + +return TRUE; +} + +/* + * Queue a flip on 'crtc' to 'pixmap' at 'target_msc'. If 'sync_flip' is true, + * then wait for vblank. Otherwise, flip immediately + */ +static Bool +ms_present_flip(RRCrtcPtr crtc, +uint64_t event_id, +uint64_t target_msc, +PixmapPtr pixmap, +Bool sync_flip) +{ +ScreenPtr screen = crtc-pScreen; +ScrnInfoPtr scrn = xf86ScreenToScrn(screen); +xf86CrtcPtr xf86_crtc = crtc-devPrivate; +drmmode_crtc_private_ptr drmmode_crtc = xf86_crtc-driver_private; +Bool ret; + +if (!ms_present_check_flip(crtc, screen-root, pixmap, sync_flip)) +return FALSE; + +ret = ms_do_pageflip(screen, pixmap, drmmode_crtc-vblank_pipe, !sync_flip, + event_id); +if (!ret) +xf86DrvMsg(scrn-scrnIndex
Re: [PATCH 0/3] present: Improve interactions with compositing manager
Hi Keith and Owen, it's a bit too late for XOrg 1.17, but i reviewed and tested Keith's patchset, after rebasing the 3rd patch on top of 1.17-rc2. The series has my Reviewed-and-Tested-by: Mario Kleiner mario.kleiner...@gmail.com My patch 1/3 is Keith's patch 3/3, just rebased and with my review/test tags for convenience. Patch 2/3 is an extension of Keith's idea to target vblanks more than 1 frame into the future, on top of Keith's patch. There is no patch 3/3, the numbering is a credit to my never ending struggles with git send-email ;) The commit message of patch 2/3 contains some test results on both KDE/KWin and GNOME-3, the versions from KUbuntu 14.10. All done with hardware timing measurement equipment to check when frames really show up at the video output and if lying about timestamps and lag is reduced. As you can see from those results, the patchset sort of helps on an otherwise idle desktop under KDE/KWin, but not at all even with one instance of glxgears running in parallel to the test app. On GNOME-3 with my patch applied on top of Keith's it helps for vblanks 1 frame in the future, so it maybe could be useful for applications which don't need fps == full video refresh rate, e.g., video players running at 30/25/20/15 fps or similar. It doesn't work well for full 60 fps, as Owen suspected. Strangely running a glxgears in the background on GNOME does help all cases on GNOME, maybe by putting the gpu into a higher performance state? This was on Intel HD-4000. I also did some testing on nouveau with variable results. Given the variability, there's probably not much of a point in applying those, maybe at most after hiding them behind some xorg.conf option by default, so people can experiment with it if they want. I don't care one way or the other. Maybe merging the first two patches as a cleanup would make sense. -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
[PATCH 2/3] present: Extend compositor optimization to presents 1 vblank into future.
Same principle as keithp's original patch 3/3, extended to handle target_msc crtc_msc + 1 1. If non-fullscreen window is redirected, schedule its vblank trigger event for 1 frame before target vblank, so offscreen copy and posting of damage happens in the frame before, through the regular present_execute() path, giving the compositor a chance to do its thing ahead of the target vblank. 2. After vblank_copy() posted the damage, it queues a new vblank event for 1 vblank into the future, which will call present_execute again, this time just for sending out the PresentCompleteNotify at the estimated vblank of true present completion. Unless something goes wrong with queueing, in which case it triggers immediate completion to avoid hangs, just as it does for unredirected copy presents. Tested on Intel HD-4000 IvyBridge mobile with hardware timing measurement equipment under KDE's KWin compositor and GNOME-3 for non-composited window copy swaps (= compositor off classic mode), non-composited or unredirected fullscreen windows (= kms-pageflip in use), and verified to do no harm to timing in those cases. Testing the interesting case of redirected windows under compositor, results are mixed: 1. KDE/KWin: Mostly correct timestamps (about 99% marking the correct vblank of true swap completion) under idle desktop. Things go downhill quickly if some additional load is present, e.g., a single instance of glxgears will cause at least 1 frame lag and all timestamps being 1 - 2 frames too early wrt. reality. 2. GNOME-3: Mostly correct timestamps for target_msc at least 2 vblanks into the future (= 30 fps animation on 60 Hz panel), but almost always wrong - 1 frame too early - and somewhat jittery for 60 fps on 60 Hz panel. This on an otherwise idle desktop. When an instance of glxgears is running, timing *improves* to almost always correct timestamps for 60 fps, 30 fps, 20 fps, so somehow the extra load helps (gpu powermanagement upclocking, or somehow glxgears kicking the compositor in the right moment?) Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com Tested-by: Mario Kleiner mario.kleiner...@gmail.com --- present/present.c | 90 +-- 1 file changed, 68 insertions(+), 22 deletions(-) diff --git a/present/present.c b/present/present.c index 2e7662a..b75017b 100644 --- a/present/present.c +++ b/present/present.c @@ -587,13 +587,16 @@ present_vblank_idle(present_vblank_ptr vblank) /* * Execute the present copy operation if it hasn't been done yet, then mark the - * related objects as completed + * related objects as completed. Schedule a deferred completion for composited + * windows. */ -static void +static Bool present_copy(present_vblank_ptr vblank) { if (!vblank-copied) { -WindowPtr window = vblank-window; +WindowPtr window = vblank-window; +ScreenPtr screen = vblank-screen; +present_window_priv_ptr window_priv = present_get_window_priv(window, TRUE); vblank-copied = TRUE; @@ -606,7 +609,46 @@ present_copy(present_vblank_ptr vblank) present_flush(window); present_vblank_idle(vblank); + +/* Was this a copy into a composited window? If so, queue a present + * completion 1 vblank into the future, as our best guess as to when + * the compositor will actually complete the presentation. + * + * Otherwise fallthrough to immediate completion. That's the right + * thing to do for a non-composited copy, and the best we can do for + * now in case this is called as a unredirected copy from a fallback + * caused by a failed PresentCompleteModeFlip. + */ +if (screen-GetWindowPixmap(window) != screen-GetScreenPixmap(screen)) { +int ret; +uint64_tust = 0, crtc_msc = 0; + +/* Re-add vblank to exec queue under same event_id */ +xorg_list_add(vblank-event_queue, present_exec_queue); +xorg_list_append(vblank-window_list, window_priv-vblank); + +/* Queue one vblank from now into the future */ +ret = present_get_ust_msc(screen, vblank-crtc, ust, crtc_msc); + +if (ret == Success) +ret = present_queue_vblank(screen, vblank-crtc, + vblank-event_id, crtc_msc + 1); + +/* On success, tell caller to not complete the present. Our deferred + * vblank event will do that by triggering present_execute() again. + */ +if (ret == Success) +return TRUE; + +xorg_list_del(vblank-event_queue); +xorg_list_del(vblank-window_list); + +DebugPresent((present_queue_vblank for deferred completion failed\n)); +} } + +/* Signal need for an immediate
[PATCH 1/3] present: When composited, get the bits into the window pixmap immediately (v2)
From: Keith Packard kei...@keithp.com When a window is composited, we want to notify the compositing manager of the new contents *before* the next frame, so that it can prepare them for display at the next frame, instead of doing the copy at that frame time, and notifying the compositor so that it would always be a frame behind. This change catches the common case of a operation destined for the next frame and immediately performs the copy, leaving the queue entry around so that the PresentComplete event can be delivered at the target frame time. This should give applications the right answer if the Compositor is able to get the frame constructed before that frame happens, which is at least better than the current situation where the client receives the PresentComplete notify at the target frame, but the contents will not be displayed until a subsequent frame. Signed-off-by: Keith Packard kei...@keithp.com Reviewed-by: Mario Kleiner mario.kleiner...@gmail.com v2 (mario): Trivial rebase on top of 1.17.0-rc2 Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com Tested-by: Mario Kleiner mario.kleiner...@gmail.com --- present/present.c | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/present/present.c b/present/present.c index bb7f33c..2e7662a 100644 --- a/present/present.c +++ b/present/present.c @@ -890,8 +890,25 @@ present_pixmap(WindowPtr window, vblank-queued = TRUE; if ((pixmap target_msc = crtc_msc) || (!pixmap target_msc crtc_msc)) { ret = present_queue_vblank(screen, target_crtc, vblank-event_id, target_msc); -if (ret == Success) +if (ret == Success) { +/* If the window is composited, and the contents are + * destined for the next frame, just do the copy, sending + * damage along to the compositor. + * + * Leave the vblank around to send the completion event at + * vblank time + */ +if (pixmap window vblank-mode == PresentCompleteModeCopy +(target_msc - crtc_msc) = 1 +screen-GetWindowPixmap(window) != screen-GetScreenPixmap(screen)) +{ +DebugPresent((\tC %p %8lld: %08lx - %08lx\n, vblank, crtc_msc, + vblank-pixmap-drawable.id, vblank-window-drawable.id)); +present_copy(vblank); +} + return Success; +} DebugPresent((present_queue_vblank failed\n)); } -- 2.1.0 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 5/5] modesetting: Implement page flipping support for Present.
On 04/28/2015 06:31 AM, Kenneth Graunke wrote: On Tuesday, April 21, 2015 05:58:44 PM Kenneth Graunke wrote: Based on code by Keith Packard, Eric Anholt, and Jason Ekstrand. v2: - Fix double free and flip_count underrun (caught by Mario Kleiner). - Don't leak flip_vblank_event on the error_out path (Mario). - Use the updated ms_flush_drm_events API (Mario, Ken). v3: Hack around DPMS shenanigans. If all monitors are DPMS off, then there is no active framebuffer; attempting to pageflip will hit the error_undo paths, causing us to drmModeRmFB with no framebuffer, which confuses the kernel into doing full modesets and generally breaks things. To avoid this, make ms_present_check_flip check that some CRTCs are enabled and DPMS on. This is an ugly hack that would get better with atomic modesetting, or some core Present work. v4: - Don't do pageflipping if CRTCs are rotated (caught by Jason Ekstrand). - Make pageflipping optional (Option PageFlip in xorg.conf.d), but enabled by default. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Upon further testing, there are still nasty DPMS bugs affecting this patch. We no longer crash the X server, but clients appear to occasionally get stuck. I guess we just need to bite the bullet and unflip on DPMS off. Or switch to atomic modesetting... I retested v4 + the attached patch with some small improvements applied. The patch adds async page flip cap support, fixes that missing zero init which Michel also reported and fixes some small inconsistency in the ms_present_unflip routine. Everything worked nicely, except for the not quite resolved DPMS off bug. No more crashes, but page-flipped clients reliably hang after DPMS off - on. KDE with composition on and page flipping on (Setting Tearing prevention set to Full scene repaints) hangs after each off-on cycle, my own fullscreen app hangs reliably as well if page-flipped swaps are used. Looking at stack traces it seems to me as if somehow after a dpms off/on, a present_event_notify() somehow doesn't get delivered to the server / gets lost somewhere, so clients like mine which wait for swap completion or vblank events hang, and KDE hangs in Mesa when Mesa tries to get a new backbuffer for rendering, runs out of buffers and then waits infinitely for a present notify event from the server in order to reuse one of its busy buffers. -mario From 3d1e2383d7bcd35cf0838f32f26b82b8118a17d5 Mon Sep 17 00:00:00 2001 From: Mario Kleiner mario.kleiner...@gmail.com Date: Fri, 1 May 2015 09:01:41 +0200 Subject: [PATCH 8/8] modesetting: Add async flip support, fix small bugs. Detect if kms driver is async flip capable. Add missing zero init for number of crtc's on counting. Fix small inconsistency. Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- hw/xfree86/drivers/modesetting/present.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/hw/xfree86/drivers/modesetting/present.c b/hw/xfree86/drivers/modesetting/present.c index 0ec0f40..693d4f3 100644 --- a/hw/xfree86/drivers/modesetting/present.c +++ b/hw/xfree86/drivers/modesetting/present.c @@ -504,7 +504,7 @@ ms_present_check_flip(RRCrtcPtr crtc, ScrnInfoPtr scrn = xf86ScreenToScrn(screen); modesettingPtr ms = modesettingPTR(scrn); xf86CrtcConfigPtr config = XF86_CRTC_CONFIG_PTR(scrn); -int num_crtcs_on; +int num_crtcs_on = 0; int i; if (!ms-drmmode.pageflip) @@ -579,7 +579,7 @@ ms_present_unflip(ScreenPtr screen, uint64_t event_id) PixmapPtr pixmap = screen-GetScreenPixmap(screen); Bool ret; -if (!ms_present_check_flip(NULL, screen-root, pixmap, FALSE)) +if (!ms_present_check_flip(NULL, screen-root, pixmap, TRUE)) return; ret = ms_do_pageflip(screen, pixmap, -1, FALSE, event_id); @@ -609,5 +609,14 @@ static present_screen_info_rec ms_present_screen_info = { Bool ms_present_screen_init(ScreenPtr screen) { +ScrnInfoPtr scrn = xf86ScreenToScrn(screen); +modesettingPtr ms = modesettingPTR(scrn); +uint64_t value; +int ret; + +ret = drmGetCap(ms-fd, DRM_CAP_ASYNC_PAGE_FLIP, value); +if (ret == 0 value == 1) +ms_present_screen_info.capabilities |= PresentCapabilityAsync; + return present_screen_init(screen, ms_present_screen_info); } -- 1.9.1 ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] present: Fix missed notify MSC computation
On 02/16/2015 10:49 AM, Chris Wilson wrote: Only treat divisor==0 as async to immediately report the actual vblank. If the user species a non-zero divisor, we should compute the missed vblank properly or else we report too early. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- present/present.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/present/present.c b/present/present.c index 1dddcc0..9a283d4 100644 --- a/present/present.c +++ b/present/present.c @@ -961,7 +961,7 @@ present_notify_msc(WindowPtr window, 0, 0, NULL, NULL, NULL, - PresentOptionAsync, + divisor == 0 ? PresentOptionAsync : 0, target_msc, divisor, remainder, NULL, 0); } Reviewed-by: Mario Kleiner mario.kleiner...@gmail.com -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 1/2] present: Improve scaling of vblank handler
On 02/14/2015 10:58 AM, Chris Wilson wrote: With large numbers of queued vblank, the list iteration on every interupt dominates processing time. If we reorder the list to be in ascending event order, then not only is also likely to be in order for notification queries (i.e. the notification will be near the start of the list), we can also stop iterating when past the target event_id. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- present/present.c | 24 +++- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/present/present.c b/present/present.c index ba88f79..b132f4d 100644 --- a/present/present.c +++ b/present/present.c @@ -482,19 +482,22 @@ present_flip_notify(present_vblank_ptr vblank, uint64_t ust, uint64_t crtc_msc) void present_event_notify(uint64_t event_id, uint64_t ust, uint64_t msc) { -present_vblank_ptr vblank, tmp; +present_vblank_ptr vblank; int s; if (!event_id) return; DebugPresent((\te %lld ust %lld msc %lld\n, event_id, ust, msc)); -xorg_list_for_each_entry_safe(vblank, tmp, present_exec_queue, event_queue) { -if (vblank-event_id == event_id) { +xorg_list_for_each_entry(vblank, present_exec_queue, event_queue) { +int64_t match = event_id - vblank-event_id; +if (match == 0) { present_execute(vblank, ust, msc); return; } +if (match 0) +break; } -xorg_list_for_each_entry_safe(vblank, tmp, present_flip_queue, event_queue) { +xorg_list_for_each_entry(vblank, present_flip_queue, event_queue) { if (vblank-event_id == event_id) { present_flip_notify(vblank, ust, msc); return; @@ -887,7 +890,7 @@ present_pixmap(WindowPtr window, vblank-pixmap-drawable.id, vblank-window-drawable.id, target_crtc, vblank-flip, vblank-sync_flip, vblank-serial)); -xorg_list_add(vblank-event_queue, present_exec_queue); +xorg_list_append(vblank-event_queue, present_exec_queue); vblank-queued = TRUE; if ((pixmap target_msc = crtc_msc) || (!pixmap target_msc crtc_msc)) { ret = present_queue_vblank(screen, target_crtc, vblank-event_id, target_msc); @@ -911,7 +914,7 @@ no_mem: void present_abort_vblank(ScreenPtr screen, RRCrtcPtr crtc, uint64_t event_id, uint64_t msc) { -present_vblank_ptr vblank, tmp; +present_vblank_ptr vblank; if (crtc == NULL) present_fake_abort_vblank(screen, event_id, msc); @@ -922,14 +925,17 @@ present_abort_vblank(ScreenPtr screen, RRCrtcPtr crtc, uint64_t event_id, uint64 (*screen_priv-info-abort_vblank) (crtc, event_id, msc); } -xorg_list_for_each_entry_safe(vblank, tmp, present_exec_queue, event_queue) { -if (vblank-event_id == event_id) { +xorg_list_for_each_entry(vblank, present_exec_queue, event_queue) { +int64_t match = event_id - vblank-event_id; +if (match == 0) { xorg_list_del(vblank-event_queue); vblank-queued = FALSE; return; } +if (match 0) +break; } -xorg_list_for_each_entry_safe(vblank, tmp, present_flip_queue, event_queue) { +xorg_list_for_each_entry(vblank, present_flip_queue, event_queue) { if (vblank-event_id == event_id) { xorg_list_del(vblank-event_queue); return; Looks good to me, also no problems during testing on top of current master, so Reviewed-and-tested-by: Mario Kleiner mario.kleiner...@gmail.com -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 2/2] present: Fix presentation of flips out of order
On 02/14/2015 10:58 AM, Chris Wilson wrote: The flip queue currently only holds events submitted to the driver for flipping, awaiting the completion notifier. It is short. We therefore can speed up interrupt processing by keeping the small number of events ready to be flipped on the end of the flip queue. By appending the events to the flip_queue in the order that they become ready, we also resolve one issue causing Present to display frames out of order. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- present/present.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/present/present.c b/present/present.c index b132f4d..bce41f4 100644 --- a/present/present.c +++ b/present/present.c @@ -327,10 +327,10 @@ present_re_execute(present_vblank_ptr vblank) static void present_flip_try_ready(ScreenPtr screen) { -present_vblank_ptr vblank, tmp; +present_vblank_ptr vblank; -xorg_list_for_each_entry_safe(vblank, tmp, present_exec_queue, event_queue) { -if (vblank-flip_ready) { +xorg_list_for_each_entry(vblank, present_flip_queue, event_queue) { +if (vblank-queued) { present_re_execute(vblank); return; } @@ -623,6 +623,8 @@ present_execute(present_vblank_ptr vblank, uint64_t ust, uint64_t crtc_msc) DebugPresent((\tr %lld %p (pending %p unflip %lld)\n, vblank-event_id, vblank, screen_priv-flip_pending, screen_priv-unflip_event_id)); +xorg_list_del(vblank-event_queue); +xorg_list_append(vblank-event_queue, present_flip_queue); vblank-flip_ready = TRUE; return; } @@ -938,6 +940,7 @@ present_abort_vblank(ScreenPtr screen, RRCrtcPtr crtc, uint64_t event_id, uint64 xorg_list_for_each_entry(vblank, present_flip_queue, event_queue) { if (vblank-event_id == event_id) { xorg_list_del(vblank-event_queue); +vblank-queued = FALSE; return; } } Ditto here. Looks good to me, also no problems during testing on top of current master, so Reviewed-and-tested-by: Mario Kleiner mario.kleiner...@gmail.com -mario ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel