Re: [PATCH xserver] dri2: Invalidate DRI2 buffers for all windows with the same pixmap on swap.

2011-03-25 Thread Mario Kleiner


On Mar 25, 2011, at 3:32 PM, Michel Dänzer wrote:


On Fre, 2011-03-25 at 14:47 +0200, Ville Syrjälä wrote:

On Fri, Mar 25, 2011 at 12:35:37PM +0100, ext Michel Dänzer wrote:

From: Michel Dänzer daen...@vmware.com

Without this, when a compositing manager unredirects a fullscreen  
window which
uses DRI2 and page flipping, the DRI2 buffer information for the  
compositing
manager's output window (typically the Composite Overlay Window  
or root window)

may become stale, resulting in all kinds of hilarity.


Make sense to me.


Thanks, does that mean I have your Reviewed-by: ?



fwiw (haven't touched that part of the code for multiple months), you  
can also have a ...


Reviewed-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
Acked-by: Mario Kleiner mario.klei...@tuebingen.mpg.de

... from me. I was suffering the same problems and was just about to  
spend my weekend on hunting this bug in the xserver, so thanks for  
saving my weekend!


-mario


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: Why no activity on this (nasty X segfault) bug?

2011-04-27 Thread Mario Kleiner

On Apr 27, 2011, at 2:45 PM, Ian Pilcher wrote:


On 04/23/2011 09:25 PM, Ian Pilcher wrote:
If there's a different iteration of the patch that you would like  
me to

test, please point me to it.  Otherwise, I would urge you to get this
thing pushed.


https://bugs.freedesktop.org/show_bug.cgi?id=35452

Anybody out there?

What is needed to get this pushed?



Keith was asking for testers for his refined patch, maybe cc'ing him,  
as i do here, helps.


Original thread was this, between Michel and Keith:

http://lists.x.org/archives/xorg-devel/2011-March/020716.html

The refined patch of Keith that needs testing:

http://www.mail-archive.com/xorg-devel@lists.x.org/msg20650.html

Maybe the fix should also be merged into the 1.9 series? Currently  
shipping distros, e.g., Ubuntu 10.10 all suffer it.


thanks,
-mario



Thanks!

--
== 
==
Ian Pilcher  
arequip...@gmail.com
If you're going to shift my paradigm ... at least buy me dinner  
first.
== 
==


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: Why no activity on this (nasty X segfault) bug?

2011-04-27 Thread Mario Kleiner


On Apr 27, 2011, at 5:12 PM, Ian Pilcher wrote:


On 04/27/2011 10:00 AM, Mario Kleiner wrote:


Original thread was this, between Michel and Keith:

http://lists.x.org/archives/xorg-devel/2011-March/020716.html

The refined patch of Keith that needs testing:

http://www.mail-archive.com/xorg-devel@lists.x.org/msg20650.html



The two patches in the second thread are just the patch from the first
thread split in two, right?



No, Keith patch is more efficient. It avoids walking parts of the  
window tree.


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [RFC] swap event handling fixes

2011-04-29 Thread Mario Kleiner


On Apr 29, 2011, at 11:37 PM, Jesse Barnes wrote:


On Thu, 28 Apr 2011 13:27:18 -0700
Jesse Barnes jbar...@virtuousgeek.org wrote:

I obviously failed to count the swap event structure size after  
adding
and removing fields a few times, and didn't even account for  
padding.  The
end result is that clients today won't receive the sbc_lo field at  
all,

and so will likely stuff junk into that field on the client side (or
zero at best).

This patchset fixes up the structure definitions, bumps the proto
levels, and adds server and client handling code for it all.  It  
should

be forward and backward compatible, but as always review and testing
appreciated.

I think the event_type checking on the client side still needs work;
the field is split now so I need to check the right one on old  
servers.
I'll also add swap support for the new requests in case people  
ever want

to run this stuff between big and little endian machines.


Btw, I've been trying to test these fixes but I can't make the  
existing

code break; I'm getting a nicely incrementing sbc count using the
piglit event test (after adding some code to dump the swap event
fields), so somehow the bits are getting through.  I just don't know
how yet...



One thing you could try as another quick test is to use xtrace on a  
opengl app. I remember seeing weird results for the msc or ust fields  
(i believe) of decoded and printed DRI2SwapComplete events. I always  
thought part of my toolchain was just out of sync, but maybe it was  
just the result of the wrong size/padding.


-mario


--
Jesse Barnes, Intel Open Source Technology Center
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: xf86-video-ati page flipping fixes

2011-05-05 Thread Mario Kleiner

On May 5, 2011, at 4:23 PM, Ville Syrjälä wrote:


On Thu, May 05, 2011 at 11:46:45AM +0200, Michel Dänzer wrote:

On Mit, 2011-05-04 at 23:51 +0300, Ville Syrjala wrote:
I came to the conclusion that the xserver DRI2 invalidate patches  
that
have been discussed aren't really fixing the problem. I suppose  
they may
make the problem slightly less likely to happen, but at least for  
me that
likelyhood is still very high. The whole mess looks like a simple  
driver

bug to me.


I think the xserver patches are still necessary, otherwise how are  
the

cached DRI2 pPriv-buffers updated for other windows sharing the same
pixmap?


As the real front buffers are not handed out to clients, there isn't
that much reason to update them. Making sure the fake front buffer
contain a more recent snapshot of the real front would be one reason
though.



Could it be that your patch fixes a closely related, but different  
bug? During OpenGL rendering, Mesa/Gallium caches the backbuffer  
assignment and only queries the current assignment from the x-server  
if its drawable has been invalidated, which is what the x-server  
patches do.


I'd also think you will need both patch sets.

-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: xf86-video-ati page flipping fixes

2011-05-05 Thread Mario Kleiner


On May 5, 2011, at 6:06 PM, Ville Syrjälä wrote:


On Thu, May 05, 2011 at 05:09:56PM +0200, Mario Kleiner wrote:

On May 5, 2011, at 4:23 PM, Ville Syrjälä wrote:


On Thu, May 05, 2011 at 11:46:45AM +0200, Michel Dänzer wrote:

On Mit, 2011-05-04 at 23:51 +0300, Ville Syrjala wrote:

I came to the conclusion that the xserver DRI2 invalidate patches
that
have been discussed aren't really fixing the problem. I suppose
they may
make the problem slightly less likely to happen, but at least for
me that
likelyhood is still very high. The whole mess looks like a simple
driver
bug to me.


I think the xserver patches are still necessary, otherwise how are
the
cached DRI2 pPriv-buffers updated for other windows sharing the  
same

pixmap?


As the real front buffers are not handed out to clients, there isn't
that much reason to update them. Making sure the fake front buffer
contain a more recent snapshot of the real front would be one reason
though.



Could it be that your patch fixes a closely related, but different
bug? During OpenGL rendering, Mesa/Gallium caches the backbuffer
assignment and only queries the current assignment from the x-server
if its drawable has been invalidated, which is what the x-server
patches do.

I'd also think you will need both patch sets.


Back buffers are private so a swap for one drawable doesn't need to
invalidate them for other drawables.



Are you sure this is also the case even when kms page flipping is  
used for fullscreen drawables?


The workaround i implemented in my toolkit for the page flipping  
problem when using fullscreen drawables on current x servers was to  
make sure that it always ends a session with an even number of  
completed page-flipping swaps, by checking 'sbc' and issuing an extra  
swap before closing its window if the sbc wasn't an even number. This  
makes sure that the same buffer objects are assigned as front- and  
backbuffer at the time the window closes as when the application  
started and opened its fullscreen window. This fixed any desktop  
corruption i'd usually get when running compiz with non-redirected  
fullscreen windows after quitting my application.


I could be totally wrong, but at least the symptoms and the working  
workaround suggested fullscreen drawables do share their back buffers  
when page flipping is on.


-mario


--
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH 2/2] DRI2: Add error message when working around driver bug

2010-10-06 Thread Mario Kleiner

On Oct 6, 2010, at 1:05 PM, Pauli Nieminen wrote:

There isn't API that allows application atomically query for msc  
changes

and schedule swaps. If msc changes dramatically between query and
scheduling application would schedule swap to happen at wrong time.

Because of API limitations driver has to make msc increment for each
vblank affecting the drawable.



Hi Paul,

this is not a driver bug which should be reported as an error, but  
simply handling of what happens if a drawable is moved by the user  
from one display head to a different display head, which likely has a  
different msc count because it runs at a different refresh rate or  
was enabled later than the first crtc.


Therefore i don't think this error printout should be added. It is  
arguably not perfect, but probably good enough for the most common  
cases. The correct solution would be to virtualize the msc counter  
for each drawable and keep track of crtc changes and compensate for  
those at each change. I have some ideas on how to do this properly,  
just didn't have the time to test them.


best,
-mario


Signed-off-by: Pauli Nieminen ext-pauli.niemi...@nokia.com
---
 hw/xfree86/dri2/dri2.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index d9b9d57..d70c115 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -860,9 +860,12 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr  
pDraw, CARD64 target_msc,

if (!(*ds-GetMSC)(pDraw, ust, current_msc))
pPriv-last_swap_target = 0;

-   if (current_msc  pPriv-last_swap_target)
+   if (current_msc  pPriv-last_swap_target) {
pPriv-last_swap_target = current_msc;
-
+   xf86DrvMsg(pScreen-myNum, X_ERROR,
+   [DRI2] %s: GetMSC returned swap count that is in 
+   past. Working around driver bug.\n, __func__);
+   }
}

/*
--
1.7.0.4

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH 1/2] DRI2: Avoid call to NULL pointer

2010-10-06 Thread Mario Kleiner

On Oct 6, 2010, at 1:05 PM, Pauli Nieminen wrote:


DDX driver may implement schedule swap without GetMSC. In that case we
can't call GetMSC in DRI2SwapBuffers.



I don't think this check is neccessary. Afaik if the ds-GetMSC entry  
point isn't defined then ds-ScheduleSwap isn't defined either, so  
the !ds-ScheduleSwap  check at the beginning of the routine triggers  
and the fallback path is taken, no?


-mario


Signed-off-by: Pauli Nieminen ext-pauli.niemi...@nokia.com
---
 hw/xfree86/dri2/dri2.c |   11 +++
 1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index 228c0e0..d9b9d57 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -856,11 +856,14 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr  
pDraw, CARD64 target_msc,

 * is moved to a crtc with a lower refresh rate, or a crtc that just
 * got enabled.
 */
-   if (!(*ds-GetMSC)(pDraw, ust, current_msc))
-   pPriv-last_swap_target = 0;
+   if (ds-GetMSC) {
+   if (!(*ds-GetMSC)(pDraw, ust, current_msc))
+   pPriv-last_swap_target = 0;

-   if (current_msc  pPriv-last_swap_target)
-   pPriv-last_swap_target = current_msc;
+   if (current_msc  pPriv-last_swap_target)
+   pPriv-last_swap_target = current_msc;
+
+   }

/*
 	 * Swap target for this swap is last swap target + swap interval  
since

--
1.7.0.4

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH 1/2] DRI2: Avoid call to NULL pointer

2010-10-13 Thread Mario Kleiner

On Oct 7, 2010, at 10:06 AM, Pauli Nieminen wrote:


On 07/10/10 00:42 +0200, ext Mario Kleiner wrote:

On Oct 6, 2010, at 1:05 PM, Pauli Nieminen wrote:

DDX driver may implement schedule swap without GetMSC. In that  
case we

can't call GetMSC in DRI2SwapBuffers.



I don't think this check is neccessary. Afaik if the ds-GetMSC entry
point isn't defined then ds-ScheduleSwap isn't defined either, so
the !ds-ScheduleSwap  check at the beginning of the routine triggers
and the fallback path is taken, no?



no. It is perfectly valid to have ScheduleSwap without GetMSC. This is
important for driver that can implement page flipping but doesn't  
have simple

way to expose msc count.



Ok, i'm not sure if such drivers should exist, and this snippet of  
code from DRI2ScreenInit...


if (info-version = 4) {
ds-ScheduleSwap = info-ScheduleSwap;
ds-ScheduleWaitMSC = info-ScheduleWaitMSC;
ds-GetMSC = info-GetMSC;
cur_minor = 3;
} else {
cur_minor = 1;
}

... implied for me that ScheduleSwap and GetMSC come in one all or  
nothing package. But GetMSC could be a null pointer anyway, so it  
doesn't hurt to protect against such configs and i think your patch  
does the right thing in this case.


You can add a...

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de

to your patch if you like. As far as i can see, such a driver would  
be limited to simple glXSwapBuffers() calls though if it doesn't know  
its own msc. It would also have to ignore the target_msc, divisor and  
remainder arguments to ScheduleSwap and only take swapInterval into  
account, as neither a glx client nor the server could provide any  
meaningful values for these arguments without knowing the msc.


-mario



-mario


Signed-off-by: Pauli Nieminen ext-pauli.niemi...@nokia.com
---
 hw/xfree86/dri2/dri2.c |   11 +++
 1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index 228c0e0..d9b9d57 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -856,11 +856,14 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr
pDraw, CARD64 target_msc,
 	 * is moved to a crtc with a lower refresh rate, or a crtc that  
just

 * got enabled.
 */
-   if (!(*ds-GetMSC)(pDraw, ust, current_msc))
-   pPriv-last_swap_target = 0;
+   if (ds-GetMSC) {
+   if (!(*ds-GetMSC)(pDraw, ust, current_msc))
+   pPriv-last_swap_target = 0;

-   if (current_msc  pPriv-last_swap_target)
-   pPriv-last_swap_target = current_msc;
+   if (current_msc  pPriv-last_swap_target)
+   pPriv-last_swap_target = current_msc;
+
+   }

/*
 * Swap target for this swap is last swap target + swap interval
since
--
1.7.0.4

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH 1/2] DRI2: Avoid call to NULL pointer

2010-10-13 Thread Mario Kleiner




You can add a...

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de

to your patch if you like. As far as i can see, such a driver would  
be limited to simple glXSwapBuffers() calls though if it doesn't  
know its own


Oops, of course i meant a

Reviewed-by: Mario Kleiner mario.klei...@tuebingen.mpg.de

-mario

msc. It would also have to ignore the target_msc, divisor and  
remainder arguments to ScheduleSwap and only take swapInterval into  
account, as neither a glx client nor the server could provide any  
meaningful values for these arguments without knowing the msc.


-mario



-mario


Signed-off-by: Pauli Nieminen ext-pauli.niemi...@nokia.com
---
 hw/xfree86/dri2/dri2.c |   11 +++
 1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index 228c0e0..d9b9d57 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -856,11 +856,14 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr
pDraw, CARD64 target_msc,
 	 * is moved to a crtc with a lower refresh rate, or a crtc  
that just

 * got enabled.
 */
-   if (!(*ds-GetMSC)(pDraw, ust, current_msc))
-   pPriv-last_swap_target = 0;
+   if (ds-GetMSC) {
+   if (!(*ds-GetMSC)(pDraw, ust, current_msc))
+   pPriv-last_swap_target = 0;

-   if (current_msc  pPriv-last_swap_target)
-   pPriv-last_swap_target = current_msc;
+   if (current_msc  pPriv-last_swap_target)
+   pPriv-last_swap_target = current_msc;
+
+   }

/*
 * Swap target for this swap is last swap target + swap interval
since
--
1.7.0.4

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


 
*

Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
 
*

For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] DRI2: Expose API to set drawable swap limit.

2010-10-25 Thread Mario Kleiner
 in a  
couple of weeks or months, go for it:


1. We maintain a little fifo queue / list of pending swap requests  
for each drawable.
2. The DRI2ScheduleSwap() function just converts the incoming  
swapbuffers request into a little struct with all the parameters  
(target_msc, divisor, remainder, ...) and inserts it at the tail of  
that queue, after checking that swaps_pending  current swaplimit.  
If  (++swaps_pending) == 1, it kicks off a little  
DRI2ReallyScheduleOneSwap() function which dequeues this first  
request from the queue and calls into the ddx-ScheduleSwap()  
routine. At the end of the DRI2SwapComplete() function, when a swap  
really got completed, that DRI2ReallyScheduleOneSwap() function gets  
called and checks if more swap requests are pending in the queue and  
dequeues and processes the oldest request. If the queue is empty, it  
is a no op and the whole thing goes idle again. Maybe  
DRI2ReallyScheduleOneSwap() would also take care of msc jumps due to  
drawables being moved between crtc's etc.


This can't perfectly solve delays in one swapbuffer request due to  
traffic jams in the command stream, but it could prevent that one  
misscheduled swap will trigger the followup ones to go ugly in random  
ways.


The problems with multiple clients blocking on glXWaitForMscOML or  
glXWaitForSbcOML() is a separate issue which would need  
implementation of additional wait queues.


And then, as we discussed on xds, there's the need for proper  
timestamping of copyswaps in the kernel, otherwise this proposal  
would only fix the pageflipped case, not the windowed case.


thanks,
-mario

*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH 2/2] DRI2: Add error message when working around driver bug

2010-10-25 Thread Mario Kleiner

On Oct 25, 2010, at 6:52 PM, Jesse Barnes wrote:


On Mon, 25 Oct 2010 17:13:58 +0300
Pauli Nieminen ext-pauli.niemi...@nokia.com wrote:

There isn't API that allows application atomically query for msc  
changes

and schedule swaps. If msc changes dramatically between query and
scheduling application would schedule swap to happen at wrong time.

Because of API limitations driver has to make msc increment for each
vblank affecting the drawable.

Signed-off-by: Pauli Nieminen ext-pauli.niemi...@nokia.com
CC: Kristian Høgsberg k...@bitplanet.net
---
 hw/xfree86/dri2/dri2.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index d9b9d57..d70c115 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -860,9 +860,12 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr  
pDraw, CARD64 target_msc,

if (!(*ds-GetMSC)(pDraw, ust, current_msc))
pPriv-last_swap_target = 0;

-   if (current_msc  pPriv-last_swap_target)
+   if (current_msc  pPriv-last_swap_target) {
pPriv-last_swap_target = current_msc;
-
+   xf86DrvMsg(pScreen-myNum, X_ERROR,
+   [DRI2] %s: GetMSC returned swap count that is in 
+   past. Working around driver bug.\n, __func__);
+   }
}


This one scares me a little.  We added this so we could also catch
drawables moving between screens with different msc bases, so this
patch could cause a lot of false positives (no question that the specs
could use some additions here though).

Making it a debug message that only shows up with -verbose would be
fine though.

--
Jesse Barnes, Intel Open Source Technology Center


I agree with Jesse, as a debug message at -verbose it would be fine.

-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH 2/2] DRI2: Add error message when working around driver bug

2010-10-27 Thread Mario Kleiner
 bit??


Or even have some dedicated function that the app needs to call to  
acknowledge it realizes what just happened? Or a way for apps to tell  
the server how they want to be treated in such cases?


E.g., one could have a transition counter for each drawable which  
increments at each crtc transition and a way for the app to query the  
current count and to pass the count with each blocking call,  
basically as a cookie? The implementation could compare counts to  
know if an app is aware at function call time that the crtc has  
changed and fail a call if the counts mismatch.


c) Extend the spec with new possible error cases to watch out:  
Unblock such calls in some slightly ugly manner to at least avoid an  
application hang, e.g., just . Somehow notify the application of what  
happened (error return code, or deriveable from the returned  
timestamps/counts/intel_swap_events?), maybe print some warning to  
the server log? How to decide which call after a transition should be  
unblocked and which are valid calls again?


Ideas?


Thanks again for all your work on this.  These are good improvements.



Indeed!
-mario


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] DRI2: Expose API to set drawable swap limit.

2010-10-28 Thread Mario Kleiner

On Oct 28, 2010, at 6:10 PM, Pauli Nieminen wrote:


I think we should have some way for the drivers to back out of this
gracefully or at least cover their tails. E.g., allow the ddx to set
a hard upper limit for the swap_limit, in a new field max_swap_limit.
Your patch could make sure that swap_limit is never set to a value
higher than that. Then the intel and radeon ddx could set
max_swap_limit = 1 unless users somehow (xorg.conf option? dri.conf?)
opt into risky higher max_swap_limits.


True.

I intended DRI2SwapLimit to be called from DDX only. DDX only entry  
wouldn't

need any checks for limit changes.

If we assume that all calls to DRI2SwapLimit are anyway driver  
initiated then

max_swap_limits would only guard driver against it self.

I can add the max_swap_limit if there is real need for it.



Or some part of the implementation (your patch? the ddx itself?)
should output some warning in the x log if a value is selected that
it can't reliably handle, so app developers aren't confused to death
about weird behaviour.



DDX would have to implement the option and warning.



Ok, sounds good. As long as the ddx has control over the settable  
swap_limit, i'm perfectly fine with it.




Good point. I wasn't aware of that. So something like sketched above
should/could be implemented inside the ddx instead of the common
server code. But there, where it can already exchange to the new
front, it should work, right?



yes. It would work.



Good. So that's one way to do it in the ddx.

Thanks,
-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH 2/2] DRI2: Add error message when working around driver bug

2010-10-28 Thread Mario Kleiner

On Oct 28, 2010, at 6:02 PM, Jesse Barnes wrote:


On Thu, 28 Oct 2010 18:47:09 +0300
Pauli Nieminen ext-pauli.niemi...@nokia.com wrote:
Most of what you have in (b) is pretty straightfoward; even the  
shared
drawable case shouldn't be too bad, since each X connection could  
have

bits indicating whether the counter has been picked up after a CRTC
move.


One option would be adding crct id parameter to calls.

glXGetMscBaseRateOML would return rate, base msc and pipe id where  
this msc

value is valid. Now all MSC calls would take the returned pipe id as
parameter. If pipe id doesn't match current crtc any more then  
call would

fail.

This would allow complex applications to pass same pipe id to  
different

context.

Negative side is that API would have to be changed to include extra
parameter.


Yeah that would be a good extension though; we may as well expose the
fact that different display pipes exist on the system, and have
corresponding MSCs.  Old applications using SGI_video_sync or existing
OML behavior would work like they do today (with an MSC value that may
jump, which we could fix with the virtualize count), and new ones  
would

be pipe aware.



I also like option b) the most, define new spec and api instead of  
working around the old specs limitations.


Another way of doing it, in the same spirit, would be some generation  
counter. Starts with 1, increments each time that something changes  
in the configuration which could influence the display timing and  
mess with the schedule the app had in mind. If the drawable changes  
crtc's. If the crtc changes its video mode (esp. refresh rate) or  
configuration (mirrored/extended desktop, synchronized to other  
crtc's etc.), dpms change etc.


A get call could return the current count, other calls could return  
the count that was valid at time of their processing. E.g.,  
intel_swap_events could code more info like generation count, crtcid.


Apps could pass in the count to the msc related functions and those  
would fail on mismatch to the current count. One could also have one  
special don't care value (e.g., 0) that says I don't care about an  
isolated glitch, because i'm not prepared to handle this anyway. Just  
do something to make sure i don't hang, e.g., fall through a blocking  
glXWaitMscOml() call or swap the buffers immediately on  
glXSwapBuffersMscOML().


I'm also for exposing rather more than less information, like pipe  
configuration, or the ability for the app to decide what it wants, e.g.,


* If the drawable covers multiple crtc's, or is in mirror display  
mode, should one of them define when to swap and the other should  
show tearing, or should each of them sync its swap separately, which  
looks nice, but can throttle redraw rates or possibly exhaust  
resources if the crtc's run and largely different rates.


Windows has the concept of a primary display which defines the swap  
timing on extended desktops. The non-primary display just shows  
tearing. Mac OS behaves similar, except that you don't have control  
over which is the primary display, and some (sometimes buggy)  
heuristic decides for you and gives you the fun of working around it  
by replugging monitors and other fun things. I like control.


The current intel and radeon ddx in page flipped mode will swap each  
crtc separately. Tear free, but with throttled framerate, as swap  
completion == swap completion of the last involved crtc. This btw. is  
a problem for the returned timestamps and timing if the crtc's run at  
different refresh rates, as the app doesn't know to which crtc the  
swap completion timestamp belongs. And it changes over time. For  
blitted copy-swaps you get tearfree on the assigned crtc for a  
drawable and tearing on the other one.


Another approach would be to define swap times in system (gettimeofday 
() time). Specify a swap deadline tWhen and the system tries to swap  
at the earliest vblank with a time tNow = tWhen. Then one doesn't  
have to care too much about changes in msc rates. The  
NV_present_video http://www.opengl.org/registry/specs/NV/ 
present_video.txt extension does something similar for presentatio  
of video buffers. My own toolkit does this as well and for user code  
it's a natural way to specify presentation times, especially if it  
has to synchronize presentation with other modalities like sound,  
digital i/o, eye tracking etc. My code just uses the  
glXGetSyncValuesOML() call to translate a user-provided system time  
tWhen into a corresponding target_msc for glXSwapBuffersMscOML().


When we're at defining new api (christmas time is coming, i got lots  
of wishes), a new swapbuffers call could also define what to do if a  
presentation deadline can't be met. E.g., instead of a delayed swap  
it could drop the swap and completely skip a bufferswap request to  
get presentation timing back on schedule, so skipped frame errors  
can't accumulate. Something like that could be interesting for 

Re: [PATCH v2 5/5] DRI2: Allow DDX to validate swap_limit changes

2010-11-03 Thread Mario Kleiner

The whole patch series looks now good to me. For all of them:

Reviewed-by: Mario Kleiner mario.klei...@tuebingen.mpg.de

thanks,
-mario

On Nov 1, 2010, at 3:22 PM, Pauli Nieminen wrote:


DDX can now implement validation for swap_limit changes to prevent
configurations that are not support in driver.

Signed-off-by: Pauli Nieminen ext-pauli.niemi...@nokia.com
CC: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 hw/xfree86/dri2/dri2.c |   18 +-
 hw/xfree86/dri2/dri2.h |   14 ++
 2 files changed, 31 insertions(+), 1 deletions(-)

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index 255fed0..7c6a0e2 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -102,6 +102,7 @@ typedef struct _DRI2Screen {
 DRI2ScheduleWaitMSCProcPtr  ScheduleWaitMSC;
 DRI2AuthMagicProcPtrAuthMagic;
 DRI2ReuseBufferNotifyProcPtr ReuseBufferNotify;
+DRI2SwapLimitValidateProcPtr SwapLimitValidate;

 HandleExposuresProcPtr   HandleExposures;

@@ -191,13 +192,23 @@ DRI2AllocateDrawable(DrawablePtr pDraw)
 return pPriv;
 }

+static Bool
+DRI2DefaultSwapLimitValidate(DrawablePtr pDraw, int swap_limit)
+{
+return swap_limit == 1;
+}
+
 Bool
 DRI2SwapLimit(DrawablePtr pDraw, int swap_limit)
 {
 DRI2DrawablePtr pPriv = DRI2GetDrawable(pDraw);
+DRI2ScreenPtr ds = pPriv-dri2_screen;
 if (!pPriv)
return FALSE;

+if (!ds-SwapLimitValidate(pDraw, swap_limit))
+   return FALSE;
+
 pPriv-swap_limit = swap_limit;

 /* Check throttling */
@@ -1134,8 +1145,10 @@ DRI2ScreenInit(ScreenPtr pScreen,  
DRI2InfoPtr info)

 ds-AuthMagic = info-AuthMagic;
 }

-if (info-version = 6)
+if (info-version = 6) {
ds-ReuseBufferNotify = info-ReuseBufferNotify;
+   ds-SwapLimitValidate = info-SwapLimitValidate;
+}

 /*
  * if the driver doesn't provide an AuthMagic function or the  
info struct
@@ -1148,6 +1161,9 @@ DRI2ScreenInit(ScreenPtr pScreen, DRI2InfoPtr  
info)

 goto err_out;
 #endif

+if (!ds-SwapLimitValidate)
+   ds-SwapLimitValidate = DRI2DefaultSwapLimitValidate;
+
 /* Initialize minor if needed and set to minimum provied by  
DDX */

 if (!dri2_minor || dri2_minor  cur_minor)
dri2_minor = cur_minor;
diff --git a/hw/xfree86/dri2/dri2.h b/hw/xfree86/dri2/dri2.h
index 3d01c55..a4bba02 100644
--- a/hw/xfree86/dri2/dri2.h
+++ b/hw/xfree86/dri2/dri2.h
@@ -169,6 +169,19 @@ typedef void		(*DRI2InvalidateProcPtr) 
(DrawablePtr pDraw,

 void *data);

 /**
+ * DRI2 calls this hook when ever swap_limit is going to be  
changed. Default
+ * implementation for the hook only accepts one as swap_limit. If  
driver can
+ * support other swap_limits it has to implement supported limits  
with this

+ * callback.
+ *
+ * \param pDraw drawable whos swap_limit is going to be changed
+ * \param swap_limit new swap_limit that going to be set
+ * \return TRUE if limit is support, FALSE if not.
+ */
+typedef Bool   (*DRI2SwapLimitValidateProcPtr)(DrawablePtr pDraw,
+   int swap_limit);
+
+/**
  * Version of the DRI2InfoRec structure defined in this header
  */
 #define DRI2INFOREC_VERSION 6
@@ -203,6 +216,7 @@ typedef struct {
 /* added in version 6 */

 DRI2ReuseBufferNotifyProcPtr ReuseBufferNotify;
+DRI2SwapLimitValidateProcPtr SwapLimitValidate;
 }  DRI2InfoRec, *DRI2InfoPtr;

 extern _X_EXPORT int DRI2EventBase;
--
1.7.0.4



*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] glx: Remove swap barrier and hyperpipe support

2010-11-22 Thread Mario Kleiner

On 11/19/2010 07:52 PM, Adam Jackson wrote:

Never implemented in any open source driver.  The implementation
assumed explicit DDX driver knowledge of how the client-side driver
worked, since at the time the server's GL renderer was not a DRI driver.
But now, it is, so any implementation of these should be done with
additional DRI driver API, like the swap control extension.


I assume this is only about removing the current unused implementation, 
not about removing support completely from the xserver, right?


At some point i'd like to implement some dri2 support for at least swap 
groups and possibly swap barriers. With multi-head cards like the amd 
eyefinity stuff (6 crtc's) we will need swap groups et al. to make good 
use of so many heads.


As the x-server's dri2/ddx part is responsible for swap scheduling in 
dri2, i assume there will be some need for interaction between the 
client and server. I was hoping to tinker with this stuff within maybe a 
year or so.


thanks,
-mario



Signed-off-by: Adam Jacksona...@redhat.com
---
  glx/Makefile.am  |1 -
  glx/g_disptab.h  |   52 --
  glx/glxcmds.c|  233 --
  glx/glxcmdsswap.c|1 -
  glx/glxdri.c |1 -
  glx/glxdri2.c|1 -
  glx/glxdriswrast.c   |1 -
  glx/glxext.c |   19 +
  glx/glxscreens.c |   27 --
  glx/glxscreens.h |   18 
  glx/indirect_table.c |1 -
  glx/xfont.c  |1 -
  hw/xquartz/GL/indirect.c |2 -
  hw/xwin/glx/indirect.c   |2 -
  14 files changed, 1 insertions(+), 359 deletions(-)
  delete mode 100644 glx/g_disptab.h

diff --git a/glx/Makefile.am b/glx/Makefile.am
index 9d9fa3c..d708872 100644
--- a/glx/Makefile.am
+++ b/glx/Makefile.am
@@ -68,7 +68,6 @@ libglx_la_SOURCES = \
indirect_program.c \
indirect_table.h \
indirect_texture_compression.c \
-g_disptab.h \
  glxbyteorder.h \
  glxcmds.c \
  glxcmdsswap.c \
diff --git a/glx/g_disptab.h b/glx/g_disptab.h
deleted file mode 100644
index 9b4308b..000
--- a/glx/g_disptab.h
+++ /dev/null
@@ -1,52 +0,0 @@
-/* DO NOT EDIT - THIS FILE IS AUTOMATICALLY GENERATED */
-#ifdef HAVE_DIX_CONFIG_H
-#includedix-config.h
-#endif
-
-#ifndef _GLX_g_disptab_h_
-#define _GLX_g_disptab_h_
-/*
- * SGI FREE SOFTWARE LICENSE B (Version 2.0, Sept. 18, 2008)
- * Copyright (C) 1991-2000 Silicon Graphics, Inc. All Rights Reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the Software),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice including the dates of first publication and
- * either this permission notice or a reference to
- * http://oss.sgi.com/projects/FreeB/
- * shall be included in all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS
- * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * SILICON GRAPHICS, INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
- * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF
- * OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
- * SOFTWARE.
- *
- * Except as contained in this notice, the name of Silicon Graphics, Inc.
- * shall not be used in advertising or otherwise to promote the sale, use or
- * other dealings in this Software without prior written authorization from
- * Silicon Graphics, Inc.
- */
-
-extern int __glXDisp_BindSwapBarrierSGIX(__GLXclientState *cl, GLbyte *pc);
-extern int __glXDisp_QueryMaxSwapBarriersSGIX(__GLXclientState *cl, GLbyte 
*pc);
-extern int __glXDisp_QueryHyperpipeNetworkSGIX(__GLXclientState *cl, GLbyte 
*pc);
-extern int __glXDisp_DestroyHyperpipeConfigSGIX (__GLXclientState *cl, GLbyte 
*pc);
-extern int __glXDisp_QueryHyperpipeConfigSGIX(__GLXclientState *cl, GLbyte 
*pc);
-extern int __glXDisp_HyperpipeConfigSGIX(__GLXclientState *cl, GLbyte *pc);
-
-extern int __glXDispSwap_BindSwapBarrierSGIX(__GLXclientState *cl, GLbyte *pc);
-extern int __glXDispSwap_QueryMaxSwapBarriersSGIX(__GLXclientState *cl, GLbyte 
*pc);
-extern int __glXDispSwap_QueryHyperpipeNetworkSGIX(__GLXclientState *cl, 
GLbyte *pc);
-extern int __glXDispSwap_DestroyHyperpipeConfigSGIX (__GLXclientState *cl, 
GLbyte *pc);
-extern int __glXDispSwap_QueryHyperpipeConfigSGIX(__GLXclientState *cl, GLbyte 
*pc);
-extern int __glXDispSwap_HyperpipeConfigSGIX(__GLXclientState *cl, GLbyte *pc);
-
-#endif 

Re: Swap limit

2010-12-10 Thread Mario Kleiner

On 12/08/2010 05:15 PM, Pauli Nieminen wrote:

On 08/12/10 16:55 +0100, ext Alex Deucher wrote:


One other thing that might be worth adding to DRI2 is a way for the
driver to access the swap interval.  If we could, then the driver
could dynamically disable things like vline waits for buffer blits or
do non-vsynced pageflipping more easily if the swap interval was 0.



Actually I tough DRI2 was already telling SwapInteval with

*swap_target = pPriv-last_swap_target + pPriv-swap_interval;

If swap_target doesn't advance then driver would know to flip as fast as
possible.

But then I noticed a bug.

First one is that we always blit with swapinterval is zero when it should be
possible to flip (even tear free).



Let it tear :). I think swap_interval == 0 should cause immediate swaps 
with tearing. All OpenGL implementations i know (Windows, MacOS classic, 
MacOS/X, Linux with proprietary drivers, old SGI's) interpret a 
swapinterval of zero as swap as soon as rendering is finished, 
immediately, don't sync to scanout cycle and i think that makes the 
most sense. It's good for benchmarking how fast your system can go if 
not throttled by the monitor (useful), for crazy gamers that trade 
visual quality for fps and for special applications that need to 
sometimes control swap timing themselves (like my toolkit).



Code from DRI2ScheduleSwap:
/* Old DDX or no swap interval, just blit */
if (!ds-ScheduleSwap || !pPriv-swap_interval) {



Not a bug, just a half-done way of achieving the expected 
swap_interval zero ;-) [ok, maybe that is a bug].


It schedules an immediate copy-swap via blitting. Unfortunately the ddx 
doesn't know about the swap_interval, so it still synchronizes the 
execution of the blit to vsync via vline waits. That's tear-free, but it 
depends on the location and size of the drawable and the current 
position of the scanout if this will cause an immediate swap (if scanout 
is outside the drawables area) or a vsync'ed swap. It's a bit undefined 
behaviour for non-fullscreen drawables and it effectively enforces a 
minimum swap interval of 1 for fullscreen drawables, which is not what 
we want.


Both the intel ddx and (soon) the ati ddx have a xorg.conf parameter 
SwapBuffersWait to disable vsync completely, then you'd get copy-swaps 
as fast as possible for swap_interval zero, but for a non-zero 
swap_interval you'd have some chance of tearing, so this is alos just a 
band-aid.


If we wanted this for page-flipping we'd need a shortcut similar to this 
one which would bypass the vblank scheduling and call the ddx pageflip 
routine directly, plus an extension to the kernel's pageflip ioctl() to 
allow non-vsynced flips.


And we need a new interface to tell the ddx that a swap should be 
non-vsync'ed.


-mario
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: Swap limit

2010-12-13 Thread Mario Kleiner

On Dec 13, 2010, at 11:21 AM, Michel Dänzer wrote:


On Mon, 2010-12-13 at 11:35 +0200, Pauli Nieminen wrote:

On 12/12/10 19:49 -0500, ext Jerome Glisse wrote:

On Fri, Dec 10, 2010 at 9:40 AM, Mario Kleiner
mario.klei...@tuebingen.mpg.de wrote:

On 12/08/2010 05:15 PM, Pauli Nieminen wrote:


On 08/12/10 16:55 +0100, ext Alex Deucher wrote:


One other thing that might be worth adding to DRI2 is a way  
for the

driver to access the swap interval.  If we could, then the driver
could dynamically disable things like vline waits for buffer  
blits or
do non-vsynced pageflipping more easily if the swap interval  
was 0.




Actually I tough DRI2 was already telling SwapInteval with

*swap_target = pPriv-last_swap_target + pPriv-swap_interval;

If swap_target doesn't advance then driver would know to flip  
as fast as

possible.

But then I noticed a bug.

First one is that we always blit with swapinterval is zero when  
it should

be
possible to flip (even tear free).



Let it tear :). I think swap_interval == 0 should cause  
immediate swaps with
tearing. All OpenGL implementations i know (Windows, MacOS  
classic, MacOS/X,
Linux with proprietary drivers, old SGI's) interpret a  
swapinterval of zero
as swap as soon as rendering is finished, immediately, don't  
sync to

scanout cycle and i think that makes the most sense. It's good for
benchmarking how fast your system can go if not throttled by the  
monitor
(useful), for crazy gamers that trade visual quality for fps and  
for special
applications that need to sometimes control swap timing  
themselves (like my

toolkit).


Code from DRI2ScheduleSwap:
/* Old DDX or no swap interval, just blit */
if (!ds-ScheduleSwap || !pPriv-swap_interval) {



Not a bug, just a half-done way of achieving the expected  
swap_interval

zero ;-) [ok, maybe that is a bug].

It schedules an immediate copy-swap via blitting. Unfortunately  
the ddx
doesn't know about the swap_interval, so it still synchronizes  
the execution
of the blit to vsync via vline waits. That's tear-free, but it  
depends on
the location and size of the drawable and the current position  
of the
scanout if this will cause an immediate swap (if scanout is  
outside the
drawables area) or a vsync'ed swap. It's a bit undefined  
behaviour for
non-fullscreen drawables and it effectively enforces a minimum  
swap interval

of 1 for fullscreen drawables, which is not what we want.

Both the intel ddx and (soon) the ati ddx have a xorg.conf  
parameter
SwapBuffersWait to disable vsync completely, then you'd get  
copy-swaps as
fast as possible for swap_interval zero, but for a non-zero  
swap_interval

you'd have some chance of tearing, so this is alos just a band-aid.

If we wanted this for page-flipping we'd need a shortcut similar  
to this one
which would bypass the vblank scheduling and call the ddx  
pageflip routine
directly, plus an extension to the kernel's pageflip ioctl() to  
allow

non-vsynced flips.

And we need a new interface to tell the ddx that a swap should be
non-vsync'ed.



Unless i miss read hw spec all pageflip are always vsync no matter
what you do. IIRC the crtc sancout address is updated during vblank
from the register, so what is scanned next is what ever address was
written last.



If I remember correctly RADEON_CRTC_OFFSET_FLIP_CNTL had some  
effect on that.


Correct, if that bit is 1 the flip happens on the next horizontal  
blank

(the non-KMS radeon DRM always sets this bit for flipping). There's
probably something similar for the AVIVO display engine.



On Radeons you can select if flip should happen at next vblank or  
hblank. Alex's initial pageflip patches for Radeon used this even for  
vblank sync'd flips, so we know it works, but we changed to a  
different method for higher overall robustness. I assume other gpu's  
aren't different, as on os/x and windows, all gpu's will swap  
immediately (with tearing) when you set the swap_interval to zero.


Adding this to the implementation wouldn't be difficult (at least for  
radeon's), it's just that the current pageflip ioctl() doesn't  
support an option to pass this to the kms-driver. Same with the ddx.  
Currently there isn't a specific interface for the x-server to  
communicate vsync vs. non-vsync to the ddx.


I have to read the other e-mails in this thread first, esp. Pauli's,  
i'm a bit slow at the moment.

-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: Swap limit

2010-12-13 Thread Mario Kleiner

On Dec 10, 2010, at 8:00 PM, Jesse Barnes wrote:


On Fri, 10 Dec 2010 15:40:38 +0100
Mario Kleiner mario.klei...@tuebingen.mpg.de wrote:

It schedules an immediate copy-swap via blitting. Unfortunately  
the ddx

doesn't know about the swap_interval, so it still synchronizes the
execution of the blit to vsync via vline waits. That's tear-free,  
but it

depends on the location and size of the drawable and the current
position of the scanout if this will cause an immediate swap (if  
scanout
is outside the drawables area) or a vsync'ed swap. It's a bit  
undefined

behaviour for non-fullscreen drawables and it effectively enforces a
minimum swap interval of 1 for fullscreen drawables, which is not  
what

we want.


That's actually the problem; swapbufferswait isn't *quite* equivalent
to swap interval = 1, since you can potentially do multiple blits
before the scanline intercepts your rect(s).

But regardless, we should add an API option to allow blits to happen
immediately when swap_interval = 0 both with and without tear
avoidance.  I think Pauli's followup covers this (tearing swap limit 0
vs tear free swap limit 0).



Ja, if i understand Pauli's proposal correctly, it's non-vsync'ed  
flips/copies for the tearing case and some kind of limited n- 
buffering + dropping frames. I mostly like it, but i think clients  
should have control over the desired behaviour.


The non-vsync'ed flips as i proposed it would be as easy as  
communicating swap_interval to the ddx (or the ddx querying it) and  
then the ddx not vsync'ing its copies or flips. I think that's the  
same as what Pauli proposes for the tearing case? I don't think using  
target_msc == current_msc is a good way of choosing this. Usually you  
want a flip to be delayed in this case, not a complete rendered frame  
to be dropped. At least i would panic if dropping frames would be  
suddenly the default behaviour of the ddx in case of target_msc ==  
current_msc, unless there's extra api to control this.


The reason i like this swap_interval == 0 -- tearing immedate flip  
is mostly because


a) It's consistent with what all other os'es and non-free drivers do.  
As a toolkit developer i like it if the default behaviour of  
different platforms is somewhat consistent for a similar existing  
api, it makes porting and maintenance simpler.


b) It can be used as a hack for userspace toolkits to allow somewhat  
synchronized swaps across drawables if there isn't proper api to  
control this. E.g., if you have two drawables on two display heads  
and you want to swap them simultaneously and at least somewhat tear- 
free, you can do glFinish on both drawables contexts, so the  
drawables are swap-ready, then schedule a swap_interval = 1 swap on  
one drawable, wait for it to complete its swap and then immediately  
schedule a swap_interval = 0 swap on the 2nd drawable. You will get a  
tear-free update of the 1st drawable and depending on latencies and  
how well the displays are synchronized, you may get a tear-free  
update on the 2nd head as well.


Ideally we should have api to control this, maybe some  
OML_sync_control_2 extension. Ideally, i'd like to have a swapbuffers  
call that allows to...


...define the swap deadline by target_msc or target system time.
...vsync or non-vsync
...If we have n-buffering, what to do for each queued swap if we fall  
behind the target presentation time. Present the buffer, tear-free  
and just accumulate more delay, or drop the buffer to catch up.
...synchronize multiple drawables in their swap behaviour - the  
sgi_swap_group extension could handle that.


For a extension of the pageflip ioctl() it would be good if we could  
pass some flags or more info to the kms driver. Currently we have a  
bitfield to pass flags to the drm's ioctl() entry point, but they  
aren't passed through to the kms drivers implementation of the ioctl.


E.g.:

- A flag to ask for non-vsynced pageflip
- A flag or mask to implement synchronized flips across crtc's, to  
make mirror mode, swap_group support and drawables that span multiple  
crtc's robust.
- Possible future flags to allow to implement frame-sequential  
stereo, e.g., to tell if you're queuing a new future frontbuffer for  
the left-eye or right-eye.


-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH xserver/dri2 2/3] DRI2WaitSbc(): Fixes for correct semantic of glXWaitForSbcOML()

2010-02-20 Thread Mario Kleiner
Added implementation for case target_sbc == 0. In that case, the
function shall schedule a wait until all pending swaps for the drawable
have completed.

Fix for non-blocking case. Old implementation returned random,
uninitialized values for (ust,msc,sbc) if it returned immediately
without scheduling a wait due to sbc = target_sbc.

Now if function doesn't schedule a wait, but returns immediately,
it returns the (ust,msc,sbc) of the most recently completed swap,
i.e., the UST and MSC corresponding to the time when the returned
current SBC was reached.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 hw/xfree86/dri2/dri2.c |   25 +++--
 1 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index baf0df8..7f40d28 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -64,6 +64,8 @@ typedef struct _DRI2Drawable {
 CARD64  swap_count;
 CARD64  target_sbc; /* -1 means no SBC wait outstanding */
 CARD64  last_swap_target; /* most recently queued swap target 
*/
+CARD64  last_swap_msc; /* msc at completion of most recent 
swap */
+CARD64  last_swap_ust; /* ust at completion of most recent 
swap */
 int swap_limit; /* for N-buffering */
 } DRI2DrawableRec, *DRI2DrawablePtr;
 
@@ -142,6 +144,8 @@ DRI2CreateDrawable(DrawablePtr pDraw)
 pPriv-swap_interval = 1;
 pPriv-last_swap_target = -1;
 pPriv-swap_limit = 1; /* default to double buffering */
+pPriv-last_swap_msc = 0;
+pPriv-last_swap_ust = 0;
 
 if (pDraw-type == DRAWABLE_WINDOW)
 {
@@ -518,6 +522,9 @@ DRI2SwapComplete(ClientPtr client, DrawablePtr pDraw, int 
frame,
 if (swap_complete)
swap_complete(client, swap_data, type, ust, frame, pPriv-swap_count);
 
+pPriv-last_swap_msc = frame;
+pPriv-last_swap_ust = ust;
+
 DRI2WakeClient(client, pDraw, frame, tv_sec, tv_usec);
 }
 
@@ -714,8 +721,22 @@ DRI2WaitSBC(ClientPtr client, DrawablePtr pDraw, CARD64 
target_sbc,
 if (pPriv == NULL)
return BadDrawable;
 
-if (pPriv-swap_count = target_sbc)
-   return Success;
+/* target_sbc == 0 means to block until all pending swaps are
+ * finished. Recalculate target_sbc to get that behaviour.
+ */
+if (target_sbc == 0)
+target_sbc = pPriv-swap_count + pPriv-swapsPending;
+
+/* If current swap count already = target_sbc,
+ * return immediately with (ust, msc, sbc) triplet of
+ * most recent completed swap.
+ */
+if (pPriv-swap_count = target_sbc) {
+*sbc = pPriv-swap_count;
+*msc = pPriv-last_swap_msc;
+*ust = pPriv-last_swap_ust;
+return Success;
+}
 
 pPriv-target_sbc = target_sbc;
 DRI2BlockClient(client, pDraw);
-- 
1.6.6

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH xserver/dri2] Improvements to DRI2 2.2 protocol bits.

2010-02-20 Thread Mario Kleiner
The following series of 3 patches contains some bug fixes and adds some
missing functionality to the new DRI2 2.2 protocol requests for
implementation of the SGI_swap_interval and OML_sync_control GLX
extensions.

Jesse Barnes already had a positive look at these patches, but i couldn't test
any of the patches, not even if they compile, due to lack of suitable hardware.

Could somebody please test these and apply them if appropriate?

Thanks,
mario

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH xserver/dri2] Fixes and Improvements for DRI 2.2 protocol bits

2010-02-20 Thread Mario Kleiner
Hi,

the patch series against the xserver i just sent out contains bug  
fixes and adds missing functionality for the new DRI2 2.2 protocol  
requests. It should improve the handling of the SGI_swap_interval and  
OML_video_sync GLX extensions.

Jesse Barnes already had a positive look at the patches. Due to lack  
of suitable hardware i wasn't able to test any of these, not even if  
they compile.

Could somebody please test these and apply them if appropriate?

Thanks,
mario

*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


Improvements to Intel DDX for implementation of DRI2 OML_sync_control extensions.

2010-02-21 Thread Mario Kleiner
This series of 2 patches fixes issues in the Intel DDX implementation
of I830DRI2ScheduleWaitMSC() and I830DRI2ScheduleSwap().

The previous implementation should mostly work in the simple
glXSwapBuffers() case, although it may reduce the maximum
swaprate to half the video refresh interval, e.g., max 30 fps
at a refresh of 60 Hz. It won't work correctly for the
glXSwapBuffersMscOML(target_msc, divisor, remainder) case if
divisor or remainder are non-zero or target_msc is not  2
frames ahead of the current vblank count. It doesn't handle
the (msc % divisor) == remainder case correctly.

A similar problem exists for calls to glXWaitForMscOML().

These patches should fix above issues, or at least make
handling more robust.

Jesse Barnes already had a look at these patches, but due to
lack of suitable hardware, i haven't tested them myself, not
even if they compile cleanly.

Could somebody please give them some good testing and merge
them if appropriate?

Thanks,
-mario

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH xf86-video-intel 2/2] Fix handling of target_msc, divisor, remainder in I830DRI2ScheduleSwap()

2010-02-21 Thread Mario Kleiner
The current code in I830DRI2ScheduleSwap() only schedules the
correct vblank events for the case divisor == 0, i.e., the
simple glXSwapBuffers() case.

In a glXSwapBuffersMscOML() request, divisor can be  0, which would go
wrong.

This modified code should handle target_msc, divisor, remainder and
the different cases defined in the OML_sync_control extension
correctly for the divisor  0 case.

It also tries to make sure that the effective framecount of swap satisfies
all constraints, taking the 1 frame delay in pageflipping mode and possible
delays in blitting/exchange mode due to DRM_VBLANK_NEXTONMISS into account.

The swap_interval logic in the X-Servers DRI2SwapBuffers() call expects
the returned swap_target from the DDX to be reasonably accurate, otherwise
implementation of swap_interval for the glXSwapBuffers() as defined in the
SGI_swap_interval extension may become unreliable.

For non-pageflipped mode, the returned swap_target is always correct due to the
adjustments done by drmWaitVBlank(), as DRM_VBLANK_NEXTONMISS is set.

In pageflipped mode, DRM_VBLANK_NEXTONMISS can't be used without
severe impact on performance, so the code in I830DRI2ScheduleSwap()
must make manual adjustments to the returned vbl.reply.sequence number.

This patch adds the needed adjustments.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 src/i830_dri.c |  110 +++
 1 files changed, 70 insertions(+), 40 deletions(-)

diff --git a/src/i830_dri.c b/src/i830_dri.c
index d5e085a..f3cd739 100644
--- a/src/i830_dri.c
+++ b/src/i830_dri.c
@@ -586,6 +586,7 @@ I830DRI2ScheduleSwap(ClientPtr client, DrawablePtr draw, 
DRI2BufferPtr front,
int ret, pipe = I830DRI2DrawablePipe(draw), flip = 0;
DRI2FrameEventPtr swap_info;
enum DRI2FrameEventType swap_type = DRI2_SWAP;
+   CARD64 current_msc;
 
swap_info = xcalloc(1, sizeof(DRI2FrameEventRec));
 
@@ -629,6 +630,8 @@ I830DRI2ScheduleSwap(ClientPtr client, DrawablePtr draw, 
DRI2BufferPtr front,
return FALSE;
}
 
+   current_msc = vbl.reply.sequence;
+
/* Flips need to be submitted one frame before */
if (DRI2CanFlip(draw)  !intel-shadow_present 
intel-use_pageflipping) {
@@ -638,55 +641,81 @@ I830DRI2ScheduleSwap(ClientPtr client, DrawablePtr draw, 
DRI2BufferPtr front,
 
swap_info-type = swap_type;
 
+   /* Correct target_msc by 'flip' if swap_type == DRI2_FLIP.
+* Do it early, so handling of different timing constraints
+* for divisor, remainder and msc vs. target_msc works.
+*/
+   if (*target_msc  0)
+   *target_msc -= flip;
+   
/*
-* If divisor is zero, we just need to make sure target_msc passes
-* before waking up the client.
+* If divisor is zero, or current_msc is smaller than target_msc,
+* we just need to make sure target_msc passes before initiating the 
swap.
 */
-   if (divisor == 0) {
-   vbl.request.type = DRM_VBLANK_NEXTONMISS |
-   DRM_VBLANK_ABSOLUTE | DRM_VBLANK_EVENT;
-   if (pipe  0)
-   vbl.request.type |= DRM_VBLANK_SECONDARY;
-
-   vbl.request.sequence = *target_msc;
-   vbl.request.sequence -= flip;
-   vbl.request.signal = (unsigned long)swap_info;
-   ret = drmWaitVBlank(intel-drmSubFD, vbl);
-   if (ret) {
-   xf86DrvMsg(scrn-scrnIndex, X_WARNING,
-  divisor 0 get vblank counter failed: %s\n,
-  strerror(errno));
-   return FALSE;
-   }
-
-   *target_msc = vbl.reply.sequence;
-   swap_info-frame = *target_msc;
-
-   return TRUE;
+   if (divisor == 0 || current_msc  *target_msc) {
+   vbl.request.type = DRM_VBLANK_ABSOLUTE | DRM_VBLANK_EVENT;
+   /* If non-pageflipping, but blitting/exchanging, we need to use
+* DRM_VBLANK_NEXTONMISS to avoid unreliable timestamping later on.
+*/
+   if (flip == 0)
+   vbl.request.type |= DRM_VBLANK_NEXTONMISS;
+   if (pipe  0)
+   vbl.request.type |= DRM_VBLANK_SECONDARY;
+   
+   /* If target_msc already reached or passed, set it to
+* current_msc to ensure we return a reasonable value back
+* to the caller. This makes swap_interval logic more robust.
+*/
+   if (current_msc = *target_msc)
+   *target_msc = current_msc;
+   
+   vbl.request.sequence = *target_msc;
+   vbl.request.signal = (unsigned long)swap_info;
+   ret = drmWaitVBlank(intel-drmSubFD, vbl);
+   if (ret) {
+   xf86DrvMsg(scrn-scrnIndex, X_WARNING,
+  divisor 0 get vblank counter failed: %s\n

[PATCH xf86-video-intel 1/2] Fix I830DRI2ScheduleWaitMSC() to correctly handle target_msc, divisor and remainder.

2010-02-21 Thread Mario Kleiner
Previous code only handled divisor == 0 case correctly. This should
honor a given target_msc for the divisor  0 case and handle the
(msc % divisor) == remainder constraint correctly.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 src/i830_dri.c |   19 +++
 1 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/src/i830_dri.c b/src/i830_dri.c
index e1c1470..d5e085a 100644
--- a/src/i830_dri.c
+++ b/src/i830_dri.c
@@ -756,6 +756,7 @@ I830DRI2ScheduleWaitMSC(ClientPtr client, DrawablePtr draw, 
CARD64 target_msc,
 DRI2FrameEventPtr wait_info;
 drmVBlank vbl;
 int ret, pipe = I830DRI2DrawablePipe(draw);
+CARD64 current_msc;
 
 /* Drawable not visible, return immediately */
 if (pipe == -1) {
@@ -785,11 +786,13 @@ I830DRI2ScheduleWaitMSC(ClientPtr client, DrawablePtr 
draw, CARD64 target_msc,
return FALSE;
 }
 
+current_msc = vbl.reply.sequence;
+
 /*
- * If divisor is zero, we just need to make sure target_msc passes
- * before waking up the client.
+ * If divisor is zero, or current_msc is smaller than target_msc,
+ * we just need to make sure target_msc passes before waking up the client.
  */
-if (divisor == 0) {
+if (divisor == 0 || current_msc  target_msc) {
vbl.request.type = DRM_VBLANK_ABSOLUTE | DRM_VBLANK_EVENT;
if (pipe  0)
vbl.request.type |= DRM_VBLANK_SECONDARY;
@@ -815,16 +818,16 @@ I830DRI2ScheduleWaitMSC(ClientPtr client, DrawablePtr 
draw, CARD64 target_msc,
 if (pipe  0)
vbl.request.type |= DRM_VBLANK_SECONDARY;
 
+vbl.request.sequence = current_msc - (current_msc % divisor) + remainder;
+
 /*
- * If we have no remainder and the condition isn't satisified, it means
+ * If calculated remainder is larger than requested remainder, it means
  * we've passed the last point where seq % divisor == remainder, so we need
  * to wait for the next time that will happen.
  */
-if (((vbl.reply.sequence % divisor) != remainder)  !remainder)
-   vbl.request.sequence += divisor;
+if ((current_msc % divisor)  remainder)
+   vbl.request.sequence += divisor;
 
-vbl.request.sequence = vbl.reply.sequence - (vbl.reply.sequence % divisor) 
+
-   remainder;
 vbl.request.signal = (unsigned long)wait_info;
 ret = drmWaitVBlank(intel-drmSubFD, vbl);
 if (ret) {
-- 
1.6.6

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


Re: Improvements to Intel DDX for implementation of DRI2 OML_sync_control extensions.

2010-03-03 Thread Mario Kleiner

They are against Jesse Barnes xf86-video-intel tree:

http://cgit.freedesktop.org/~jbarnes/xf86-video-intel/log/
git://people.freedesktop.org/~jbarnes/xf86-video-intel master branch

The tree is in the same state as when i wrote the patches - they are  
following commit


http://cgit.freedesktop.org/~jbarnes/xf86-video-intel/commit/? 
id=dd1930673565e70ec247f7f96ca0bdc575eecfb0


Ideas to fix? I'm a git newbie.

thanks,
-mario

On Mar 4, 2010, at 1:28 AM, Eric Anholt wrote:

On Sun, 21 Feb 2010 18:45:46 +0100, Mario Kleiner  
mario.klei...@tuebingen.mpg.de wrote:

This series of 2 patches fixes issues in the Intel DDX implementation
of I830DRI2ScheduleWaitMSC() and I830DRI2ScheduleSwap().

The previous implementation should mostly work in the simple
glXSwapBuffers() case, although it may reduce the maximum
swaprate to half the video refresh interval, e.g., max 30 fps
at a refresh of 60 Hz. It won't work correctly for the
glXSwapBuffersMscOML(target_msc, divisor, remainder) case if
divisor or remainder are non-zero or target_msc is not  2
frames ahead of the current vblank count. It doesn't handle
the (msc % divisor) == remainder case correctly.

A similar problem exists for calls to glXWaitForMscOML().

These patches should fix above issues, or at least make
handling more robust.

Jesse Barnes already had a look at these patches, but due to
lack of suitable hardware, i haven't tested them myself, not
even if they compile cleanly.

Could somebody please give them some good testing and merge
them if appropriate?


These don't seem to apply, and I can't quite work out what code  
they're

against


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] DRI2: fixup handling of last_swap_target

2010-03-04 Thread Mario Kleiner

Jesse,

i think this patch breaks the patch series i sent out a week ago, the  
one we discussed in length offlist and which you reviewed and  
recommended for inclusion.


How to proceed? Could you try to merge my patches to your tree, then  
reapply your slightly modified patch again? That seems to require  
less shuffling around of code and less time to check everything's  
still sane.


My patch already breaks stuff up similar to yours, you'd only need to  
move a block of setup code fro last_swap_target out of  
DRI2SwapBuffers and into DRI2CreateDrawable again.


-mario


On Mar 4, 2010, at 6:23 PM, Jesse Barnes wrote:


[Sorry for the resend, forgot to cc Keith]

We need to initialize the swap target, which is passed to the  
driver to

schedule events.  Rather than using -1 to indicate that the field is
uninitialized, just make sure we initialize it at drawable creation
time.

Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index cd69ca0..301f4fd 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -116,9 +116,12 @@ DRI2GetDrawable(DrawablePtr pDraw)
 int
 DRI2CreateDrawable(DrawablePtr pDraw)
 {
+DRI2ScreenPtr   ds = DRI2GetScreen(pDraw-pScreen);
 WindowPtr  pWin;
 PixmapPtr  pPixmap;
 DRI2DrawablePtr pPriv;
+CARD64  ust;
+int ret;

 pPriv = DRI2GetDrawable(pDraw);
 if (pPriv != NULL)
@@ -141,7 +144,10 @@ DRI2CreateDrawable(DrawablePtr pDraw)
 pPriv-swap_count = 0;
 pPriv-target_sbc = -1;
 pPriv-swap_interval = 1;
-pPriv-last_swap_target = -1;
+ret = (*ds-GetMSC)(pDraw, ust, pPriv-last_swap_target);
+if (!ret)
+   pPriv-last_swap_target = 0;
+
 pPriv-swap_limit = 1; /* default to double buffering */

 if (pDraw-type == DRAWABLE_WINDOW)
@@ -575,7 +581,6 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr  
pDraw, CARD64 target_msc,

 DRI2ScreenPtr   ds = DRI2GetScreen(pDraw-pScreen);
 DRI2DrawablePtr pPriv;
 DRI2BufferPtr   pDestBuffer = NULL, pSrcBuffer = NULL;
-CARD64  ust;
 int ret, i;

 pPriv = DRI2GetDrawable(pDraw);
@@ -617,27 +622,6 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr  
pDraw, CARD64 target_msc,

 }

 /*
- * In the simple glXSwapBuffers case, all params will be 0,  
and we just
- * need to schedule a swap for the last swap target + the swap  
interval.
- * If the last swap target hasn't been set yet, call into the  
driver

- * to get the current count.
- */
-if (target_msc == 0  divisor == 0  remainder == 0 
-   pPriv-last_swap_target  0) {
-   ret = (*ds-GetMSC)(pDraw, ust, target_msc);
-   if (!ret) {
-   xf86DrvMsg(pScreen-myNum, X_ERROR,
-  [DRI2] %s: driver failed to return current MSC\n,
-  __func__);
-   return BadDrawable;
-   }
-}
-
-/* First swap needs to initialize last_swap_target */
-if (pPriv-last_swap_target  0)
-   pPriv-last_swap_target = target_msc;
-
-/*
  * Swap target for this swap is last swap target + swap  
interval since
  * we have to account for the current swap count, interval,  
and the

  * number of pending swaps.
___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


Re: Improvements to Intel DDX for implementation of DRI2 OML_sync_control extensions.

2010-03-04 Thread Mario Kleiner
Is there a simple git way to switch to that tree, or would i need to  
checkout the whole tree again and then copy  paste chunks of code  
from the old tree to the new one manually?


-mario

On Mar 4, 2010, at 4:56 PM, Jesse Barnes wrote:


Oh sorry, that tree is out of date since I pushed everything to
master.  The tree to generate fixes against is
git://git.freedesktop.org/git/xorg/driver/xf86-video-intel.

Jesse

On Thu, 4 Mar 2010 05:24:38 +0100
Mario Kleiner mario.klei...@tuebingen.mpg.de wrote:


They are against Jesse Barnes xf86-video-intel tree:

http://cgit.freedesktop.org/~jbarnes/xf86-video-intel/log/
git://people.freedesktop.org/~jbarnes/xf86-video-intel master branch

The tree is in the same state as when i wrote the patches - they are
following commit

http://cgit.freedesktop.org/~jbarnes/xf86-video-intel/commit/?
id=dd1930673565e70ec247f7f96ca0bdc575eecfb0

Ideas to fix? I'm a git newbie.

thanks,
-mario

On Mar 4, 2010, at 1:28 AM, Eric Anholt wrote:


On Sun, 21 Feb 2010 18:45:46 +0100, Mario Kleiner
mario.klei...@tuebingen.mpg.de wrote:

This series of 2 patches fixes issues in the Intel DDX
implementation of I830DRI2ScheduleWaitMSC() and
I830DRI2ScheduleSwap().

The previous implementation should mostly work in the simple
glXSwapBuffers() case, although it may reduce the maximum
swaprate to half the video refresh interval, e.g., max 30 fps
at a refresh of 60 Hz. It won't work correctly for the
glXSwapBuffersMscOML(target_msc, divisor, remainder) case if
divisor or remainder are non-zero or target_msc is not  2
frames ahead of the current vblank count. It doesn't handle
the (msc % divisor) == remainder case correctly.

A similar problem exists for calls to glXWaitForMscOML().

These patches should fix above issues, or at least make
handling more robust.

Jesse Barnes already had a look at these patches, but due to
lack of suitable hardware, i haven't tested them myself, not
even if they compile cleanly.

Could somebody please give them some good testing and merge
them if appropriate?


These don't seem to apply, and I can't quite work out what code
they're
against


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)





--
Jesse Barnes, Intel Open Source Technology Center


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] DRI2: fixup handling of last_swap_target

2010-03-04 Thread Mario Kleiner

On Mar 4, 2010, at 7:06 PM, Jesse Barnes wrote:


Yeah, I'll try to apply your series on top and re-post it.


Thanks. I'll just wait for updates from you then. Can i simply pull  
from your own xserver branch again to get a consistent view of the  
new situation? I'd then look at the thing again and see if anything  
still clashes or not.




I know this one affects some of the things we discussed, since it  
makes

offscreen drawing vs. swap interval even more broken than before (i.e.
swap interval and throttling will both be broken when a drawable isn't
being displayed), but on the other hand it'll prevent hangs, so that's
good.


Don't think the combined patches will introduce more offscreen  
brokenness than before, but reduce brokenness for the onscreen/non  
redirected cases quite a bit.


I just meant that you changed lines of code very close to my changes,  
so my stuff will probably fail to merge and looking at it it seems  
that the amount of breakage/merge conflicts to fix will be much  
smaller if my patch is applied first, then yours on top of it instead  
of the other way round.


-mario



Thanks,
Jesse

On Thu, 4 Mar 2010 19:03:15 +0100
Mario Kleiner mario.klei...@tuebingen.mpg.de wrote:


Jesse,

i think this patch breaks the patch series i sent out a week ago, the
one we discussed in length offlist and which you reviewed and
recommended for inclusion.

How to proceed? Could you try to merge my patches to your tree, then
reapply your slightly modified patch again? That seems to require
less shuffling around of code and less time to check everything's
still sane.

My patch already breaks stuff up similar to yours, you'd only need to
move a block of setup code fro last_swap_target out of
DRI2SwapBuffers and into DRI2CreateDrawable again.

-mario


On Mar 4, 2010, at 6:23 PM, Jesse Barnes wrote:


[Sorry for the resend, forgot to cc Keith]

We need to initialize the swap target, which is passed to the
driver to
schedule events.  Rather than using -1 to indicate that the field is
uninitialized, just make sure we initialize it at drawable creation
time.

Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index cd69ca0..301f4fd 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -116,9 +116,12 @@ DRI2GetDrawable(DrawablePtr pDraw)
 int
 DRI2CreateDrawable(DrawablePtr pDraw)
 {
+DRI2ScreenPtr   ds = DRI2GetScreen(pDraw-pScreen);
 WindowPtr  pWin;
 PixmapPtr  pPixmap;
 DRI2DrawablePtr pPriv;
+CARD64  ust;
+int ret;

 pPriv = DRI2GetDrawable(pDraw);
 if (pPriv != NULL)
@@ -141,7 +144,10 @@ DRI2CreateDrawable(DrawablePtr pDraw)
 pPriv-swap_count = 0;
 pPriv-target_sbc = -1;
 pPriv-swap_interval = 1;
-pPriv-last_swap_target = -1;
+ret = (*ds-GetMSC)(pDraw, ust, pPriv-last_swap_target);
+if (!ret)
+   pPriv-last_swap_target = 0;
+
 pPriv-swap_limit = 1; /* default to double buffering */

 if (pDraw-type == DRAWABLE_WINDOW)
@@ -575,7 +581,6 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr
pDraw, CARD64 target_msc,
 DRI2ScreenPtr   ds = DRI2GetScreen(pDraw-pScreen);
 DRI2DrawablePtr pPriv;
 DRI2BufferPtr   pDestBuffer = NULL, pSrcBuffer = NULL;
-CARD64  ust;
 int ret, i;

 pPriv = DRI2GetDrawable(pDraw);
@@ -617,27 +622,6 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr
pDraw, CARD64 target_msc,
 }

 /*
- * In the simple glXSwapBuffers case, all params will be 0,
and we just
- * need to schedule a swap for the last swap target + the swap
interval.
- * If the last swap target hasn't been set yet, call into the
driver
- * to get the current count.
- */
-if (target_msc == 0  divisor == 0  remainder == 0 
-   pPriv-last_swap_target  0) {
-   ret = (*ds-GetMSC)(pDraw, ust, target_msc);
-   if (!ret) {
-   xf86DrvMsg(pScreen-myNum, X_ERROR,
-  [DRI2] %s: driver failed to return current MSC\n,
-  __func__);
-   return BadDrawable;
-   }
-}
-
-/* First swap needs to initialize last_swap_target */
-if (pPriv-last_swap_target  0)
-   pPriv-last_swap_target = target_msc;
-
-/*
  * Swap target for this swap is last swap target + swap
interval since
  * we have to account for the current swap count, interval,
and the
  * number of pending swaps.
___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm

Re: [PATCH] DRI2: fixup handling of last_swap_target

2010-03-06 Thread Mario Kleiner
 it was blocked, otherwise there are a race conditions.



thanks,
-mario

*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH xf86-video-intel 2/2] Fix handling of target_msc, divisor, remainder in I830DRI2ScheduleSwap()

2010-03-07 Thread Mario Kleiner

On Mar 5, 2010, at 10:09 PM, Jesse Barnes wrote:


I just fixed these up and pushed them into the tree along with another
fix for handling offscreen drawables better.  Tests indicate that OML
swap divisor/remainder stuff is working correctly now.



Thanks for doing that. But stuff got seriously garbled in  
I830ScheduleWaitMSC() inside src/i830_dri.c - this won't work  
correctly for any glXWaitForMscOML(target_msc, divisor, remainder)  
call, except for the special cases with divisor == 0!


This passage...

/*
 * If divisor is zero, or current_msc is smaller than  
target_msc,
 * we just need to make sure target_msc passes  before  
waking up the

 * client.
 */
if (divisor == 0) {
...

should read as...

if (divisor == 0 ||  current_msc  target_msc) {

This passage ...

/*
 * If the calculated  remainder and the condition isn't  
satisified, it
 * means we've passed the last point where seq % divisor ==  
remainder,

 * so we need to wait for the next time that will happen.
 */
if ((current_msc % divisor) != remainder)
vbl.request.sequence += divisor;

... should be replaced by ...

vbl.request.sequence = current_msc - (current_msc % divisor) +  
remainder;


/*
 * If calculated remainder is larger than requested remainder,  
it means
 * we've passed the last point where seq % divisor == remainder,  
so we need

 * to wait for the next time that will happen.
 */
if ((current_msc % divisor)  remainder)
   vbl.request.sequence += divisor;


Other than that, it's fine.

Oh and in I830DRI2ScheduleSwap() , this statement ...

if (pipe  0)
vbl.request.type |= DRM_VBLANK_SECONDARY;

... accidentally got copied twice into the if () clause, which is not  
harmful, but a bit redundant :-)


best,
-mario


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] DRI2: fixup handling of last_swap_target

2010-03-07 Thread Mario Kleiner
. So its not just something totally far fetched.


So i think you should have some call to DRI2Throttleclient at the  
beginning of DRI2ScheduleSwap() as well for more determinstic behaviour?



I thought quite a bit about such effects while looking at your code.  
I think a really cool and more bullet-proof implementation of the  
OML_sync_control spec that would allow a client to render multiple  
frames ahead / have multiple swaps pending without getting into  
trouble with a few race-conditions in the current code is possible,  
but it would require a more dynamic scheduling of what happens when.  
Maybe we can discuss such refinements when your first iteration of  
the dri2 swap bits is finished and out there.


-mario




On Mar 7, 2010, at 6:10 PM, Jesse Barnes wrote:


On Sun, 7 Mar 2010 08:44:51 +0100
Mario Kleiner mario.klei...@tuebingen.mpg.de wrote:


On Mar 5, 2010, at 6:50 PM, Jesse Barnes wrote:



Ok pushed fixes for these issues to the repo above.  I know  
you've put
a lot of time into this already, but can you take another look to  
make

sure I got everything?


No problem.


  Note I left the swap interval default at 1
since GLX_MESA_swap_control should let us set it to 0 to get
unthrottled behavior, which I just fixed and tested.



Agreed, that's ok. GLX_MESA_swap_control is a way to do it.

The current state looks almost good, esp. your fixes to
DRI2SwapBuffers for swap_interval == 0 and for target_msc == 0.

But a few more things:

In DRI2SwapBuffers, in the body of ...

 /* Old DDX or no swap interval, just blit */
 if (!ds-ScheduleSwap || !pPriv-swap_interval) {
...

... it calls DRI2SwapComplete and passes target_msc as the 'msc'
value of swap completion. Would be probably better to pass a zero
value, just as it is done in the Intel DDX when the swap doesn't
happen at all at the requested target_msc, but rather immediately due
to some problems. If ds-ScheduleSwap isn't available then the 'msc'
is basically undefined and unknown, so zero is a better return value.
In the !pPriv-swap_interval case one could call a *ds-getMSC() to
get a more meaningful return value for 'msc'. Don't know if it's
worth the effort, but a zero return would be better than 'target_msc'
imho.


Right, good idea.  That way software using the new event won't get
bogus values with old DDXes.

We need to add another patch to Mesa. Mesa translates a  
glXSwapBuffers

() call into a DRI2SwapBuffers() call with target_msc, divisor and
remainder all zero and DRI2SwapBuffers takes this as hint that
glXSwapBuffers behaviour is requested. However these are also legal
arguments to a glXSwapBuffersMscOML(..., ,0,0,0) call, which has to
behave differently than a glXSwapBuffers call  -- In this case,
DRI2SwapBuffers would confuse this with a glXSwapBuffers request and
do the wrong thing.

I'd propose adding this line to the __glXSwapBuffersMscOML() routine
in mesa's /src/glx/x11/glxcmds.c file:

if (target_msc == 0  divisor == 0  remainder == 0) remainder = 1;

- if target_msc and divisor are zero, then the remainder is ignored
inside the DRI implementation, so the actual value of remainder
doesn't matter for swap behaviour, but can be used to disambiguate
glXSwapBuffers vs. glXSwapBuffersMsc...  for a glXSwapBuffersMscOML
(0,0,0) call.


Ok, I'll have to check the behaviors again, but this sounds  
reasonable.



Your deferred pPriv deletion logic looks now good to me. But if
deletion is deferred then i think almost all public functions
inside dri2.c should return 'BadDrawable' not only for pPriv == NULL
but also for pPriv-refCount == 0, e.g., DRI2SwapBuffers,
DRI2GetMSC, ... From the viewpoint of client code, a refCount == 0 is
the same as a destroyed drawable. For all practical purposes it is
already gone. Maybe it would make sense to consolidate the (pPriv ==
NULL || pPriv-refCount == 0) check into a macro or function,
something like DRI2IsDrawableAlive(pPriv); and then call that in most
DRI2xxx functions?


Yeah, that would make things cleaner, I'll do that.


I believe there's also another problem with the pPriv-blockedClient
logic. The implementation only keeps track of blockedClient, but not
for the reason why the client blocked. This can cause trouble in
DRI2WakeClient. Consider this example which i think would cause the
client to hang:

1. Assume current msc is 1000 and there are 2 threads in the client.

// Thread 1: Request swap at target_msc 1010:
2. glXSwapBuffersMscOML(dpy, drawable, 1010, 0, 0);

// Thread 2: Request wait until target_msc 2000:
3. glXWaitForMscOML(dpu, drawable, 2000, 0, 0, );

4. Thread 1: Calls XDestroyWindow(dpy, drawable);

Step 2 will schedule a swap for msc = 1010.
Step 3 will pPriv-blockedClient = client and IgnoreClient(client)
the client and schedule a wakeup at msc = 2000.
The xconnection dpy is now blocked.

Step 4 doesn't execute yet, because the xconnection is blocked.

Now at msc = 1010 - swap completes - DRI2SwapComplete -
DRI2WakeClient - executes

Re: [PATCH xf86-video-intel 2/2] Fix handling of target_msc, divisor, remainder in I830DRI2ScheduleSwap()

2010-03-07 Thread Mario Kleiner

On Mar 7, 2010, at 6:18 PM, Jesse Barnes wrote:


Arg, I did botch that patch.  And of course I only tested the swap
buffers behavior and not OML's WaitMSC so I didn't catch it.  I'll
improve the test and push the fix.



No problem. Just fyi: I noticed you added a test in  
I830DRI2ScheduleSwap() that outputs X_Warnings if a swap is scheduled  
more than 100 vblanks ahead. At least for the users of my toolkit,  
scheduling a swap more than 100 vblanks ahead wouldn't be something  
noteworthy but quite a typical case of normal use of the system. At  
the end of a work day some users would probably find ten thousands of  
such warnings in some log, assuming those warnings get logged at the  
normal log level. So this usage case is less weird than one would  
think :-)


-mario


___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] DRI2: fixup handling of last_swap_target

2010-03-07 Thread Mario Kleiner


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] DRI2: fixup handling of last_swap_target

2010-03-08 Thread Mario Kleiner

On Mar 8, 2010, at 6:07 PM, Jesse Barnes wrote:


On Sun, 7 Mar 2010 21:16:21 +0100
Florian Mickler flor...@mickler.org wrote:


On Sun, 7 Mar 2010 09:10:51 -0800
Jesse Barnes jbar...@virtuousgeek.org wrote:


On Sun, 7 Mar 2010 08:44:51 +0100
Mario Kleiner mario.klei...@tuebingen.mpg.de wrote:


Yeah, this could also happen without OML I think, if a few swaps  
were
queued (resulting in a block) and then a SGI_video_sync call  
occurred.

I'll fix it up.

Thanks,


could this also happen with a single glxgears instance and no other
3d clients running?


I don't *think* glxgears does this; it's been running fine for me at
least.
But you could definitely be hitting one of the MSC races; you can add
some debug output to I830DRI2ScheduleSwap in src/i830_dri.c in the DDX
driver to see what's going on.  Presumably one of the paths there is
putting the client to sleep and never waking it.


The glxgears.c of mesa only does draw - glXSwapBuffers - draw -  
glXSwapBuffers - ...
It doesn't use any of the new api's, so the only way it can block is  
probably via the DRI2ThrottleClient call when requesting a new  
backbuffer.


I see at least one problematic thing. In the xserver's dri2.c file in  
DRI2SwapBuffers(), the pPriv-swapsPending++ statement is *after* the  
call into the ddx's I830DRI2ScheduleSwap() function, instead of  
*before* it. If a swap completes inside the ddx, it will call  
DRI2SwapComplete() which will do a pPriv-swapsPending-- before it  
was incremented to mark a swap as pending. That is, if swapsPending  
was zero before, it will now wrap around to 0x. This will  
happen if the fallback_blit path inside the ddx is used, e.g., if the  
drawable is not visible (yet).


Not sure if this by itself is sufficient for the hang, because the  
call to pPriv-swapsPending++ after return from I830DRI2ScheduleSwap  
() should fix this, but if something else goes wrong and an error  
path is taken or an event not delivered, then swapsPending could get  
stuck at 0x or a similar high number, which would trigger  
DRI2ThrottleClient and block the client connection without ever  
waking it up again, as clients are only woken up if a swap completes,  
which can't happen if no swap is pending. - hang in _XReply on the  
client side.


So we should probable move pPriv-swapsPending++ before the call to...

ret = (*ds-ScheduleSwap)(client, pDraw, pDestBuffer, pSrcBuffer,
  swap_target, divisor, remainder, func,  
data);


... in DRI2SwapBuffers() and probably add a pPriv-swapsPending--  
into the error handling path directly after the call to ds- 
ScheduleSwap.


-mario

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH xf86-video-intel 2/2] Fix handling of target_msc, divisor, remainder in I830DRI2ScheduleSwap()

2010-03-08 Thread Mario Kleiner


On Mar 8, 2010, at 8:34 PM, Jesse Barnes wrote:

Ok just pushed these fixes; omlsync seems to do the right thing now
with both waits and swaps.



Sorry to torture you further, almost there ;-) -- the devil is in the  
details.


In I830DRI2ScheduleWaitMSC():

At the end, in this if statement...

if ((current_msc % divisor)  remainder)
vbl.request.sequence += divisor;

... the comparison should be = instead of  , that is:

if ((current_msc % divisor) = remainder)

I reread the spec and realized the true meaning of this little  
snippet ...Otherwise, the function will block until the MSC value is  
incremented to a value such that MSC % divisor = remainder...


It doesn't say until the msc *is* such that msc % divisor =  
remainder, but it says until the *msc is **incremented** to a  
value*. That means they don't want it to return immediately if the  
current msc already satisfies the constraint, but they want it to  
return basically at the start of the vblank interval when the msc  
reaches a value that satisfies the constraint. The idea is to  
synchronize the client's execution to the vblank interval. If a
client just wants to wait for a certain msc without precise sync to  
the vblank interval, it can simply pass a divisor==0 and then will  
get that behaviour.


Changing the comparison from a  to a = above achieves that goal.  
This makes lots of sense if i think about how i would actually use  
that function in my toolkit or similar apps. I'd mostly want it to  
synchronize to the exact *start* of certain video refesh cycles.


Then we should add this check to the if(divisor == 0 || current_msc   
target_msc) {  } branch:


if (current_msc = target_msc)
target_msc = current_msc;

This is similar to the check in I830DRI2ScheduleSwap. It guarantees  
that the target_msc can't fall behind the current_msc. This is  
important for all scheduled vblank events because the DRM will do the  
wrong thing if the requested vblank count is more than 2^23 counts  
behind the current vblank count. Events would get stuck forever if  
this happened due to a 32 bit wraparound, not good.



What else? I rethought the idea of virtualizing the msc into a 64 bit  
value from the 32 count of the kernel. It is a bit more tricky than  
one would expect, so probably not a quick thing do to correctly - and  
not a thing to do now.


But we could add some simple masking to the very top of  
I830DRI2ScheduleSwap() and I830DRI2ScheduleWaitMSC() to truncate all  
msc related input values from the server to 32 bits, ie.


in I830DRI2ScheduleWaitMSC()

target_msc = 0x;
divisor = 0x;
remainder = 0x;

in I830DRI2ScheduleSwap()

*target_msc = 0x;
divisor = 0x;
remainder = 0x;

At least the few simple cases my brain can handle with paper   
pencil testing seem to do the right thing if a 32 bit vblank counter  
wraparound happens. At worst, there would be a small glitch every  
time the counter wraps around - about once every 4-12 months.  
Inbetween everything should work. I doubt that an app will regularly  
schedule swaps or waits multiple months ahead of time and still  
expect it to work perfectly ;-).


Finally in I830DRI2ScheduleWaitMSC() the error handling is a bit  
inconsistent. Sometimes it returns failure to calling code, which  
would provoke a client hang due to a missing _XReply, sometimes it  
recovers and unstucks the client by providing a fake response.  
Similar to the blit_fallback: path in I830DRI2ScheduleSwap() you  
could maybe have a common error handler that at least calls  
DRI2WaitMSCComplete(client, draw, 0, 0, 0); to prevent the client  
from hanging?


Ok, i'm hopefully running out of more ideas for now ;-)
-mario

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


Re: DRI2 fixes

2010-03-24 Thread Mario Kleiner

On Mar 22, 2010, at 11:03 PM, Jesse Barnes wrote:

This is a collection of fixes from my personal server tree  
targeting the

1.8 release.  They're mostly small fixes, but they fix a few important
(i.e. common) cases with the new protocol code.

Please review; I'll make any necessary changes, add the Reviewed-by
tags, and push to Keith.



Hi Jesse,

done. You can add a reviewed-by from me on all relevant patches if  
you like -- i pulled them from your xserver-tree and checked on that.  
I didn't test any of them (no suitable hardware) but did a paper +  
pencil + brain review and looks good to me.


thanks,
-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[Patches][nouveau/ddx]: Improvements to bufferswap implementation and timestamping

2012-02-15 Thread Mario Kleiner
Hi,

here a set of patches against the nouveau-ddx. This is an extended and
revised set, based on Francisco Jerez feedback from autumn last year.

[1/9] Makes pageflipping work again on X-Server 1.12rc. It apparently stopped
working somewhere around Xorg 1.11+.

[2/9] Implements handling of pageflip completion events from the kernel.
Francisco Jerez argument against including it was that the x-server didn't
have a swaplimit api, so this couldn't be applied at the same time as the
pseudo triple-buffering hack which is in place in the current ddx.
Now we have the swaplimit api in 1.12, so this problem should be solved.

[3/9] Makes use of the swaplimit api. A new xorg.conf option SwapLimit
allows to select a swaplimit of 1 or 2 for double-buffering or
triple-buffering. It defaults to 2 for Xorg 1.12+ and 1 for older servers.
This way, on 1.12+ nouveau retains the kind of triple buffering behaviour it
currently has, but swap completion timestamping (OML_sync_control,
INTEL_swap_events, and client swap throttling) works conforming to
the specs. On older servers it removes triple-buffering but makes
nouveau conform to the specs. This is important for apps that need
precise and reliable swap scheduling and timestamping.

[4/9] A bug fix against corrupted desktop when switching from redirected
to non-redirected fullscreen windows under a desktop compositor.
Fixes FDO bug #35452.

[5/9] Some fixes to swap scheduling, revised according to Francisco's
review.

[6/9] Fixes swap throttling for non-fullscreen windows under a desktop
compositor. Split into a separate patch according to Francisco's
feedback.

[7/9] An attempt to provide more sane swap completion events for non-
fullscreen windows. This is a bit of cheating, as it delivers the
events for the earliest point in time one would expect the swap
to complete, assuming only a lightly loaded gpu. Real completion
could be later. I think this is an improvement because the current
implementation delivers swap complete events with timestamps and
counts that signal swap completion before the client even requested
the swap.

[8/9] This one adds Francisco's original triple-buffering hack back
for Xorg 1.11 and earlier servers if the xorg.conf option requests
a swaplimit  1. Users can choose between triple-buffering but
broken timestamping or double-buffering with sane timestamping.

[9/9] Fixes a corner case which could cause the ddx to segfault
with its current triple-buffering implementation.

I've tested these on single-display and dual-display setups,
with/without compositor, with/without redirection and checked
the robustness and precision of the timestamps with special
measurement equipment, on XOrg 1.12-rc2 and 1.10. They work
pretty well for me and finally make nouveau very useable for
the kind of scientific applications that require precise
swap scheduling and timestamping, so i'd love to see them
reviewed and hopefully included into the ddx soon.

A couple of things are a bit of hacks:

[3/9] I think the setup of a default swap limit would be
better done in the x-server itself instead of the ddx. The
setup code is a bit awkward, hijacking the ddx-CreateBuffer
function to apply the swaplimit. A DRI2GetSwapLimit() function
is also missing from the server, which could help in the
future to track swaplimit changes by other clients than the ddx
itself. Unfortunately it is too late for the 1.12 release to do
this.

[8/9] Don't know if this is still wanted/needed or not?

[9/9] It fixes the problem and doesn't affect performance, but is
somewhat of a hack. I don't know how to do better, maybe somebody
else has a better solution?

Thanks,
-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH 1/9] dri2: Fix can_exchange() to allow page-flipping on new servers.

2012-02-15 Thread Mario Kleiner
can_exchange() fails on at least Xorg 1.12+. This fixes
it in the same way it was fixed in the ati  intel ddx.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 src/nouveau_dri2.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index 3aa5ec5..5b62425 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -160,7 +160,7 @@ can_exchange(DrawablePtr draw, PixmapPtr dst_pix, PixmapPtr 
src_pix)
return ((DRI2CanFlip(draw)  pNv-has_pageflip)) 
dst_pix-drawable.width == src_pix-drawable.width 
dst_pix-drawable.height == src_pix-drawable.height 
-   dst_pix-drawable.depth == src_pix-drawable.depth 
+   dst_pix-drawable.bitsPerPixel == 
src_pix-drawable.bitsPerPixel 
dst_pix-devKind == src_pix-devKind;
 }
 
-- 
1.7.5.4

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH 2/9] dri2: Implement handling of pageflip completion events.

2012-02-15 Thread Mario Kleiner
Requests pageflip completion events from the kernel.
Implements pageflip completion handler to finalize
and timestamp swaps.

Completion handler includes a consistency check, and
disambiguation if multiple crtc's are involved in a
pageflip (e.g., clone mode, extendend desktop). Only
the timestamp of the crtc whose vblank event initially
triggered the swap is used, but handler waits for flip
completion on all involved crtc's before completing the
swap and releasing the old framebuffer.

This code is almost identical to the code used in the
ati/radeon ddx and intel ddx.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 src/drmmode_display.c |  107 +++--
 src/nouveau_dri2.c|   89 ++--
 src/nv_proto.h|5 ++-
 3 files changed, 191 insertions(+), 10 deletions(-)

diff --git a/src/drmmode_display.c b/src/drmmode_display.c
index 75ef6dd..9e15c29 100644
--- a/src/drmmode_display.c
+++ b/src/drmmode_display.c
@@ -83,6 +83,21 @@ typedef struct {
 drmmode_prop_ptr props;
 } drmmode_output_private_rec, *drmmode_output_private_ptr;
 
+typedef struct {
+drmmode_ptr drmmode;
+unsigned old_fb_id;
+int flip_count;
+void *event_data;
+unsigned int fe_frame;
+unsigned int fe_tv_sec;
+unsigned int fe_tv_usec;
+} drmmode_flipdata_rec, *drmmode_flipdata_ptr;
+
+typedef struct {
+drmmode_flipdata_ptr flipdata;
+Bool dispatch_me;
+} drmmode_flipevtcarrier_rec, *drmmode_flipevtcarrier_ptr;
+
 static void drmmode_output_dpms(xf86OutputPtr output, int mode);
 
 static drmmode_ptr
@@ -1245,13 +1260,17 @@ drmmode_cursor_init(ScreenPtr pScreen)
 }
 
 Bool
-drmmode_page_flip(DrawablePtr draw, PixmapPtr back, void *priv)
+drmmode_page_flip(DrawablePtr draw, PixmapPtr back, void *priv,
+ unsigned int ref_crtc_hw_id)
 {
ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum];
xf86CrtcConfigPtr config = XF86_CRTC_CONFIG_PTR(scrn);
drmmode_crtc_private_ptr crtc = config-crtc[0]-driver_private;
drmmode_ptr mode = crtc-drmmode;
int ret, i, old_fb_id;
+   int emitted = 0;
+   drmmode_flipdata_ptr flipdata;
+   drmmode_flipevtcarrier_ptr flipcarrier;
 
old_fb_id = mode-fb_id;
ret = drmModeAddFB(mode-fd, scrn-virtualX, scrn-virtualY,
@@ -1264,24 +1283,64 @@ drmmode_page_flip(DrawablePtr draw, PixmapPtr back, 
void *priv)
return FALSE;
}
 
+   flipdata = calloc(1, sizeof(drmmode_flipdata_rec));
+   if (!flipdata) {
+   xf86DrvMsg(scrn-scrnIndex, X_WARNING,
+   flip queue: data alloc failed.\n);
+   goto error_undo;
+   }
+
+   flipdata-event_data = priv;
+   flipdata-drmmode = mode;
+
for (i = 0; i  config-num_crtc; i++) {
crtc = config-crtc[i]-driver_private;
 
if (!config-crtc[i]-enabled)
continue;
 
+   flipdata-flip_count++;
+
+   flipcarrier = calloc(1, sizeof(drmmode_flipevtcarrier_rec));
+   if (!flipcarrier) {
+   xf86DrvMsg(scrn-scrnIndex, X_WARNING,
+  flip queue: carrier alloc failed.\n);
+   if (emitted == 0)
+   free(flipdata);
+   goto error_undo;
+   }
+
+   /* Only the reference crtc will finally deliver its page flip
+* completion event. All other crtc's events will be discarded.
+*/
+   flipcarrier-dispatch_me = ((1  i) == ref_crtc_hw_id);
+   flipcarrier-flipdata = flipdata;
+
ret = drmModePageFlip(mode-fd, crtc-mode_crtc-crtc_id,
- mode-fb_id, 0, priv);
+ mode-fb_id, DRM_MODE_PAGE_FLIP_EVENT,
+ flipcarrier);
if (ret) {
xf86DrvMsg(scrn-scrnIndex, X_WARNING,
   flip queue failed: %s\n, strerror(errno));
-   return FALSE;
+
+   free(flipcarrier);
+   if (emitted == 0)
+   free(flipdata);
+   goto error_undo;
}
+
+   emitted++;
}
 
-   drmModeRmFB(mode-fd, old_fb_id);
+   /* Will release old fb after all crtc's completed flip. */
+   flipdata-old_fb_id = old_fb_id;
 
return TRUE;
+
+error_undo:
+   drmModeRmFB(mode-fd, mode-fb_id);
+   mode-fb_id = old_fb_id;
+   return FALSE;
 }
 
 #ifdef HAVE_LIBUDEV
@@ -1347,6 +1406,42 @@ drmmode_uevent_fini(ScrnInfoPtr scrn)
 }
 
 static void
+drmmode_flip_handler(int fd, unsigned int frame, unsigned int tv_sec,
+unsigned int tv_usec, void *event_data)
+{
+   drmmode_flipevtcarrier_ptr flipcarrier

[PATCH 3/9] dri2: Add support for DRI2SwapLimit() API.

2012-02-15 Thread Mario Kleiner
Uses the new DRI2SwapLimit() API of X-Server 1.12+
to allow to change the maximum number of pending
swaps on a drawable before the OpenGL client is
throttled by the server.

The new optional xorg.conf parameter SwapLimit
allows to select a new swap limit = 1. The default
swap limit is 2 for triple-buffering on XOrg 1.12+,
1 for double-buffering on older servers, as we can't
change the swap limit there.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 man/nouveau.man|   11 +++
 src/nouveau_dri2.c |   29 +++--
 src/nv_const.h |2 ++
 src/nv_driver.c|   34 ++
 src/nv_type.h  |3 +++
 5 files changed, 77 insertions(+), 2 deletions(-)

diff --git a/man/nouveau.man b/man/nouveau.man
index dd9d938..59f6c1a 100644
--- a/man/nouveau.man
+++ b/man/nouveau.man
@@ -93,6 +93,17 @@ will assign xrandr outputs LVDS and VGA-0 to this instance 
of the driver.
 .TP
 .BI Option \*qPageFlip\*q \*q boolean \*q
 Enable DRI2 page flipping. Default: on.
+.TP
+.BI Option \*qSwapLimit\*q \*q integer \*q
+Set maximum allowed number of pending OpenGL double-buffer swaps for
+a drawable before a client is blocked.
+.br
+A value of 1 corresponds to double-buffering. A value of 2 corresponds
+to triple-buffering. Higher values may allow higher framerate, but also
+increase lag for interactive applications, e.g., games. Nouveau currently
+supports a maximum value of 2 on XOrg 1.12+ and a maximum of 1 on older 
servers.
+.br
+Default: 2 for XOrg 1.12+, 1 for older servers.
 .SH SEE ALSO
 __xservername__(__appmansuffix__), __xconfigfile__(__filemansuffix__), 
Xserver(__appmansuffix__), X(__miscmansuffix__)
 .SH AUTHORS
diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index acef08a..2908e56 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -42,6 +42,11 @@ nouveau_dri2_create_buffer(DrawablePtr pDraw, unsigned int 
attachment,
} else {
WindowPtr pwin = (WindowPtr)pDraw;
ppix = pScreen-GetWindowPixmap(pwin);
+
+#if DRI2INFOREC_VERSION = 6
+   /* Set initial swap limit on drawable. */
+   DRI2SwapLimit(pDraw, pNv-swap_limit);
+#endif
}
 
ppix-refcnt++;
@@ -208,6 +213,20 @@ nouveau_wait_vblank(DrawablePtr draw, int type, CARD64 msc,
return 0;
 }
 
+#if DRI2INFOREC_VERSION = 6
+static Bool
+nouveau_dri2_swap_limit_validate(DrawablePtr draw, int swap_limit)
+{
+   ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum];
+   NVPtr pNv = NVPTR(scrn);
+
+   if ((swap_limit  1 ) || (swap_limit  pNv-max_swap_limit))
+   return FALSE;
+
+   return TRUE;
+}
+#endif
+
 static void
 nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame,
 unsigned int tv_sec, unsigned int tv_usec,
@@ -293,8 +312,10 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int 
frame,
 * not, to prevent it from blocking the client on the next
 * GetBuffers request (and let the client do triple-buffering).
 *
-* XXX - The DRI2SwapLimit() API will allow us to move this to
-*   the flip handler with no FPS hit.
+* XXX - The DRI2SwapLimit() API allowed us to move this to
+*   the flip handler with no FPS hit for page flipped swaps.
+*   It is still needed for copy swaps as we lack a method
+*   to detect true swap completion for DRI2_BLIT_COMPLETE.
 */
DRI2SwapComplete(s-client, draw, frame, tv_sec, tv_usec,
 type, s-func, s-data);
@@ -534,6 +555,10 @@ nouveau_dri2_init(ScreenPtr pScreen)
dri2.ScheduleWaitMSC = nouveau_dri2_schedule_wait;
dri2.GetMSC = nouveau_dri2_get_msc;
 
+#if DRI2INFOREC_VERSION = 6
+   dri2.SwapLimitValidate = nouveau_dri2_swap_limit_validate;
+#endif
+
return DRI2ScreenInit(pScreen, dri2);
 }
 
diff --git a/src/nv_const.h b/src/nv_const.h
index a27a951..5c232d4 100644
--- a/src/nv_const.h
+++ b/src/nv_const.h
@@ -15,6 +15,7 @@ typedef enum {
 OPTION_GLX_VBLANK,
 OPTION_ZAPHOD_HEADS,
 OPTION_PAGE_FLIP,
+OPTION_SWAP_LIMIT,
 } NVOpts;
 
 
@@ -28,6 +29,7 @@ static const OptionInfoRec NVOptions[] = {
 { OPTION_GLX_VBLANK,   GLXVBlank,OPTV_BOOLEAN,   {0}, FALSE },
 { OPTION_ZAPHOD_HEADS, ZaphodHeads,  OPTV_STRING,{0}, FALSE },
 { OPTION_PAGE_FLIP,PageFlip, OPTV_BOOLEAN,   {0}, 
FALSE },
+{ OPTION_SWAP_LIMIT,   SwapLimit,OPTV_INTEGER,   {0}, FALSE },
 { -1,   NULL,   OPTV_NONE,  {0}, FALSE }
 };
 
diff --git a/src/nv_driver.c b/src/nv_driver.c
index 87ef2c4..5def531 100644
--- a/src/nv_driver.c
+++ b/src/nv_driver.c
@@ -31,6 +31,9 @@
 #include xf86drm.h
 #include xf86drmMode.h
 #include nouveau_drm.h
+#ifdef DRI2
+#include dri2.h
+#endif
 
 /*
  * Forward definitions for the functions

[PATCH 4/9] dri2: Update front buffer pixmap and name before exchanging buffers

2012-02-15 Thread Mario Kleiner
Buffer exchange assumes that the front buffer pixmap and name
information is accurate. That may not be the case eg. if the window
has been (un)redirected since the buffer was created.

This is a translation to nouveau of a fix that was originally developed
by Ville Syrjala syrj...@sci.fi for the ati/radeon ddx to fix the
same bug there.

See thread at:

http://lists.x.org/archives/xorg-devel/2011-May/021908.html

Fixes FDO bug #35452.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 src/nouveau_dri2.c |   45 ++---
 1 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index 2908e56..8608678 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -149,6 +149,35 @@ struct nouveau_dri2_vblank_state {
 };
 
 static Bool
+update_front(DrawablePtr draw, DRI2BufferPtr front)
+{
+   int r;
+   PixmapPtr pixmap;
+   struct nouveau_dri2_buffer *nvbuf = nouveau_dri2_buffer(front);
+
+   if (draw-type == DRAWABLE_PIXMAP)
+   pixmap = (PixmapPtr)draw;
+   else
+   pixmap = (*draw-pScreen-GetWindowPixmap)((WindowPtr)draw);
+
+   pixmap-refcnt++;
+
+   exaMoveInPixmap(pixmap);
+   r = nouveau_bo_handle_get(nouveau_pixmap_bo(pixmap), front-name);
+   if (r) {
+   (*draw-pScreen-DestroyPixmap)(pixmap);
+   return FALSE;
+   }
+
+   (*draw-pScreen-DestroyPixmap)(nvbuf-ppix);
+   front-pitch = pixmap-devKind;
+   front-cpp = pixmap-drawable.bitsPerPixel / 8;
+   nvbuf-ppix = pixmap;
+
+   return TRUE;
+}
+
+static Bool
 can_exchange(DrawablePtr draw, PixmapPtr dst_pix, PixmapPtr src_pix)
 {
ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum];
@@ -234,13 +263,14 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int 
frame,
 {
ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum];
NVPtr pNv = NVPTR(scrn);
-   PixmapPtr dst_pix = nouveau_dri2_buffer(s-dst)-ppix;
+   PixmapPtr dst_pix;
PixmapPtr src_pix = nouveau_dri2_buffer(s-src)-ppix;
-   struct nouveau_bo *dst_bo = nouveau_pixmap_bo(dst_pix);
+   struct nouveau_bo *dst_bo;
struct nouveau_bo *src_bo = nouveau_pixmap_bo(src_pix);
struct nouveau_channel *chan = pNv-chan;
RegionRec reg;
int type, ret;
+   Bool front_updated;
 
REGION_INIT(0, reg, ((BoxRec){ 0, 0, draw-width, draw-height }), 0);
REGION_TRANSLATE(0, reg, draw-x, draw-y);
@@ -257,6 +287,15 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int 
frame,
if (ref_crtc_hw_id  1)
ref_crtc_hw_id = 1;
 
+   /* Update frontbuffer pixmap and name: Could have changed due to
+* window (un)redirection as part of compositing.
+*/
+   front_updated = update_front(draw, s-dst);
+
+   /* Assign frontbuffer pixmap, after update in update_front() */
+   dst_pix = nouveau_dri2_buffer(s-dst)-ppix;
+   dst_bo = nouveau_pixmap_bo(dst_pix);
+
/* Throttle on the previous frame before swapping */
nouveau_bo_map(dst_bo, NOUVEAU_BO_RD);
nouveau_bo_unmap(dst_bo);
@@ -275,7 +314,7 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int 
frame,
FIRE_RING(chan);
}
 
-   if (can_exchange(draw, dst_pix, src_pix)) {
+   if (front_updated  can_exchange(draw, dst_pix, src_pix)) {
type = DRI2_EXCHANGE_COMPLETE;
DamageRegionAppend(draw, reg);
 
-- 
1.7.5.4

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH 5/9] dri2: Fixes to swap scheduling.

2012-02-15 Thread Mario Kleiner
Fix some small off-by-one errors and a mismatch
between 32 bit kernel interfaces for vblank count
and 64 bit dri2 interfaces for target_msc et al.

Return corrected target_msc to swap scheduling in
x-server.

A revised version of the patch discussed here:
http://lists.freedesktop.org/archives/nouveau/2011-September/009143.html

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 src/nouveau_dri2.c |   23 +--
 1 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index 8608678..719b3bb 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -387,11 +387,22 @@ nouveau_dri2_schedule_swap(ClientPtr client, DrawablePtr 
draw,
if (ret)
goto fail;
 
+   /* Truncate to match kernel interfaces; means occasional 
overflow
+* misses, but that's generally not a big deal.
+*/
+   *target_msc = 0x;
+   divisor = 0x;
+   remainder = 0x;
+
/* Calculate a swap target if we don't have one */
if (current_msc = *target_msc  divisor)
*target_msc = current_msc + divisor
- (current_msc - remainder) % divisor;
 
+   /* Avoid underflow of unsigned value below */
+   if (*target_msc == 0)
+   *target_msc = 1;
+
/* Request a vblank event one frame before the target */
ret = nouveau_wait_vblank(draw, DRM_VBLANK_ABSOLUTE |
  DRM_VBLANK_EVENT,
@@ -399,7 +410,8 @@ nouveau_dri2_schedule_swap(ClientPtr client, DrawablePtr 
draw,
  expect_msc, NULL, s);
if (ret)
goto fail;
-   s-frame = (unsigned int) expect_msc  0x;
+   s-frame = 1 + ((unsigned int) expect_msc  0x);
+   *target_msc = 1 + expect_msc;
} else {
/* We can't/don't want to sync to vblank, just swap. */
nouveau_dri2_finish_swap(draw, 0, 0, 0, s);
@@ -420,6 +432,13 @@ nouveau_dri2_schedule_wait(ClientPtr client, DrawablePtr 
draw,
CARD64 current_msc;
int ret;
 
+   /* Truncate to match kernel interfaces; means occasional overflow
+* misses, but that's generally not a big deal.
+*/
+   target_msc = 0x;
+   divisor = 0x;
+   remainder = 0x;
+
if (!can_sync_to_vblank(draw)) {
DRI2WaitMSCComplete(client, draw, target_msc, 0, 0);
return TRUE;
@@ -439,7 +458,7 @@ nouveau_dri2_schedule_wait(ClientPtr client, DrawablePtr 
draw,
goto fail;
 
/* Calculate a wait target if we don't have one */
-   if (current_msc  target_msc  divisor)
+   if (current_msc = target_msc  divisor)
target_msc = current_msc + divisor
- (current_msc - remainder) % divisor;
 
-- 
1.7.5.4

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH 6/9] dri2: Allow vblank controlled swaps for redirected windows. Part I

2012-02-15 Thread Mario Kleiner
Make sure that swaps for redirected windows under a
compositor are still scheduled via vblank events, to
avoid effects like 2900 fps swaps under a compositor.

See discussion with Francisco Jerez at:

http://lists.freedesktop.org/archives/nouveau/2011-September/009278.html
http://lists.freedesktop.org/archives/nouveau/2011-September/009292.html

This is part I of the agreed upon band-aid, in a separate patch.

It allows to use vblank related functions on redirected
windows and thereby fixes functions from sgi_sync_control
and oml_sync_control extension, e.g., glXWaitForMscOML(),
glXGetSyncValuesOML(), glXWaitVideoSyncSGI, ...

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 src/nouveau_dri2.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index 719b3bb..6a0800c 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -204,10 +204,8 @@ can_sync_to_vblank(DrawablePtr draw)
 {
ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum];
NVPtr pNv = NVPTR(scrn);
-   PixmapPtr pix = NVGetDrawablePixmap(draw);
 
return pNv-glx_vblank 
-   nouveau_exa_pixmap_is_onscreen(pix) 
nv_window_belongs_to_crtc(scrn, draw-x, draw-y,
  draw-width, draw-height);
 }
-- 
1.7.5.4

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH 7/9] dri2: Allow vblank controlled swaps for redirected windows. Part II

2012-02-15 Thread Mario Kleiner
This part implements proper throttling for clients. For
vblank synchronized blits, it defers DRI2SwapComplete()
until 1 vblank after the framebuffer blit is submitted to
the gpu.

Rationale:

For unredirected windows, this is the earliest time the
blit swap can complete, as blits are submitted one vblank
before the target vblank and synchronized with vblank in the
gpu. This makes swap completion timestamps at least reasonable.

For redirected windows, the compositor will probably pick
up the blit swapped frontbuffer pixmap of the window quickly,
but defer its own recomposition to the next vblank, at least
if sync to vblank for the compositor is on.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 src/nouveau_dri2.c |   32 +---
 1 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index 6a0800c..fdc5148 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -135,6 +135,7 @@ nouveau_dri2_copy_region(DrawablePtr pDraw, RegionPtr 
pRegion,
 struct nouveau_dri2_vblank_state {
enum {
SWAP,
+   BLIT,
WAIT
} action;
 
@@ -342,6 +343,22 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int 
frame,
 
REGION_TRANSLATE(0, reg, -draw-x, -draw-y);
nouveau_dri2_copy_region(draw, reg, s-dst, s-src);
+
+   if (can_sync_to_vblank(draw)) {
+   /* Request a vblank event one vblank from now, the most
+* likely (optimistic?) time a direct framebuffer blit
+* will complete or a desktop compositor will update its
+* screen. This defers DRI2SwapComplete() to the 
earliest
+* likely time of real swap completion.
+*/
+   s-action = BLIT;
+   ret = nouveau_wait_vblank(draw, DRM_VBLANK_EVENT |
+ DRM_VBLANK_RELATIVE, 1,
+ NULL, NULL, s);
+   /* Done, if success. Otherwise use fallback below. */
+   if (!ret)
+   return;
+   }
}
 
/*
@@ -351,8 +368,9 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int 
frame,
 *
 * XXX - The DRI2SwapLimit() API allowed us to move this to
 *   the flip handler with no FPS hit for page flipped swaps.
-*   It is still needed for copy swaps as we lack a method
-*   to detect true swap completion for DRI2_BLIT_COMPLETE.
+*   It is still needed as a fallback for some copy swaps as
+*   we lack a method to detect true swap completion for
+*   DRI2_BLIT_COMPLETE.
 */
DRI2SwapComplete(s-client, draw, frame, tv_sec, tv_usec,
 type, s-func, s-data);
@@ -505,8 +523,10 @@ nouveau_dri2_vblank_handler(int fd, unsigned int frame,
 
ret = dixLookupDrawable(draw, s-draw, serverClient,
M_ANY, DixWriteAccess);
-   if (ret)
+   if (ret) {
+   free(s);
return;
+   }
 
switch (s-action) {
case SWAP:
@@ -517,6 +537,12 @@ nouveau_dri2_vblank_handler(int fd, unsigned int frame,
DRI2WaitMSCComplete(s-client, draw, frame, tv_sec, tv_usec);
free(s);
break;
+
+   case BLIT:
+   DRI2SwapComplete(s-client, draw, frame, tv_sec, tv_usec,
+DRI2_BLIT_COMPLETE, s-func, s-data);
+   free(s);
+   break;
}
 }
 
-- 
1.7.5.4

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH 8/9] dri2: Reimplement hack for triple-buffering on old X-Servers.

2012-02-15 Thread Mario Kleiner
X-Servers before 1.12.0 don't have the DRI2SwapLimit()
API. On these, default to a swaplimit of 1 - double-buffering.

This patch implements support for swap limit of 2,
triple-buffering, on old x-servers via Francisco Jerez
previous hack:

Return DRI2SwapComplete() before the swap has completed,
so clients don't get blocked on the pending swap. This
allows for a triple-buffering look-alike behaviour, but
breaks the swap scheduling and timestamping defined
in the OML_sync_control spec, so applications which
rely on conformant behaviour will break with a swap
limit of 2 on pre 1.12.0 x-servers.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 man/nouveau.man|6 +-
 src/nouveau_dri2.c |   32 +---
 src/nv_driver.c|   11 ++-
 3 files changed, 40 insertions(+), 9 deletions(-)

diff --git a/man/nouveau.man b/man/nouveau.man
index 59f6c1a..7c72907 100644
--- a/man/nouveau.man
+++ b/man/nouveau.man
@@ -101,7 +101,11 @@ a drawable before a client is blocked.
 A value of 1 corresponds to double-buffering. A value of 2 corresponds
 to triple-buffering. Higher values may allow higher framerate, but also
 increase lag for interactive applications, e.g., games. Nouveau currently
-supports a maximum value of 2 on XOrg 1.12+ and a maximum of 1 on older 
servers.
+reliably supports a maximum value of 2 on XOrg 1.12+. A maximum setting of 2
+on older x-servers is allowed, but it will break conformance with the
+OpenML OML_sync_control specification and will cause failure of software
+that relies on correct presentation timing behaviour as defined in that
+specification.
 .br
 Default: 2 for XOrg 1.12+, 1 for older servers.
 .SH SEE ALSO
diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index fdc5148..f0c7fec 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -255,6 +255,18 @@ nouveau_dri2_swap_limit_validate(DrawablePtr draw, int 
swap_limit)
 }
 #endif
 
+/* Shall we intentionally violate the OML_sync_control spec to
+ * get some sort of triple-buffering behaviour on a pre 1.12.0
+ * x-server?
+ */
+static Bool violate_oml(DrawablePtr draw)
+{
+   ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum];
+   NVPtr pNv = NVPTR(scrn);
+
+   return (DRI2INFOREC_VERSION  6)  (pNv-swap_limit  1);
+}
+
 static void
 nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int frame,
 unsigned int tv_sec, unsigned int tv_usec,
@@ -319,7 +331,9 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int 
frame,
 
if (DRI2CanFlip(draw)) {
type = DRI2_FLIP_COMPLETE;
-   ret = drmmode_page_flip(draw, src_pix, s, 
ref_crtc_hw_id);
+   ret = drmmode_page_flip(draw, src_pix,
+   violate_oml(draw) ? NULL : s,
+   ref_crtc_hw_id);
if (!ret)
goto out;
}
@@ -330,7 +344,7 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int 
frame,
DamageRegionProcessPending(draw);
 
/* If it is a page flip, finish it in the flip event handler. */
-   if (type == DRI2_FLIP_COMPLETE)
+   if ((type == DRI2_FLIP_COMPLETE)  !violate_oml(draw))
return;
} else {
type = DRI2_BLIT_COMPLETE;
@@ -344,7 +358,7 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int 
frame,
REGION_TRANSLATE(0, reg, -draw-x, -draw-y);
nouveau_dri2_copy_region(draw, reg, s-dst, s-src);
 
-   if (can_sync_to_vblank(draw)) {
+   if (can_sync_to_vblank(draw)  !violate_oml(draw)) {
/* Request a vblank event one vblank from now, the most
 * likely (optimistic?) time a direct framebuffer blit
 * will complete or a desktop compositor will update its
@@ -361,6 +375,14 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int 
frame,
}
}
 
+   /* Special triple-buffering hack for old pre 1.12.0 x-servers used? */
+   if (violate_oml(draw)) {
+   /* Signal to client that swap completion timestamps and counts
+* are invalid - they violate the specification.
+*/
+   frame = tv_sec = tv_usec = 0;
+   }
+
/*
 * Tell the X server buffers are already swapped even if they're
 * not, to prevent it from blocking the client on the next
@@ -371,6 +393,10 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int 
frame,
 *   It is still needed as a fallback for some copy swaps as
 *   we lack a method to detect true swap completion for
 *   DRI2_BLIT_COMPLETE.
+*
+*   It is also used if triple-buffering is requested on
+*   old x-servers which don't

[PATCH 9/9] dri2: Fix corner case crash for swaplimit 1

2012-02-15 Thread Mario Kleiner
If a swaplimit  1 is set on a server which
supports the swaplimit api (XOrg 1.12.0+),
the following can happen:

1. Client calls glXSwapBuffersMscOML() with a
   swap target  1 vblank in the future, or a
   client calls glXSwapbuffers() while the swap
   interval is set to  1 (unusual but possible).

2. nouveau_dri2_finish_swap() is therefore called
   only at the target vblank, instead of immediately.

3. Because of the deferred execution of
   nouveu_dri2_finish_swap(), the OpenGL client
   can call x-servers DRI2GetBuffersWithFormat()
   before nouveau_dri2_finish_swap() executes and
   it deletes pixmaps that would be needed by
   nouveau_dri2_finish_swap() -- Segfault -- Crash.

Prevent this: When a swap is scheduled into the
future, we temporarily reduce the swaplimit to 1
until nouveau_dri2_finish_swap() is done, then
restore it to its original value. This throttles
the client inside the server in DRI2ThrottleClient()
before it can call the evil DRI2GetbuffersWithFormat().

The client will still be released one video refresh
interval before swap completion, so there is still
some potential win.

This doesn't affect the common case of swapping at
the next vblank, where this throttling is not needed
or done.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 src/nouveau_dri2.c |   26 ++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index f0c7fec..7878a5a 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -445,6 +445,26 @@ nouveau_dri2_schedule_swap(ClientPtr client, DrawablePtr 
draw,
if (*target_msc == 0)
*target_msc = 1;
 
+#if DRI2INFOREC_VERSION = 6
+   /* Is this a swap in the future, ie. the vblank event will
+* not be immediately dispatched, but only at a future vblank?
+* If so, we need to temporarily lower the swaplimit to 1, so
+* that DRI2GetBuffersWithFormat() requests from the client get
+* deferred in the x-server until the vblank event has been
+* dispatched to us and nouveau_dri2_finish_swap() is done. If
+* we wouldn't do this, DRI2GetBuffersWithFormat() would operate
+* on wrong (pre-swap) buffers, and cause a segfault later on in
+* nouveau_dri2_finish_swap(). Our vblank event handler restores
+* the old swaplimit immediately after 
nouveau_dri2_finish_swap()
+* is done, so we still get 1 video refresh cycle worth of
+* triple-buffering. For a swap at next vblank, dispatch of the
+* vblank event happens immediately, so there isn't any need
+* for this lowered swaplimit.
+*/
+   if (current_msc  *target_msc - 1)
+   DRI2SwapLimit(draw, 1);
+#endif
+
/* Request a vblank event one frame before the target */
ret = nouveau_wait_vblank(draw, DRM_VBLANK_ABSOLUTE |
  DRM_VBLANK_EVENT,
@@ -557,6 +577,12 @@ nouveau_dri2_vblank_handler(int fd, unsigned int frame,
switch (s-action) {
case SWAP:
nouveau_dri2_finish_swap(draw, frame, tv_sec, tv_usec, s);
+#if DRI2INFOREC_VERSION = 6
+   /* Restore real swap limit on drawable, now that it is safe. */
+   ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum];
+   DRI2SwapLimit(draw, NVPTR(scrn)-swap_limit);
+#endif
+
break;
 
case WAIT:
-- 
1.7.5.4

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH 1/9] dri2: Fix can_exchange() to allow page-flipping on new servers.

2012-02-19 Thread Mario Kleiner

On 02/16/2012 11:04 AM, Michel Dänzer wrote:

On Don, 2012-02-16 at 00:45 +0100, Mario Kleiner wrote:

can_exchange() fails on at least Xorg 1.12+. This fixes
it in the same way it was fixed in the ati  intel ddx.

Signed-off-by: Mario Kleinermario.klei...@tuebingen.mpg.de
---
  src/nouveau_dri2.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index 3aa5ec5..5b62425 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -160,7 +160,7 @@ can_exchange(DrawablePtr draw, PixmapPtr dst_pix, PixmapPtr 
src_pix)
return ((DRI2CanFlip(draw)  pNv-has_pageflip))
dst_pix-drawable.width == src_pix-drawable.width
dst_pix-drawable.height == src_pix-drawable.height
-   dst_pix-drawable.depth == src_pix-drawable.depth
+   dst_pix-drawable.bitsPerPixel == 
src_pix-drawable.bitsPerPixel
dst_pix-devKind == src_pix-devKind;
  }



Actually, it seems like the pixmap depths really should match, otherwise
one could end up with the front pixmap depth not matching the window
depth. Not sure that's a real problem right now, but it seems wonky at
least...

Have you investigated why the depths don't match?




Depends on the meaning of investigated: One of the pixmaps has depth 
24 bits (the pixmap of the root window) the other 32 bits (as requested 
from the client via DRI2GetBuffersWithFormat for RGBA8 visuals). Both 
have 32 bpp. I checked what the intel and ati ddx do. The ati ddx always 
checked for matching drawable.bitsPerPixel since kms pageflip support 
was implemented. The intel ddx does the same, but the code and comments 
suggests they tried both and checking for matching depths probably 
didn't work:


cd xorg/drivers/xf86-video-intel/
git log -p e2615cdeef078dbd2e834b68c437f098a92b941d

So everybody does it like this currently, and it seems to work.

I also just tried setting DefaultDepth to 30 bits on a NV50 card which 
supports this. xdpyinfo confirms a screen depth of 30 bit == 10 bits per 
red/green/blue channel. glxinfo shows that only RGBA8 visuals are 
supported. Here we have a mismatch of depths 30 vs. 24 and the driver 
reverts correctly from page flipping to copy swaps, although i can't see 
immediately what causes it to correctly switch back to copy swaps.


-mario
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH 9/9] dri2: Fix corner case crash for swaplimit 1

2012-02-19 Thread Mario Kleiner

On 02/16/2012 10:46 AM, Michel Dänzer wrote:

On Don, 2012-02-16 at 00:45 +0100, Mario Kleiner wrote:

If a swaplimit  1 is set on a server which
supports the swaplimit api (XOrg 1.12.0+),
the following can happen:

1. Client calls glXSwapBuffersMscOML() with a
swap target  1 vblank in the future, or a
client calls glXSwapbuffers() while the swap
interval is set to  1 (unusual but possible).

2. nouveau_dri2_finish_swap() is therefore called
only at the target vblank, instead of immediately.

3. Because of the deferred execution of
nouveu_dri2_finish_swap(), the OpenGL client
can call x-servers DRI2GetBuffersWithFormat()
before nouveau_dri2_finish_swap() executes and
it deletes pixmaps that would be needed by
nouveau_dri2_finish_swap() --  Segfault --  Crash.


Pixmaps are reference counted, so it should be possible to fix this via
proper reference counting.




Thanks for the tip. I'll try that.

-mario
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] dri2: Pass AsyncSwap [swap_interval=0] requests to the drivers

2012-02-21 Thread Mario Kleiner

On Feb 21, 2012, at 10:55 AM, Chris Wilson wrote:


Currently, the midlayer dri2 code intercepts swap_interval=0 (ala
vblank_mode=0) SwapBuffers and converts it to a CopyRegion request.
This prevents the backend from doing anything meaningful in this case
and typically ends up being vsync'ed since the drivers cannot  
distinguish

it from a regular CopyRegion request.

v2: Only invalidate the drawable on the behest of the backend, so that
existing behaviour of not invalidating for async blit copies is
preserved, suggested by Simon Farnsworh. And rebase for the  
intervening

12 months since v1.



...snip...


 int
 DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64  
target_msc,

CARD64 divisor, CARD64 remainder, CARD64 *swap_target,
@@ -876,20 +900,11 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr  
pDraw, CARD64 target_msc,


 /* Old DDX or no swap interval, just blit */
 if (!ds-ScheduleSwap || !pPriv-swap_interval) {
-   BoxRec box;
-   RegionRec region;
-
-   box.x1 = 0;
-   box.y1 = 0;
-   box.x2 = pDraw-width;
-   box.y2 = pDraw-height;
-   RegionInit(region, box, 0);
-
pPriv-swapsPending++;
-
-   (*ds-CopyRegion)(pDraw, region, pDestBuffer, pSrcBuffer);
-	DRI2SwapComplete(client, pDraw, target_msc, 0, 0,  
DRI2_BLIT_COMPLETE,

-func, data);
+   if ((*ds-AsyncSwap)(client, pDraw,
+pDestBuffer, pSrcBuffer,
+func, data))
+   DRI2InvalidateDrawable(pDraw);
return Success;
 }



One thing i'm not sure about, maybe just a false alert, maybe a tiny  
race: The ddx could call DRI2SwapComplete() in *ds-AsyncSwap()  
before the server calls DRI2InvalidateDrawable() afterwards.


Normally DRI2InvalidateDrawable() - As soon as client tries to  
render again - invalidated buffers - Client calls  
DRI2GetBuffersWithFormat() - DRI2ThrottleClient() - Server usually  
waits for DRI2SwapComplete().


With the current implementation the server could only process  
DRI2SwapComplete() [as triggered by a received vblank event or  
pageflip completion event from the kernel], after  
DRI2InvalidateDrawable() at the end of DRI2SwapBuffers().


a) A single threaded client can't start rendering before return from  
glXSwapBuffers[MscOML](), so no problem.


b) A multi-threaded gl client could wait/poll for swap completion via  
glXWaitForSbcOML(), glXGetSyncValuesOML() or via the  
INTEL_swap_events extension, so it could be that a 2nd rendering  
thread would get released by DRI2SwapComplete() and start rendering  
again before receiving the buffer invalidate events.


I'm not sure if this is a real problem, or if i'm missing something,  
but it could be prevented if the ds-AsyncSwap code itself would call  
DRI2InvalidateDrawable() at the proper time instead of the server  
calling it afterwards.


... snip ...


+
+/* added in version 7 */
+
+/* Used when the client requests swap_interval=0, i.e. swap  
immediately
+ * with no throttling. Whether to tear or not is left up to  
the driver.


I wonder if there is any case when the option of not tearing would  
make sense for a swaplimit of 0? The only when i set a swaplimit of  
zero in my apps is if i want to benchmark how fast the whole  
rendering loop could go in the best case, so i want it to tear. Same  
for benchmarks.


Other than that

Reviewed-by: Mario Kleiner mario.klei...@tuebingen.mpg.de

thanks,
-mario

*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH 1/9] dri2: Fix can_exchange() to allow page-flipping on new servers.

2012-02-21 Thread Mario Kleiner

On 02/20/2012 11:27 AM, Michel Dänzer wrote:

On Mon, 2012-02-20 at 05:59 +0100, Mario Kleiner wrote:

On 02/16/2012 11:04 AM, Michel Dänzer wrote:

On Don, 2012-02-16 at 00:45 +0100, Mario Kleiner wrote:

can_exchange() fails on at least Xorg 1.12+. This fixes
it in the same way it was fixed in the ati   intel ddx.

Signed-off-by: Mario Kleinermario.klei...@tuebingen.mpg.de
---
   src/nouveau_dri2.c |2 +-
   1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index 3aa5ec5..5b62425 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -160,7 +160,7 @@ can_exchange(DrawablePtr draw, PixmapPtr dst_pix, PixmapPtr 
src_pix)
return ((DRI2CanFlip(draw)   pNv-has_pageflip))
dst_pix-drawable.width == src_pix-drawable.width
dst_pix-drawable.height == src_pix-drawable.height
-   dst_pix-drawable.depth == src_pix-drawable.depth
+   dst_pix-drawable.bitsPerPixel == 
src_pix-drawable.bitsPerPixel
dst_pix-devKind == src_pix-devKind;
   }



Actually, it seems like the pixmap depths really should match, otherwise
one could end up with the front pixmap depth not matching the window
depth. Not sure that's a real problem right now, but it seems wonky at
least...

Have you investigated why the depths don't match?


Depends on the meaning of investigated: One of the pixmaps has depth
24 bits (the pixmap of the root window) the other 32 bits (as requested
from the client via DRI2GetBuffersWithFormat for RGBA8 visuals). Both
have 32 bpp. I checked what the intel and ati ddx do. The ati ddx always
checked for matching drawable.bitsPerPixel since kms pageflip support
was implemented. The intel ddx does the same, but the code and comments
suggests they tried both and checking for matching depths probably
didn't work:

cd xorg/drivers/xf86-video-intel/
git log -p e2615cdeef078dbd2e834b68c437f098a92b941d

So everybody does it like this currently, and it seems to work.


Right. I must have been incorrectly thinking of flipping as changing the
window pixmap.

I still have some doubts — e.g. why is an RGBA visual chosen for a
fullscreen window, and does this really have anything to do with changes
in xserver 1.12, or rather in Mesa — but I think the change itself is
okay.



You are right, it is not the xserver, but more likely Mesa. I tried the 
patch again with some debug output against xserver 1.10.4 (Ubuntu 11.10 
standard x-server) and the same happens. The FRONT_LEFT buffer is taken 
from the windows pixmap, which is depth 24, the BACK_LEFT buffer is 
allocated as a pixmap of 32 bits. So now i'm surprised it ever worked 
before...


Original testing of this nouveau patchset was around August 2011, since 
then i've also updated mesa to 7.11 as part of a distribution upgrade 
from Ubuntu 11.04 - 11.10.


Mesa requests 32 bit buffers as part of the BACK_LEFT attachment in 
DRI2GetBuffersWithFormat(). As far as i could trace it back, the code in 
src/gallium/state_trackers/dri/drm/dri2.c, dri2_drawable_get_buffers() 
mapped the PIPE_FORMAT_B8G8R8A8_UNORM (as requested by my toolkit) and 
also PIPE_FORMAT_B8G8R8X8_UNORM (from glxgears) to a DRI2 buffer format 
of 32 bits and it seems that that is used as .depth for the pixmaps 
drawable as well.


There's this related commit in mesa:

git log -p 576161289df68eedade591fbca4013329c9e5ded

which changed the logic a bit after Mesa 7.12. Also this one:

git log -p c72d7df16879e3210946ba92a7edc823815b6f16

So to support page flipping on different mesa versions i guess 
can_exchange needs to be lenient and check for the matching bitsPerPixel 
instead.


I will change the commit message and title of this patch to be less 
misleading about the cause of the problem.


I wonder if the depth field is still a good way to describe the buffer 
format, especially with things like depth 30 RGB10 scanout.


-mario
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

[PATCH 01/10] dri2: Fix can_exchange() to allow page-flipping on more mesa versions.

2012-03-01 Thread Mario Kleiner
can_exchange() returns false and thereby prevents page
flipping on some drawables where page flipping would
work fine. This due to non-matching drawable depths values
between front buffer pixmap and back buffer pixmap, because
front buffer pixmaps inherit the depth of the screen, typically
24 bits, whereas the depth value of back buffer pixmaps for a
given RGB8 or RGBA8 visual depends on the mesa version in use,
either 24 bits or 32 bits.

Use bitsPerPixel instead of depth to decide if drawable
is flippable. This will still catch really incompatible
formats like 32 bpp vs. 16 bpp buffers.

Tested for screen DefaultDepth 24 and also 30 bits
(for RGB10 framebuffers) on NV-50.

The problem was fixed in the same way in the ati  intel ddx.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 src/nouveau_dri2.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index 3aa5ec5..5b62425 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -160,7 +160,7 @@ can_exchange(DrawablePtr draw, PixmapPtr dst_pix, PixmapPtr 
src_pix)
return ((DRI2CanFlip(draw)  pNv-has_pageflip)) 
dst_pix-drawable.width == src_pix-drawable.width 
dst_pix-drawable.height == src_pix-drawable.height 
-   dst_pix-drawable.depth == src_pix-drawable.depth 
+   dst_pix-drawable.bitsPerPixel == 
src_pix-drawable.bitsPerPixel 
dst_pix-devKind == src_pix-devKind;
 }
 
-- 
1.7.5.4

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH 10/10] dri2: Testpatch: Fix corner case crash but not the problem.

2012-03-01 Thread Mario Kleiner
Do not apply - Just for illustration:

Introduce reference counting for nouveau_dri2_buffers and
temporarily increase refcount on nouveau_dri2_buffer
frontbuffer to protect it against premature deletion by
x-server as part of the DRI2GetBuffers[WithFormat]() path
while the vblank event for a scheduled swap is still pending
and nouveau_dri2_finish_swap() hasn't completed for this
swap.

This prevents server/ddx crash on deferred swaps with
swaplimit  1 but doesn't fix the problem: We try to treat
a setup which is only double-buffered as if it is truly
triple-buffered. This causes flickering, so not useful in
practice.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 src/nouveau_dri2.c |   57 +++-
 1 files changed, 34 insertions(+), 23 deletions(-)

diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index 7878a5a..eb008c9 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -14,6 +14,7 @@
 struct nouveau_dri2_buffer {
DRI2BufferRec base;
PixmapPtr ppix;
+   unsigned int refcnt;
 };
 
 static inline struct nouveau_dri2_buffer *
@@ -79,6 +80,7 @@ nouveau_dri2_create_buffer(DrawablePtr pDraw, unsigned int 
attachment,
nvbuf-base.format = format;
nvbuf-base.flags = 0;
nvbuf-ppix = ppix;
+   nvbuf-refcnt++;
 
nvpix = nouveau_pixmap(ppix);
if (!nvpix || !nvpix-bo ||
@@ -100,6 +102,10 @@ nouveau_dri2_destroy_buffer(DrawablePtr pDraw, 
DRI2BufferPtr buf)
if (!nvbuf)
return;
 
+   nvbuf-refcnt--;
+   if (nvbuf-refcnt)
+   return;
+
pDraw-pScreen-DestroyPixmap(nvbuf-ppix);
free(nvbuf);
 }
@@ -178,6 +184,17 @@ update_front(DrawablePtr draw, DRI2BufferPtr front)
return TRUE;
 }
 
+static void
+ref_front(DrawablePtr draw, DRI2BufferPtr front, Bool doref)
+{
+   struct nouveau_dri2_buffer *nvbuf = nouveau_dri2_buffer(front);
+
+   if (doref)
+   nvbuf-refcnt++;
+   else
+   nouveau_dri2_destroy_buffer(draw, front);
+}
+
 static Bool
 can_exchange(DrawablePtr draw, PixmapPtr dst_pix, PixmapPtr src_pix)
 {
@@ -445,25 +462,21 @@ nouveau_dri2_schedule_swap(ClientPtr client, DrawablePtr 
draw,
if (*target_msc == 0)
*target_msc = 1;
 
-#if DRI2INFOREC_VERSION = 6
-   /* Is this a swap in the future, ie. the vblank event will
-* not be immediately dispatched, but only at a future vblank?
-* If so, we need to temporarily lower the swaplimit to 1, so
-* that DRI2GetBuffersWithFormat() requests from the client get
-* deferred in the x-server until the vblank event has been
-* dispatched to us and nouveau_dri2_finish_swap() is done. If
-* we wouldn't do this, DRI2GetBuffersWithFormat() would operate
-* on wrong (pre-swap) buffers, and cause a segfault later on in
-* nouveau_dri2_finish_swap(). Our vblank event handler restores
-* the old swaplimit immediately after 
nouveau_dri2_finish_swap()
-* is done, so we still get 1 video refresh cycle worth of
-* triple-buffering. For a swap at next vblank, dispatch of the
-* vblank event happens immediately, so there isn't any need
-* for this lowered swaplimit.
+   /* Increase refcount on frontbuffer to protect it against
+* premature deletion by x-server as part of the
+* DRI2GetBuffersWithFormat() path while the vblank event
+* is still pending and nouveau_dri2_finish_swap() hasn't
+* completed for this swap.
+*
+* XXX: This prevents server/ddx crash on deferred swaps,
+* but doesn't fix the problem: We try to treat a setup
+* which is only double-buffered as if it is truly triple-
+* buffered. This causes flickering, so not useful in practice.
+*
+* Tinkering with the DRI2SwapLimit() is ugly but provides
+* proper behaviour.
 */
-   if (current_msc  *target_msc - 1)
-   DRI2SwapLimit(draw, 1);
-#endif
+   ref_front(draw, s-dst, true);
 
/* Request a vblank event one frame before the target */
ret = nouveau_wait_vblank(draw, DRM_VBLANK_ABSOLUTE |
@@ -577,12 +590,10 @@ nouveau_dri2_vblank_handler(int fd, unsigned int frame,
switch (s-action) {
case SWAP:
nouveau_dri2_finish_swap(draw, frame, tv_sec, tv_usec, s);
-#if DRI2INFOREC_VERSION = 6
-   /* Restore real swap limit on drawable, now that it is safe. */
-   ScrnInfoPtr scrn = xf86Screens[draw-pScreen-myNum];
-   DRI2SwapLimit(draw, NVPTR(scrn)-swap_limit);
-#endif

Re: [Nouveau] [Patches][nouveau/ddx]: Improvements to bufferswap implementation and timestamping

2012-03-01 Thread Mario Kleiner

On 02/29/2012 08:17 AM, Ben Skeggs wrote:

On Thu, 2012-02-16 at 00:45 +0100, Mario Kleiner wrote:

Hi,

Hey Mario,

What's your plan with this patchset?  Do you intend on taking Michel's
comments into account?

CC'ing Francisco as he had some comments on IRC.  I'd like to get this
all sorted and merged so I can rebase the ported-to-new-libdrm ddx on
top of it all and get that ready.



Hi Ben,

just sent out two updated patches wrt. to Michels feedback.

[1/10] Replaces the original [1/9] patch. The code is the same, but the 
commit message is changed to be less misleading about the cause of the 
problem. See the e-mail i think i sent about a week ago.


[10/10] Just for demonstration what i tried, not for application. The 
old [9/9] patch is still the best i can do at the moment. Following
Michels proposal, with 10/10 i added refcounting to nouveau_dri2_buffer 
objects and keep a reference while the swap is still pending. This 
prevents deletion of the buffer by the x-server and thereby prevents the 
server crash.


Problem: Now we have a display that doesn't crash, but flickers horribly 
while pageflipping with deferred swaps. I think it is because we still 
only have two buffers for double-buffering but pretend we're doing 
triple-buffering, but i could be totally wrong about the cause of the 
problem. The original patch [9/9] throttles the client in the special 
case where the pseudo-triplebuffering would result in flicker or crash, 
so it fixes the problem although in a rather hackish way.


Not sure if this can be done cleanly without a real n-buffering 
implementation in the x-server?


I think other than that there were no further comments and Francisco's 
original feedback from last year is already implemented in the patchset.


More feedback etc. welcome, but i usually can only work on this part 
time on weekends if it needs further improvements before merging.


Can the kms kernel patches be already merged? I think they are in a good 
shape now.


Thanks,
-mario



Ben.



here a set of patches against the nouveau-ddx. This is an extended and
revised set, based on Francisco Jerez feedback from autumn last year.

[1/9] Makes pageflipping work again on X-Server 1.12rc. It apparently stopped
working somewhere around Xorg 1.11+.

[2/9] Implements handling of pageflip completion events from the kernel.
Francisco Jerez argument against including it was that the x-server didn't
have a swaplimit api, so this couldn't be applied at the same time as the
pseudo triple-buffering hack which is in place in the current ddx.
Now we have the swaplimit api in 1.12, so this problem should be solved.

[3/9] Makes use of the swaplimit api. A new xorg.conf option SwapLimit
allows to select a swaplimit of 1 or 2 for double-buffering or
triple-buffering. It defaults to 2 for Xorg 1.12+ and 1 for older servers.
This way, on 1.12+ nouveau retains the kind of triple buffering behaviour it
currently has, but swap completion timestamping (OML_sync_control,
INTEL_swap_events, and client swap throttling) works conforming to
the specs. On older servers it removes triple-buffering but makes
nouveau conform to the specs. This is important for apps that need
precise and reliable swap scheduling and timestamping.

[4/9] A bug fix against corrupted desktop when switching from redirected
to non-redirected fullscreen windows under a desktop compositor.
Fixes FDO bug #35452.

[5/9] Some fixes to swap scheduling, revised according to Francisco's
review.

[6/9] Fixes swap throttling for non-fullscreen windows under a desktop
compositor. Split into a separate patch according to Francisco's
feedback.

[7/9] An attempt to provide more sane swap completion events for non-
fullscreen windows. This is a bit of cheating, as it delivers the
events for the earliest point in time one would expect the swap
to complete, assuming only a lightly loaded gpu. Real completion
could be later. I think this is an improvement because the current
implementation delivers swap complete events with timestamps and
counts that signal swap completion before the client even requested
the swap.

[8/9] This one adds Francisco's original triple-buffering hack back
for Xorg 1.11 and earlier servers if the xorg.conf option requests
a swaplimit  1. Users can choose between triple-buffering but
broken timestamping or double-buffering with sane timestamping.

[9/9] Fixes a corner case which could cause the ddx to segfault
with its current triple-buffering implementation.

I've tested these on single-display and dual-display setups,
with/without compositor, with/without redirection and checked
the robustness and precision of the timestamps with special
measurement equipment, on XOrg 1.12-rc2 and 1.10. They work
pretty well for me and finally make nouveau very useable for
the kind of scientific applications that require precise
swap scheduling and timestamping, so i'd love to see them
reviewed and hopefully included into the ddx soon.

A couple of things are a bit

Re: [PATCH 01/10] dri2: Fix can_exchange() to allow page-flipping on more mesa versions.

2012-03-02 Thread Mario Kleiner

On Mar 1, 2012, at 9:37 PM, Daniel Stone wrote:


Hi,

On 1 March 2012 18:11, Mario Kleiner  
mario.klei...@tuebingen.mpg.de wrote:

can_exchange() returns false and thereby prevents page
flipping on some drawables where page flipping would
work fine. This due to non-matching drawable depths values
between front buffer pixmap and back buffer pixmap, because
front buffer pixmaps inherit the depth of the screen, typically
24 bits, whereas the depth value of back buffer pixmaps for a
given RGB8 or RGBA8 visual depends on the mesa version in use,
either 24 bits or 32 bits.

Use bitsPerPixel instead of depth to decide if drawable
is flippable. This will still catch really incompatible
formats like 32 bpp vs. 16 bpp buffers.


In theory, shouldn't this be a format-compatibility check, so you
don't (unlikely though it is) attempt to flip to a BGR drawable on an
RGB configuration, or a  drawable on 0565?

Cheers,
Daniel



Yes, i think that would be the best solution, in theory. But i don't  
think the info is easily available to the ddx? However, i didn't  
search very hard.

-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH] DRI2: Don't return junk reply instead of blocking in glXWaitForSbcOML()

2010-06-05 Thread Mario Kleiner
DRI2WaitSBC() didn't block if requested targetSBC wasn't yet reached.

Instead it returned a xreply with uninitialized junk return values, then
blocked the connection until targetSBC was reached.

Therefore the client didn't block, but continued with bogus return
values from glXWaitForSbcOML.

This patch fixes the problem by implementing DRI2WaitSBC similar
to the clean and proven DRI2WaitMSC implementation.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 hw/xfree86/dri2/dri2.c|   10 --
 hw/xfree86/dri2/dri2.h|3 +--
 hw/xfree86/dri2/dri2ext.c |   16 +++-
 3 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index 9ec4caa..23ed3e5 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -909,8 +909,7 @@ DRI2WaitMSC(ClientPtr client, DrawablePtr pDraw, CARD64 
target_msc,
 }
 
 int
-DRI2WaitSBC(ClientPtr client, DrawablePtr pDraw, CARD64 target_sbc,
-   CARD64 *ust, CARD64 *msc, CARD64 *sbc)
+DRI2WaitSBC(ClientPtr client, DrawablePtr pDraw, CARD64 target_sbc)
 {
 DRI2DrawablePtr pPriv;
 
@@ -924,14 +923,13 @@ DRI2WaitSBC(ClientPtr client, DrawablePtr pDraw, CARD64 
target_sbc,
 if (target_sbc == 0)
 target_sbc = pPriv-swap_count + pPriv-swapsPending;
 
-/* If current swap count already = target_sbc,
+/* If current swap count already = target_sbc, reply and
  * return immediately with (ust, msc, sbc) triplet of
  * most recent completed swap.
  */
 if (pPriv-swap_count = target_sbc) {
-*sbc = pPriv-swap_count;
-*msc = pPriv-last_swap_msc;
-*ust = pPriv-last_swap_ust;
+ProcDRI2WaitMSCReply(client, pPriv-last_swap_ust,
+ pPriv-last_swap_msc, pPriv-swap_count);
 return Success;
 }
 
diff --git a/hw/xfree86/dri2/dri2.h b/hw/xfree86/dri2/dri2.h
index f08ca29..fe0bf6c 100644
--- a/hw/xfree86/dri2/dri2.h
+++ b/hw/xfree86/dri2/dri2.h
@@ -265,8 +265,7 @@ extern _X_EXPORT int DRI2WaitMSC(ClientPtr client, 
DrawablePtr pDrawable,
 extern _X_EXPORT int ProcDRI2WaitMSCReply(ClientPtr client, CARD64 ust,
  CARD64 msc, CARD64 sbc);
 extern _X_EXPORT int DRI2WaitSBC(ClientPtr client, DrawablePtr pDraw,
-CARD64 target_sbc, CARD64 *ust, CARD64 *msc,
-CARD64 *sbc);
+CARD64 target_sbc);
 extern _X_EXPORT Bool DRI2ThrottleClient(ClientPtr client, DrawablePtr pDraw);
 
 extern _X_EXPORT Bool DRI2CanFlip(DrawablePtr pDraw);
diff --git a/hw/xfree86/dri2/dri2ext.c b/hw/xfree86/dri2/dri2ext.c
index db04268..8016edb 100644
--- a/hw/xfree86/dri2/dri2ext.c
+++ b/hw/xfree86/dri2/dri2ext.c
@@ -520,9 +520,8 @@ static int
 ProcDRI2WaitSBC(ClientPtr client)
 {
 REQUEST(xDRI2WaitSBCReq);
-xDRI2MSCReply rep;
 DrawablePtr pDrawable;
-CARD64 target, ust, msc, sbc;
+CARD64 target;
 int status;
 
 REQUEST_SIZE_MATCH(xDRI2WaitSBCReq);
@@ -532,18 +531,9 @@ ProcDRI2WaitSBC(ClientPtr client)
return status;
 
 target = vals_to_card64(stuff-target_sbc_lo, stuff-target_sbc_hi);
-status = DRI2WaitSBC(client, pDrawable, target, ust, msc, sbc);
-if (status != Success)
-   return status;
+status = DRI2WaitSBC(client, pDrawable, target);
 
-rep.type = X_Reply;
-rep.length = 0;
-rep.sequenceNumber = client-sequence;
-load_msc_reply(rep, ust, msc, sbc);
-
-WriteToClient(client, sizeof(xDRI2MSCReply), rep);
-
-return Success;
+return status;
 }
 
 static int
-- 
1.6.6

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH] DRI2/xserver: Don't hang in glXSwapBuffers if drawable moves between crtc's

2010-06-13 Thread Mario Kleiner

Hi

This patch fixes a hang in glXSwapBuffers if a user moves
a drawable from a fast running crtc, e.g., 60 Hz to a slower
running crtc, e.g., 50 Hz, when using the new DRI2 sync  swap bits.

It should fix Bugzilla bug #28383.

https://bugs.freedesktop.org/show_bug.cgi?id=28383

I've tested this on a R600 against master. I assume the poster
of that bug report will test it as well. He has tested a previous
version of the patch.

Jesse could you review it? Would be good if this could make
it into master and 1.8.2 as well.

thanks,
-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH] DRI2/xserver: Don't hang in glXSwapBuffers if drawable moves between crtc's

2010-06-13 Thread Mario Kleiner
Detect if a drawable has been moved from an original crtc to a new crtc
with a lower current vblank count than the original crtc inbetween
glXSwapBuffers() calls. Reinitialize drawable's last_swap_target
before scheduling next swap if such a move has taken place.

last_swap_target defines the baseline for scheduling the next swap.
If a movement between crtc's is not taken into account, the swap may
schedule for a vblank count on the new crtc far in the future, resulting
in a apparent hang of the drawable for a long time.

Fixes Bugzilla bug #28383.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 hw/xfree86/dri2/dri2.c |   15 +++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index d33b0d1..1f80022 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -756,6 +756,7 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, CARD64 
target_msc,
 DRI2DrawablePtr pPriv;
 DRI2BufferPtr   pDestBuffer = NULL, pSrcBuffer = NULL;
 int ret, i;
+CARD64  ust, current_msc;
 
 pPriv = DRI2GetDrawable(pDraw);
 if (pPriv == NULL) {
@@ -800,12 +801,26 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr pDraw, 
CARD64 target_msc,
  * need to schedule a swap for the last swap target + the swap interval.
  */
 if (target_msc == 0  divisor == 0  remainder == 0) {
+   /* If the current vblank count of the drawable's crtc is lower
+* than the count stored in last_swap_target from a previous swap
+* then reinitialize last_swap_target to the current crtc's msc,
+* otherwise the swap will hang. This will happen if the drawable
+* is moved to a crtc with a lower refresh rate, or a crtc that just
+* got enabled.
+*/
+   if (!(*ds-GetMSC)(pDraw, ust, current_msc))
+   pPriv-last_swap_target = 0;
+
+   if (current_msc  pPriv-last_swap_target)
+   pPriv-last_swap_target = current_msc;
+
/*
 * Swap target for this swap is last swap target + swap interval since
 * we have to account for the current swap count, interval, and the
 * number of pending swaps.
 */
*swap_target = pPriv-last_swap_target + pPriv-swap_interval;
+
 } else {
/* glXSwapBuffersMscOML could have a 0 target_msc, honor it */
*swap_target = target_msc;
-- 
1.6.6

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] DRI2/xserver: Don't hang in glXSwapBuffers if drawable moves between crtc's

2010-06-17 Thread Mario Kleiner

Hi Keith and Peter,

this patch has been reviewed by Jesse Barnes and now successfully  
tested by the guy who submitted the bug report and by me. Fixes  
Bugzilla bug #28383, a quite ugly hang of application windows  
bufferswapping, including crashes when moving them between different  
crtc's of different refresh rate.


It would be good to apply it to 1.9 and 1.8.2 if possible, for a more  
enjoyable multi-display OpenGL experience.


thanks,
-mario

On Jun 14, 2010, at 6:13 PM, Jesse Barnes wrote:


On Sun, 13 Jun 2010 18:05:26 +0200
Mario Kleiner mario.klei...@tuebingen.mpg.de wrote:

Detect if a drawable has been moved from an original crtc to a new  
crtc

with a lower current vblank count than the original crtc inbetween
glXSwapBuffers() calls. Reinitialize drawable's last_swap_target
before scheduling next swap if such a move has taken place.

last_swap_target defines the baseline for scheduling the next swap.
If a movement between crtc's is not taken into account, the swap may
schedule for a vblank count on the new crtc far in the future,  
resulting

in a apparent hang of the drawable for a long time.

Fixes Bugzilla bug #28383.

Signed-off-by: Mario Kleiner mario.klei...@tuebingen.mpg.de
---
 hw/xfree86/dri2/dri2.c |   15 +++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index d33b0d1..1f80022 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -756,6 +756,7 @@ DRI2SwapBuffers(ClientPtr client, DrawablePtr  
pDraw, CARD64 target_msc,

 DRI2DrawablePtr pPriv;
 DRI2BufferPtr   pDestBuffer = NULL, pSrcBuffer = NULL;
 int ret, i;
+CARD64  ust, current_msc;

 pPriv = DRI2GetDrawable(pDraw);
 if (pPriv == NULL) {
@@ -800,12 +801,26 @@ DRI2SwapBuffers(ClientPtr client,  
DrawablePtr pDraw, CARD64 target_msc,
  * need to schedule a swap for the last swap target + the  
swap interval.

  */
 if (target_msc == 0  divisor == 0  remainder == 0) {
+   /* If the current vblank count of the drawable's crtc is lower
+* than the count stored in last_swap_target from a previous swap
+* then reinitialize last_swap_target to the current crtc's msc,
+* otherwise the swap will hang. This will happen if the drawable
+	 * is moved to a crtc with a lower refresh rate, or a crtc that  
just

+* got enabled.
+*/
+   if (!(*ds-GetMSC)(pDraw, ust, current_msc))
+   pPriv-last_swap_target = 0;
+
+   if (current_msc  pPriv-last_swap_target)
+   pPriv-last_swap_target = current_msc;
+
/*
 	 * Swap target for this swap is last swap target + swap interval  
since

 * we have to account for the current swap count, interval, and the
 * number of pending swaps.
 */
*swap_target = pPriv-last_swap_target + pPriv-swap_interval;
+
 } else {
/* glXSwapBuffersMscOML could have a 0 target_msc, honor it */
*swap_target = target_msc;


Yeah, this looks ok.  Really we should write up GLX extension that
clarifies behavior in multi-head and display on/off situations  
too.  Is

that something you could do?

Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org

--
Jesse Barnes, Intel Open Source Technology Center



___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: IgnoreClient question

2010-06-28 Thread Mario Kleiner

On Jun 29, 2010, at 12:27 AM, Jesse Barnes wrote:


In GL code, we have two subsystems using IgnoreClient these days: GLX
and DRI2.

GLX uses it to suspend clients while the server is VT switched away
(not sure why, maybe some drivers can't handle it?).

DRI2 uses it to implement swap throttling and various GLX extensions
and DRI2 protocol requests.



Jesse,

do we really need IgnoreClient() for throttling in DRI2?

If i remember correctly, the DRI2 requests i've seen are all like this:

1. Some request (DRI2GetBuffersWithFormat() / DRI2SwapBuffers() /  
glXWaitForMscOML() / glXWaitForSBCOML().
2. Request needs to wait for some event, either a scheduled vblank  
event to be delivered from the kernel, or a swap to complete.
3. Server responds to the client which was waiting for a xreply in  
response to completion of the request.


The client's calling thread always has to wait for a reply from the  
server before it can continue doing anything, so that thread would  
block anyway in xlib, waiting for a xreply from the server at step 3.


Because we only have one wait slot per drawable, we block the whole  
xdisplay connection via IgnoreClient() to prevent other threads in  
the same client from issuing such blocking requests to the same  
drawable - it would overflow our single wait slot.


This implementation isn't safe if multiple display connections  
address a single drawable, e.g., 1 connection per client thread. It  
also doesn't allow to handle multipe or parallel outstanding  
requests, something that would be useful with some of the new DRI2  
protocol requests.


If we could replace that single wait slot with one wait list per  
drawable and protocol request (1 list for glXWaitForMscOML, 1 list  
for glXWaitForSbcOML and maybe 1 list for swap related throttling),  
then we probably wouldn't need to block the connection via  
IgnoreClient at all and could have multiple outstanding / parallel  
requests.


What i assume but didn't check is that xlib doesn't have a problem  
with getting replies out of sequence, e.g., getting a reply to a  
request with seq nr 15 before another reply with nr 12 on the same  
connection? Just neccessary for the sequence number in the reply  
matching the corresponding sequence number from the request? xlib  
doesn't time out if a request doesn't quickly get a reply?


If we could implement such wait lists then we could ignore  
IgnoreClient and solve the multi-threading / multiple drawable issues  
i believe exist.


-mario

*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: IgnoreClient question

2010-06-28 Thread Mario Kleiner

On Jun 29, 2010, at 2:24 AM, Keith Packard wrote:

On Tue, 29 Jun 2010 02:18:31 +0200, Mario Kleiner  
mario.klei...@tuebingen.mpg.de wrote:



What i assume but didn't check is that xlib doesn't have a problem
with getting replies out of sequence


The X protocol is purely sequential -- you can't execute requests  
out of

order (and queuing replies is part of the request execution). In
particular, no X library would ever deal with replies being  
returned out
of order; they count on the sequence number from the X server being  
monotonic.




Ok, thanks. That makes live for multi-threaded clients a bit more  
difficult if they want to use the new extensions, because they would  
need to maintain multiple x connections, one for each thread, to  
circumvent this. I guess then xcb doesn't handle this either?


Then i think the idea of multiple wait queues doesn't allow to get  
rid of IgnoreClient(), but we will still need those wait queues in  
addition to IgnoreClient() to handle multiple blocking requests for  
one drawable from different clients / x-connections correctly.


The current code assumes exactly one client / x-connection per  
drawable. I think if we have multiple threads accessing the drawable  
over separate connections then the current IgnoreClient() method will  
not work and we will end up with undefined behaviour for some of the  
new dri2 protocol requests.


E.g., what should work according to the spec is that i have multiple  
threads, each of them using its own dedicated x-connection to perform  
a glXWaitForSbcOML() or glXWaitForMscOML() or glXSwapBuffers()/ 
glXSwapBuffersMscOML() call for the same drawable and getting its  
request retired when the corresponding event happens. Currently  
random stuff would happen if i tried this. If i remember correctly  
the first thread making a request would be correctly handled and all  
other threads would hang forever - their requests would disappear  
into nirvana.


Anyway, this is a separate issue from IgnoreClient(), so i'll discuss  
it separately with Jesse.


Another question: If i have separate x-connections, e.g., one for  
each drawable and each connection has an opengl rendering context  
attached to the drawable, then i assume these context still can share  
state (display lists, shaders, texture objects, fbo's etc.), right?


thanks,
-mario


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: IgnoreClient question

2010-07-07 Thread Mario Kleiner

On Jul 2, 2010, at 11:43 PM, Jesse Barnes wrote:


On Tue, 29 Jun 2010 04:07:48 +0200
Mario Kleiner mario.klei...@tuebingen.mpg.de wrote:

...
Then i think the idea of multiple wait queues doesn't allow to get
rid of IgnoreClient(), but we will still need those wait queues in
addition to IgnoreClient() to handle multiple blocking requests for
one drawable from different clients / x-connections correctly.


Right, we have lists in the server now too, so it shouldn't be too  
hard.


Another question: If i have separate x-connections, e.g., one for
each drawable and each connection has an opengl rendering context
attached to the drawable, then i assume these context still can share
state (display lists, shaders, texture objects, fbo's etc.), right?


Anything associated with the drawable would be shared, but context
specific information would still be separate per thread.  Display  
lists

can be shared explicitly, likewise for some of the other objects, so
you should be able to get the sharing you want.



Good. I will switch our client toolkit to a 1 x-connection per thread  
model. If a wait queue implementation in the dri2 sync  swap bits  
hasn't magically shown up by the time i hit the limits of the current  
implementation, i'll give implementing it myself a try.


-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: glXSwapBuffers fix for moving between crtcs is not following the OML_sync_control specification

2010-07-09 Thread Mario Kleiner

On Jul 9, 2010, at 3:40 PM, Michel Dänzer wrote:


On Fre, 2010-07-09 at 16:20 +0300, Pauli Nieminen wrote:


This commit assumes that MSC maybe different depending on where the
drawable is. While specification says For a multi-monitor system,  
the

monitor used to determine MSC is screen 0 of display. which IMO
means that there should be single MSC everywhere were target drawable
can move. So correct fix would be making MSC value in-depend of CRTC
where drawable is.


That commit only fixes the glXSwapBuffers() case, which is not  
defined in the OML_sync_control extension. I couldn't find any  
specification of how glXSwapBuffers should behave wrt. drawable on  
different crtc's, e.g., SGI_swap_control doesn't talk about such  
multi-monitor issues. In that sense the patch doesn't violate a  
specification, because the behaviour is unspecified.


Currently we have proper behaviour while the drawable is on a given  
crtc - it obeys swap_interval. For the swap that happens between a  
move of the drawable from one crtc to the other, we have a small  
timing glitch: The swap_interval setting is ignored for that swap -  
it swaps on the next retrace, as if swap_interval is 1. But then a  
swap_interval of 1 is the default setting and probably what almost  
all standard client apps use anyway.


We could do a bit better and avoid that single glitch by mapping msc  
count from the crtc where the drawable was to the crtc where the  
drawable is now. But that would need us to keep track of the previous  
crtc of a drawable and we'd need new DDX entry points to query the  
current msc of any given crtc and the current crtc of a given  
drawable from within the server. Currently this info is only  
available in the DDX itself. Something like (pseudo-code):


if (current_crtc != previous_crtc)
	last_swap_target = last_swap_target + (msc(current_crtc) - msc 
(previous_crtc));


Not sure if avoiding this small glitch is worth the hassle of two new  
entry points?



If you use the glXSwapBuffersMscOML() command or other  
OML_sync_control commands then the commit doesn't apply. Currently it  
is up to the client application to find out if a drawable moved  
between crtc's inbetween swaps and handle this in a app specific manner.


From the perspective of a toolkit developer i like this do it  
yourself behaviour for glXSwapBuffersMscOML(). OML_sync_control  
provides all the neccessary tools for a client app to detect and  
handle such situations in a way that leaves the application in  
control. Apps that use OML_sync_control will likely care very much  
about very precise visual timing. Some automagic corrections like the  
above for glXSwapBuffers() could spoil the day of someone (e.g.,  
me ;) ) that needs very precise timing control and uses  
OML_sync_control for that purpose.




I agree there *should* be such a single MSC, the question is how it
should be calculated when the CRTCs don't all have the same vertical
refresh rate.


I guess OML_sync_control was specified at a time when systems with  
multiple-displays for one screen didn't exist, at least not ones with  
different refresh rates for the same screen. If you think about how  
applications would use the extension to precisely schedule swaps, i  
think the assumption is that the drawable will usually not move  
between crtc's or that all crtc's work perfectly synchronized with  
the same refresh rate, e.g., for multi-display virtual reality  
applications.




For the old intel DRI1 swap scheduling hack, I solved this by  
making the
MSC not correspond to any specific CRTC counter directly but making  
it a

'virtual' counter which increases at the same rate as the CRTC the
window is currently being synchronized to. Looking at the
GLX_OML_sync_control spec again, I think this solution should still be
feasible in general, as all the extension calls take a drawable
parameter.


Hmm. I'm not sure if i as a toolkit developer would like that kind of  
virtualization. What i do is i have a given target system time at  
which the swap should happen - as close to that time as possible. I  
use the msc count and the associated ust timestamp together with the  
known video refresh rate and my given target time to translate the  
target time into a target_msc. I use these counts and timestamps to  
synchronize with other modalities like sound, various i/o etc. and  
often need sub-millisecond accurate timing. Sometimes i need to  
synchronize swaps across multiple displays. If such a virtualized  
counter wouldn't be very accurate or if the associated ust timestamps  
wouldn't be very accurate, i'd be in trouble and the OML_sync_control  
extension would lose most of its value to me.


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www

Re: glXSwapBuffers fix for moving between crtcs is not following the OML_sync_control specification

2010-07-09 Thread Mario Kleiner

On Jul 9, 2010, at 3:20 PM, Pauli Nieminen wrote:

But actual patch has two problems. First, GetMSC is not NULL checked
before calling it. This causes crash with driver that is using
flipping without OML_sync_control.  Second, same freeze would happen


I avoided a ds-GetMSC() NULL check because that code-path wouldn't  
be taken if OML_sync_control support in the DDX is missing. If you  
look at the top of DRI2SwapBuffers() there is a fallback path that is  
taken if those entry points are null.



for application that is using OML_sync_control if it was moved between
areas where is different MSC. In my option actual patch should be
reverted because it is not correct fix for the bug.


See my other response.

thanks,
-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: glXSwapBuffers fix for moving between crtcs is not following the OML_sync_control specification

2010-07-09 Thread Mario Kleiner

On Jul 9, 2010, at 6:15 PM, Michel Dänzer wrote:


[...]


For the old intel DRI1 swap scheduling hack, I solved this by
making the
MSC not correspond to any specific CRTC counter directly but making
it a
'virtual' counter which increases at the same rate as the CRTC the
window is currently being synchronized to. Looking at the
GLX_OML_sync_control spec again, I think this solution should  
still be

feasible in general, as all the extension calls take a drawable
parameter.


Hmm. I'm not sure if i as a toolkit developer would like that kind of
virtualization. What i do is i have a given target system time at
which the swap should happen - as close to that time as possible. I
use the msc count and the associated ust timestamp together with the
known video refresh rate and my given target time to translate the
target time into a target_msc. I use these counts and timestamps to
synchronize with other modalities like sound, various i/o etc. and
often need sub-millisecond accurate timing. Sometimes i need to
synchronize swaps across multiple displays. If such a virtualized
counter wouldn't be very accurate or if the associated ust timestamps
wouldn't be very accurate, i'd be in trouble and the OML_sync_control
extension would lose most of its value to me.


The accuracy should be the same, and sophisticated apps / toolkits  
such

as yours can still do the same things when the window moves between
CRTCs. However, I suspect most apps / toolkits won't be as
sophisticated, and would be more likely to work sensibly.



I can see that maintaining a per-drawable virtual msc counter could  
work, although you'd need to do very careful forward and backward  
mappings from virtual - real crtc in various places, e.g.,  
translate to true vblank counts for scheduling vblank events in the  
kernel and then translating back at swap completion for  
glXWaitForSbcOML and the new INTEL_swap_events extension, taking  
possible movements of the drawable between scheduling the swap and  
completing the swap into account. I think the asynchronous nature of  
the new dri2 sync  swap bits can cause a couple of new headaches here.


One problem i can see is if you have multiple drawables synced to/on  
the same crtc and you want to schedule them to swap in sync or at  
specific msc offsets to each other. As long as all msc values refer  
to the same real msc of the crtc, everything is fine. If each  
drawable has its own virtualized msc, e.g., initialized to zero at  
drawable creation time and then monotonically incrementing, you have  
different reference points - msc counts are no longer comparable and  
meaningful between drawables.


-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Bug: Deadlock for multi-threaded glx apps inside __glXInitialize()

2010-07-26 Thread Mario Kleiner

Hi Kristian

Testing with current mesa master, my toolkit deadlocks on the first  
call to a glX function (glXChooseVisual()).


The deadlock was probably introduced by your recent commit:

glx: Use _Xglobal_lock for protecting extension display list

ab434f6b7641a64d30725a9ac24929240362d466

The problem is that the _Xglobal_lock is locked twice inside the  
__glXInitialize() function of mesa/src/glx/glxext.c, once inside  
__glXInitialize(), and then as part of dri2CreateDisplay() - ... -  
XextFindDisplay. The 2nd locking call on the already held lock  
deadlocks.


Attached a backtrace with the problem and a patch/hack that fixes  
it for me, but introduces a race-condition itself, so this is  
obviously not the correct solution. The race condition would trigger  
if two threads would simultaneously do their first call to a glX  
function for the same Display* dpy handle, rather unlikely to happen  
in practice? If so, then the patch might be  an acceptable fix until  
a better solution is found?


Other applications (glxgears, games etc.) work correctly. The  
difference is that my toolkit calls  XInitThreads() at startup to use  
xlib multi-threaded and afaik the locking calls only get enabled if  
xlib is switched to thread-safe mode, otherwise they are no-ops?


thanks,
-mario



0001-mesa-glx-Work-around-deadlock-on-_Xglobal_lock-in-__.patch
Description: Binary data
#0  0x0012d422 in __kernel_vsyscall ()
#1  0x031ccaf9 in __lll_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0
#2  0x031c813b in _L_lock_748 () from /lib/tls/i686/cmov/libpthread.so.0
#3  0x031c7f61 in pthread_mutex_lock () from /lib/tls/i686/cmov/libpthread.so.0
#4  0x0310fba6 in pthread_mutex_lock () from /lib/tls/i686/cmov/libc.so.6
#5  0x0292f41f in _XLockMutex (lip=0x2a25e34) at ../../src/locking.c:107
#6  0x034729af in XextFindDisplay (extinfo=0x0, dpy=0x84debb0) at 
../../src/extutil.c:232
#7  0x0344faff in DRI2FindDisplay (dpy=0x84debf0) at dri2.c:81
#8  0x0344fd6e in DRI2QueryExtension (dpy=0x84debf0, eventBase=0xbfffd9cc, 
errorBase=0xbfffd9c8) at dri2.c:184
#9  0x0344e653 in dri2CreateDisplay (dpy=0x84debf0) at dri2_glx.c:900
#10 0x0342a6ce in __glXInitialize (dpy=0x84debf0) at glxext.c:848
#11 0x0342764a in GetGLXPrivScreenConfig (dpy=0x84debf0, scrn=0, 
ppriv=0xbfffdbdc, ppsc=0xbfffdbd8) at glxcmds.c:183
#12 0x03427d46 in glXChooseVisual (dpy=0x84debf0, screen=0, 
attribList=0xbfffdcd8) at glxcmds.c:1495
#13 0x032dae3f in PsychOSOpenOnscreenWindow (screenSettings=0xbfffe114, 
windowRecord=0x8501010, numBuffers=2, stereomode=0, 
conserveVRAM=0) at Linux/Screen/PsychWindowGlue.c:288
#14 0x032ee883 in PsychOpenOnscreenWindow (screenSettings=0xbfffe114, 
windowRecord=0xbfffe110, numBuffers=2, stereomode=0, 
rect=0xbfffe1b0, multiSample=0, sharedContextWindow=0x0) at 
Common/Screen/PsychWindowSupport.c:295
#15 0x0330e57c in SCREENOpenWindow () at Common/Screen/SCREENOpenWindow.c:301
#16 0x0334b79b in mexFunction (nlhs=1, plhs=0xb5ee7018, nrhs=4, 
prhs=0xb5ee7008) at Common/Base/PsychScriptingGlue.cc:959
#17 0x004e0f25 in call_mex(bool, void*, octave_value_list const, int, 
octave_mex_function*) ()
   from /usr/lib/octave-3.2.3/liboctinterp.so
#18 0x006f8868 in octave_mex_function::do_multi_index_op(int, octave_value_list 
const) ()
   from /usr/lib/octave-3.2.3/liboctinterp.so



*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [RFC] DRI2 synchronization and swap bits

2009-11-01 Thread Mario Kleiner
Hello everybody

My name is Mario Kleiner and i'm new to this list, so i apologize  
beforehand should i violate some rules of netiquette, state the  
totally obvious, or if this post is somehow considered off-topic or  
way too long. Please tell me if so, and how to do better next time.  
First some background to why i am posting, then some proposals more  
to the point of this RFC.

I read this RFC and i'm very excited about the prospect of having  
well working support for the OML_sync_control extension in DRI2 on  
Linux/X11. I was hoping for this to happen since years, so a big  
thank you in advance! This is why i hope to provide some input from  
the perspective of future power-users of functions like  
glXGetSyncValuesOML(), glXSwapBuffersMscOML(), glXWaitForSbcOML. I'm  
the co-developer of a popular free-software toolkit (Psychtoolbox)  
that is used mostly in the neuroscience / cognitive science community  
by scientist to find out how the different senses (visual, auditory,  
haptic, ...) work and how they work together. Our requirements to  
graphics are often much more demanding than what a videogame, typical  
vr-environment or a mediaplayer has.

Our users often have very strict requirements for scheduling frame- 
accurate and tear-free visual stimulus display, synchronizing  
bufferswaps across display-heads, and low-latency returns from swap- 
completion. Often they need swap-completion timestamps which are  
available with the shortest possible delay after a successfull swap  
and accurately tied to the vblank at which scanout of a swapped frame  
started. The need for timestamps with sub-millisecond accuracy is not  
uncommon. Therefore, well working OML_sync_control support would be  
basically a dream come true and a very compelling feature for Linux  
as a platform for cognitive science.

I spent the last 12 hours reading the CompositeSwap page at the DRI- 
Wiki and through Jesse Barnes git-tree and the drivers/gpu/drm/ 
drm_irq.c file in the linux-next git-tree at kernel org, which i  
assume (correctly?) is the current state of art wrt. to the DRM, and  
have some thoughts or wishes.

1. Wrt to 2) DRI2WaitMSC/SBC a) Concern about blocking the client on  
the server side as opposed to a client side wait.

I'm not sure about the extra latency involved by blocking the client  
on the server side, instead of a client side wait, but i can assure  
you that for our applications, 1 millisecond extra delay between swap- 
completion and unblocking can make a significant difference. Quite  
often certain actions need to be triggered in sync with swap  
completion. Examples are starting recording equipment for brain  
activity (fMRI, EEG, MEG, eye-trackers) or other physiological  
responses, starting sound playback or recording, sending trigger  
packets over a network, driving special digital/analog I/O boards,  
driving motion simulators etc. So low-latency unblocking would be  
much appreciated from our side.

2. On the CompositePage in the DRM Wiki, there is this comment:  
...It seems that composited apps should never need to know about  
real world screen vblank issues, ... When dealing with a  
redirected window it seems it would be acceptable to come up with an  
entirely fake number for all existing extensions that care about  
vblanks..

I don't like this idea about entirely fake numbers and like to vote  
for a solution that is as close as possible to the non-redirected  
case. Most of our applications run in non-redirected, full-screen,  
undecorated, page-flipped windows, ie., without a compositor being  
involved. I can think of a couple future usage cases though where  
reasonably well working redirected/composited windows would be very  
useful for us, but only if we get meaningful timestamps and vblank  
counts that are tied to the actual display onset.

3. The Wiki also mentions The direct rendered cases outlined in the  
implementation notes above are complete, but there's a bug in the  
async glXSwapBuffers that sometimes causes clients to hang after  
swapping rather than continue. Looking through the code of http:// 
cgit.freedesktop.org/~jbarnes/xf86-video-intel/tree/src/i830_dri.c? 
id=a0e2e624c47516273fa3d260b86d8c293e2519e4 i can see that in  
I830DRI2SetupSwap() and I830DRI2SetupWaitMSC(), in the if (divisor  
== 0) { ...} path, the functions return after DRM_VBLANK_EVENT  
submission without assigning *event_frame = vbl.reply.sequence;  This  
looks problematic to me, as the xserver is later submitting  
event_frame in the call to DRI2AddFrameEvent() inside DRI2SwapBuffers 
() as a cookie to find the right events for clients to wait on? Could  
this be a reason for clients hanging after swap? I found a few other  
spots where i other misunderstood something or there are small bugs.  
What is the appropriate way to report these?

4. According to spec, the different OML_sync_control functions do  
return a UST timestamp which is supposed to reflect the exact

Re: [RFC] DRI2 synchronization and swap bits

2009-11-07 Thread Mario Kleiner
 this beamposition timestamping in userspace to get rid  
of most of the scheduling/wakeup delay and jitter after swap  
completion on Windows and MacOS/X. Both systems provide a function to  
query the current scanout position. It is so far the only method that  
allows us to achieve sub-millisecond precision for our timestamps. It  
also allows us to provide a timestamp for vbl onset and a timestamp  
for start of scanout of a new frame. If Linux could do this within  
its timestamping code by default, it would be even more accurate. Our  
userspace code can get preempted at the wrong moment, but kernel code  
inside the irq handler should be more robust. Also it would allow a  
more precise implementation of the UST timestamp according to the  
OML_sync_control spec, as that spec asks for UST being the time of  
start of scanout of the first scanline of a new video frame, instead  
of start of vblank.

Let me know what you think about this,
-mario



*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled.
(Richard Feynman)

___
xorg-devel mailing list
xorg-devel@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-devel


DRI3/Present fixes for XServers 1.16 and 1.17rc

2014-11-24 Thread Mario Kleiner
Hi,

i finally figured out why dri3/present failed so miserably with my
neuro-science application, which relies on OML_sync_control and
INTEL_swap_events extensively and worked nicely under dri2 for years.

I found so far two bugs in the x-server and 3 bugs in mesas
dri/present backend. I'll send out the mesa patches with fixes
separately.

The following two patches fix the x-server part for me. First one
only caused pain during debugging - a bug in the DebugPresent macros,
causing server crash on logout, closing of unredirected fullscreen
windows, or toggling composition on/off whenever the debug macros
were compiled in.

The second patch fixes nouveau's massive tearing and various
OML_sync_control related failures caused by lack of vsynced pageflips.
The problem doesn't happen on intel, at least with the tested ddx
versions, because intel-ddx (as of 2.99.914) never reports
AsyncFlipCapability on my Intel HD Ironlake and Intel HD 4000 IvyBridge,
so possibly a bug in the intel driver which cancels out with the server
bug?

These patches were tested to apply cleanly on master and on top of the
1.16.2 branch. They were tested on Intel HD, Intel HD-4000 and with
nouveau on a NVidia gpu on top of 1.16.2.

Could these be applied to 1.16 and later? 

thanks,
-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

[PATCH 1/2] present: Avoid crashes in DebugPresent(), a bit more info.

2014-11-24 Thread Mario Kleiner
DebugPresent() crashed the server when a dri3 drawable
was closed while a pageflipped present was still pending,
due to vblank-window- Null-Ptr deref, so debug builds
caused new problems to debug.

E.g.,

glXSwapBuffers(...);
glXDestroyWindow(...);
- Pageflip for non-existent window completes - boom.

Also often happens when switching desktop compositor on/off
due to Present unflips, or when logging out of session.

Also add info if a Present is queued for copyswap or pageflip,
if the present is vsynced, and the serial no of the Present
request, to aid debugging of pageflip and vsync issues. The
serial number is useful as Mesa's dri3/present backend encodes
its sendSBC in the serial number, so one can easily correlate
server debug output with Mesa and with the SBC values returned
to actual OpenGL client applications via OML_sync_control and
INTEL_swap_events extension, makes debugging quite a bit more
easy.

Please also cherry-pick this for a 1.16.x stable update.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 present/present.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/present/present.c b/present/present.c
index cf283f4..d84bdef 100644
--- a/present/present.c
+++ b/present/present.c
@@ -435,7 +435,7 @@ present_flip_notify(present_vblank_ptr vblank, uint64_t 
ust, uint64_t crtc_msc)
 DebugPresent((\tn %lld %p %8lld: %08lx - %08lx\n,
   vblank-event_id, vblank, vblank-target_msc,
   vblank-pixmap ? vblank-pixmap-drawable.id : 0,
-  vblank-window-drawable.id));
+  vblank-window ? vblank-window-drawable.id : 0));
 
 assert (vblank == screen_priv-flip_pending);
 
@@ -851,10 +851,10 @@ present_pixmap(WindowPtr window,
 }
 
 if (pixmap)
-DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p)\n,
+DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p) flip %d vsync 
%d serial %d\n,
   vblank-event_id, vblank, target_msc,
   vblank-pixmap-drawable.id, vblank-window-drawable.id,
-  target_crtc));
+  target_crtc, vblank-flip, vblank-sync_flip, 
vblank-serial));
 
 xorg_list_add(vblank-event_queue, present_exec_queue);
 vblank-queued = TRUE;
@@ -946,7 +946,7 @@ present_vblank_destroy(present_vblank_ptr vblank)
 DebugPresent((\td %lld %p %8lld: %08lx - %08lx\n,
   vblank-event_id, vblank, vblank-target_msc,
   vblank-pixmap ? vblank-pixmap-drawable.id : 0,
-  vblank-window-drawable.id));
+  vblank-window ? vblank-window-drawable.id : 0));
 
 /* Drop pixmap reference */
 if (vblank-pixmap)
-- 
2.1.0

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

[PATCH 2/2] present: Fix use of vsynced pageflips and honor PresentOptionAsync.

2014-11-24 Thread Mario Kleiner
Pageflips for Pixmap presents were not synchronized to vblank
by default, as they should be, due to some missing init for
vblank-sync_flips. The PresentOptionAsync flag was completely
ignored for controlling this.

Vsynced flips only worked by accident on the intel-ddx, as that
driver doesn't report PresentCapabilityAsync support (tested
on Intel HD Ironlake and Intel HD 4000 IvyBridge + Linux 3.17).
This lack might be a bug in the intel-ddx itself?

On nouveau-ddx, which properly reports PresentCapabilityAsync,
this always caused non-vsynced pageflips and pretty ugly tearing.

This patch fixes the problem, as tested on top of XOrg 1.16.2
on nouveau and intel.

Please also apply to XOrg 1.16 stable.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 present/present.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/present/present.c b/present/present.c
index d84bdef..f24a218 100644
--- a/present/present.c
+++ b/present/present.c
@@ -737,6 +737,10 @@ present_pixmap(WindowPtr window,
  */
 window_priv-msc = crtc_msc;
 
+/* Clear async request if driver can't support it for pixmap present */
+if (pixmap  (!screen_priv-info || !(screen_priv-info-capabilities  
PresentCapabilityAsync)))
+options = ~PresentOptionAsync;
+
 /* Adjust target_msc to match modulus
  */
 if (crtc_msc = target_msc) {
@@ -828,9 +832,10 @@ present_pixmap(WindowPtr window,
 vblank-msc_offset = window_priv-msc_offset;
 vblank-notifies = notifies;
 vblank-num_notifies = num_notifies;
+vblank-sync_flip = TRUE;
 
-if (!screen_priv-info || !(screen_priv-info-capabilities  
PresentCapabilityAsync))
-vblank-sync_flip = TRUE;
+if (options  PresentOptionAsync)
+vblank-sync_flip = FALSE;
 
 if (pixmap  present_check_flip (target_crtc, window, pixmap, 
vblank-sync_flip, valid, x_off, y_off)) {
 vblank-flip = TRUE;
-- 
2.1.0

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] dri2: add initial prime support. (v1.2)

2012-07-03 Thread Mario Kleiner

On 03.07.2012, at 16:22, Dave Airlie wrote:

 From: Dave Airlie airl...@redhat.com
 
 This adds the initial prime support for dri2 offload. The main thing is
 when we get a connection from a prime client, we stored the information
 and mark all drawables from that client as prime. We then create all
 buffers for that drawable on the prime device dri2screen.
 
 Then DRI2UpdatePrime is provided which drivers can call to get a shared
 pixmap which they can use as the front buffer. The driver is then
 responsible for doing the back-front copy to the shared buffer.
 
 prime requires a compositing manager be run, but it handles the case where
 a window get un-redirected by allocating a new pixmap and pointing the crtc
 at it while the client is in that state.
 
 Currently prime can't handle pageflipping, so always does straight copy swap,
 
 v1.1: renumber on top of master.
 v1.2: fix auth on top of master.
 
 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
 glx/glxdri2.c |2 +-
 hw/xfree86/dri2/dri2.c|  265 +
 hw/xfree86/dri2/dri2.h|   25 -
 hw/xfree86/dri2/dri2ext.c |4 +-
 4 files changed, 267 insertions(+), 29 deletions(-)
 
 diff --git a/glx/glxdri2.c b/glx/glxdri2.c
 index 7b76c3a..984f115 100644
 --- a/glx/glxdri2.c
 +++ b/glx/glxdri2.c
 @@ -840,7 +840,7 @@ __glXDRIscreenProbe(ScreenPtr pScreen)
 return NULL;
 
 if (!xf86LoaderCheckSymbol(DRI2Connect) ||
 -!DRI2Connect(pScreen, DRI2DriverDRI,
 +!DRI2Connect(serverClient, pScreen, DRI2DriverDRI,
  screen-fd, driverName, deviceName)) {
 LogMessage(X_INFO,
AIGLX: Screen %d is not DRI2 capable\n, pScreen-myNum);
 diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
 index d3b3c73..0f87820 100644
 --- a/hw/xfree86/dri2/dri2.c
 +++ b/hw/xfree86/dri2/dri2.c
 @@ -45,7 +45,7 @@
 #include dixstruct.h
 #include dri2.h
 #include xf86VGAarbiter.h
 -
 +#include damage.h
 #include xf86.h
 
 CARD8 dri2_major;   /* version of DRI2 supported by DDX */
 @@ -63,6 +63,17 @@ static DevPrivateKeyRec dri2PixmapPrivateKeyRec;
 
 #define dri2PixmapPrivateKey (dri2PixmapPrivateKeyRec)
 
 +static DevPrivateKeyRec dri2ClientPrivateKeyRec;
 +
 +#define dri2ClientPrivateKey (dri2ClientPrivateKeyRec)
 +
 +#define dri2ClientPrivate(_pClient) 
 (dixLookupPrivate((_pClient)-devPrivates, \
 +  dri2ClientPrivateKey))
 +
 +typedef struct _DRI2Client {
 +int prime_id;
 +} DRI2ClientRec, *DRI2ClientPtr;
 +
 static RESTYPE dri2DrawableRes;
 
 typedef struct _DRI2Screen *DRI2ScreenPtr;
 @@ -87,6 +98,9 @@ typedef struct _DRI2Drawable {
 int swap_limit; /* for N-buffering */
 unsigned long serialNumber;
 Bool needInvalidate;
 +int prime_id;
 +PixmapPtr prime_slave_pixmap;
 +PixmapPtr redirectpixmap;
 } DRI2DrawableRec, *DRI2DrawablePtr;
 
 typedef struct _DRI2Screen {
 @@ -113,14 +127,47 @@ typedef struct _DRI2Screen {
 HandleExposuresProcPtr HandleExposures;
 
 ConfigNotifyProcPtr ConfigNotify;
 +DRI2CreateBuffer2ProcPtr CreateBuffer2;
 +DRI2DestroyBuffer2ProcPtr DestroyBuffer2;
 +DRI2CopyRegion2ProcPtr CopyRegion2;
 } DRI2ScreenRec;
 
 +static void
 +destroy_buffer(DrawablePtr pDraw, DRI2BufferPtr buffer, int prime_id);
 +
 static DRI2ScreenPtr
 DRI2GetScreen(ScreenPtr pScreen)
 {
 return dixLookupPrivate(pScreen-devPrivates, dri2ScreenPrivateKey);
 }
 
 +static ScreenPtr
 +GetScreenPrime(ScreenPtr master, int prime_id)
 +{
 +ScreenPtr slave;
 +int i;
 +
 +if (prime_id == 0 || xorg_list_is_empty(master-offload_slave_list)) {
 +return master;
 +}
 +i = 0;
 +xorg_list_for_each_entry(slave, master-offload_slave_list, 
 offload_head) {
 +if (i == (prime_id - 1))
 +break;
 +i++;
 +}
 +if (!slave)
 +return master;
 +return slave;
 +}
 +
 +static DRI2ScreenPtr
 +DRI2GetScreenPrime(ScreenPtr master, int prime_id)
 +{
 +ScreenPtr slave = GetScreenPrime(master, prime_id);
 +return DRI2GetScreen(slave);
 +}
 +  
 static DRI2DrawablePtr
 DRI2GetDrawable(DrawablePtr pDraw)
 {
 @@ -187,7 +234,8 @@ DRI2AllocateDrawable(DrawablePtr pDraw)
 xorg_list_init(pPriv-reference_list);
 pPriv-serialNumber = DRI2DrawableSerial(pDraw);
 pPriv-needInvalidate = FALSE;
 -
 +pPriv-redirectpixmap = NULL;
 +pPriv-prime_slave_pixmap = NULL;
 if (pDraw-type == DRAWABLE_WINDOW) {
 pWin = (WindowPtr) pDraw;
 dixSetPrivate(pWin-devPrivates, dri2WindowPrivateKey, pPriv);
 @@ -286,6 +334,7 @@ DRI2CreateDrawable(ClientPtr client, DrawablePtr pDraw, 
 XID id,
DRI2InvalidateProcPtr invalidate, void *priv)
 {
 DRI2DrawablePtr pPriv;
 +DRI2ClientPtr dri2_client = dri2ClientPrivate(client);
 XID dri2_id;
 int rc;
 
 @@ -295,6 +344,8 @@ DRI2CreateDrawable(ClientPtr client, 

Re: [PATCH] dri2: add initial prime support. (v1.2)

2012-07-05 Thread Mario Kleiner
On 03.07.2012, at 20:47, Dave Airlie wrote:

 On Tue, Jul 3, 2012 at 7:39 PM, Mario Kleiner
 mario.klei...@tuebingen.mpg.de wrote:
 
 On 03.07.2012, at 16:22, Dave Airlie wrote:
 
 From: Dave Airlie airl...@redhat.com
 
 This adds the initial prime support for dri2 offload. The main thing is
 when we get a connection from a prime client, we stored the information
 and mark all drawables from that client as prime. We then create all
 buffers for that drawable on the prime device dri2screen.
 

...

 
 +DrawablePtr DRI2UpdatePrime(DrawablePtr pDraw, DRI2BufferPtr pDest)
 +{
 +DRI2DrawablePtr pPriv = DRI2GetDrawable(pDraw);
 +PixmapPtr spix;
 +PixmapPtr mpix = GetDrawablePixmap(pDraw);
 +ScreenPtr master, slave;
 +Bool ret;
 +
 +master = mpix-drawable.pScreen;
 +
 +if (pDraw-type == DRAWABLE_WINDOW) {
 + WindowPtr pWin = (WindowPtr)pDraw;
 + PixmapPtr pPixmap = pDraw-pScreen-GetWindowPixmap(pWin);
 +
 + if (pDraw-pScreen-GetScreenPixmap(pDraw-pScreen) == pPixmap) {
 + if (pPriv-redirectpixmap 
 + pPriv-redirectpixmap-drawable.width == pDraw-width 
 + pPriv-redirectpixmap-drawable.height == pDraw-height 
 + pPriv-redirectpixmap-drawable.depth == pDraw-depth) {
 + mpix = pPriv-redirectpixmap;
 + } else {
 +if (master-ReplaceScanoutPixmap) {
 +mpix = (*master-CreatePixmap)(master, pDraw-width, 
 pDraw-height,
 +   pDraw-depth, 
 CREATE_PIXMAP_USAGE_SHARED);
 +if (!mpix)
 +return NULL;
 +
 +ret = (*master-ReplaceScanoutPixmap)(pDraw, mpix, 
 TRUE);
 +if (ret == FALSE) {
 +(*master-DestroyPixmap)(mpix);
 +return NULL;
 +}
 

The following bit ok?

 I probably just don't understand the code well enough, but is a ...
 
 if (pPriv-redirectpixmap) (*master-DestroyPixmap)(pPriv-redirectpixmap);
 
 … missing here before the assignment of the new mpix?
 
 +pPriv-redirectpixmap = mpix;
 +} else
 +return NULL;
 + }
 + } else if (pPriv-redirectpixmap) {
 +(*master-ReplaceScanoutPixmap)(pDraw, pPriv-redirectpixmap, 
 FALSE);
 + (*master-DestroyPixmap)(pPriv-redirectpixmap);
 + pPriv-redirectpixmap = NULL;
 + }
 +}
 +

…

 
 If you define a new CopyRegion2 hook, it would be good to now also pass 
 along the pPriv-
 swap_interval parameter, or a bool vsync = pPriv-swap_interval ? true : 
 false flag. That would allow the ddx CopyRegion2() to skip sync to vblank 
 during copy if swap_interval is zero. Other OS'es and the NVidia/AMD binary 
 blobs on Linux treat a zero swap interval as don't sync to retrace. 
 DRI2SwapBuffers() currently skips all swap scheduling on zero swap interval 
 and uses CopyRegion(), but CopyRegion doesn't know it is a zero swap 
 interval/no-vsync swap, so it syncs anyway. All ddx'es have a 
 SwapBuffersWait xorg.conf option to disable this behaviour, but i think 
 that was added as a workaround, not really the amount of control that 
 applications expect.
 
 I could do that but I really would prefer we do things in iterations
 and not confuse the interfaces, so add a CopyRegion3 with this sort of
 thing.

Ok. Also Chris Wilson just posted an updated patch for an AsyncSwap hook, which 
would solve the same problem, depending on what gets in, in what order.

 
 The thing is I've not really considered timing because its really
 messy with two GPUs, current offload design is offload GPU calls
 CopyRegion2 which uses the hw copy engine to copy from the VRAM to
 GTT, the GTT object is shared with the other GPU and is the backing
 for the redirected window pixmap. So when the first driver copies to
 the second, it causes damage on the backing pixmap which gets sent to
 compositor which copies from it. Doing sync and flipping is a lot
 messier, since you probably want to sync everything on the displaying
 GPU.

Thanks for the explanation. Posting damage at the current place could also 
cause composition of partially copied frames, right? Just a question to see if 
i understand the damage tracking correctly.

 
 
 If so, how does a compositor in the redirected case know when the offload 
 slave has finished its blit to the redirectPixmap? How does the offload 
 slave sync to vblank of the masters crtc in the unredirected case when its 
 pPriv-redirectpixmap is set as a crtc's scanout pixmap?
 
 Again sync to vblank is definitely part 2, we've been looking at
 inter-gpu semaphores (by we I mean mlankhorst), but I'd appreciate any
 input you can give since you know this stuff fairly well ;-).

Not so sure how well i know this stuff, but i totally agreed to the definitely 
part 2 thing :). I'm happy if your current patches get in soonish. I think i 
will just shut up and wait

Re: Initial DRI3000 protocol specs available

2013-02-20 Thread Mario Kleiner

On 02/20/2013 09:27 PM, Keith Packard wrote:

Chris Wilson ch...@chris-wilson.co.uk writes:


What I don't see here is how the client instructs the server to
handle a missed swap.

Right, this first pass was just trying to replicate the DRI2 semantics;
figuring out how to improve those seems like a good idea.

 From what game developers have told us, a missed swap should just tear
instead of dropping a frame. It might be nice to inform the client that
they're not keeping up with the target frame rate and let them scale
stuff back; I'd suggest the SwapComplete event could contain enough
information to let them know what actually happened.


Please make this configurable. Tearing makes sense for a game, but for 
the kind of scientific apps i do, we don't want it to tear ever, or bad 
things would happen for us. We need it to just flip the frame delayed 
but vsync'ed and then the app can figure out via the INTEL_swap_events 
or glXWaitForSbcOML() that a deadline was missed and what to do to catch up.


There's 
http://www.opengl.org/registry/specs/EXT/glx_swap_control_tear.txt that 
allows apps to define if they want to tear or vsync on a missed swap 
deadline.



For example, with the typical use of
   swap-interval  0, divisor = 0, target = current
we can choose to either emit this SwapRegion synchronously, or
asynchronously (to risk tearing but allow the client catch up to its
target framerate).

I haven't heard anyone asking for us to skip a frame in this case to
avoid tearing.


See above :-).

...

In the reply, swap_hi/swap_lo form a 64-bit swap count value
when the swap will actually occur (e.g.  the last queued swap
count + (pending swap count * swap interval)).


t
I'm not sure exactly what SBC is meant to be. Is it a simple seqno of
the SwapRegion in this Drawable's swap queue (why then does
swap_interval matter), or is it meant to correlate with the vblank
counter (in which case it is merely a predicted value)?

Just like DRI2, it's the planned swap time. Don't be late!


SBC in DRI2 is the running count of completed swaps for a drawable, ie., 
current swap count + pending swap count, not the planned swap time. 
Essentially a reference to a just queued swap via sbc = 
glXSwapBuffersMscOML(...), so you can use sbc as a unique id for that 
swap in glXWaitForSBCOML() or to match it up with the sbc in a returned 
INTEL_swap_event. Very useful, so i guess this should stay 
backwards-compatible.


thanks,
-mario




___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

[PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-11-24 Thread Mario Kleiner
Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org
Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 src/glx/dri3_glx.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 5796491..c53be1b 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1518,6 +1518,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
xcb_connection_t *c = XGetXCBConnection(dpy);
struct dri3_buffer *back;
int64_t ret = 0;
+   uint32_t options = XCB_PRESENT_OPTION_NONE;
 
unsigned flags = __DRI2_FLUSH_DRAWABLE;
if (flush)
@@ -1557,6 +1558,9 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
   if (target_msc == 0)
  target_msc = priv-msc + priv-swap_interval * (priv-send_sbc - 
priv-recv_sbc);
 
+  if (priv-swap_interval == 0)
+  options |= XCB_PRESENT_OPTION_ASYNC;
+
   back-busy = 1;
   back-last_swap = priv-send_sbc;
   xcb_present_pixmap(c,
@@ -1570,7 +1574,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
  None, /* target_crtc 
*/
  None,
  back-sync_fence,
- XCB_PRESENT_OPTION_NONE,
+ options,
  target_msc,
  divisor,
  remainder, 0, NULL);
-- 
1.9.1

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

[PATCH 2/3] glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc

2014-11-24 Thread Mario Kleiner
Prevent calls to glXGetSyncValuesOML() and glXWaitForMscOML()
from overwriting the (ust,msc) values of the last successfull
swapbuffers call (PresentPixmapCompleteNotify event), as
glXWaitForSbcOML() relies on those values corresponding to
the most recent completed swap, not to whatever was last
returned from the server.

Problematic call sequence without this patch would have been, e.g.,

glXSwapBuffers()
... wait ...
swap completes - PresentPixmapComplete event - (ust,msc)
updated to reflect swap completion time and count.
... wait for at least 1 video refresh cycle/vblank increment.

glXGetSyncValuesOML()
- PresentNotifyMsc event overwrites (ust,msc) of swap
completion with (ust,msc) of most recent vblank

glXWaitForSbcOML()
- Returns sbc of last completed swap but (ust,msc) of last
completed vblank, not of last completed swap.
- Client is confused.

Do this by tracking a separate set of (ust, msc) for the
dri3_wait_for_msc() call than for the dri3_wait_for_sbc()
call.

This makes the glXWaitForSbcOML() call robust again and restores
consistent behaviour with the DRI2 implementation.

Fixes applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org
Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 src/glx/dri3_glx.c  | 11 +++
 src/glx/dri3_priv.h |  5 -
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index b4ac278..5796491 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -420,11 +420,14 @@ dri3_handle_present_event(struct dri3_drawable *priv, 
xcb_present_generic_event_
 
  if (psc-show_fps_interval)
 show_fps(priv, ce-ust);
+
+ priv-ust = ce-ust;
+ priv-msc = ce-msc;
   } else {
  priv-recv_msc_serial = ce-serial;
+ priv-vblank_ust = ce-ust;
+ priv-vblank_msc = ce-msc;
   }
-  priv-ust = ce-ust;
-  priv-msc = ce-msc;
   break;
}
case XCB_PRESENT_EVENT_IDLE_NOTIFY: {
@@ -498,8 +501,8 @@ dri3_wait_for_msc(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
   }
}
 
-   *ust = priv-ust;
-   *msc = priv-msc;
+   *ust = priv-vblank_ust;
+   *msc = priv-vblank_msc;
*sbc = priv-recv_sbc;
 
return 1;
diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h
index 8e46640..222deb0 100644
--- a/src/glx/dri3_priv.h
+++ b/src/glx/dri3_priv.h
@@ -182,9 +182,12 @@ struct dri3_drawable {
uint64_t send_sbc;
uint64_t recv_sbc;
 
-   /* Last received UST/MSC values */
+   /* Last received UST/MSC values for pixmap present complete */
uint64_t ust, msc;
 
+   /* Last received UST/MSC values for vblank */
+   uint64_t vblank_ust, vblank_msc;
+
/* Serial numbers for tracking wait_for_msc events */
uint32_t send_msc_serial;
uint32_t recv_msc_serial;
-- 
1.9.1

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [Mesa-dev] [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-11-30 Thread Mario Kleiner

On 11/25/2014 07:52 AM, Axel Davy wrote:

Hi,



Sorry for the late reply, seems my e-mail filters hate me - i was 
already worried about the lack of replies.


This patch removes the tripple buffering behaviour that the GLX 
implementation

has with DRI3. I understand your concern for Medical softwares,
but perhaps this would be better handled with an user option.

Axel Davy



I don't think it does that?

For swapinterval  0 all the triple-buffering etc. is left active as 
usual, and swaps are supposed to be vsynced and tear-free. swapinterval 
is 1 by default, so this is the case by default.


For a client or user selected swapinterval = 0, the code in Mesa 
schedules glXSwapBuffers calls to occur immediately, and the Present 
implementation in the server schedules them in a way so they are 
supposed to occur immediately/asap - assuming no vsync, otherwise the 
logic there wouldn't make much sense.


Due to what i'm pretty sure is a bug (see my other 2 patches on top of 
the xserver dri3/present implementation) the PresentOptionAsync isn't 
handled correctly by the server for pagelfips as it should be, and as 
result vsync is always on on intel-ddx, and always off on nouveau-ddx 
(or any future driver which exposes PresentCapabilityAsync as nouveau 
does), which causes massive tearing on nouveau all the time under 
dri3/present.


On any os/graphics-stack i'm familiar with (Windows and OSX all versions 
and drivers, Linux proprietary drivers, DRI2 with intel-ddx and 
nouveau-ddx) a swapinterval zero meant to swap immediately without 
vsync, so this just restores what probably any OpenGL client expects to 
happen. The MESA_swap_control spec even states that If interval is 
set to a value of 0, buffer swaps are not synchronized to a video frame.


For medical or neuro-science applications like mine, or multi-display 
stereo/VR applications it is crucial to have this for users being able 
to easily diagnose multi-display synchronization issues without the need 
to learn how to use git and compilers and how graphics stacks work. For 
gamers and gaming benchmarks it's the ego-shooter mode and squeeze 
out max fps they like so much.


Btw., digging deeper into the ddx'es i realized that my posted patch for 
the x-server bug is both incomplete and too complex. I've written a new 
patch which is only a simple one-liner but solves more problems than the 
old one. I'll send it out after giving it some testing.


thanks,
-mario


On 25/11/2014 04:00, Mario Kleiner wrote :

Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org
Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
  src/glx/dri3_glx.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 5796491..c53be1b 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1518,6 +1518,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, 
int64_t target_msc, int64_t divisor,

 xcb_connection_t *c = XGetXCBConnection(dpy);
 struct dri3_buffer *back;
 int64_t ret = 0;
+   uint32_t options = XCB_PRESENT_OPTION_NONE;
   unsigned flags = __DRI2_FLUSH_DRAWABLE;
 if (flush)
@@ -1557,6 +1558,9 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, 
int64_t target_msc, int64_t divisor,

if (target_msc == 0)
   target_msc = priv-msc + priv-swap_interval * 
(priv-send_sbc - priv-recv_sbc);

  +  if (priv-swap_interval == 0)
+  options |= XCB_PRESENT_OPTION_ASYNC;
+
back-busy = 1;
back-last_swap = priv-send_sbc;
xcb_present_pixmap(c,
@@ -1570,7 +1574,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, 
int64_t target_msc, int64_t divisor,

   None, /* target_crtc */
   None,
   back-sync_fence,
- XCB_PRESENT_OPTION_NONE,
+ options,
   target_msc,
   divisor,
   remainder, 0, NULL);




___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-11-30 Thread Mario Kleiner

On 11/25/2014 09:31 AM, Frank Binns wrote:

Hi,

I sent exactly the same patch back in April. Despite getting a review-by
it was never merged. See:

http://lists.freedesktop.org/archives/mesa-dev/2014-April/058347.html
http://lists.freedesktop.org/archives/mesa-dev/2014-May/060432.html

Thanks
Frank



Indeed, exactly the same patch - minus white-space differences - already 
reviewed. So i guess i can add your additional signed-off-by and chris 
wilsons reviewed-by to this one?


I have a few more patches coming for this series, which fix two more 
mesa bugs related to dri3/present...


thanks,
-mario


On 25/11/14 03:00, Mario Kleiner wrote:

Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org
Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
  src/glx/dri3_glx.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 5796491..c53be1b 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1518,6 +1518,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, 
int64_t target_msc, int64_t divisor,

 xcb_connection_t *c = XGetXCBConnection(dpy);
 struct dri3_buffer *back;
 int64_t ret = 0;
+   uint32_t options = XCB_PRESENT_OPTION_NONE;
   unsigned flags = __DRI2_FLUSH_DRAWABLE;
 if (flush)
@@ -1557,6 +1558,9 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, 
int64_t target_msc, int64_t divisor,

if (target_msc == 0)
   target_msc = priv-msc + priv-swap_interval * 
(priv-send_sbc - priv-recv_sbc);

  +  if (priv-swap_interval == 0)
+  options |= XCB_PRESENT_OPTION_ASYNC;
+
back-busy = 1;
back-last_swap = priv-send_sbc;
xcb_present_pixmap(c,
@@ -1570,7 +1574,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, 
int64_t target_msc, int64_t divisor,

   None, /* target_crtc */
   None,
   back-sync_fence,
- XCB_PRESENT_OPTION_NONE,
+ options,
   target_msc,
   divisor,
   remainder, 0, NULL);




___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

[PATCH 2/2] present: Fix use of vsynced pageflips and honor PresentOptionAsync. (v3)

2014-12-03 Thread Mario Kleiner
Pageflips for Pixmap presents were not synchronized to vblank on
drivers with support for PresentCapabilityAsync, due to some
missing init for vblank-sync_flips. The PresentOptionAsync
flag was completely ignored for pageflipped presents.

Vsynced flips only worked by accident on the intel-ddx, as that
driver doesn't have PresentCapabilityAsync support.

On nouveau-ddx, which supports PresentCapabilityAsync, this
always caused non-vsynced pageflips with pretty ugly tearing.

This patch fixes the problem, as tested on top of XOrg 1.16.2
on nouveau and intel.

Please also apply to XOrg 1.17 and XOrg 1.16.2 stable.

Applying on top of XOrg 1.16.2 may require cherry-picking
commit 2051514652481a83bd7cf22e57cb0fcd40333f33
which trivially fixes lack of support for protocol option
PresentOptionCopy - get two bug fixes for the price of one!

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 present/present.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/present/present.c b/present/present.c
index e5d3fd5..be1c9f1 100644
--- a/present/present.c
+++ b/present/present.c
@@ -834,7 +834,7 @@ present_pixmap(WindowPtr window,
 vblank-notifies = notifies;
 vblank-num_notifies = num_notifies;
 
-if (!screen_priv-info || !(screen_priv-info-capabilities  
PresentCapabilityAsync))
+if (!(options  PresentOptionAsync))
 vblank-sync_flip = TRUE;
 
 if (!(options  PresentOptionCopy) 
-- 
1.9.1

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

[PATCH 1/2] present: Avoid crashes in DebugPresent(), a bit more info.

2014-12-03 Thread Mario Kleiner
DebugPresent() crashed the server when a dri3 drawable
was closed while a pageflipped present was still pending,
due to vblank-window- Null-Ptr deref, so debug builds
caused new problems to debug.

E.g.,

glXSwapBuffers(...);
glXDestroyWindow(...);
- Pageflip for non-existent window completes - boom.

Also often happens when switching desktop compositor on/off
due to Present unflips, or when logging out of session.

Also add info if a Present is queued for copyswap or pageflip,
if the present is vsynced, and the serial no of the Present
request, to aid debugging of pageflip and vsync issues. The
serial number is useful as Mesa's dri3/present backend encodes
its sendSBC in the serial number, so one can easily correlate
server debug output with Mesa and with the SBC values returned
to actual OpenGL client applications via OML_sync_control and
INTEL_swap_events extension, makes debugging quite a bit more
easy.

Please also cherry-pick this for a 1.16.x stable update.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 present/present.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/present/present.c b/present/present.c
index ac9047e..e5d3fd5 100644
--- a/present/present.c
+++ b/present/present.c
@@ -440,7 +440,7 @@ present_flip_notify(present_vblank_ptr vblank, uint64_t 
ust, uint64_t crtc_msc)
 DebugPresent((\tn %lld %p %8lld: %08lx - %08lx\n,
   vblank-event_id, vblank, vblank-target_msc,
   vblank-pixmap ? vblank-pixmap-drawable.id : 0,
-  vblank-window-drawable.id));
+  vblank-window ? vblank-window-drawable.id : 0));
 
 assert (vblank == screen_priv-flip_pending);
 
@@ -859,10 +859,10 @@ present_pixmap(WindowPtr window,
 }
 
 if (pixmap)
-DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p)\n,
+DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p) flip %d vsync 
%d serial %d\n,
   vblank-event_id, vblank, target_msc,
   vblank-pixmap-drawable.id, vblank-window-drawable.id,
-  target_crtc));
+  target_crtc, vblank-flip, vblank-sync_flip, 
vblank-serial));
 
 xorg_list_add(vblank-event_queue, present_exec_queue);
 vblank-queued = TRUE;
@@ -955,7 +955,7 @@ present_vblank_destroy(present_vblank_ptr vblank)
 DebugPresent((\td %lld %p %8lld: %08lx - %08lx\n,
   vblank-event_id, vblank, vblank-target_msc,
   vblank-pixmap ? vblank-pixmap-drawable.id : 0,
-  vblank-window-drawable.id));
+  vblank-window ? vblank-window-drawable.id : 0));
 
 /* Drop pixmap reference */
 if (vblank-pixmap)
-- 
1.9.1

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

DRI3/Present fixes for XServers 1.16 and 1.17rc - (v2)

2014-12-03 Thread Mario Kleiner
Hi,

an updated set of patches to fix the bugs i found in the
xserver dri3/present implementation and one bug in intel-ddx
uxa/dri3/present implementation. Axel Davys comments made me
rethink my original xserver patch and the new solution is
simple and better and afaics how this was actually intended
to work in the server, the server properly using the present_check_flip
ddx driver function.

Patch 1/2 fixes and slightly improves DebugPresent() macros for
the server to avoid crashes at logout, compositor en/disable or
closing windows while flips are pending when the server is compiled
with debug macros on.

Patch 2/2 fixes the use of PresentOptionAsync for page-flipped present,
and makes Present working on nouveau without horrible tearing.

These patches apply to master, 1.17rc and 1.16.2. They were tested
on top of 1.16.2 with the dri3/present backends of nouveau master
(glamor and exa) and intel master (sna and fixed uxa) on single-display
and dual-display, also ran through my hardware timing test equipment.

Patch uxa/present is a required fix for intel-ddx uxa backend, so
intel_present_check_flip no longer lies to the server about its
capabilities.

Can the x-server patches please also be included into the 1.16 series?

thanks,
-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

[PATCH] uxa/present: Handle sync_flip flag in intel_present_check_flip()

2014-12-03 Thread Mario Kleiner
Make sure we reject async flips if we don't support
async flips.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 src/uxa/intel_present.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/uxa/intel_present.c b/src/uxa/intel_present.c
index d20043f..d2aa9ee 100644
--- a/src/uxa/intel_present.c
+++ b/src/uxa/intel_present.c
@@ -58,6 +58,8 @@ struct intel_present_vblank_event {
uint64_tevent_id;
 };
 
+static Bool intel_present_has_async_flip(ScreenPtr screen);
+
 static uint32_t pipe_select(int pipe)
 {
if (pipe  1)
@@ -266,6 +268,9 @@ intel_present_check_flip(RRCrtcPtr  crtc,
 if (!bo)
 return FALSE;
 
+if (!sync_flip  !intel_present_has_async_flip(screen))
+return FALSE;
+
return TRUE;
 }
 
-- 
1.9.1

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH 2/2] present: Fix use of vsynced pageflips and honor PresentOptionAsync. (v3)

2014-12-04 Thread Mario Kleiner
?

thanks,
-mario


Axel Davy

On 02/12/2014 20:08, Mario Kleiner wrote :

Pageflips for Pixmap presents were not synchronized to vblank on
drivers with support for PresentCapabilityAsync, due to some
missing init for vblank-sync_flips. The PresentOptionAsync
flag was completely ignored for pageflipped presents.

Vsynced flips only worked by accident on the intel-ddx, as that
driver doesn't have PresentCapabilityAsync support.

On nouveau-ddx, which supports PresentCapabilityAsync, this
always caused non-vsynced pageflips with pretty ugly tearing.

This patch fixes the problem, as tested on top of XOrg 1.16.2
on nouveau and intel.

Please also apply to XOrg 1.17 and XOrg 1.16.2 stable.

Applying on top of XOrg 1.16.2 may require cherry-picking
commit 2051514652481a83bd7cf22e57cb0fcd40333f33
which trivially fixes lack of support for protocol option
PresentOptionCopy - get two bug fixes for the price of one!

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
  present/present.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/present/present.c b/present/present.c
index e5d3fd5..be1c9f1 100644
--- a/present/present.c
+++ b/present/present.c
@@ -834,7 +834,7 @@ present_pixmap(WindowPtr window,
  vblank-notifies = notifies;
  vblank-num_notifies = num_notifies;
  -if (!screen_priv-info || !(screen_priv-info-capabilities  
PresentCapabilityAsync))

+if (!(options  PresentOptionAsync))
  vblank-sync_flip = TRUE;
if (!(options  PresentOptionCopy) 




___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH 2/2] present: Fix use of vsynced pageflips and honor PresentOptionAsync. (v3)

2014-12-05 Thread Mario Kleiner

On 12/05/2014 12:56 AM, Eric Anholt wrote:

Mario Kleiner mario.kleiner...@gmail.com writes:


Pageflips for Pixmap presents were not synchronized to vblank on
drivers with support for PresentCapabilityAsync, due to some
missing init for vblank-sync_flips. The PresentOptionAsync
flag was completely ignored for pageflipped presents.

Vsynced flips only worked by accident on the intel-ddx, as that
driver doesn't have PresentCapabilityAsync support.

On nouveau-ddx, which supports PresentCapabilityAsync, this
always caused non-vsynced pageflips with pretty ugly tearing.

This patch fixes the problem, as tested on top of XOrg 1.16.2
on nouveau and intel.

Please also apply to XOrg 1.17 and XOrg 1.16.2 stable.

Applying on top of XOrg 1.16.2 may require cherry-picking
commit 2051514652481a83bd7cf22e57cb0fcd40333f33
which trivially fixes lack of support for protocol option
PresentOptionCopy - get two bug fixes for the price of one!

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
  present/present.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/present/present.c b/present/present.c
index e5d3fd5..be1c9f1 100644
--- a/present/present.c
+++ b/present/present.c
@@ -834,7 +834,7 @@ present_pixmap(WindowPtr window,
  vblank-notifies = notifies;
  vblank-num_notifies = num_notifies;
  
-if (!screen_priv-info || !(screen_priv-info-capabilities  PresentCapabilityAsync))

+if (!(options  PresentOptionAsync))
  vblank-sync_flip = TRUE;

I think I'd like to see a hunk like this in with this patch, so that
each driver doesn't need to have the cap check:

diff --git a/present/present.c b/present/present.c
index a9f2214..ed0d734 100644
--- a/present/present.c
+++ b/present/present.c
@@ -838,6 +838,9 @@ present_pixmap(WindowPtr window,
  vblank-sync_flip = TRUE;
  
  if (!(options  PresentOptionCopy) 

+!((options  PresentOptionAsync) 
+  (!screen_priv-info ||
+   !(screen_priv-info-capabilities  PresentCapabilityAsync))) 
  pixmap != NULL 
  present_check_flip (target_crtc, window, pixmap, vblank-sync_flip, 
valid, x_off, y_off))
  {

Seem reasonable?  If you wanted to squash this in, then this is:


I'm not sure if drivers will really avoid the cap check, as i assume the 
definition of the check_flip() function requires them to implement it 
anyway? Does some spec somewhere require them to do it? Do driver 
writers check all server implementations to see if they can get away 
with less?


But then having this hunk in doesn't hurt either, and it would keep the 
current intel-ddx uxa backends working, so i'll integrate it - after 
some urgently needed sleep.


Thanks for the review. These server patches are actually the critical 
ones for me. Without them in XOrg 1.16+, all the mesa fixes would be 
utterly useless for my kind of applications.


-mario


Reviewed-by: Eric Anholt e...@anholt.net

(So's patch 1/2, regardless).


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

[PATCH 1/2] present: Avoid crashes in DebugPresent(), a bit more info.

2014-12-05 Thread Mario Kleiner
DebugPresent() crashed the server when a dri3 drawable
was closed while a pageflipped present was still pending,
due to vblank-window- Null-Ptr deref, so debug builds
caused new problems to debug.

E.g.,

glXSwapBuffers(...);
glXDestroyWindow(...);
- Pageflip for non-existent window completes - boom.

Also often happens when switching desktop compositor on/off
due to Present unflips, or when logging out of session.

Also add info if a Present is queued for copyswap or pageflip,
if the present is vsynced, and the serial no of the Present
request, to aid debugging of pageflip and vsync issues. The
serial number is useful as Mesa's dri3/present backend encodes
its sendSBC in the serial number, so one can easily correlate
server debug output with Mesa and with the SBC values returned
to actual OpenGL client applications via OML_sync_control and
INTEL_swap_events extension, makes debugging quite a bit more
easy.

Please also cherry-pick this for a 1.16.x stable update.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
Reviewed-by: Eric Anholt e...@anholt.net
---
 present/present.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/present/present.c b/present/present.c
index ac9047e..e5d3fd5 100644
--- a/present/present.c
+++ b/present/present.c
@@ -440,7 +440,7 @@ present_flip_notify(present_vblank_ptr vblank, uint64_t 
ust, uint64_t crtc_msc)
 DebugPresent((\tn %lld %p %8lld: %08lx - %08lx\n,
   vblank-event_id, vblank, vblank-target_msc,
   vblank-pixmap ? vblank-pixmap-drawable.id : 0,
-  vblank-window-drawable.id));
+  vblank-window ? vblank-window-drawable.id : 0));
 
 assert (vblank == screen_priv-flip_pending);
 
@@ -859,10 +859,10 @@ present_pixmap(WindowPtr window,
 }
 
 if (pixmap)
-DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p)\n,
+DebugPresent((q %lld %p %8lld: %08lx - %08lx (crtc %p) flip %d vsync 
%d serial %d\n,
   vblank-event_id, vblank, target_msc,
   vblank-pixmap-drawable.id, vblank-window-drawable.id,
-  target_crtc));
+  target_crtc, vblank-flip, vblank-sync_flip, 
vblank-serial));
 
 xorg_list_add(vblank-event_queue, present_exec_queue);
 vblank-queued = TRUE;
@@ -955,7 +955,7 @@ present_vblank_destroy(present_vblank_ptr vblank)
 DebugPresent((\td %lld %p %8lld: %08lx - %08lx\n,
   vblank-event_id, vblank, vblank-target_msc,
   vblank-pixmap ? vblank-pixmap-drawable.id : 0,
-  vblank-window-drawable.id));
+  vblank-window ? vblank-window-drawable.id : 0));
 
 /* Drop pixmap reference */
 if (vblank-pixmap)
-- 
2.1.0

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

DRI3/Present fixes for XServers 1.16 and 1.17rc - reviewed

2014-12-05 Thread Mario Kleiner
Ok, the final patchset. I made the changes Eric Anholt proposed,
retested, and got them reviewed by him. With those on top of
XServer 1.16.2, DRI3/Present works for me on all drivers and
backends (intel sna, uxa and nouveau exa, glamor), single and
dual-display fullscreen and windowed, with/without compositor,
vsynced or non-vsynced etc.

The patches apply cleanly to master, 1.17-rc and 1.16.2.

Thanks,
-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

[PATCH] sna: Also fix ZaphodHeads on Linux kernels older than 3.19

2014-12-06 Thread Mario Kleiner
Only 3.19 will have O_NONBLOCK for drm_read(), so the
current ddx will still stutter in ZaphodHead mode on
current kernels. Fix the problem by adding a poll() on
the drm fd before potentially blocking on read().

The logic is directly transplanted from the uxa backend
intel_mode_read_drm_events() function.

Fixes fdo bug #84744 on older kernels.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 src/sna/sna_display.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c
index 163..a7ad6cc 100644
--- a/src/sna/sna_display.c
+++ b/src/sna/sna_display.c
@@ -7390,7 +7390,16 @@ fixup_flip:
 void sna_mode_wakeup(struct sna *sna)
 {
char buffer[1024];
-   int len, i;
+   int len, i, r;
+   struct pollfd p = { .fd = sna-kgem.fd, .events = POLLIN };
+
+   /* DRM read is blocking on old kernels, so poll first to avoid it. */
+   do {
+   r = poll(p, 1, 0);
+   } while (r == -1  (errno == EINTR || errno == EAGAIN));
+
+   if (r = 0)
+   return;
 
/* The DRM read semantics guarantees that we always get only
 * complete events.
-- 
1.9.1

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-12-15 Thread Mario Kleiner

On 12/15/2014 06:46 AM, Keith Packard wrote:

Mario Kleiner mario.kleiner...@gmail.com writes:


Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Hrm. I'd love for this to be controlled by the GLX_EXT_swap_control_tear
extension, but that one uses negative interval values to indicate
tearing, instead of providing a new API, and we can't tell the
difference between 0 and -0.

Are you sure you don't want GLX_EXT_swap_control_tear and an interval of
-1 instead of an interval of 0 with tearing?



Yes. GLX_EXT_swap_control_tear It's a different use case. Useful for 
games which want to avoid tearing if possible, but don't want to get in 
a tremor of switching frequently between say 60 fps and 30 fps if they 
only almost manage to do 60 fps, but not with reliable headroom. Would 
be also nice to have support for that, but only as an addition for 
swapinterval -1, not a replacement.


The 0 case is good for benchmarking.

In my specific case i always want vsync'ed swap for actual visual 
stimulation in neuroscience/medical settings, with no frame skipped 
ever. The bonus use for me, except for benchmarking how fast the system 
can go, is if one has a multi-display setup, e.g., dual-display for 
stereoscopic stimulation - one display per eye, or some CAVE like setup 
for VR with more than 2 displays. You want display updates and scanout 
on all of them synchronized, so the scene stays coherent. One simple way 
for visually testing multi-display sync is to intentionally swap all of 
them without vsync, e.g., timed to swap in the middle of the scanout. If 
the tear-lines on all displays are roughly at the same vertical position 
and stay there then that's a good visual test if stuff works. There are 
other ways to do it, but this is the one method that seems to work 
cross-platform, without lots of mental context switching depending on 
what os/gpu/server/driver combo with what settings one uses, and much 
more easy to grasp for scientists with no graphics background. You can 
see at a glance if stuff is roughly correct or not.


-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-12-16 Thread Mario Kleiner

On 12/16/2014 09:23 AM, Keith Packard wrote:

Mario Kleiner mario.kleiner...@gmail.com writes:


The 0 case is good for benchmarking.

Sure, but the current code does benchmarking just fine. In fact, because
it doesn't copy queued frames that aren't the most recent before the
vblank, benchmarks tend to run *faster* as a result, and people
generally like that aspect of it...



Hmm. For benchmarking i think i'd consider that a mild form of cheating. 
You get higher fps because you skip processing like the whole gpu blit 
overhead and host processing overhead for queuing / validating / 
processing the copy command in the command stream, so the benchmark 
numbers don't translate very well anymore in how the system would behave 
in a non-benchmark situation?


... but read on below ...


In my specific case i always want vsync'ed swap for actual visual
stimulation in neuroscience/medical settings, with no frame skipped
ever. The bonus use for me, except for benchmarking how fast the system
can go, is if one has a multi-display setup, e.g., dual-display for
stereoscopic stimulation - one display per eye, or some CAVE like setup
for VR with more than 2 displays. You want display updates and scanout
on all of them synchronized, so the scene stays coherent. One simple way
for visually testing multi-display sync is to intentionally swap all of
them without vsync, e.g., timed to swap in the middle of the scanout. If
the tear-lines on all displays are roughly at the same vertical position
and stay there then that's a good visual test if stuff works. There are
other ways to do it, but this is the one method that seems to work
cross-platform, without lots of mental context switching depending on
what os/gpu/server/driver combo with what settings one uses, and much
more easy to grasp for scientists with no graphics background. You can
see at a glance if stuff is roughly correct or not.

It seems like you want something that the GL API doesn't express
precisely; my reading of the  GL spec definitely lets Present work the
way it does today, and as you avoid tearing *and* improve performance in
the vblank_mode=0 case, I'm very reluctant to change it.


From GLX_EXT_swap_control and MESA_swap_control:

If interval is set to a value of 0, buffer swaps are not
synchronized to a video frame.


It depends on how you interpret the not synchronized to a video frame? 
Can you explain your interpretation?


I don't think the spec says anywhere that dropping old not most recent 
at vblank time frames is allowed, like Present does atm.? And the 
current Present implementation does synchronize the buffer swap of its 
most recent received Pixmap to the [onset of] a video frame, so while 
preventing tearing it goes a bit slower than it could go without vsync.


Every past/other implementation than DRI3/Present that i have experience 
with interpreted the spec the way that patch tries to restore, at least 
everything i know of from OSX/Windows/Linux proprietary/DRI2, so my 
interpretation is certainly a valid interpretation, and it is the one 
that provides consistency and therefore the least surprise to 
implementers of GL clients and end users.



Present could trivially offer a new bit to force tearing; I'm not sure
how you'd get at that from GL though.



It does already with PresentOptionAsync? It just needs to be used in 
accordance with the mainstream interpretation of the _swap_control spec, 
like this patch suggests.


I'm not trying to claim here that the current behaviour of Mesa+Present 
isn't useful for some types of applications like games. I'm just saying 
it shouldn't be the default behaviour for swapinterval 0 or  0. As far 
as i understand the meaning, intention and origin of the 
EXT_swap_control_tear extension, the current Present implementation 
would implement a useful approximation of EXT_swap_control_tear for a 
swapinterval of  0. Not an exact implementation, but at least following 
the spirit of that extension.


So i'm arguing for restoring the default behaviour any other 
implementation has with that patch, but providing the current behaviour 
via sync_control_tear? Or maybe even some new sync_control_tear2 to 
cover the difference between the current method and sync_control_tear.


When we are at the topic, i can also send you my christmas wish list 
with proposals for future mesa/server releases:


1 - Another thing i'd love to have, which would require a new option 
PresentOptionDontSkip is the ability to not skip present requests 
which are late. That would allow to take advantage of mesas triple/quad 
buffering to queue frames for animations ahead of time for playing 
animations or videos and be still certain that every queued frame was 
shown at least for one video refresh cycle. I'd love to take advantage 
of the new triple-buffering behaviour, or maybe even use Present 
directly somehow for deeper n-buffering, but for some of my types of 
application i'd need to be certain that frames

Re: [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-12-16 Thread Mario Kleiner

On 12/17/2014 04:17 AM, Keith Packard wrote:

Mario Kleiner mario.kleiner...@gmail.com writes:


Hmm. For benchmarking i think i'd consider that a mild form of cheating.
You get higher fps because you skip processing like the whole gpu blit
overhead and host processing overhead for queuing / validating /
processing the copy command in the command stream, so the benchmark
numbers don't translate very well anymore in how the system would behave
in a non-benchmark situation?

It's still a very useful mode -- imagine wanting the lowest possible
latency between the user and the display; normally, you process input
and generate a frame just after vblank, then (if the rendering is really
quick), end up waiting most of the frame time not doing anything before
finally updating it at the next vblank.

With vblank_mode=0 and DRI3, you have the ability to try and generate
another frame before vblank comes; if you manage, then you get that data
on the screen, rather than the older version.

So, it offers latency closer to the tearing vblank_mode=0 but without
any tearing.

That's why I did it; the fact that it offers a small performance benefit
for benchmarks is an unintentional bonus feature.



It is a useful mode for some applications, no disagreement here. I can 
think of use cases where i would exactly want to have that, e.g., VR 
stuff with head-tracking - rendering - Occulus Rift style display. I 
actually want to tinker with something like that early next year and see 
if i can make use of it.


It's just that i need access to both, the old behaviour i described, and 
the new drop frame behaviour, and i need a way to select what i want 
at runtime via api without the need for easily overwhelmed and confused 
users to change config files or environment variables. I also always 
need meaningful and trustworthy feedback, at least for page-flipped 
presents, about what really happened for a presented frame - was it 
flipped, copied, exchanged, skipped, or did some error happen?


That's why i'd like to have an extension to INTEL_swap_events to also 
report some new completion type skipped and error and that one patch 
5/5 of mine for mesa reviewed and included, to make sure the swap_events 
don't fall apart so easily.


That's why i think we could (ab)use the swap_control_tear extension to 
implement this behaviour:


swapinterval  0 - sync to vblank, swap only at most every 
swapinterval'th refresh.
swapinterval = 0 - Don't sync to vblank (PresentOptionAsync), swap 
immediately.


swapinterval  0 - Do what your current implementation does for 
swapinterval = abs(swapinterval);


The first two would restore consistent behaviour with all other 
implementations.


The last one would implement something at least close to the spirit of 
of swap_control_tear. Although it would be even better to think of some 
extension to the extension so one can select if one really wants 
tearing if late as swap_control_tear defines, to squeeze out a few 
more msecs, or rather your tear free skip old frames method.


Essentially i'd love to have more control over how this stuff is done, 
and a default behaviour close to what all other implementations do, to 
reduce confusion.


As some kind of stop gap measure one could also think about defining a 
new vblank_mode to enable the new behaviour instead of the old one.


-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-12-16 Thread Mario Kleiner

On 12/17/2014 05:49 AM, Keith Packard wrote:

Mario Kleiner mario.kleiner...@gmail.com writes:


It's just that i need access to both, the old behaviour i described, and
the new drop frame behaviour, and i need a way to select what i want
at runtime via api without the need for easily overwhelmed and confused
users to change config files or environment variables. I also always
need meaningful and trustworthy feedback, at least for page-flipped
presents, about what really happened for a presented frame - was it
flipped, copied, exchanged, skipped, or did some error happen?

Present reports precisely what it did with each frame; flipped, copied,
or skipped.


That's why i'd like to have an extension to INTEL_swap_events to also
report some new completion type skipped and error and that one patch
5/5 of mine for mesa reviewed and included, to make sure the swap_events
don't fall apart so easily.

You can use Present events on the target drawable; they're generated to
whoever requests them, so you don't need to rely on the intel swap
events alone.


Never thought about that. Could you show me some short example snippet 
of XLib/GLX code how i reliably detect at runtime if Present is present, 
and then enable this? That would probably do for the moment and at the 
same time solve the problem that i don't know how to reliably detect at 
runtime if i'm on DRI2 or DRI3/Present. Making good use of this will 
require separate code-path and a way to select the right one.


It's still important to fix that wraparound handling bug from my patch 
5/5 for INTEL_swap_events.



As some kind of stop gap measure one could also think about defining a
new vblank_mode to enable the new behaviour instead of the old one.

I really don't have a good suggestion here, given that we have such a
limited API available.



I thought i just made a suggestion how we could wiggle through it within 
existing api?

And we can define new one for future extensions?

Need to sleep now, thanks
-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [RFC 5/5] modesetting: Implement page flipping support for Present.

2015-02-08 Thread Mario Kleiner
)
 +return FALSE;
 +ms-flip_vblank_event-event_id = event_id;
 +
 +new_front_bo.gbm = glamor_gbm_bo_from_pixmap(screen, new_front);
 +new_front_bo.dumb = NULL;
 +if (!new_front_bo.gbm) {
 +xf86DrvMsg(scrn-scrnIndex, X_ERROR,
 +   Failed to get GBM bo for flip to new front.\n);
 +free(ms-flip_vblank_event);
 +ms-flip_vblank_event = NULL;
 +return FALSE;
 +}
 +
 +/* Create a new handle for the back buffer */
 +if (drmModeAddFB(ms-fd, scrn-virtualX, scrn-virtualY,
 + scrn-depth, scrn-bitsPerPixel,
 + drmmode_bo_get_pitch(new_front_bo),
 + drmmode_bo_get_handle(new_front_bo), new_fb_id))
 +goto error_out;

- error_out doesn't free(ms-flip_vblank_event); and
ms-flip_vblank_event = NULL; Memory leak?


 +
 +drmmode_bo_destroy(ms-drmmode, new_front_bo);
 +
 +flags = DRM_MODE_PAGE_FLIP_EVENT;
 +if (async)
 +flags |= DRM_MODE_PAGE_FLIP_ASYNC;
 +
 +/* Queue flips on all enabled CRTCs.
 + *
 + * Note that if/when we get per-CRTC buffers, we'll have to update
 this.
 + * Right now it assumes a single shared fb across all CRTCs, with the
 + * kernel fixing up the offset of each CRTC as necessary.
 + *
 + * Also, flips queued on disabled or incorrectly configured displays
 + * may never complete; this is a configuration error.
 + */
 +ms-fe_msc = 0;
 +ms-fe_usec = 0;
 +
 +for (i = 0; i  config-num_crtc; i++) {
 +xf86CrtcPtr crtc = config-crtc[i];
 +
 +if (!ms_crtc_on(crtc))
 +continue;
 +
 +if (!queue_flip_on_crtc(screen, crtc, ref_crtc_vblank_pipe,
 new_fb_id,
 +flags)) {
 +goto error_undo;
 +}
 +}

- queue_flip_on_crtc() has at least one error path where it doesn't
free(ms-flip_vblank_event). Maybe that was partially the purpose of
the currently uncommented intel_crtc_apply() in error_undo?



 +
 +ms-drmmode.old_fb_id = ms-drmmode.fb_id;
 +ms-drmmode.fb_id = new_fb_id;
 +
 +assert(ms-flip_count = 0);
 +if (ms-flip_count == 0) {
 +ms_pageflip_complete(screen);
 +}
 +
 +return TRUE;
 +
 +error_undo:
 +drmModeRmFB(ms-fd, new_fb_id);

-- drmModeRmFB() could get called here even if a few
queue_flip_on_crtc() calls succeeded for a multi-crtc flip, so the FB
would be in use? Not sure though...

Feel free to add a

Reviewed-and-Tested-by: Mario Kleiner mario.kleiner...@gmail.com

for the bits i tested and reviewed if that helps. The other patches in
the series seem to be fine,

thanks,
-mario


 +for (i = 0; i  config-num_crtc; i++) {
 +if (config-crtc[i]-enabled) {
 +ErrorF(XXX: crtc apply\n);
 +/* intel_crtc_apply(config-crtc[i]); */
 +}
 +}
 +
 +error_out:
 +xf86DrvMsg(scrn-scrnIndex, X_WARNING, Page flip failed: %s\n,
 +   strerror(errno));
 +
 +assert(ms-flip_count = 0);
 +ms-flip_count = 0;
 +return FALSE;
 +}
 +
 +/*
 + * Test to see if page flipping is possible on the target crtc
 + */
 +static Bool
 +ms_present_check_flip(RRCrtcPtr crtc,
 +  WindowPtr window,
 +  PixmapPtr pixmap,
 +  Bool sync_flip)
 +{
 +ScreenPtr screen = window-drawable.pScreen;
 +ScrnInfoPtr scrn = xf86ScreenToScrn(screen);
 +modesettingPtr ms = modesettingPTR(scrn);
 +
 +/* We only support page flipping with Glamor, for now. */
 +if (!ms-drmmode.glamor)
 +return FALSE;
 +
 +if (!scrn-vtSema)
 +return FALSE;
 +
 +if (ms-drmmode.shadow_enable)
 +return FALSE;
 +
 +if (crtc  !ms_crtc_on(crtc-devPrivate))
 +return FALSE;
 +
 +/* Check stride, can't change that on flip */
 +if (pixmap-devKind != drmmode_bo_get_pitch(ms-drmmode.front_bo))
 +return FALSE;
 +
 +/* Make sure there's a bo we can get to */
 +/* XXX: actually do this.  also...is it sufficient?
 + * if (!glamor_get_pixmap_private(pixmap))
 + * return FALSE;
 + */
 +
 +return TRUE;
 +}
 +
 +/*
 + * Queue a flip on 'crtc' to 'pixmap' at 'target_msc'. If 'sync_flip' is
 true,
 + * then wait for vblank. Otherwise, flip immediately
 + */
 +static Bool
 +ms_present_flip(RRCrtcPtr crtc,
 +uint64_t event_id,
 +uint64_t target_msc,
 +PixmapPtr pixmap,
 +Bool sync_flip)
 +{
 +ScreenPtr screen = crtc-pScreen;
 +ScrnInfoPtr scrn = xf86ScreenToScrn(screen);
 +xf86CrtcPtr xf86_crtc = crtc-devPrivate;
 +drmmode_crtc_private_ptr drmmode_crtc = xf86_crtc-driver_private;
 +Bool ret;
 +
 +if (!ms_present_check_flip(crtc, screen-root, pixmap, sync_flip))
 +return FALSE;
 +
 +ret = ms_do_pageflip(screen, pixmap, drmmode_crtc-vblank_pipe,
 !sync_flip,
 + event_id);
 +if (!ret)
 +xf86DrvMsg(scrn-scrnIndex

Re: [PATCH 0/3] present: Improve interactions with compositing manager

2015-02-08 Thread Mario Kleiner
Hi Keith and Owen,

it's a bit too late for XOrg 1.17, but i reviewed and tested
Keith's patchset, after rebasing the 3rd patch on top of 1.17-rc2.
The series has my

Reviewed-and-Tested-by: Mario Kleiner mario.kleiner...@gmail.com

My patch 1/3 is Keith's patch 3/3, just rebased and with my
review/test tags for convenience. Patch 2/3 is an extension of
Keith's idea to target vblanks more than 1 frame into the future,
on top of Keith's patch. There is no patch 3/3, the numbering is a
credit to my never ending struggles with git send-email ;)

The commit message of patch 2/3 contains some test results on both
KDE/KWin and GNOME-3, the versions from KUbuntu 14.10. All done with
hardware timing measurement equipment to check when frames really show
up at the video output and if lying about timestamps and lag is reduced.

As you can see from those results, the patchset sort of helps on
an otherwise idle desktop under KDE/KWin, but not at all even with
one instance of glxgears running in parallel to the test app.

On GNOME-3 with my patch applied on top of Keith's it helps for vblanks  1
frame in the future, so it maybe could be useful for applications which
don't need fps == full video refresh rate, e.g., video players running
at 30/25/20/15 fps or similar. It doesn't work well for full 60 fps, as
Owen suspected. Strangely running a glxgears in the background on GNOME
does help all cases on GNOME, maybe by putting the gpu into a higher
performance state? This was on Intel HD-4000. I also did some testing on
nouveau with variable results.

Given the variability, there's probably not much of a point in applying those,
maybe at most after hiding them behind some xorg.conf option by default,
so people can experiment with it if they want. I don't care one way or
the other.

Maybe merging the first two patches as a cleanup would make sense.

-mario

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

[PATCH 2/3] present: Extend compositor optimization to presents 1 vblank into future.

2015-02-08 Thread Mario Kleiner
Same principle as keithp's original patch 3/3, extended to handle
target_msc  crtc_msc + 1

1. If non-fullscreen window is redirected, schedule its vblank
   trigger event for 1 frame before target vblank, so offscreen
   copy and posting of damage happens in the frame before, through
   the regular present_execute() path, giving the compositor a
   chance to do its thing ahead of the target vblank.

2. After vblank_copy() posted the damage, it queues a new vblank
   event for 1 vblank into the future, which will call present_execute
   again, this time just for sending out the PresentCompleteNotify
   at the estimated vblank of true present completion.

   Unless something goes wrong with queueing, in which case it
   triggers immediate completion to avoid hangs, just as it does
   for unredirected copy presents.

Tested on Intel HD-4000 IvyBridge mobile with hardware timing
measurement equipment under KDE's KWin compositor and GNOME-3
for non-composited window copy swaps (= compositor off classic
mode), non-composited or unredirected fullscreen windows
(= kms-pageflip in use), and verified to do no harm to timing in
those cases. Testing the interesting case of redirected windows
under compositor, results are mixed:

1. KDE/KWin: Mostly correct timestamps (about 99% marking the correct
   vblank of true swap completion) under idle desktop. Things go
   downhill quickly if some additional load is present, e.g., a
   single instance of glxgears will cause at least 1 frame lag and
   all timestamps being 1 - 2 frames too early wrt. reality.

2. GNOME-3: Mostly correct timestamps for target_msc at least 2
   vblanks into the future (= 30 fps animation on 60 Hz panel),
   but almost always wrong - 1 frame too early - and somewhat
   jittery for 60 fps on 60 Hz panel. This on an otherwise idle
   desktop.

   When an instance of glxgears is running, timing *improves* to
   almost always correct timestamps for 60 fps, 30 fps, 20 fps,
   so somehow the extra load helps (gpu powermanagement upclocking,
   or somehow glxgears kicking the compositor in the right moment?)

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
Tested-by: Mario Kleiner mario.kleiner...@gmail.com
---
 present/present.c | 90 +--
 1 file changed, 68 insertions(+), 22 deletions(-)

diff --git a/present/present.c b/present/present.c
index 2e7662a..b75017b 100644
--- a/present/present.c
+++ b/present/present.c
@@ -587,13 +587,16 @@ present_vblank_idle(present_vblank_ptr vblank)
 
 /*
  * Execute the present copy operation if it hasn't been done yet, then mark the
- * related objects as completed
+ * related objects as completed. Schedule a deferred completion for composited
+ * windows.
  */
-static void
+static Bool
 present_copy(present_vblank_ptr vblank)
 {
 if (!vblank-copied) {
-WindowPtr   window = vblank-window;
+WindowPtr   window = vblank-window;
+ScreenPtr   screen = vblank-screen;
+present_window_priv_ptr window_priv = 
present_get_window_priv(window, TRUE);
 
 vblank-copied = TRUE;
 
@@ -606,7 +609,46 @@ present_copy(present_vblank_ptr vblank)
 present_flush(window);
 
 present_vblank_idle(vblank);
+
+/* Was this a copy into a composited window? If so, queue a present
+ * completion 1 vblank into the future, as our best guess as to when
+ * the compositor will actually complete the presentation.
+ *
+ * Otherwise fallthrough to immediate completion. That's the right
+ * thing to do for a non-composited copy, and the best we can do for
+ * now in case this is called as a unredirected copy from a fallback
+ * caused by a failed PresentCompleteModeFlip.
+ */
+if (screen-GetWindowPixmap(window) != 
screen-GetScreenPixmap(screen)) {
+int ret;
+uint64_tust = 0, crtc_msc = 0;
+
+/* Re-add vblank to exec queue under same event_id */
+xorg_list_add(vblank-event_queue, present_exec_queue);
+xorg_list_append(vblank-window_list, window_priv-vblank);
+
+/* Queue one vblank from now into the future */
+ret = present_get_ust_msc(screen, vblank-crtc, ust, crtc_msc);
+
+if (ret == Success)
+ret = present_queue_vblank(screen, vblank-crtc,
+   vblank-event_id, crtc_msc + 1);
+
+/* On success, tell caller to not complete the present. Our 
deferred
+ * vblank event will do that by triggering present_execute() again.
+ */
+if (ret == Success)
+return TRUE;
+
+xorg_list_del(vblank-event_queue);
+xorg_list_del(vblank-window_list);
+
+DebugPresent((present_queue_vblank for deferred completion 
failed\n));
+}
 }
+
+/* Signal need for an immediate

[PATCH 1/3] present: When composited, get the bits into the window pixmap immediately (v2)

2015-02-08 Thread Mario Kleiner
From: Keith Packard kei...@keithp.com

When a window is composited, we want to notify the compositing manager
of the new contents *before* the next frame, so that it can prepare
them for display at the next frame, instead of doing the copy at that
frame time, and notifying the compositor so that it would always be a
frame behind.

This change catches the common case of a operation destined for the
next frame and immediately performs the copy, leaving the queue entry
around so that the PresentComplete event can be delivered at the
target frame time. This should give applications the right answer if
the Compositor is able to get the frame constructed before that frame
happens, which is at least better than the current situation where the
client receives the PresentComplete notify at the target frame, but
the contents will not be displayed until a subsequent frame.

Signed-off-by: Keith Packard kei...@keithp.com
Reviewed-by: Mario Kleiner mario.kleiner...@gmail.com

v2 (mario): Trivial rebase on top of 1.17.0-rc2
Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
Tested-by: Mario Kleiner mario.kleiner...@gmail.com
---
 present/present.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/present/present.c b/present/present.c
index bb7f33c..2e7662a 100644
--- a/present/present.c
+++ b/present/present.c
@@ -890,8 +890,25 @@ present_pixmap(WindowPtr window,
 vblank-queued = TRUE;
 if ((pixmap  target_msc = crtc_msc) || (!pixmap  target_msc  
crtc_msc)) {
 ret = present_queue_vblank(screen, target_crtc, vblank-event_id, 
target_msc);
-if (ret == Success)
+if (ret == Success) {
+/* If the window is composited, and the contents are
+ * destined for the next frame, just do the copy, sending
+ * damage along to the compositor.
+ *
+ * Leave the vblank around to send the completion event at
+ * vblank time
+ */
+if (pixmap  window  vblank-mode == PresentCompleteModeCopy 
+(target_msc - crtc_msc) = 1 
+screen-GetWindowPixmap(window) != 
screen-GetScreenPixmap(screen))
+{
+DebugPresent((\tC %p %8lld: %08lx - %08lx\n, vblank, 
crtc_msc,
+ vblank-pixmap-drawable.id, 
vblank-window-drawable.id));
+present_copy(vblank);
+}
+
 return Success;
+}
 
 DebugPresent((present_queue_vblank failed\n));
 }
-- 
2.1.0

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH 5/5] modesetting: Implement page flipping support for Present.

2015-05-04 Thread Mario Kleiner

On 04/28/2015 06:31 AM, Kenneth Graunke wrote:

On Tuesday, April 21, 2015 05:58:44 PM Kenneth Graunke wrote:

Based on code by Keith Packard, Eric Anholt, and Jason Ekstrand.

v2:
- Fix double free and flip_count underrun (caught by Mario Kleiner).
- Don't leak flip_vblank_event on the error_out path (Mario).
- Use the updated ms_flush_drm_events API (Mario, Ken).

v3: Hack around DPMS shenanigans.  If all monitors are DPMS off, then
 there is no active framebuffer; attempting to pageflip will hit the
 error_undo paths, causing us to drmModeRmFB with no framebuffer,
 which confuses the kernel into doing full modesets and generally
 breaks things.  To avoid this, make ms_present_check_flip check that
 some CRTCs are enabled and DPMS on.  This is an ugly hack that would
 get better with atomic modesetting, or some core Present work.

v4:
- Don't do pageflipping if CRTCs are rotated (caught by Jason Ekstrand).
- Make pageflipping optional (Option PageFlip in xorg.conf.d), but
   enabled by default.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org


Upon further testing, there are still nasty DPMS bugs affecting this
patch.  We no longer crash the X server, but clients appear to
occasionally get stuck.

I guess we just need to bite the bullet and unflip on DPMS off.
Or switch to atomic modesetting...



I retested v4 + the attached patch with some small improvements applied. 
The patch adds async page flip cap support, fixes that missing zero init 
which Michel also reported and fixes some small inconsistency in the 
ms_present_unflip routine.


Everything worked nicely, except for the not quite resolved DPMS off 
bug. No more crashes, but page-flipped clients reliably hang after DPMS 
off - on. KDE with composition on and page flipping on (Setting 
Tearing prevention set to Full scene repaints) hangs after each 
off-on cycle, my own fullscreen app hangs reliably as well if 
page-flipped swaps are used.


Looking at stack traces it seems to me as if somehow after a dpms 
off/on, a present_event_notify() somehow doesn't get delivered to the 
server / gets lost somewhere, so clients like mine which wait for swap 
completion or vblank events hang, and KDE hangs in Mesa when Mesa tries 
to get a new backbuffer for rendering, runs out of buffers and then 
waits infinitely for a present notify event from the server in order to 
reuse one of its busy buffers.


-mario
From 3d1e2383d7bcd35cf0838f32f26b82b8118a17d5 Mon Sep 17 00:00:00 2001
From: Mario Kleiner mario.kleiner...@gmail.com
Date: Fri, 1 May 2015 09:01:41 +0200
Subject: [PATCH 8/8] modesetting: Add async flip support, fix small bugs.

Detect if kms driver is async flip capable.
Add missing zero init for number of crtc's on counting.
Fix small inconsistency.

Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
 hw/xfree86/drivers/modesetting/present.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/xfree86/drivers/modesetting/present.c b/hw/xfree86/drivers/modesetting/present.c
index 0ec0f40..693d4f3 100644
--- a/hw/xfree86/drivers/modesetting/present.c
+++ b/hw/xfree86/drivers/modesetting/present.c
@@ -504,7 +504,7 @@ ms_present_check_flip(RRCrtcPtr crtc,
 ScrnInfoPtr scrn = xf86ScreenToScrn(screen);
 modesettingPtr ms = modesettingPTR(scrn);
 xf86CrtcConfigPtr config = XF86_CRTC_CONFIG_PTR(scrn);
-int num_crtcs_on;
+int num_crtcs_on = 0;
 int i;
 
 if (!ms-drmmode.pageflip)
@@ -579,7 +579,7 @@ ms_present_unflip(ScreenPtr screen, uint64_t event_id)
 PixmapPtr pixmap = screen-GetScreenPixmap(screen);
 Bool ret;
 
-if (!ms_present_check_flip(NULL, screen-root, pixmap, FALSE))
+if (!ms_present_check_flip(NULL, screen-root, pixmap, TRUE))
 return;
 
 ret = ms_do_pageflip(screen, pixmap, -1, FALSE, event_id);
@@ -609,5 +609,14 @@ static present_screen_info_rec ms_present_screen_info = {
 Bool
 ms_present_screen_init(ScreenPtr screen)
 {
+ScrnInfoPtr scrn = xf86ScreenToScrn(screen);
+modesettingPtr ms = modesettingPTR(scrn);
+uint64_t value;
+int ret;
+
+ret = drmGetCap(ms-fd, DRM_CAP_ASYNC_PAGE_FLIP, value);
+if (ret == 0  value == 1)
+ms_present_screen_info.capabilities |= PresentCapabilityAsync;
+
 return present_screen_init(screen, ms_present_screen_info);
 }
-- 
1.9.1

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] present: Fix missed notify MSC computation

2015-05-08 Thread Mario Kleiner

On 02/16/2015 10:49 AM, Chris Wilson wrote:

Only treat divisor==0 as async to immediately report the actual vblank.
If the user species a non-zero divisor, we should compute the missed
vblank properly or else we report too early.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
---
  present/present.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/present/present.c b/present/present.c
index 1dddcc0..9a283d4 100644
--- a/present/present.c
+++ b/present/present.c
@@ -961,7 +961,7 @@ present_notify_msc(WindowPtr window,
0, 0,
NULL,
NULL, NULL,
-  PresentOptionAsync,
+  divisor == 0 ? PresentOptionAsync : 0,
target_msc, divisor, remainder, NULL, 0);
  }




Reviewed-by: Mario Kleiner mario.kleiner...@gmail.com

-mario
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH 1/2] present: Improve scaling of vblank handler

2015-05-08 Thread Mario Kleiner

On 02/14/2015 10:58 AM, Chris Wilson wrote:

With large numbers of queued vblank, the list iteration on every
interupt dominates processing time. If we reorder the list to be in
ascending event order, then not only is also likely to be in order for
notification queries (i.e. the notification will be near the start of
the list), we can also stop iterating when past the target event_id.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
---
  present/present.c | 24 +++-
  1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/present/present.c b/present/present.c
index ba88f79..b132f4d 100644
--- a/present/present.c
+++ b/present/present.c
@@ -482,19 +482,22 @@ present_flip_notify(present_vblank_ptr vblank, uint64_t 
ust, uint64_t crtc_msc)
  void
  present_event_notify(uint64_t event_id, uint64_t ust, uint64_t msc)
  {
-present_vblank_ptr  vblank, tmp;
+present_vblank_ptr  vblank;
  int s;

  if (!event_id)
  return;
  DebugPresent((\te %lld ust %lld msc %lld\n, event_id, ust, msc));
-xorg_list_for_each_entry_safe(vblank, tmp, present_exec_queue, 
event_queue) {
-if (vblank-event_id == event_id) {
+xorg_list_for_each_entry(vblank, present_exec_queue, event_queue) {
+int64_t match = event_id - vblank-event_id;
+if (match == 0) {
  present_execute(vblank, ust, msc);
  return;
  }
+if (match  0)
+break;
  }
-xorg_list_for_each_entry_safe(vblank, tmp, present_flip_queue, 
event_queue) {
+xorg_list_for_each_entry(vblank, present_flip_queue, event_queue) {
  if (vblank-event_id == event_id) {
  present_flip_notify(vblank, ust, msc);
  return;
@@ -887,7 +890,7 @@ present_pixmap(WindowPtr window,
vblank-pixmap-drawable.id, 
vblank-window-drawable.id,
target_crtc, vblank-flip, vblank-sync_flip, 
vblank-serial));

-xorg_list_add(vblank-event_queue, present_exec_queue);
+xorg_list_append(vblank-event_queue, present_exec_queue);
  vblank-queued = TRUE;
  if ((pixmap  target_msc = crtc_msc) || (!pixmap  target_msc  
crtc_msc)) {
  ret = present_queue_vblank(screen, target_crtc, vblank-event_id, 
target_msc);
@@ -911,7 +914,7 @@ no_mem:
  void
  present_abort_vblank(ScreenPtr screen, RRCrtcPtr crtc, uint64_t event_id, 
uint64_t msc)
  {
-present_vblank_ptr  vblank, tmp;
+present_vblank_ptr  vblank;

  if (crtc == NULL)
  present_fake_abort_vblank(screen, event_id, msc);
@@ -922,14 +925,17 @@ present_abort_vblank(ScreenPtr screen, RRCrtcPtr crtc, 
uint64_t event_id, uint64
  (*screen_priv-info-abort_vblank) (crtc, event_id, msc);
  }

-xorg_list_for_each_entry_safe(vblank, tmp, present_exec_queue, 
event_queue) {
-if (vblank-event_id == event_id) {
+xorg_list_for_each_entry(vblank, present_exec_queue, event_queue) {
+int64_t match = event_id - vblank-event_id;
+if (match == 0) {
  xorg_list_del(vblank-event_queue);
  vblank-queued = FALSE;
  return;
  }
+if (match  0)
+break;
  }
-xorg_list_for_each_entry_safe(vblank, tmp, present_flip_queue, 
event_queue) {
+xorg_list_for_each_entry(vblank, present_flip_queue, event_queue) {
  if (vblank-event_id == event_id) {
  xorg_list_del(vblank-event_queue);
  return;



Looks good to me, also no problems during testing on top of current 
master, so


Reviewed-and-tested-by: Mario Kleiner mario.kleiner...@gmail.com

-mario


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH 2/2] present: Fix presentation of flips out of order

2015-05-08 Thread Mario Kleiner

On 02/14/2015 10:58 AM, Chris Wilson wrote:

The flip queue currently only holds events submitted to the driver for
flipping, awaiting the completion notifier. It is short. We therefore
can speed up interrupt processing by keeping the small number of events
ready to be flipped on the end of the flip queue. By appending the
events to the flip_queue in the order that they become ready, we also
resolve one issue causing Present to display frames out of order.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
---
  present/present.c | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/present/present.c b/present/present.c
index b132f4d..bce41f4 100644
--- a/present/present.c
+++ b/present/present.c
@@ -327,10 +327,10 @@ present_re_execute(present_vblank_ptr vblank)
  static void
  present_flip_try_ready(ScreenPtr screen)
  {
-present_vblank_ptr  vblank, tmp;
+present_vblank_ptr  vblank;

-xorg_list_for_each_entry_safe(vblank, tmp, present_exec_queue, 
event_queue) {
-if (vblank-flip_ready) {
+xorg_list_for_each_entry(vblank, present_flip_queue, event_queue) {
+if (vblank-queued) {
  present_re_execute(vblank);
  return;
  }
@@ -623,6 +623,8 @@ present_execute(present_vblank_ptr vblank, uint64_t ust, 
uint64_t crtc_msc)
  DebugPresent((\tr %lld %p (pending %p unflip %lld)\n,
vblank-event_id, vblank,
screen_priv-flip_pending, 
screen_priv-unflip_event_id));
+xorg_list_del(vblank-event_queue);
+xorg_list_append(vblank-event_queue, present_flip_queue);
  vblank-flip_ready = TRUE;
  return;
  }
@@ -938,6 +940,7 @@ present_abort_vblank(ScreenPtr screen, RRCrtcPtr crtc, 
uint64_t event_id, uint64
  xorg_list_for_each_entry(vblank, present_flip_queue, event_queue) {
  if (vblank-event_id == event_id) {
  xorg_list_del(vblank-event_queue);
+vblank-queued = FALSE;
  return;
  }
  }



Ditto here. Looks good to me, also no problems during testing on top of 
current master, so


Reviewed-and-tested-by: Mario Kleiner mario.kleiner...@gmail.com

-mario
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

  1   2   >