Re: [RFC xserver 0/3] Allow XWM to control Xwayland commits

2016-11-30 Thread Owen Taylor
Hi Pekka,

I don't have a lot of of commentary to add here. Certainly getting the
frame-sync protocols right does require integration between Xwayland and
the compositing manager. I don't think there's that much virtue in spending
time on the extended version of the sync protocol and
_NET_WM_FRAME_TIMINGS, because, to my knowledge, those are implemented only
by GTK+ 3, and most GTK+ 3 apps will be running as native Wayland apps. On
the other hand, gtk2 and qt4 X clients will be around to exercise the basic
version of the protocol for the forseeable future.

Regards,
Owen

On Mon, Nov 28, 2016 at 8:47 AM, Pekka Paalanen  wrote:

> On Fri, 25 Nov 2016 12:30:18 +0200
> Pekka Paalanen  wrote:
>
> > On Fri, 25 Nov 2016 03:47:56 -0500 (EST)
> > Olivier Fourdan  wrote:
> >
> > > Hi Pekka,
> > >
> > > > this is probably the first time I'm sending patches for the xserver,
> so
> > > > pointing out API misuse, coding style issues etc. would be
> appreciated. The
> > > > last patch also has some XXX comments with questions.
> > > >
> > > > The first patch, refactoring, is not absolutely necessary (anymore -
> I used
> > > > to
> > > > have a use for it while brewing this), but I think it makes things
> look
> > > > nicer.
> > > >
> > > > The fundamental problem is that Xwayland and the Wayland compositor
> (window
> > > > manager) are racing, particularly when windows are being mapped.
> This can
> > > > lead
> > > > to glitches. The specific glitch I bumped into is described at
> > > > https://phabricator.freedesktop.org/T7622 .
> > > >
> > > > The proposed solution allows the WM to temporarily prohibit commits.
> > > >
> > > > It would be awkward for Weston to postpone mapping certain windows
> since it
> > > > would leak quite much Xwayland into the heart of the window manager.
> A
> > > > Wayland
> > > > compositor also cannot ignore commits, because Xwayland will wait
> for frame
> > > > callbacks until it commits again. To not get stuck, one would need
> to fire
> > > > frame callbacks outside of their normal path. It seems much simpler
> to just
> > > > control Xwayland's commits, which also ensures that the first commit
> will
> > > > carry
> > > > fully drawn window decorations too.
> > > >
> > > > This series is the Xwayland side of the fix to T7622. The Weston
> side is
> > > > still
> > > > brewing, but I was able to test that the Xwayland code works.
> > >
> > > I've taken a look at https://phabricator.freedesktop.org/T7622 but I
> > > am not sure how that proposal would play with apps that implement the
> > > _NET_WM_SYNC protocol, using _NET_WM_SYNC_REQUEST and sync counters
> > > as described in EMWH [1]
> >
> > Hi,
> >
> > I have no idea. I don't know how that protocol works and Weston does
> > not use it. Yet.
> >
> > > GNOME (mutter,gtk3) go a bit farther even and use an extended version
> > > of this protocol described by Owen [2].
> > >
> > > I suspect these make the de-syncronization with the Xwayland toplevel
> > > decorated by the X window manager even more visible under Wayland, as
> > > the repaint is scheduled according to the app content and not sync'ed
> > > with Xwayland and repaint of the shadows by the XWM.
> > >
> > > Would your proposal help there?
> >
> > I would hope so, but given that I don't know how the sync protocol
> > works, I can't say. So far I'm just following the plan Daniel made.
> >
> > From what Daniel wrote about the sync protocol in T7622, it sounds like
> > "my" proposal is the first step toward making the sync protocol really
> > successful.
> >
> > I've been wondering though whether the client content sync
> > protocol should be implemented in the WM or could/should it be
> > implemented in Xwayland? Somehow it would feel natural to me, being a
> > foreigner to X11, that Xwayland would throttle its wl_surface.commits
> > based on both the client content sync protocol and the WM sync protocol
> > (this series). I don't think we would want to route every single commit
> > through the WM.
> >
> >
> > Thanks,
> > pq
> >
> > > [1]
> > > https://specifications.freedesktop.org/wm-spec/wm-spec-latest.html#
> idm140200472538288
> > > [2] http://fishsoup.net/misc/wm-spec-synchronization.html
>
> Hi,
>
> having read the above two specs, it is very much not obvious how to
> connect all the dots. It needs experimenting.
>
> Would it be acceptable to build the knowledge of all these sync
> protocols into Xwayland server?
>
> The thing there is that Xwayland is issuing the wl_surface.commits, and
> having XWM to explicitly trigger all commits always seems like
> unnecessary overhead. However, Xwayland does not know when to commit if
> it doesn't understand the X11 sync protocols. Either every commit
> for every window has to roundtrip through XWM, or Xwayland needs to
> understand stuff. Or the third option, which I like the least, is to
> have the Wayland compositor special-case all surface from Xwayland and
> put them through a completely different and new code paths not share

Re: [RFC xserver 0/3] Allow XWM to control Xwayland commits

2016-12-01 Thread Owen Taylor
Probably a combination of factors:
 * It never got into the official spec (I didn't want to push it in without
some review and hopefully a second implementation)
 * It requires coordinated window manager and toolkit changes to get any
benefit
 * The original spec fixed a huge problem with opaque resizing where the
application could lag arbitrarily behind the window manager - the extended
spec just makes things look better when you look close.
 * People who really care about these things were already beginning to put
their efforts into Wayland

- Owen

On Thu, Dec 1, 2016 at 4:53 AM, Pekka Paalanen  wrote:

> On Wed, 30 Nov 2016 15:12:54 -0500
> Owen Taylor  wrote:
>
> > Hi Pekka,
> >
> > I don't have a lot of of commentary to add here. Certainly getting the
> > frame-sync protocols right does require integration between Xwayland and
> > the compositing manager. I don't think there's that much virtue in
> spending
> > time on the extended version of the sync protocol and
> > _NET_WM_FRAME_TIMINGS, because, to my knowledge, those are implemented
> only
> > by GTK+ 3, and most GTK+ 3 apps will be running as native Wayland apps.
> On
> > the other hand, gtk2 and qt4 X clients will be around to exercise the
> basic
> > version of the protocol for the forseeable future.
>
> Hi Owen,
>
> that's valuable, saves me the effort of designing for and maybe
> implementing the extended version. :-)
>
> I'm kind of curious though why no other toolkit saw the benefit of the
> extended sync protocol.
>
>
> Thanks,
> pq
>
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] Fix XNextRequest() after direct usage of XCB

2014-05-09 Thread Owen Taylor
Uli Schlachter wrote:

> >  unsigned long XNextRequest(Display *dpy)
> >  {
> > +unsigned long next_request;
> > +LockDisplay(dpy);
> > +next_request = _XNextRequest(dpy);
> > +UnlockDisplay(dpy);
> > +
> >  return (NextRequest(dpy));
> 
> Unused variable "next_request". Did you really mean to write the function
> like this instead of also changing the return statement?

Oops - it does actually work, because the _XNextRequest() updates 
display->request
as a side effect, so I wasn't dreaming that my testing worked, but still
embarrassing ;-) Sent a newversion.

- Owen
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] Fix locking bugs with XIAllowTouchEvents() and XIUngrabTouchBegin()

2014-07-11 Thread Owen Taylor
Jasper St. Pierre wrote:
> int touchid,
> LockDisplay(dpy);
> if (_XiCheckExtInit(dpy, XInput_2_2, extinfo) == -1)
> return (NoSuchExtension);
> -
> - status = _XIAllowEvents(dpy, deviceid, event_mode, CurrentTime, touchid,
> grab_window);
> -
> UnlockDisplay(dpy);
> - SyncHandle();
> 
> - return status;
> + return _XIAllowEvents(dpy, deviceid, event_mode, CurrentTime, touchid,
> grab_window);
> 
> Shouldn't you remove the declaration for "status" here?

Yes, thanks - new version sent.

- Owen
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH 0/3] present: Improve interactions with compositing manager

2015-01-26 Thread Owen Taylor
On Sat, 2014-12-13 at 00:32 -0800, Keith Packard wrote:
> Here's a short patch series which reduces lag when using PresentPixmap
> with a compositing manager. I wrote up a short description on my blog:
> 
>   http://keithp.com/blogs/present-compositor/
> 
> The basic idea is to perform the CopyArea immediately upon receipt of
> the PresentPixmap request, but to delay the PresentComplete event
> until the actual vblank occurs, allowing the compositing manager a
> chance to actually get the contents onto the screen at the desired
> time.

Hi Keith -

Sorry for getting to this so slowly. At least for the algorithms that
GNOME currently uses, this isn't going to help avoid the extra frame of
latency.

It would potentially work if the compositor drew the frame to the screen
immediately upon receiving damage, but this algorithm tends to result in
uneven jerky updates if there is more than one application drawing - the
first application to draw "wins" and the next app gets delayed a frame.

What we do instead start drawing the frame at a fixed point in the frame
cycle. The point isn't chosen "as late as possible", since that's hard
to predict reliably - after all, we can get damage events for drawing
that isn't even completed on the GPU (at least we could with DRI2) - but
rather is fixed to an arbitrary point 2 ms after VBlank. The idea here
is to accept the extra frame of latency and within that constraint make
things as predictable and as smooth as possible.

I think to get rid of the extra frame of latency, you need to be able 
"edit" a queued presentation in some fashion. If the compositor could
put the active window into a separate hardware overlay then it would be
easy. But it might be possible to just present another buffer:

 - Copy queued buffer to a new buffer
 - Overwrite with the updated window image
 - Present that for the same timestamp as the last frame

It doesn't scale to *arbitrarily* many updating apps, but it could fix
the case where one app is trying to draw smoothly, and another app jumps
in and blinks the cursor. It wouldn't even (I think) require any new API
beyond the current Present extension, but it would require more
consistency in the X driver world than currently exists to make it
worthwhile as more than just an experiment.

Does it *hurt* to make the change you are proposing? Not that I can
think of - there's no downside to getting the damage early, even if we
don't use it, and the PresentComplete timestamps would only be as
inaccurate as they already are.

- Owen


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH 0/3] present: Improve interactions with compositing manager

2015-01-26 Thread Owen Taylor
On Mon, 2015-01-26 at 16:22 -0800, Keith Packard wrote:
> Owen Taylor  writes:
> 
> > Sorry for getting to this so slowly. At least for the algorithms that
> > GNOME currently uses, this isn't going to help avoid the extra frame of
> > latency.
> 
> Maybe you don't quite understand this change?

I think I do

> > What we do instead start drawing the frame at a fixed point in the frame
> > cycle. The point isn't chosen "as late as possible", since that's hard
> > to predict reliably - after all, we can get damage events for drawing
> > that isn't even completed on the GPU (at least we could with DRI2) - but
> > rather is fixed to an arbitrary point 2 ms after VBlank. The idea here
> > is to accept the extra frame of latency and within that constraint make
> > things as predictable and as smooth as possible.
> 
> Let me explain what I've done and see if that will help your current
> situation.
> 
> Right now, a redirected application that calls PresentPixmap will have
> that presentation delayed until the scheduled frame. So, an application
> which is 'keeping up' (presenting contents very soon after top-of-frame)
> will have their new contents waiting around until the next frame begins,
> at which point they will be copied to the redirected pixmap and damage
> delivered to the compositor.
> 
> The change is to copy the contents immediately, send the damage to the
> compositor, and delay sending the PresentNotify event to the client
> until the start of the next frame.
> 
> So, if an application manages to present a new frame in the 2ms after
> vblank, the change will get that up on the screen a full frame earlier.

I think 2ms is generally unrealistic for rendering for an application
that is doing something non-trivial - i.e. for something than glxgears.
Especially when you consider that in order to improve things the app has
to consistently and reliably hit that 2ms - if the rendering sometimes
takes sometimes at 1.9ms and sometimes at 2.1ms, it's going to look
awful.

[ It has to be said that the 2ms is 2ms of *CPU* time unless there's a
glFinish() or equivalent somewhere - that helps a bit ]

The only way that it's going to be reliable is if the application adds
buffering app-side and renders the frame that is present in the 2ms
window ahead of time. And then we've just gone back to having the extra
frame of latency.

> > I think to get rid of the extra frame of latency, you need to be able 
> > "edit" a queued presentation in some fashion. If the compositor could
> > put the active window into a separate hardware overlay then it would be
> > easy. But it might be possible to just present another buffer:
> 
> You can "edit" the queued operation with the Present API. Current kernel
> drivers can't do this with page flipping, but if you force a copy, then
> you can replace that pending operation with another copy at any point.
> 
> If you're doing sub-region updates by copying portions of windows to the
> current scan-out buffer, then you can queue as many as you want for a
> particular frame, and all of them will get executed at vblank time. If
> one of the queued requests subsumes an existing request, the previous
> request will be skipped.

OK - there's definitely stuff there that could be experimented with.

> > Does it *hurt* to make the change you are proposing? Not that I can
> > think of - there's no downside to getting the damage early, even if we
> > don't use it, and the PresentComplete timestamps would only be as
> > inaccurate as they already are.
> 
> If the damage appears before you draw the next frame, I think it'll help.

Thinking some more, maybe it *does* hurt - we've gone from the
application getting consistent latency as long as it renders in less
than 16ms (because the X server was internally making sure that the
damage was delivered in the 2ms window), to adding an arbitrary
threshold where the latency changes.

But then again, I'm not going to claim that the current GNOME/Mutter
algorithm is ideal - it's just something I came up with that kept my
test cases reasonably smooth rather than jerking all over the place.

- Owen


- Owen

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [Xlib] Ignoring BadAccess errors

2021-04-24 Thread Owen Taylor
GTK+ is already using Xlib at a pretty low level - see gdk/x11/gdkasync.c.

So just, say, copy what
https://cgit.freedesktop.org/xorg/lib/libX11/tree/src/GetIFocus.c does
and check the return value of _XReply() - if it's 0, then you got a
BadAccess. (Or some other call if GetInputFocus doesn't do what you
want.)

Going to XCB would be the clean solution, but I don't see introducing
XCB usage into GTK+ as a good idea at this point - there are almost
certainly subtle traps if mixing together XLib and XCB usage.

Owen

On Tue, Apr 20, 2021 at 7:30 AM Nekun  wrote:
>
> Hi all.
>
> Can somebody explain me why BadAlloc and BadAccess errors are ignored in
> Xlib reply processing, but only in certain conditions? As I can see
> (maybe I misunderstood something, please correct), in _Xreply:
> https://cgit.freedesktop.org/xorg/lib/libX11/tree/src/xcb_io.c#n655 ,
> after sending a request, at first we process pending responses in input
> buffer and handle it without any exceptions and then we process the last
> response and ignore some errors: BadName and BadFont in some conditions
> described in the Xlib spec:
> https://www.x.org/releases/current/doc/libX11/libX11/libX11.html#Using_the_Default_Error_Handlers
> , and, suddenly, completely ignore BadAccess and BadAlloc. I don't found
> any references for that behavior. In result, I can't trap these errors
> with my error handler if an Xlib request wrapper calls _Xreply
> explicitly. Some wrappers, such as XBell, writes a request to output
> queue, calls the sync handler and then errors are handled without
> exception because XSync adds InputSetFocus to queue, so the target
> request is processed with no-exception logic described above. This issue
> arised when I worked on fixing the broken untrusted context check in GTK
> (GDK) at display opening. I need a function which robustly returns an
> error when we in untrusted context, which returned exactly when client
> is untrusted and not for any other reasons, and throw in the gdk_x_error
> trap. Tried XListHosts and failed for reasons described above. Maybe I
> can use a return value of some function instead of error handling? Seems
> like X11 security extension more often returns fake values for untrusted
> clients rather than errors, but not sure that is possible to distinguish
> between a fake value and a real in any response. Stucked here.
> ___
> xorg-devel@lists.x.org: X.Org development
> Archives: http://lists.x.org/archives/xorg-devel
> Info: https://lists.x.org/mailman/listinfo/xorg-devel
>

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel


Re: Fence Sync patches

2010-12-05 Thread Owen Taylor
On Fri, 2010-12-03 at 13:08 -0800, James Jones wrote:
> On Friday 03 December 2010 11:16:43 am Owen Taylor wrote:
> > On Fri, 2010-12-03 at 10:13 -0800, James Jones wrote:
> > > I wrote a slide deck on synchronization and presentation ideas for X a
> > > year ago or so before starting this work:
> > > 
> > > http://people.freedesktop.org/~aplattner/x-presentation-and-
> > > synchronization.pdf
> > > 
> > > Aaron presented it at XDevConf last year.  However, that doesn't really
> > > cover the immediately useful function for GL composite managers:
> > > XDamageSubtractAndTrigger().  I plan on putting some patches to compiz
> > > together demonstrating the usage, but basically the flow is:
> > > 
> > > -Create a bunch of sync objects at startup time in your GL/GLES-based
> > > compositor, and import them into your GL context as GL sync objects. 
> > > I'll call those syncObjectsX[] and syncObjectsGL[] respectively.
> > > 
> > > -Rather than calling XDamageSubtract(), call
> > > XDamageSubtractAndTrigger(syncObjectsX[current]).
> > 
> > So the basic flow here is:
> > 
> >  Client => X server [render this]
> >  X server => GPU[render this]
> >  X server => compositor [something was rendered]
> >  compositor => xserver  [trigger the fence]
> >  compositor => GPU  [render this after the fence]
> >  xserver => GPU [trigger the fence]
> 
> Roughly, but I think that ordering implies a very worst-case scenario.  In 
> reality the last two steps will most likely occur simultaneously, and the 
> last 
> step might even be a no-op: If the X server already knows the rendering has 
> completed it can simply mark the fence triggered immediately without going 
> out 
> to the GPU.

Obviously exactly how fences work is going to be very hardware dependent.
But it doesn't seem to me that marking the fence triggered is ever a "no-op" - 
until it is done, the GPU won't progress past the wait on the fence. So in
order to get the frame to the screen, the X server is going to have to be
scheduled again and process the compositor's request. If the X server 
is busy doing something else, there might measurable latency.

>   This is often the case in our driver, though I haven't 
> implemented that particular optimization yet.
> 
> > In the normal case where there is a single damage event per frame, the
> > fact that we have this round trip where the compositor has to go back to
> > the X server, and the X server has to go back to the GPU bothers me.
> 
> I like to point out that it's not really a round trip, but rather two trips 
> to 
> the same destination in parallel.  A round trip would add more latency.

It's actually possible that a round-trip would add *less* latency, because in
the round trip case the compositor will yield its timeslice to the X server, 
while
simply writing stuff to the X socket won't do that. In the lightly-loaded 
multicore
case, you will, of course, get the parallelism you mentioned... the X server 
will
wake up and handle the async request as soon as it receives it.

> > It's perhaps especially problematic in the case of the open source
> > drivers where the synchronization is already handled correctly without
> > this extra work and the extra work would just be a complete waste of
> > time. [*]
> 
> The default implementation assumes the Open Source driver behavior and marks 
> the fence triggered as soon as the server receives the request, so the only 
> added time will be a single asynchronous X request if the open source OpenGL-
> side implementation is done efficiently.

Well, the default implementation is at least going to have to flush buffers,
since the Open Source driver behavior only applies to buffers submitted
to the kernel. 

I'm also thinking that it's making stronger assumptions about the guarantees 
that
the open source drivers than are being assumed by the current 
flush-after-damage. 
It's assuming equivalence to full serialization of submitted buffers - also
for things like:

 process A: write to B1, fence F1
 process B: wait for F1, write to B1

which is considerably stronger than an implicit fence between writing to a 
buffer
and turning around and reading from it. (Maybe all the open source drivers 
actually
fit the stronger criteria ... haven't checked.)

> > But it doesn't seem like a particularly efficient or low-latency way of
> > handling things even in the case of a driver with no built in
> > synchronization.
> > 
> > Can you go into the reasoning for this approach?
> 
> As I said, this definitely is

Re: [PATCH] Fixes v5: Cursor barriers

2010-12-06 Thread Owen Taylor
Reviewed-by: Owen Taylor 

This looks good to me - it look like it is sufficient to meet the need
of making easy-to-hit "hot corners" on a multi-head displays with a good
amount of flexibility but without adding a lot of unmotivated
complexity.

Creating barriers relative to any window rather than to a root window
isn't needed for the hot-corner case, but since the Window argument is
needed to say *which* root window in the multi-screen case, it seems
natural to allow any window.

There's a slight naming iconsistency in CreateCursorBarrier vs. the
Barrier resource, but that's OK - resources are almost all one word
but CreateCursorBarrier is more self-documenting than CreateBarrier.

Though I do wonder, is this a Cursor barrier or a Pointer barrier? I
always thought of the cursor being the image displayed at the mouse
pointer.

> +* XFIXES VERSION 5 OR BETTER ***
> +
> +12. Cursor Barriers
> +
> +Compositing managers and desktop environments may have UI elements in
> +particular screen locations such that for a single-headed display they
> +correspond to easy targets according to Fitt's Law, for example, the top
> +left corner.  For a multiheaded environment these corners should still be

single-headed versus multiheaded

> +semi-impermeable.  Cursor barriers allow the application to define
> +additional constraint on cursor motion so that these regions behave as
> +expected even in the face of multiple displays.
> +
> +WarpPointer and similar requests do not obey these barriers.
> +
> +12.1 Types
> +
> + BARRIER:XID
> +
> + BarrierDirections
> +
> + BarrierPositiveX:   1 << 0
> + BarrierPositiveY:   1 << 1
> + BarrierNegativeX:   1 << 2
> + BarrierNegativeY:   1 << 3
> +
> +12.2 Errors
> +
> + Barrier
> +
> +12.3 Requests
> +
> +CreateCursorBarrier
> +
> + barrier:BARRIER
> + window: WINDOW
> + x1, y2, x2, y2: CARD16
> + directions: CARD32
> +
> + Creates a cursor barrier along the line specified by the given
> + coordinates on the screen specified by the given Window.  The
> + barrier has no spatial extent; it is simply a line along the left
> + or top edge of the specified pixels.
> +
> + The coordinates must be axis aligned, either x1 == x2, or
> + y1 == y2.  If not, BadValue is generated.

Should specify what the meaning if x2 < x1 or y2 < y1 is. Is it
BadValue, or empty, or the same as the unreversed case. If x2 < x1 is
BadValue then should specify whether x2 == x1 is BadValue or not, but
doesn't seem necessary if all x1,x2 pairs are legal.

> +
> + Motion is allowed through the barrier in the directions specified.
> + Nonsensical values (forbidding Y axis travel through a vertical
> + barrier, for example) and excess set bits are ignored.

Might be good to actual say what "in the direction specified" means. The
only documentation is the constant names and while you can guess that
BarrierPositiveY means crossing the barrier in the direction of Y
coordinate increase (down), it's not entirely obvious.


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] Fixes v5: Cursor barriers

2010-12-06 Thread Owen Taylor
On Mon, 2010-12-06 at 09:41 -0500, Adam Jackson wrote:
> On Mon, 2010-12-06 at 06:27 -0500, Owen Taylor wrote:
> > Reviewed-by: Owen Taylor 
> > 
> > This looks good to me - it look like it is sufficient to meet the need
> > of making easy-to-hit "hot corners" on a multi-head displays with a good
> > amount of flexibility but without adding a lot of unmotivated
> > complexity.
> > 
> > Creating barriers relative to any window rather than to a root window
> > isn't needed for the hot-corner case, but since the Window argument is
> > needed to say *which* root window in the multi-screen case, it seems
> > natural to allow any window.
> 
> The spec text could probably be a little clearer that the barrier
> coordinates are in screen space not window space, I think.

OK, then that's very much not clear from the text.

> > There's a slight naming iconsistency in CreateCursorBarrier vs. the
> > Barrier resource, but that's OK - resources are almost all one word
> > but CreateCursorBarrier is more self-documenting than CreateBarrier.
> 
> I was also trying to be unambiguous with GLX swap barriers.
> 
> > Though I do wonder, is this a Cursor barrier or a Pointer barrier? I
> > always thought of the cursor being the image displayed at the mouse
> > pointer.
> 
> I suppose a pointer barrier is a stronger concept, since you may not
> have a cursor.  I hadn't really made any mental distinction between the
> two for barriers, since barriering an invisible cursor seems like very
> strange use case.

You always have a cursor right? It just can be a 1x1 invisible cursor? 
My thought process is more that it's XWarpPointer() or the confine_to
argument of XGrabPointer is documented as:

 'Specifies the window to confine the pointer in or None'

Another thought I had about the spec was that it probably should
explicitly mention that cursors only affect relative pointer devices and
not absolute pointer devices.

- Owen


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Clip lists for redirected windows (was Re: Fence Sync patches)

2010-12-08 Thread Owen Taylor
On Tue, 2010-12-07 at 16:54 -0800, James Jones wrote:

> > [ In terms of GNOME 3 and NVIDIA: If it's *that* slow to update clip lists
> > for a GLX window, then just save the last one you got, and during
> > ValidateTree memcmp() and if nothing  changed, don't do anything. Can't be
> > more than 20 lines of code. Would make thousands of GNOME 3 users happy ]
> 
> Can you point to a more specific use case (start app A, drag it over app B, 
> etc) We've got a huge backlog of work to do in this area, but specific worst-
> case examples are always good.

OK, I should explain that off-hand remark some more - it's been
discussed at various times on IRC but probably never written down.

The situation where we trigger problems with GNOME Shell is when you
have a GL window with manually redirected child windows. (We use this in
a couple of places to embed legacy X rendered windows within a GL-drawn
canvas.)

When you manually redirect windows with the Composite extension, they no
longer clip the parent window. So moving them around actually does
absolutely nothing to the parent clip. However, ValidateTree still ends
up getting called because things have to be updated inside the
redirected window.

The observed behavior is that whenever we move such a window, we take a
huge performance hit with the NVIDIA driver. If we aren't moving windows
around we do 60fps no problem, if we move a window once per frame we
drop to a few frames per second. It's so slow that my assumption is that
we're triggering some sort of timeout or bug in the driver, but it might
just be normal overhead from changing clip-lists and synchronizing
clients.

My understanding is that because pScreen->ValidateTree is where the
driver hooks in it's very hard to optimize this out at the xserver level
- to disguise the fact that anything happened from the driver.

But for a driver where changing clip lists is potentially expensive for
whatever reason, it seems to me that saving the last clip list and
detecting that the clip list didn't change is a simple and very
worthwhile optimization. (And this isn't the only case where
ValidateTree can be called without the clip list for a window changing.)

- Owen


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: Event for override-redirect change

2010-11-30 Thread Owen Taylor
On Mon, 2010-11-29 at 17:31 -0500, Devin J. Pohly wrote:
> On 11/29/2010 02:06 PM, Adam Jackson wrote:
> > On Fri, 2010-11-26 at 23:33 -0500, Devin J. Pohly wrote:
> >> The override-redirect flag is important for a window manager to know
> >> about, but there's no notification made when a window changes it.  It
> >> doesn't appear that the protocol specifies an action to take when a
> >> ChangeWindowAttributes request changes override-redirect.
> >
> > I'm not entirely sure what you're going to do about managing an
> > unmanageable window.  Which sort of determines how we'd go about getting
> > an event for this generated.  What did you have in mind here?
> 
> In my case, the motivating example is a tiling window manager.  If a 
> window is already mapped and part of the tiled layout, I'd want it to 
> pop out of the tiled arrangement if it switches to unmanaged.  As it 
> stands, the WM has to recheck each window every time it arranges them.

If the window is override redirect, the window manager doesn't have a
say in where it is positioned. The application is completely responsible
for positioning and stacking override redirect windows. You should not
be touching override redirect windows or you can cause a "fight" where
the window bounces around between multiple positions.

You might also want to note section 4.2.9 of the ICCCM which suggests
that applications in some cases can temporarily set override-redirect on
their toplevels. To my knowledge no toolkit has ever implemented this
and I doubt anybody ever will, but the implication there is a change to
override-redirect should *not* cause a window to pop out; the window
manager should continue as normal assuming it will shortly be changed
back.

If an application reliably wants to change a window from acting as a
toplevel to acting as an override-redirect popup, the recommended way to
do it would surely be to withdraw the window first. Nothing else is
going to work with the bulk of existing window managers.

> The more basic issue is that X provides a way for Client A to change a 
> property which is of direct interest to Client B (the WM in this case) 
> without providing a way for Client B to find out.  It feels... 
> inconsistent... since X is generally really good about this.

In general, yes, X does have a lot of notification. But lack of
notification in one area, isn't in itself a reason for protocol
additions. Motivation has to be something that clients need to do that
will make things better for the user.

- Owen


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: Fence Sync patches

2010-12-03 Thread Owen Taylor
On Thu, 2010-12-02 at 09:40 -0800, James Jones wrote:
> As I mentioned early on, I really want to get the fence sync work in server 
> 1.10.  The server code was reviewed by Adam Jackson (thanks for sifting 
> through all that) and various nvidians, but I still haven't received any  
> external official reviewed-by for the proto updates it relies on, or for the 
> lib 
> code to exercise it.  I've CC'd the suggested reviewers on the latest 
> versions 
> of the patches and here:
> 
> -Alan and Adam, because you provided some early feedback on the proto specs 
> but never responded to my updates based on said feedback.
> 
> -Keith, because you're the maintainer of the damage subsystem, and there are 
> some minor changes to the damage proto and lib.
> 
> If these remaining pieces get reviewed, I can send out pull requests for 
> everything immediately.  I've had this code out for review in some form for 
> about 3 months now, so it'd be pretty disappointing if it had to sit around 
> waiting for another release cycle.

Hey, is there a good big picture overview of this somewhere?

As a compositing manager maintainer (Mutter, the GNOME 3 compositing
manager), I'm wondering what it means for me.

There's already a lot of magic voodoo dances around both Damage and
Texture-From-Pixmap, what extra incantations does this add to the
picture?

- Owen

(I can understand each individual step of the magic voodoo dance, but
when I go away from the individual problems and come back 6 months
later, I have to work it all out again. And there's a strong sense that
only particular code paths that actually are in use are tested and
anything else probably doesn't work, at least on some drivers.)


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: Fence Sync patches

2010-12-03 Thread Owen Taylor
On Fri, 2010-12-03 at 10:13 -0800, James Jones wrote:

> I wrote a slide deck on synchronization and presentation ideas for X a year 
> ago or so before starting this work:
> 
> http://people.freedesktop.org/~aplattner/x-presentation-and-
> synchronization.pdf
> 
> Aaron presented it at XDevConf last year.  However, that doesn't really cover 
> the immediately useful function for GL composite managers: 
> XDamageSubtractAndTrigger().  I plan on putting some patches to compiz 
> together demonstrating the usage, but basically the flow is:
> 
> -Create a bunch of sync objects at startup time in your GL/GLES-based 
> compositor, and import them into your GL context as GL sync objects.  I'll 
> call those syncObjectsX[] and syncObjectsGL[] respectively.
> 
> -Rather than calling XDamageSubtract(), call 
> XDamageSubtractAndTrigger(syncObjectsX[current]).

So the basic flow here is:

 Client => X server [render this]
 X server => GPU[render this]
 X server => compositor [something was rendered]
 compositor => xserver  [trigger the fence]
 compositor => GPU  [render this after the fence]
 xserver => GPU [trigger the fence]

In the normal case where there is a single damage event per frame, the
fact that we have this round trip where the compositor has to go back to
the X server, and the X server has to go back to the GPU bothers me.

It's perhaps especially problematic in the case of the open source
drivers where the synchronization is already handled correctly without
this extra work and the extra work would just be a complete waste of
time. [*]

But it doesn't seem like a particularly efficient or low-latency way of
handling things even in the case of a driver with no built in
synchronization.

Can you go into the reasoning for this approach?  

> -Prefix all the GL rendering that repairs the damage subtracted with a sync 
> wait: glWaitSync(syncObjectsGL[current++])
> 
> The GL rendering will then wait (on the GPU.  It won't block the application 
> unless it gets really backed up) until all rendering that created the damage 
> has finished on the GPU.  Managing the ring-buffer of sync objects is a 
> little 
> more complicated than that in practice, but that's the basic idea. 

Can you be more specific about that? Do you need to do a
glClientWaitSync() when you wrap around and reuse the first sync object
pair?

[...]

> I admit this isn't an ideal work-flow, and yes it is one more layer of hard-
> to-test voodoo needed to write a robust TFP/EGLimage based composite manager, 
> but it's the best we can do without modifying client applications.  However, 
> fence sync objects provide a basis for all kinds of cooler stuff once you 
> start defining new ways that client applications can use them to notify the 
> composite manager when they've finished rendering a frame explicitly.  Then 
> the extra step of telling X you want notification when some rendering you've 
> already been notified of has completed will go away.  The rendering 
> notification (damage event) and a pointer of some sort to the sync object 
> that 
> tracks it will arrive together.  That's what I'll be working on after the 
> initial object support is wrapped up.

It worries me to see a pretty complex, somewhat expensive band-aid going
in *without* knowing more about that long term picture. Obviously if the
fence objects are useful for other things, then that reduces the
complexity of the band-aid a bit.

- Owen

[*] If it was documented that a Damage event didn't imply the rendering
had hit the GPU, then the X server could be changed not to flush
rendering before sending damage events. In the normal case where the
rendering is just a single glXSwapBuffers() or XCopyArea() that doesn't
actually improve efficiency, but it does slightly reduce extra work from
the fence. On the other hand, that would change this exercise from
"fixing a corner case that misrenders on one driver" to "breaking every
non-updated compositing manager".

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PULL] DRI2 XID confusion fixes

2010-05-03 Thread Owen Taylor
On Sun, 2010-05-02 at 17:57 -0700, Kristian Høgsberg wrote:
> Here's a pull request for the four patches I sent out earlier.  The
> dix resource patch has been updated to also update the element count
> for FreeClientNeverRetainResources() and FreeClientResources(), which
> also didn't keep the count up to date.  Also in the patch is a fix for
> LookupClientResourceComplex() and FreeClientNeverRetainResources()
> reentrancy.  These two functions didn't check the element count and
> restart if changed after the callback, which makes them non-reentrant.
> 
> Owen, I added a Tested-by from you for the DRI2 XID fix since we were
> working on that together and you were testing the patch in the fedora
> X server.  I hope that's OK.

Yes, it worked well in my testing, and I built it for Fedora and reports
from people trying that have been positive as well.

- Owen


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver 2/7] EXA: Use round robin instead of rand() for choosing eviction position.

2011-05-17 Thread Owen Taylor
On Tue, 2011-05-17 at 15:03 +0200, Michel Dänzer wrote:
> From: Michel Dänzer 
> 
> This should be just as good on average but is less expensive.

If we're not hitting the cache, isn't the cost of rand() pretty much
noise? On my system, rand() seems to take about 10ns.

The nice thing about random replacement is that it reliable sticks to
being about "as good" as the average, while predictable strategies tend
to have cases where they work well, and cases which they work badly.

That is, if you have cache of size 10, does performance degrade smoothly
when you go from 9 glyphs to 11 glyphs, or do you drop off a cliff?

If libc rand() is too slow, then some inlined linear-congruential
generator could shave a few cycles.

- Owen

[ I haven't done any testing of cache replacement strategies for this
  code for actual texts, my preference for random replacement is more
  general - that it's not necessary to worry about whether the test
  cases are exercising some sweet spot of the algorithm or hitting
  some pathology. ]

> Signed-off-by: Michel Dänzer 
> ---
>  exa/exa_glyphs.c |4 ++--
>  exa/exa_priv.h   |2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/exa/exa_glyphs.c b/exa/exa_glyphs.c
> index 5c46ec9..28aa01f 100644
> --- a/exa/exa_glyphs.c
> +++ b/exa/exa_glyphs.c
> @@ -219,7 +219,7 @@ exaRealizeGlyphCaches(ScreenPtrpScreen,
>   for (j = 0; j < cache->hashSize; j++)
>   cache->hashEntries[j] = -1;
>   
> - cache->evictionPosition = rand() % cache->size;
> + cache->evictionPosition = 0;
>  }
>  
>  /* Each cache references the picture individually */
> @@ -500,7 +500,7 @@ exaGlyphCacheBufferGlyph(ScreenPtr pScreen,
>   exaGlyphCacheHashInsert(cache, pGlyph, pos);
>  
>   /* And pick a new eviction position */
> - cache->evictionPosition = rand() % cache->size;
> + cache->evictionPosition = (pos + 1) % cache->size;
>   }
>  
>   exaGlyphCacheUploadGlyph(pScreen, cache, x, y, pGlyph);
> diff --git a/exa/exa_priv.h b/exa/exa_priv.h
> index 70de4bd..8c5ea15 100644
> --- a/exa/exa_priv.h
> +++ b/exa/exa_priv.h
> @@ -132,7 +132,7 @@ typedef struct {
>  PicturePtr picture;   /* Where the glyphs of the cache are stored */
>  int yOffset;  /* y location within the picture where the cache 
> starts */
>  int columns;  /* Number of columns the glyphs are layed out in */
> -int evictionPosition; /* Next random position to evict a glyph */
> +int evictionPosition; /* Next position to evict a glyph */
>  } ExaGlyphCacheRec, *ExaGlyphCachePtr;
>  
>  #define EXA_NUM_GLYPH_CACHES 4


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver 2/7] EXA: Use round robin instead of rand() for choosing eviction position.

2011-05-17 Thread Owen Taylor
On Tue, 2011-05-17 at 18:32 +0200, Michel Dänzer wrote:
> On Die, 2011-05-17 at 11:53 -0400, Owen Taylor wrote: 
> > On Tue, 2011-05-17 at 15:03 +0200, Michel Dänzer wrote:
> > > From: Michel Dänzer 
> > > 
> > > This should be just as good on average but is less expensive.
> > 
> > If we're not hitting the cache, isn't the cost of rand() pretty much
> > noise? On my system, rand() seems to take about 10ns.
> 
> I wrote this patch because rand() showed up in profiles, and it
> increased x11perf -aa10text numbers.

If rand() shows up in profiles at all, then x11perf -aa10text is
exceeding the cache size, right? (this really suprises me, how many
different glyphs does aa10text show?) 

If x11perf -aa10text is exceeding the cache size then the exact details
of the cache replacement strategy will affect the numbers a lot more
than rand().

> > The nice thing about random replacement is that it reliable sticks to
> > being about "as good" as the average, while predictable strategies tend
> > to have cases where they work well, and cases which they work badly.
> > 
> > That is, if you have cache of size 10, does performance degrade smoothly
> > when you go from 9 glyphs to 11 glyphs, or do you drop off a cliff?
> 
> I haven't tested this specifically.
> 
> 
> > If libc rand() is too slow, then some inlined linear-congruential
> > generator could shave a few cycles.
> 
> I'm afraid I'm not really interested in working on that though.

Add an unsigned int evictionRand to ExaGlyphCacheRec, then when
updating:

 cache->evictionRand = cache->evictionRand * 1103515245 + 12345;
 cache->evictionPosition = cache->evictionRand % cache->size;

would be good enough. (Things could be done to improve the quality,
but just doesn't matter here)

- Owen


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] composite: Inhibit window background paint with manual subwindow redirection

2011-07-19 Thread Owen Taylor
On Fri, 2011-05-06 at 18:19 +0300, ville.syrj...@nokia.com wrote:
> From: Ville Syrjälä 
> 
> The composite extension spec says that window background painting
> should be inhibited when the subwindow redirection mode is set to
> manual.
> 
> This eliminates the ugly flashing effect when compiz unredirects a
> fullscreen window.

Being able to do no-flash unredirection is something I'd definitely like
to be able to do for Mutter!

Hmm, OK, so what you are doing is:

 - Unmap or reshape COW, leaving root window contents, which you
   don't want to be repainted. (And you can't get this by changing
   the background to None because: "Changing the background of a root
   window to None or ParentRelative restores the  default background
   pixmap.")
 - Unredirect fullscreen window

it took me a moment to see why this isn't backwards ordering: if you
unredirected the fullscreen window first it wouldn't have preserved
contents, so the X server would have to expose it from scratch.

This patch does look like it makes the behavior agree with the composite
protocol extension; it's not clear at this point whether there was
always a mismatch or whether it was changed at some point:

History in the compositeproto repo shows:

commit 35a9c80252b35720c1afc5dc53153228e2084b10
Author: Keith Packard 
Date:   Sun Nov 9 07:07:21 2003 +

Note that Manual Subwindows mode disables background painting.

but the X server code from that date wasn't easily discoverable for me. 

Patch looks correct - it makes windows with manual subwindow
redirection go through the same codepath as background none windows.

It seems pretty safe - very little application painting code *depends*
on clearing to the background color, and breakage from this change would
require that plus an application that was doing something pretty unusual
with the composite extension.

Would be definitely good to get Keith to check this as well if possible.

- Owen

Reviewed-by: Owen Taylor 


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] composite: Inhibit window background paint with manual subwindow redirection

2011-07-20 Thread Owen Taylor
On Wed, 2011-07-20 at 02:54 +0300, Ville Syrjälä wrote:

> BTW I noticed afterwards that this doesn't help all apps. Evince, for
> example, still blinks whenever I bring up the popup menu while in
> fullscreen mode, whereas Firefox doesn't blink. I did some quick
> protocol tracing and I saw some window background mode changes from
> the client. A quick look at gdk-x11 code revealed that it does some
> tricks with the window background. I never got around to testing if
> removing that code would make Evince blink-free (tm).

Interesting... there are certainly tricks being played there, but it's
hard to see why they would cause blinking.

When GTK+ maps an override-redirect window, it temporarily sets the
background of the override-redirect window to None. When it unmaps a
window, in some cases, it will temporarily unset the background of a
window that it thinks it might be underneath.

But the background mode of a client window should have no effect at all
on whether it's contents are preserved in the composite pixmap. Changing
the background mode doesn't cause a clear.

The only thing I can think of is that evince might be changing it's size
when popping up the menu, but that seems pretty strange. Hard to guess
at without a live situation to debug.

- Owen


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] composite: Inhibit window background paint with manual subwindow redirection

2011-07-20 Thread Owen Taylor
On Wed, 2011-07-20 at 10:44 -0400, Owen Taylor wrote:
> On Wed, 2011-07-20 at 02:54 +0300, Ville Syrjälä wrote:
> 
> > BTW I noticed afterwards that this doesn't help all apps. Evince, for
> > example, still blinks whenever I bring up the popup menu while in
> > fullscreen mode, whereas Firefox doesn't blink. I did some quick
> > protocol tracing and I saw some window background mode changes from
> > the client. A quick look at gdk-x11 code revealed that it does some
> > tricks with the window background. I never got around to testing if
> > removing that code would make Evince blink-free (tm).
> 
> Interesting... there are certainly tricks being played there, but it's
> hard to see why they would cause blinking.
[...]
> The only thing I can think of is that evince might be changing it's size
> when popping up the menu, but that seems pretty strange. Hard to guess
> at without a live situation to debug.

I did actually have to see it live to figure out what was going on ...
but once I saw it, and thought about it a bit, I was able to avoid
digging out the debugger.

What's going on is that for a window with a non-None background, the
copying between the parent pixmap and the composite pixmap that we do
isn't sufficient - all that copying is doing is making sure that things
_look_ right initially - it doesn't actually suppress expose event
generation and the background painting that happens during expose event
generation.

Fixing this - looks hard! I couldn't immediately see a way of doing it
without modifying ScreenRec, and I couldn't immediately see what
modifications to ScreenRec would make sense. I suppose the easiest, if
hacky thing would be to have some way of removing computed expose
areas between pScreen->ValidateTree and pScreen->HandleExposures.

(Or slight variant - having some way to completely suppress exposure
generation during a sequence of ValidateTree/HandleExposures and only
worry about the case where *no* exposures area necessary.)

- Owen


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] composite: Inhibit window background paint with manual subwindow redirection

2011-07-22 Thread Owen Taylor
On Thu, 2011-07-21 at 00:24 +0300, Ville Syrjälä wrote:
> > I did actually have to see it live to figure out what was going on ...
> > but once I saw it, and thought about it a bit, I was able to avoid
> > digging out the debugger.
> > 
> > What's going on is that for a window with a non-None background, the
> > copying between the parent pixmap and the composite pixmap that we do
> > isn't sufficient - all that copying is doing is making sure that things
> > _look_ right initially - it doesn't actually suppress expose event
> > generation and the background painting that happens during expose event
> > generation.
> 
> The pointless expose events were supposed to be eliminated by another
> patch set [1]. At least my test app [2] manages to redirect/unredirect
> non-None windows w/o generating unnecessary expose events.

Hmm, OK, I had those patches applied, but figured that the expose events
were being generated by some other path than Map/Unmap. It turns out
that they were being generated because my test Mutter code wasn't
ordering things correctly.

With that fixed, I can't reproduce any flickering with Evince. It
redirects and unredirects seamlessly when I pop up a menu. (This is with
GTK+-3, but there shouldn't be any difference from GTK+-2 in this
regard.)

So, I'm not sure what was happening in your testing but I'm pretty happy
with how things are working for me!

- Owen


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] composite: Inhibit window background paint with manual subwindow redirection

2011-07-27 Thread Owen Taylor
On Mon, 2011-07-25 at 20:36 +0200, Marko Macek wrote:

> Sometime in the previous decade I sent patches to Mozilla to always set 
> window background 
> to None (mozilla bug 28003). Hopefully none of it got lost since then.

Mozilla has used background none as long as I can remember. I guess
since your patches... Without a compositor, there are tradeoffs -
background none makes some operations look better, but causes:

 A) Trailing when another window is dragged over a window
 B) Very confusing behavior for the user when the app is
not responsive - using, for example, alt-tab to switch to an
application would leave the previous app and the alt-tab popup
visible.

As long as you can do an at all decent job of predicting the background
color that will be painted (and you can in most applications), I've
always though that those downsides outweigh any advantages for
background None.

With a compositor, these considerations don't really come into play and
there's not much point in pre-painting the backgrounds.

> It makes sense as not only it prevents blinking, it was also visibly faster 
> at the time
> (probably still is). And opaque move/resize is much better.

Really would not expect any noticeable speed difference for a few solid
color fills. (Even a decade ago, but especially today.)

> IMO, this mechanism should be used by all X apps, because the only downside 
> is that background is not painted when app is blocking. The GTK+ mechanism 
> of switching backgrounds seems of little use to me (?).

It might make sense for a toolkit to always use background None in the
composited case.

- Owen


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: Initial DRI3000 protocol specs available

2013-02-26 Thread Owen Taylor
Sorry for joining the discussion late here. I went through the specs and
discussion and have various comments and questions.

- Owen

* It would be great if we could figure out a plan to get to the
  point where the exact same application code is going to work for
  proprietary and open source drivers. When you get down to the details
  of swap this isn't close to the case currently.

  - Because swap handled client side in some drivers, INTEL_swap_event
is seen as awkward to implement.

  - There is divergence on some basic behaviors, e.g.,  whether
glXSwapBuffers() glFinish() waits for the swap to complete or not.

  - When rendering with a compositor, the X server is innocent of
relevant information about timing and when the application should
draw additional new frames. I've been working on handing this
via client <=> compositor protocols

(https://mail.gnome.org/archives/wm-spec-list/2013-January/msg0.html)

But this adds a lot of complexity to the minimal client, especially
when a client wants to work both redirected and unredirected. 

  I think it would be great if we could sit down and figure out what
  the Linux-ecosystem API is for this in a way we could give to
  application authors.
  
* One significant problem that I have currently is that the default mode
  for the Intel drivers is to use triple buffering and send back swap
  events *when rendering the next frame would not block* - that is,
  immediately. This results in a frame of unnecessary latency.
  (The too-early events are also missing ust/msc/sbc information.)

  So I'd like to make sure that we know exactly what SwapComplete means
  and not have creative reinterpretations based on what works well
  for one client or another.

  The SwapComplete event is specified as - "This event is delivered
  when a SwapRegion operation completes" - but the specification
  of SwapRegion itself is fuzzy enough that I'm unclear exactly what
  that means.

  - The description SwapRegion needs to define "swap" since the
operation has only a vague resemblance to the English-language
meaning of "swap".

  - My interpretation of SwapRegion is that the actual propagation of
source to destination is *asynchronous* to the X protocol stream.
This is implied by "Schedule a swap..." but probably should be
explicitly stated, since it is radically different from other
rendering in X.

  - Is the serial in the SwapComplete event synchronous to the protocol
stream? E.g., can you assume that any CopyArea from the destination
drawable before that serial will get the old contents, and a
CopyArea from the destination after that serial will get the new
contents?

  - What happens when multiple SwapRegion requests are made with a
swap-interval of zero. Are previous ones discarded?

  - Is it an error to render to a non-idle pixmap? Is it an error to
pass a non-idle pixmap as the source to SwapRegion?

  - What's the interaction between swap-interval and target-msc, etc?

  - When a window is redirected, what's the interpretation of 
swap-interval, target-msc, etc? Is it that the server performs the
operation at the selected blanking interval (as if they window
wasn't redirected), and then damage/other events are generated
and the server picks it up and renders to the real front buffer
at the next opportunity - usually a frame later.

* Do I understand correctly that the idle pixmaps returned from
  a SwapRegion request are the pixmaps that *will* be idle once the
  corresponding SwapComplete event is received?

  If this is correct, what happens if things change before the swap
  actually happens and what was scheduled as a swap ends up being
  a copy? Is it sufficient to assume that a ConfigureNotify on a
  destination window means that all pixmaps passed to previous
  SwapRegion requests are now idle?

* In the definition of SWAPIDLE you say:

If valid is TRUE, swap-hi/swap-lo form a 64-bit
swap count value from the SwapRegion request which matches the
data that the pixmap currently contains

 If I'm not misunderstanding things, this is a confusing statement
 because, leaving aside damage to the front buffer, pixmaps always
 contain the same contents (whatever the client rendered into it.)

 Is the use of the swap-hi/swap-lo identify the SwapRegion problematical
 in the case where swaps aren't throttled? Would it be better to use
 sequence number of the request? Or is the pixmap itself sufficient?

* What control, if any, will applications have over the number of
  buffers used - what the behavior will be when an application starts
  rendering another frame in terms of allocating a new buffer versus
  swapping?

* Do we need to deal with stereo as part of this?


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: Initial DRI3000 protocol specs available

2013-02-26 Thread Owen Taylor
A complex scheme where the compositor and the server collaborate on the
implementation of SwapRegion seems fragile to me, and still doesn't get
some details right - like the swap count returned from SwapRegion.

What if we made SwapRegion redirectable along the lines of
ResizeRedirectMask? Since it would be tricky to block the client calling
SwapRegion until the compositor responded, this would probably require
removing the reply to SwapRegion and sending everything needed back in
events. At the most granular, this would be:

 SwapScheduled - whatever information is available immediately on
 receipt of SwapRegion
 SwapIdle  - a buffer is returned to the application for rendering
 SwapComplete  - the swap actually happened and we know the
 msc/sbc/ust triple

But I don't know that you need that much granularity. I think SwapIdle
and SwapComplete are sufficient.

Tricky parts:

 * Not leaking buffers during redirection/unredirection could be tricky.
   What if the compositor exits while a client is waiting for a
   SwapIdle? An event when swap is redirected/unredirected is probably
   necessary.

 * To make this somewhat safe, the concept of "idle" has to be one of
   correct display not system stability. It can't take down the system
   if the compositor sends SwapIdle at the wrong time.

 * Because the SBC is a drawable attribute it's a little complex to
   continue having the right value over swap redirection.

When a window is swap-redirected, we say that the SBC is
incremented by one every time the redirecting client calls
SwapRegion, and never otherwise. A query is provided for the
current value.

 * It doesn't make sense to have both the server and the compositor
   scheduling stuff. I think you'd specify that once you swap
   redirect a window, it gets simple:

   - SwapRegion called by another client - SwapRegionRequest is
 immediately generated.
   - SwapRegion called by redirecting client - action (either a
 swap or a copy) happens immediately.

   Actually, from the compositor's perspective, the window's front
   buffer doesn't matter, but you probably need to keep it current
   to make screenshot tools, etc, work correctly.

Is this better than a more collaborative approach where the server and
compositor together determine what pixmaps are idle? I think it might be
a little simpler and more flexible. It also potentially allows a
SwapRegion on a client window to be directly turned to a SwapRegion on
the root window for a full-screen client. But there probably are a host
of difficulties I'm not thinking of :-)

- Owen


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: Initial DRI3000 protocol specs available

2013-03-07 Thread Owen Taylor
On Thu, 2013-02-28 at 16:55 -0800, Keith Packard wrote:

> > * It would be great if we could figure out a plan to get to the
> >   point where the exact same application code is going to work for
> >   proprietary and open source drivers. When you get down to the details
> >   of swap this isn't close to the case currently.
> 
> Agreed -- the problem here is that except for the nVidia closed drivers,
> everything else implicitly serializes device access through the kernel,
> providing a natural way to provide some defined order of
> operations. Failing that, I'd love to know what mechanisms *could* work
> with that design.

I don't think serialization is actually the big issue - although it's
annoying to deal with fences that are no-op for the open sources, it's
pretty well defined where you have to insert them, and because they are
no-op's for the open source drivers, there's little overhead.

Notification is more of an issue.

> >   - Because swap handled client side in some drivers, INTEL_swap_event
> > is seen as awkward to implement.
> 
> I'm not sure what could be done here, other than to have some way for
> the X server to get information about the swap and stuff it into the
> event stream, of course. It could be as simple as having the client
> stuff the event data to the X server itself.

It may be that a focus on redirection makes things easier - once the
compositor is involved, we can't get away from X server involvement. The
compositor is the main case where the X server can be completely
bypassed when swapping. And I'm less concerned about API divergence for
the compositor. (Not that I *invite* it...)

> >   - There is divergence on some basic behaviors, e.g.,  whether
> > glXSwapBuffers() glFinish() waits for the swap to complete or not.
> 
> glXSwapBuffers is pretty darn explicit in saying that it *does not* wait
> for the swap to complete, and glFinish only promises to synchronize the
> effects of rendering ("contents of the frame buffer"), not the actual
> swap operation itself. I'm not sure how we're supposed to respond when
> drivers ignore the spec and do their own thing?

I wish the GLX specification was clear enough so we actually knew who
was ignoring the spec and doing their own thing... ;-) The GLX
specification describes the swap operation as the contents of the back
buffer "become the contents of the front buffer" ... that seems like an
operation on the "contents of the frame buffer".

But getting into the details here is a bit of a distraction - my goal is
to try to get us to convergence so we have only one API with well
defined behaviors.

> >   - When rendering with a compositor, the X server is innocent of
> > relevant information about timing and when the application should
> > draw additional new frames. I've been working on handing this
> > via client <=> compositor protocols
> 
> With 'Swap', I think the X server should be involved as it is necessary
> to get be able to 'idle' buffers which aren't in use after the
> compositor is done with them. I tried to outline a sketch of how that
> would work before.
> 
> > (https://mail.gnome.org/archives/wm-spec-list/2013-January/msg0.html)
> >
> > But this adds a lot of complexity to the minimal client, especially
> > when a client wants to work both redirected and unredirected.
> 
> Right, which is why I think fixing the X server to help here would be better.

If the goal is really to obsolete the proposed WM spec changes, rather
than just make existing GLX apps work better, then there's quite a bit
of stuff to get right. For example, from my perspective, the
OML_sync_control defined UST timestamps are completely insufficient -
it's not even defined what the units are for these timestamps!

> >   I think it would be great if we could sit down and figure out what
> >   the Linux-ecosystem API is for this in a way we could give to
> >   application authors.
> 
> Ideally, a GL application using simple GLX or EGL APIs would work
> 'perfectly', without the need to use additional X-specific APIs. My hope
> with splitting DRI3000 into separate DRI3 and Swap extensions is to
> provide those same semantics to simple double-buffered 2D applications
> using core X and Render drawing as well, without requiring that they be
> rewritten to use GL, and while providing all of the same functionality
> over the network as local direct rendering applications get today.

The GLX APIs have some significant holes and poorly defined aspects. And
they don't properly take compositing into account, which is the norm
today. So providing those capabilities to 2D apps seems of limited
utility.

[...]

> >   The SwapComplete event is specified as - "This event is delivered
> >   when a SwapRegion operation completes" - but the specification
> >   of SwapRegion itself is fuzzy enough that I'm unclear exactly what
> >   that means.
> >
> >   - The description SwapRegion needs to define "swap" since the
> > operation has only