Re: [PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-08-02 Thread Michel Dänzer
On 08.07.2016 19:01, Michel Dänzer wrote:
> From: Michel Dänzer 
> 
> The unnecessary FlushCallback calls could cause significant performance
> degradation with compositing.
> 
> As an example, with the radeon driver using glamor, a gtkperf run using
> default parameters (100 iterations of every test) takes:
> (numbers without => with this patch)
> 
> * About 1.9 => 1.9 seconds with xfwm4 without compositing
> * About 2.6 => 2.2 seconds with compton compositing added
> * About 4.1 => 2.2 seconds with kwin compositing
> * About 10.2 => 2.4 seconds with xfwm4 compositing
> 
> Signed-off-by: Michel Dänzer 

I'm retracting this patch in favour of
https://patchwork.freedesktop.org/patch/102509/ .


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-07-18 Thread Keith Packard
Adam Jackson  writes:

> My epoll branch has been rebased atop that as well.

I've rebased a nearly identical branch atop the two poll emulation
patches I just sent out in my tree.

I think we're ready to merge that code to master now; it should 'just
work' on mingw.

-- 
-keith


signature.asc
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-07-18 Thread Adam Jackson
On Tue, 2016-07-12 at 09:49 -0700, Keith Packard wrote:
> Adam Jackson  writes:
> 
> > Here's an epoll branch rebased to master, with poll emulation for
> > Windows, and a pair of proof-of-concept patches at the end to allow
> > (force) testing on pollful systems:
> 
> So you're just emulating poll with select, which should work on
> Windows. that seems like a good starting place; I'd encourage the
> Windows hackers to refine that to handle arbitrary numbers of FDs.
> 
> > I tested this with 32 parallel x11perf -noop's on Linux, and while it
> > does work, reliable numbers are a bit hard to come by. The x11perf
> > processes take (wildly) varying amounts of time to complete, such that
> > the last few runs to complete show inflated nop dispatch rates because
> > they're competing with fewer clients. I don't know yet how much of that
> > is X's scheduler and how much is the kernel's. 
> 
> I've been running many instances of 'plaid' in parallel as that doesn't
> scale it's work based on the server performance.
> 
> Thanks for taking a stab at this; it looks good to me. I'd love for
> someone to give it a spin on Windows and make sure it actually works.

I've merged everything up to (but not including) the poll emulation
code:

remote: I: patch #59987 updated using rev 
05a793f5b3c40747d5a92a076def7f4fb673c7e7.
remote: I: patch #89027 updated using rev 
81135991a583b3b30a90e82ddc1d5c86d57bf00b.
remote: E: failed to find patch for rev 
aa6717ce213e79735c72afc5ec9cc1f9c0297e09.
remote: I: patch #89021 updated using rev 
559aac2d71250e3137aaa582e2a59a918ddf21b7.
remote: E: failed to find patch for rev 
c3fea428aed919826130ef8ebdb2cceb445a845b.
remote: E: failed to find patch for rev 
24e65bf0db57bf4ac70386c0a0e8275b73edd2fb.
remote: I: patch #87950 updated using rev 
55c2e1a3aa587c58a74066724e11e30b3df267b8.
remote: I: patch #87953 updated using rev 
6bf7b49f6711c7ed1837283dc04f93f4c1b77ecc.
remote: I: patch #87948 updated using rev 
6299ef3d749d6f978d3d38d42f711ac56bf382eb.
remote: E: failed to find patch for rev 
410bc047480a9f98df678dc850bc6b99c3cfb5bf.
remote: E: failed to find patch for rev 
fb0802113b4c57819cba15d64baf79bf4148607e.
remote: E: failed to find patch for rev 
9d15912aa475b733bbb20efc367a67dacad63bf1.
remote: I: patch #89584 updated using rev 
be5a513fee6cbf29ef7570e57eb0436d70fbd88c.
remote: E: failed to find patch for rev 
4af00242ef1e39499b932d12423fdf449296090a.
remote: I: patch #89581 updated using rev 
7762a602c1dfdd8cfcf2b8c2281cf4d683d05216.
remote: I: patch #89592 updated using rev 
a414db021575accff64abad6f1047245e81c7476.
remote: I: patch #89593 updated using rev 
e6636b438322a9a2f2270ad9d60bf3dfc72be0b3.
remote: E: failed to find patch for rev 
0d294462a5af08ada654c588fad921ed7a22749b.
remote: E: failed to find patch for rev 
ef7ddbe242ed4c461f816663fb88646e41f1c21b.
remote: E: failed to find patch for rev 
8d3a368d8980e37e7e8c57065dc901ce809887c6.
remote: I: 10 patch(es) updated to state Accepted.
To ssh://git.freedesktop.org/git/xorg/xserver
   950ffb8..8d3a368  master -> master

My epoll branch has been rebased atop that as well.

- ajax
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-07-12 Thread Keith Packard
Adam Jackson  writes:

> Here's an epoll branch rebased to master, with poll emulation for
> Windows, and a pair of proof-of-concept patches at the end to allow
> (force) testing on pollful systems:

So you're just emulating poll with select, which should work on
Windows. that seems like a good starting place; I'd encourage the
Windows hackers to refine that to handle arbitrary numbers of FDs.

> I tested this with 32 parallel x11perf -noop's on Linux, and while it
> does work, reliable numbers are a bit hard to come by. The x11perf
> processes take (wildly) varying amounts of time to complete, such that
> the last few runs to complete show inflated nop dispatch rates because
> they're competing with fewer clients. I don't know yet how much of that
> is X's scheduler and how much is the kernel's. 

I've been running many instances of 'plaid' in parallel as that doesn't
scale it's work based on the server performance.

Thanks for taking a stab at this; it looks good to me. I'd love for
someone to give it a spin on Windows and make sure it actually works.

-- 
-keith


signature.asc
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-07-12 Thread Adam Jackson
On Mon, 2016-07-11 at 12:32 -0400, Peter Harris wrote:
> On 2016-07-08 16:33, Keith Packard wrote:
> > Adam Jackson  writes:
> > 
> > > Still not cool with committing code that we _know_ will break a
> > > supported platform.
> > 
> > Oh, right, windows. Hrm. Suggestions welcome, but I suspect the only
> > workable solution will be to write some windows-specific code that does
> > the wait for multiple objects thing. We can't use select because the
> > fd_set data structure, while 'poll-like', has a fixed number of fds, and
> > we can't use poll because it's broken.
> 
> The number of fds in fd_set isn't really fixed. FD_SETSIZE is user
> controlled,

This does appear to be true of Windows, where fd_set is a self-
describing struct and not just an array of bits. Definitely not the
case for Linux, but that doesn't matter here.

Here's an epoll branch rebased to master, with poll emulation for
Windows, and a pair of proof-of-concept patches at the end to allow
(force) testing on pollful systems:

https://cgit.freedesktop.org/~ajax/xserver/log/?h=epoll

I tested this with 32 parallel x11perf -noop's on Linux, and while it
does work, reliable numbers are a bit hard to come by. The x11perf
processes take (wildly) varying amounts of time to complete, such that
the last few runs to complete show inflated nop dispatch rates because
they're competing with fewer clients. I don't know yet how much of that
is X's scheduler and how much is the kernel's. 

- ajax
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-07-11 Thread Keith Packard
Peter Harris  writes:

> The number of fds in fd_set isn't really fixed. FD_SETSIZE is user
> controlled, so I can't imagine a problem with massaging fd_set into C99
> variable-length-array form:
>
> typedef struct xfd_set {
> u_int fd_count;
> SOCKET  fd_array[];
> } xfd_set;
>
> and reallocing to taste.

Sounds like it would be easy to duplicate the 'poll' code using this
alternate data structure then? Anyone up for the adventure?

-- 
-keith


signature.asc
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-07-11 Thread Peter Harris
On 2016-07-08 16:33, Keith Packard wrote:
> Adam Jackson  writes:
> 
>> Still not cool with committing code that we _know_ will break a
>> supported platform.
> 
> Oh, right, windows. Hrm. Suggestions welcome, but I suspect the only
> workable solution will be to write some windows-specific code that does
> the wait for multiple objects thing. We can't use select because the
> fd_set data structure, while 'poll-like', has a fixed number of fds, and
> we can't use poll because it's broken.

The number of fds in fd_set isn't really fixed. FD_SETSIZE is user
controlled, so I can't imagine a problem with massaging fd_set into C99
variable-length-array form:

typedef struct xfd_set {
u_int fd_count;
SOCKET  fd_array[];
} xfd_set;

and reallocing to taste.

Peter Harris
-- 
   Open Text Connectivity Solutions Group
Peter Harrishttp://connectivity.opentext.com/
Research and DevelopmentPhone: +1 905 762 6001
phar...@opentext.comToll Free: 1 877 359 4866
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-07-08 Thread Keith Packard
Adam Jackson  writes:

> Still not cool with committing code that we _know_ will break a
> supported platform.

I wasn't aware of any platform this would break? Am I missing something?

-- 
-keith


signature.asc
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-07-08 Thread Keith Packard
Adam Jackson  writes:

> Still not cool with committing code that we _know_ will break a
> supported platform.

Oh, right, windows. Hrm. Suggestions welcome, but I suspect the only
workable solution will be to write some windows-specific code that does
the wait for multiple objects thing. We can't use select because the
fd_set data structure, while 'poll-like', has a fixed number of fds, and
we can't use poll because it's broken.

-- 
-keith


signature.asc
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-07-08 Thread Adam Jackson
On Fri, 2016-07-08 at 16:37 +0200, Keith Packard wrote:
> Keith Packard  writes:
> 
> > It should, however, be sufficient to simply move the callback in
> > FlushAllOutput after the check for 'newoutput'; newoutput should only
> > ever be set if there is output to deliver.
> 
> It'd also be awesome to get the epoll patches merged to test the effect
> of this in that context. There's a single patch in that list which
> hasn't been reviewed...

Still not cool with committing code that we _know_ will break a
supported platform.

- ajax
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-07-08 Thread Keith Packard
Keith Packard  writes:

> It should, however, be sufficient to simply move the callback in
> FlushAllOutput after the check for 'newoutput'; newoutput should only
> ever be set if there is output to deliver.

It'd also be awesome to get the epoll patches merged to test the effect
of this in that context. There's a single patch in that list which
hasn't been reviewed...

-- 
-keith


signature.asc
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-07-08 Thread Keith Packard
Michel Dänzer  writes:

> From: Michel Dänzer 
>
> The unnecessary FlushCallback calls could cause significant performance
> degradation with compositing.

Yeah, we do need a flush callback in drivers to ensure that damage
events aren't sent without the rendering being flushed. A block handler
used to be sufficient as WriteToClient "never" actually wrote anything,
and all output was delayed until FlushAllOutput, which happens after the
BlockHandler call in WaitForSomething.

It should, however, be sufficient to simply move the callback in
FlushAllOutput after the check for 'newoutput'; newoutput should only
ever be set if there is output to deliver.

-- 
-keith


signature.asc
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

[PATCH xserver] FlushAllOutput: Only call FlushCallbacks when actually flushing data

2016-07-08 Thread Michel Dänzer
From: Michel Dänzer 

The unnecessary FlushCallback calls could cause significant performance
degradation with compositing.

As an example, with the radeon driver using glamor, a gtkperf run using
default parameters (100 iterations of every test) takes:
(numbers without => with this patch)

* About 1.9 => 1.9 seconds with xfwm4 without compositing
* About 2.6 => 2.2 seconds with compton compositing added
* About 4.1 => 2.2 seconds with kwin compositing
* About 10.2 => 2.4 seconds with xfwm4 compositing

Signed-off-by: Michel Dänzer 
---
 os/io.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/os/io.c b/os/io.c
index d04ebd8..82e22cc 100644
--- a/os/io.c
+++ b/os/io.c
@@ -633,14 +633,12 @@ FlushAllOutput(void)
 OsCommPtr oc;
 register ClientPtr client;
 Bool newoutput = NewOutputPending;
+Bool callbacks_called = FALSE;
 
 #if defined(WIN32)
 fd_set newOutputPending;
 #endif
 
-if (FlushCallback)
-CallCallbacks(&FlushCallback, NULL);
-
 if (!newoutput)
 return;
 
@@ -671,8 +669,14 @@ FlushAllOutput(void)
 FD_SET(oc->fd, &OutputPending); /* set the bit again */
 NewOutputPending = TRUE;
 }
-else
+else {
+if (FlushCallback && !callbacks_called) {
+CallCallbacks(&FlushCallback, NULL);
+callbacks_called = TRUE;
+}
+
 (void) FlushClient(client, oc, (char *) NULL, 0);
+}
 }
 }
 #else   /* WIN32 */
@@ -689,8 +693,14 @@ FlushAllOutput(void)
 FD_SET(oc->fd, &newOutputPending);  /* set the bit again */
 NewOutputPending = TRUE;
 }
-else
+else {
+if (FlushCallback && !callbacks_called) {
+CallCallbacks(&FlushCallback, NULL);
+callbacks_called = TRUE;
+}
+
 (void) FlushClient(client, oc, (char *) NULL, 0);
+}
 }
 XFD_COPYSET(&newOutputPending, &OutputPending);
 #endif  /* WIN32 */
-- 
2.8.1

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel