Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Kristian Høgsberg
On Sun, Oct 17, 2010 at 12:58 PM, Aaron Plattner aplatt...@nvidia.com wrote:
 This change does not solve the problem it claims to solve because it
 makes two assumptions about the driver that are not always true:

 1. that it sends rendering requests to the GPU immediately, which can
   kill performance on some GPUs, and

No.  We batch up rendering commands in a userspace command buffer and
the X server batches up outgoing events in a per client protocol
buffer.  We flush the command batch to the kernel just before the
server flushes its protocol buffers, using the flush callback chain.
This way we make sure that when we send out damage events, the render
has been submitted already, and still (typically) batch up all
rendering between one set of wake and block.

The reason for this patch is that when the client protocol buffer
overflows, the X server ends up flushing the damage events queued so
far *plus* the one it was trying to add to the protocol buffer (that's
how it works, it uses writev to send the protocol buffer *and* the
event that overflowed the buffer).  If we're using pre-op damage
reporting, that means that the rendering command corresponding to that
last damage event has not yet been added to the batch buffer and the
client will receive the damage event before we queue up the rendering.

Switching to post-op reporting fixes this race, and from a client
point of view, all we ever did was post-op reporting.  Pre-op
reporting just can't work when the damage listener is in a different
process - the X server would have to send the event immediately and
wait for the client to acknowledge it before it could submit the
rendering.  So in principle there should be no sematic difference from
the client point of view.  Unfortunately, there seems to be a problem
with the post-op reporting - it's not a code path that we've really
used before.

 2. that the GPU processes all requests from multiple clients in order,
   which is false for some GPUs.

Yes, that's an assumption I make.  I understand that it's not a fix
for hardware with multiple queues, but it allows single queue hardware
to work correctly.  The back story is that we used to have a roundtrip
in the glXBindTexImage path, which as a side effect made the intel
driver flush its rendering batch buffers.  When I eliminated the round
trip, we ended up with damage events and rendering commands no longer
in sync.  The post-op patch plus an intel driver patch to submit the
rendering from the flush callback chain (instead of just block
handler) fixed that regression.

So I just fixed a regression.  I'm not claiming to solve anything for
other drivers.

 All a Damage event tells you is that your next X request will be
 procesed after the damage has landed, not that the damage has already
 landed or even that the rendering commands that cause the damage have
 been submitted to the GPU.

 In addition to the above, this commit also broke the Compiz
 Wallpaper plugin.

 This reverts commit 8d7b7a0d71e0b89321b3341b781bc8845386def6.

Sure, we can revert it.  I never saw the corner case it fixes trigger,
we usually never actually fill up a protocol buffer with damage
events.  The only case I saw that would flush a protocol buffer was
sending out xkb keymaps.

Acked-by: Kristian Høgsberg k...@bitplanet.net

 ---
 Fixing this correctly requires spec changes, which is the purpose of James
 Jones's damage sync stuff.  If you're reading this, you should go review
 that now.  :)

  damageext/damageext.c |    1 -
  1 files changed, 0 insertions(+), 1 deletions(-)

 diff --git a/damageext/damageext.c b/damageext/damageext.c
 index b4bb478..f5265dd 100644
 --- a/damageext/damageext.c
 +++ b/damageext/damageext.c
 @@ -217,7 +217,6 @@ ProcDamageCreate (ClientPtr client)
     if (!AddResource (stuff-damage, DamageExtType, (pointer) pDamageExt))
        return BadAlloc;

 -    DamageSetReportAfterOp (pDamageExt-pDamage, TRUE);
     DamageRegister (pDamageExt-pDrawable, pDamageExt-pDamage);

     if (pDrawable-type == DRAWABLE_WINDOW)
 --
 1.7.0.4

 ___
 xorg-devel@lists.x.org: X.Org development
 Archives: http://lists.x.org/archives/xorg-devel
 Info: http://lists.x.org/mailman/listinfo/xorg-devel

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Keith Packard
On Sun, 17 Oct 2010 09:58:50 -0700, Aaron Plattner aplatt...@nvidia.com wrote:

 This change does not solve the problem it claims to solve because it
 makes two assumptions about the driver that are not always true:
 
 1. that it sends rendering requests to the GPU immediately, which can
kill performance on some GPUs, and
 2. that the GPU processes all requests from multiple clients in order,
which is false for some GPUs.

When the compositing manager receives the damage event, it must be able
to access content from all applications which resulted in the reported
damage. How else can this ever work?

 All a Damage event tells you is that your next X request will be
 procesed after the damage has landed, not that the damage has already
 landed or even that the rendering commands that cause the damage have
 been submitted to the GPU.

It doesn't require anything about the current state of the GPU, only
that future rendering requests include the results of any rendering
which resulted in the reported damage.

In a multi-queue rendering engine, you clearly need to perform some
cross-queue synchronization so that the rendering performed by the
compositing manager sees the rendering performed by other applications.

The switch to reporting events after performing the rendering operation
ensured that the driver was notified about the event delivery before the
client received the event (through the 'flush' callback). It's easy to
believe that this notification is not what multi-queue hardware wants,
and that perhaps some other mechanism would be preferred.

The switch to post-op damage reporting was teste on hardware which has
only a single queue and for which ensuring that the shuffle of requests
hitting the graphics hardware is synchronized with the shuffle of
requests executed by the X server at appropriate times.

 In addition to the above, this commit also broke the Compiz
 Wallpaper plugin.

That's just a bug in the damage code; it would be lovely to see the
sequence of damage events delivered in both cases so we could fix
it. There should be no difference in the generated damage events, only
the sequence within the X server of when those events are generated, and
in the absence of a full output buffer requiring an intermediate flush
operation, there shouldn't even be a change in when those events are
delivered to the application.

I don't see any way to construct correct Damage semantics without
delaying the damage event delivery until after the rendering operation
has been seen by the driver; the old code relied on the damage event
getting stuck in the client output buffer.

-- 
keith.pack...@intel.com


pgpMQQlwAlw5H.pgp
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Aaron Plattner
Thanks for this discussion.  More precise semantics are always better.
Comments below:

On Wed, Oct 20, 2010 at 11:05:29AM -0700, Keith Packard wrote:
 On Sun, 17 Oct 2010 09:58:50 -0700, Aaron Plattner aplatt...@nvidia.com 
 wrote:
 
  This change does not solve the problem it claims to solve because it
  makes two assumptions about the driver that are not always true:
  
  1. that it sends rendering requests to the GPU immediately, which can
 kill performance on some GPUs, and
  2. that the GPU processes all requests from multiple clients in order,
 which is false for some GPUs.
 
 When the compositing manager receives the damage event, it must be able
 to access content from all applications which resulted in the reported
 damage. How else can this ever work?

By sending X protocol to explicitly fence the GL rendering with the X
rendering, which is the whole purpose of James's XDamageSubtractAndTrigger
request.  Otherwise, the driver has to implicitly fence all rendering (and
probably all CUDA kernels that interoperate with X) for damage events,
which will not perform well.

  All a Damage event tells you is that your next X request will be
  procesed after the damage has landed, not that the damage has already
  landed or even that the rendering commands that cause the damage have
  been submitted to the GPU.
 
 It doesn't require anything about the current state of the GPU, only
 that future rendering requests include the results of any rendering
 which resulted in the reported damage.

In the context of the Damage extension, future rendering requests means
future *X* rendering requests -- this is how you told me it worked when I
asked you about it.  Implicitly fencing all other rendering through any API
when a Damage event is generated is a spec change that we could make, but
it would require discussion.

Since X rendering all happens in order, it doesn't matter whether the
client gets the event before the driver sees the rendering request because
it will have done by the time it receives any later requests.

If your driver model can use the ReportAfter semantics to get implicit
fencing across APIs, then that seems like something you could enable in the
driver via a DamageCreate or DamageRegister hook.  It doesn't seem like it
belongs in the server.

 In a multi-queue rendering engine, you clearly need to perform some
 cross-queue synchronization so that the rendering performed by the
 compositing manager sees the rendering performed by other applications.
 
 The switch to reporting events after performing the rendering operation
 ensured that the driver was notified about the event delivery before the
 client received the event (through the 'flush' callback). It's easy to
 believe that this notification is not what multi-queue hardware wants,
 and that perhaps some other mechanism would be preferred.

The flush callback sounds nifty, but it won't solve the problem for
multi-queue hardware.

 The switch to post-op damage reporting was teste on hardware which has
 only a single queue and for which ensuring that the shuffle of requests
 hitting the graphics hardware is synchronized with the shuffle of
 requests executed by the X server at appropriate times.
 
  In addition to the above, this commit also broke the Compiz
  Wallpaper plugin.
 
 That's just a bug in the damage code; it would be lovely to see the
 sequence of damage events delivered in both cases so we could fix
 it. There should be no difference in the generated damage events, only
 the sequence within the X server of when those events are generated, and
 in the absence of a full output buffer requiring an intermediate flush
 operation, there shouldn't even be a change in when those events are
 delivered to the application.

Sure, I agree that it should have worked, but I don't agree that the change
is necessary in the core server code.

 I don't see any way to construct correct Damage semantics without
 delaying the damage event delivery until after the rendering operation
 has been seen by the driver; the old code relied on the damage event
 getting stuck in the client output buffer.

The old (and new, with this change) Damage semantics just plain don't work
on multi-queue hardware with direct-rendering clients; it's a limitation in
the specification that's addressed by James's sync extension.

 -- 
 keith.pack...@intel.com
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Keith Packard
On Wed, 20 Oct 2010 11:35:46 -0700, Aaron Plattner aplatt...@nvidia.com wrote:

 By sending X protocol to explicitly fence the GL rendering with the X
 rendering, which is the whole purpose of James's XDamageSubtractAndTrigger
 request.  Otherwise, the driver has to implicitly fence all rendering (and
 probably all CUDA kernels that interoperate with X) for damage events,
 which will not perform well.

The rendering we're interested in is coming from an X pixmap -- the
backing pixmap for a redirected window. That you're using that with GL
is immaterial; you must talk about the X resource somewhere. Yes, it's
no longer on the X protocol connection, but as an X API, it seems
incumbent on the implementation to retain the basic X ordering
guarantees.

The Intel DRI driver has specific fencing operations that pend
operations from one client wrt another client to ensure correct ordering
in these cases.

 In the context of the Damage extension, future rendering requests means
 future *X* rendering requests -- this is how you told me it worked when I
 asked you about it.  Implicitly fencing all other rendering through any API
 when a Damage event is generated is a spec change that we could make, but
 it would require discussion.

I don't see it as a change in the specification, just a natural
consequence of the X ordering guarantees, and as we're dealing only with
X objects here, I'm not sure why the API used to reference those objects
is relevant.

 If your driver model can use the ReportAfter semantics to get implicit
 fencing across APIs, then that seems like something you could enable in the
 driver via a DamageCreate or DamageRegister hook.  It doesn't seem like it
 belongs in the server.

I guess the point of the change was that the change ensures that the
driver *can* serialize access to the hardware correctly, without the
need for changes in applications. If there is a more efficient
mechanism, I'd love to hear suggestions.

 The flush callback sounds nifty, but it won't solve the problem for
 multi-queue hardware.

Right, it seems like what you want is to know which drawables have
delivered damage events so that you can correctly serialize rendering
From them by other applications using direct rendering.

 The old (and new, with this change) Damage semantics just plain don't work
 on multi-queue hardware with direct-rendering clients; it's a limitation in
 the specification that's addressed by James's sync extension.

The patch in questions should have *no effect* except in the rare case
when the events end up being delivered before the rendering is flushed
to the hardware. Given that, and given that there is no other existing
mechanism to prevent this race condition in the server code, I'm
convinced that we cannot remove the change until another solution is
proposed.

Fixing the problem with the compiz plugin should be our priority at this point.

-- 
keith.pack...@intel.com


pgpmX14EpiNmS.pgp
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Aaron Plattner
On Wed, Oct 20, 2010 at 01:54:35PM -0700, Keith Packard wrote:
 On Wed, 20 Oct 2010 11:35:46 -0700, Aaron Plattner aplatt...@nvidia.com 
 wrote:
 
  By sending X protocol to explicitly fence the GL rendering with the X
  rendering, which is the whole purpose of James's XDamageSubtractAndTrigger
  request.  Otherwise, the driver has to implicitly fence all rendering (and
  probably all CUDA kernels that interoperate with X) for damage events,
  which will not perform well.
 
 The rendering we're interested in is coming from an X pixmap -- the
 backing pixmap for a redirected window. That you're using that with GL
 is immaterial; you must talk about the X resource somewhere. Yes, it's
 no longer on the X protocol connection, but as an X API, it seems
 incumbent on the implementation to retain the basic X ordering
 guarantees.

It's GLX_EXT_texture_from_pixmap, which has no such ordering guarantee.  It
was intentionally left out of the specification so that we would have time
to work through the details of a proper synchronization extension.

GLX also makes no such guarantees for X drawables that are made current.
This is what glXWaitX is for, except that that's a very heavy hammer.

 The Intel DRI driver has specific fencing operations that pend
 operations from one client wrt another client to ensure correct ordering
 in these cases.

Then you're implementing what sounds like a nice vendor-specific feature.

  In the context of the Damage extension, future rendering requests means
  future *X* rendering requests -- this is how you told me it worked when I
  asked you about it.  Implicitly fencing all other rendering through any API
  when a Damage event is generated is a spec change that we could make, but
  it would require discussion.
 
 I don't see it as a change in the specification, just a natural
 consequence of the X ordering guarantees, and as we're dealing only with
 X objects here, I'm not sure why the API used to reference those objects
 is relevant.
 
  If your driver model can use the ReportAfter semantics to get implicit
  fencing across APIs, then that seems like something you could enable in the
  driver via a DamageCreate or DamageRegister hook.  It doesn't seem like it
  belongs in the server.
 
 I guess the point of the change was that the change ensures that the
 driver *can* serialize access to the hardware correctly, without the
 need for changes in applications. If there is a more efficient
 mechanism, I'd love to hear suggestions.

The driver can do that by enabling ReportAfter from a damage create hook.

The reason I care is that with a proper synchronization extension, clients
that use it actually benefit from receiving Damage events as early as
possible because they can get started processing their rendering without
having to wait even for X to submit the rendering to the GPU, secure in the
knowledge that the GPU rendering won't actually occur until the X rendering
has landed.

Forcing the compositor to wait until the X driver has processed the
request, and especially (at least on our hardware) until the commands have
been flushed to the GPU reduces the amount of parallelism between the CPU
and the GPU.

  The flush callback sounds nifty, but it won't solve the problem for
  multi-queue hardware.
 
 Right, it seems like what you want is to know which drawables have
 delivered damage events so that you can correctly serialize rendering
 From them by other applications using direct rendering.
 
  The old (and new, with this change) Damage semantics just plain don't work
  on multi-queue hardware with direct-rendering clients; it's a limitation in
  the specification that's addressed by James's sync extension.
 
 The patch in questions should have *no effect* except in the rare case
 when the events end up being delivered before the rendering is flushed
 to the hardware. Given that, and given that there is no other existing
 mechanism to prevent this race condition in the server code, I'm
 convinced that we cannot remove the change until another solution is
 proposed.

Yes, I agree that there's some additional bug that needs to be fixed,
regardless.  What I'm saying is that this patch should be reverted on
principle anyway and moved into the appropriate drivers (or the DRI
module) instead, because it's a vendor-specific requirement that is not
applicable to all drivers and reduces CPU/GPU parallelism.

 Fixing the problem with the compiz plugin should be our priority at this 
 point.
 
 -- 
 keith.pack...@intel.com
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Aaron Plattner
On Wed, Oct 20, 2010 at 02:57:27PM -0700, Keith Packard wrote:
 On Wed, 20 Oct 2010 14:29:33 -0700, Aaron Plattner aplatt...@nvidia.com 
 wrote:
 
  Then you're implementing what sounds like a nice vendor-specific
  feature.
 
 There isn't any other mechanism provided today to perform the necessary
 synchronization; the event must not be seen by the client before the
 hardware ever sees the rendering request.

The premise is correct, but the conclusion doesn't follow except for a
certain set of drivers.

  The driver can do that by enabling ReportAfter from a damage create
  hook.
 
 No, it must see the rendering request before the damage extension writes
 the damage event to the client, hence the damage extension itself must
 pend the delivery of this event until after the rendering operation has
 been delivered to the client.

... for your driver, and only until we have a sync extension.

  Forcing the compositor to wait until the X driver has processed the
  request, and especially (at least on our hardware) until the commands have
  been flushed to the GPU reduces the amount of parallelism between the CPU
  and the GPU.
 
 As Kristian has said, there is no effect 99.99% of the time - the X
 server will always hold the event until it decides to flush events,
 which can only happen before the rendering operation is complete when
 the output buffer is already full before the event is written. Hence, no
 performance difference for the two cases, just a simple change to ensure
 that synchronization is possible, even in the absence of proposed
 additions to the protocol to permit more complicated synchronization
 semantics.
 
  Yes, I agree that there's some additional bug that needs to be fixed,
  regardless.  What I'm saying is that this patch should be reverted on
  principle anyway and moved into the appropriate drivers (or the DRI
  module) instead, because it's a vendor-specific requirement that is not
  applicable to all drivers and reduces CPU/GPU parallelism.
 
 I've got two competing bugs -- on one hand, we get occasionally
 incorrect rendering with *any* compositing manager, on the other hand we
 get consistent incorrect rendering with a specific compiz plugin.
 
 I don't see any way to win here.

Fine, but will you be willing to move this call to the drivers that need it
when we have a real sync extension?

 -- 
 keith.pack...@intel.com
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Aaron Plattner
On Wed, Oct 20, 2010 at 03:03:59PM -0700, Aaron Plattner wrote:
 On Wed, Oct 20, 2010 at 02:57:27PM -0700, Keith Packard wrote:
  On Wed, 20 Oct 2010 14:29:33 -0700, Aaron Plattner aplatt...@nvidia.com 
  wrote:
   The driver can do that by enabling ReportAfter from a damage create
   hook.
  
  No, it must see the rendering request before the damage extension writes
  the damage event to the client, hence the damage extension itself must
  pend the delivery of this event until after the rendering operation has
  been delivered to the client.
 
 ... for your driver, and only until we have a sync extension.

Oh wait, I'm confused... I was talking about making the drivers that need
this wrap DamageCreate or DamageRegister and enable ReportAfterOp there so
that they get the same semantics without baking that behavior into the
server for all drivers.  The drivers that opt into that behavior will still
see the rendering before the event is generated... I'm not talking about
disabling this behavior permanently for everbody, just reverting it out of
the server proper.
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Keith Packard
On Wed, 20 Oct 2010 15:03:59 -0700, Aaron Plattner aplatt...@nvidia.com wrote:

 Fine, but will you be willing to move this call to the drivers that need it
 when we have a real sync extension?

I don't see the point -- it doesn't change anything visible to clients
or drivers, other than to eliminate the tiny window where some damage
events might be delivered in a slightly different order, aside from the
bug present within the Damage code.

Also, there's no way, given the currents server API, to have drivers
make this call -- they can't know which of the many damage objects
within the server are created by the damage extension.

I guess I'm curious how these other drivers manage to use the composite
extension today; are they just forcing cross-queue synchronization at
every flush call in the X server?

In any case, here's what I think we should ensure that any release on
either 1.9 or 1.10 branch doesn't contain the compiz bug.

For the 1.9 branch, unless the fix for the Damage code is trivial, I'd
recommend reverting this patch there.

For the 1.10 master branch, I'd strongly prefer to see the Damage bug
get fixed before the 1.10 release. It's likely to be causing other
problems within the server in places which use the post-op path. If no
such fix is forthcoming before the 1.10 release, we should revert this
patch as the known bugs it causes are worse than the known bugs it
fixes.

-- 
keith.pack...@intel.com


pgpeJqPJUbQBT.pgp
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Francisco Jerez
Keith Packard kei...@keithp.com writes:

 On Wed, 20 Oct 2010 15:03:59 -0700, Aaron Plattner aplatt...@nvidia.com 
 wrote:

 Fine, but will you be willing to move this call to the drivers that need it
 when we have a real sync extension?

 I don't see the point -- it doesn't change anything visible to clients
 or drivers, other than to eliminate the tiny window where some damage
 events might be delivered in a slightly different order, aside from the
 bug present within the Damage code.

 Also, there's no way, given the currents server API, to have drivers
 make this call -- they can't know which of the many damage objects
 within the server are created by the damage extension.

 I guess I'm curious how these other drivers manage to use the composite
 extension today; are they just forcing cross-queue synchronization at
 every flush call in the X server?

Nouveau uses the hardware interchannel synchronization primitives on
demand (it kicks in when two channels have references to the same BO):
the nouveau DRM guarantees that two command buffers rendering to/from
the same BO will get executed in order as seen by the DRM.

So this commit *does* solve the problem for us because the kernel will
see X's command buffers before the damage event gets to the GL client.

In short, the mentioned synchronization primitive (it's called a
semaphore) stalls channel A until the contents of a given memory
address become equal to X, then we make channel B write X to the memory
address after it's done with the disputed buffer.

 In any case, here's what I think we should ensure that any release on
 either 1.9 or 1.10 branch doesn't contain the compiz bug.

 For the 1.9 branch, unless the fix for the Damage code is trivial, I'd
 recommend reverting this patch there.

 For the 1.10 master branch, I'd strongly prefer to see the Damage bug
 get fixed before the 1.10 release. It's likely to be causing other
 problems within the server in places which use the post-op path. If no
 such fix is forthcoming before the 1.10 release, we should revert this
 patch as the known bugs it causes are worse than the known bugs it
 fixes.


pgpxcWXqravO8.pgp
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Keith Packard
On Thu, 21 Oct 2010 01:38:58 +0200, Francisco Jerez curroje...@riseup.net 
wrote:

 Nouveau uses the hardware interchannel synchronization primitives on
 demand (it kicks in when two channels have references to the same BO):
 the nouveau DRM guarantees that two command buffers rendering to/from
 the same BO will get executed in order as seen by the DRM.

Would it make sense to have this only kick in when damage is reported to
the compositing manager? That would limit the number of semaphores you
stick in the command queue.

-- 
keith.pack...@intel.com


pgpGvOoHWCSfk.pgp
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Keith Packard
On Sun, 17 Oct 2010 09:58:50 -0700, Aaron Plattner aplatt...@nvidia.com wrote:
 
 This commit broke the Compiz Wallpaper plugin.

Applied.

   d738175..1a0d932  master - master

-- 
keith.pack...@intel.com


pgpLWHcy0M8Wk.pgp
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Francisco Jerez
Keith Packard kei...@keithp.com writes:

 On Thu, 21 Oct 2010 01:38:58 +0200, Francisco Jerez curroje...@riseup.net 
 wrote:

 Nouveau uses the hardware interchannel synchronization primitives on
 demand (it kicks in when two channels have references to the same BO):
 the nouveau DRM guarantees that two command buffers rendering to/from
 the same BO will get executed in order as seen by the DRM.

 Would it make sense to have this only kick in when damage is reported to
 the compositing manager? That would limit the number of semaphores you
 stick in the command queue.

Not really, the decision is made by the kernel when it sees a reference
to an in-flight buffer from the wrong channel, if that happens we can be
sure we have to resort to some kind of synchronization before submitting
the next batch, except on rare occasions.


pgpSjsQ8FUxGH.pgp
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Kristian Høgsberg
On Wed, Oct 20, 2010 at 7:45 PM, Keith Packard kei...@keithp.com wrote:
 On Sun, 17 Oct 2010 09:58:50 -0700, Aaron Plattner aplatt...@nvidia.com 
 wrote:

 In addition to the above, this commit also broke the Compiz
 Wallpaper plugin.

 Ok, as usual, Dave Airlie is right here -- independent of whether this
 patch is a good idea (and I think it is), it should be reverted until it
 doesn't cause a regression.

Yup, and that's why I acked it in the first place.  It doesn't reduce
GPU/CPU parallelism and it's not new semantics for the damage
extension.  If the post-op reporting worked correctly, the only
difference would be in the case where a damage event overflows the
clients protocol buffer.  And in that case, the only difference would
be that the driver gets to see the rendering request before X flushes
the buffer.  There's really nothing to discuss.

Kristian
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-20 Thread Aaron Plattner
On Wed, Oct 20, 2010 at 03:49:47PM -0700, Keith Packard wrote:
 On Wed, 20 Oct 2010 15:03:59 -0700, Aaron Plattner aplatt...@nvidia.com 
 wrote:
 
  Fine, but will you be willing to move this call to the drivers that need it
  when we have a real sync extension?
 
 I don't see the point -- it doesn't change anything visible to clients
 or drivers, other than to eliminate the tiny window where some damage
 events might be delivered in a slightly different order, aside from the
 bug present within the Damage code.

I meant that with proper synchronization, clients would benefit from
receiving the events as early as possible to give them more time to
process the events.  I realize that the events get buffered most of the
time currently so it's not a big change right now.

 Also, there's no way, given the currents server API, to have drivers
 make this call -- they can't know which of the many damage objects
 within the server are created by the damage extension.

That's a good point.  That seems like it might be worth adding.

 I guess I'm curious how these other drivers manage to use the composite
 extension today; are they just forcing cross-queue synchronization at
 every flush call in the X server?

No, at least for our driver.  Like I said, GLX does not specify any
synchronization and adding implicit synchronization would kill performance.
Compositors just kick off their rendering and hope for the best, and
sometimes it doesn't work.  Compiz added some sort of force
synchronization option recently, but I haven't looked at how it works.

 In any case, here's what I think we should ensure that any release on
 either 1.9 or 1.10 branch doesn't contain the compiz bug.
 
 For the 1.9 branch, unless the fix for the Damage code is trivial, I'd
 recommend reverting this patch there.
 
 For the 1.10 master branch, I'd strongly prefer to see the Damage bug
 get fixed before the 1.10 release. It's likely to be causing other
 problems within the server in places which use the post-op path. If no
 such fix is forthcoming before the 1.10 release, we should revert this
 patch as the known bugs it causes are worse than the known bugs it
 fixes.

Fair enough, and this change isn't that huge of a deal, I'm just worried
that papering over it by adding an inferred ordering between queues will
mean that nobody will review James's sync extension stuff and the Damage
sync problems will go unfixed for a third of your users for another few
years
( http://www.phoronix.com/data/img/results/lgs_2010_results/3.png )
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


[PATCH] Revert Set DamageSetReportAfterOp to true for the damage extension (#30260)

2010-10-17 Thread Aaron Plattner
This change does not solve the problem it claims to solve because it
makes two assumptions about the driver that are not always true:

1. that it sends rendering requests to the GPU immediately, which can
   kill performance on some GPUs, and
2. that the GPU processes all requests from multiple clients in order,
   which is false for some GPUs.

All a Damage event tells you is that your next X request will be
procesed after the damage has landed, not that the damage has already
landed or even that the rendering commands that cause the damage have
been submitted to the GPU.

In addition to the above, this commit also broke the Compiz
Wallpaper plugin.

This reverts commit 8d7b7a0d71e0b89321b3341b781bc8845386def6.
---
Fixing this correctly requires spec changes, which is the purpose of James
Jones's damage sync stuff.  If you're reading this, you should go review
that now.  :)

 damageext/damageext.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/damageext/damageext.c b/damageext/damageext.c
index b4bb478..f5265dd 100644
--- a/damageext/damageext.c
+++ b/damageext/damageext.c
@@ -217,7 +217,6 @@ ProcDamageCreate (ClientPtr client)
 if (!AddResource (stuff-damage, DamageExtType, (pointer) pDamageExt))
return BadAlloc;
 
-DamageSetReportAfterOp (pDamageExt-pDamage, TRUE);
 DamageRegister (pDamageExt-pDrawable, pDamageExt-pDamage);
 
 if (pDrawable-type == DRAWABLE_WINDOW)
-- 
1.7.0.4

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel