Re: [RFC] DRI2 synchronization and swap bits

2009-11-18 Thread Jesse Barnes
On Sun, 8 Nov 2009 08:16:51 +0100
Mario Kleiner mario.klei...@tuebingen.mpg.de wrote:
 My proposal to use a spinlock was probably  rather stupid. Because
 of glXGetSyncValuesOML() - I830DRI2GetMSC - drmWaitVBlank -  
 drm_wait_vblank - drm_vblank_count(), if multiple clients call  
 glXGetSyncValuesOML() frequently, e.g., in a polling loop, i assume  
 this could cause quite a bit of contention on a spinlock that must
 be acquired with minimal delay from the vblank irq handler. According
 to http://rt.wiki.kernel.org/index.php/RT_PREEMPT_HOWTO, critical  
 sections protected by spinlock_t are preemptible if one uses a  
 realtime kernel with preempt-rt patches applied, something i'd
 expect my users to do frequently. Maybe i overlook something, but
 this sounds unhealthy if it happens while drm_vblank_count() holds
 the lock?

It's not ideal, but I don't think it would cause many problems in
practice.  The lock hold times will be very small, so contention
shouldn't be a big problem, but as you say if we find issues we can use
a lockless method.

 Btw., when looking at the code in drm_irq.c in the current
 linux-next tree, i saw that drm_handle_vblank_events() does a
 e-event.sequence = seq; assignment with the current seq vblank
 number when retiring an event, but the special shortcut path in
 drm_queue_vblank_event(), which retires events immediately without
 queuing them if the requested vblank number has been reached or
 exceeded already, does not do an update e-event.sequence = seq; with
 the most recent seq vblank number that triggered this early
 retirement. This looks inconsistent to me, could this be a bug?

It uses vblwait-request.sequence which should be the right number.
The timestamp is definitely off though...

 The simple seqlock implementation might be too simple though and a  
 ringbuffer that holds multiple hundred recent vblank timestamp  
 samples might be better.

I wonder if it would be sufficient to just track the last timestamp.
Any callers interested in the last event would get a precise
timestamp.  Callers looking at past events could be returned a
calculated value based on the time difference between the last two
events?

 The problem is the accuracy of glXGetMscRateOML(). This value -  
 basically the duration of a video refresh interval - gets calculated  
 from the current video mode timing, i.e., dotclock, HTotal and  
 VTotal. This value is only useful for userspace applications like my  
 toolkit under the assumption that both the dotclock of the GPU and  
 the current system clock (TSC / HPET / APIC timer / ...) are  
 perfectly accurate and drift-free. In reality, both clocks are  
 imperfect and drift against each other, therefore the returned  
 nominal value of glXGetMscRateOML() is basically always a bit wrong/ 
 inaccurate wrt. system time as used by userspace applications. Our  
 app therefore determines the real refresh duration by a
 calibration loop of multiple seconds duration at startup. This works
 ok, but it increases startup time, can't take slow clock drift over
 the course of a session into account, because i can't recalibrate
 during a session, and the calibration is also not perfect due to the
 timing noise (preemption, scheduling jitter, wakeup latency after  
 swapbuffers etc.) that affects a measurement loop in userspace.
 
 A better approach would be for Linux to measure the current video  
 refresh interval over a certain time window, e.g., computing a
 moving average over a few seconds. This could be done if the vblank  
 timestamps are logged into a ringbuffer. The ringbuffer would allow  
 for lock-free readout of the most recent vblank timestamp from  
 drm_vblank_count(). At the same time the system could look at all  
 samples in the ringbuffer to compute the real duration of a video  
 refresh interval as a average over the deltas between samples in the  
 ringbuffer and provide an accurate and current estimate of  
 glXGetMscRateOML() that would be better than anything we can do in  
 userspace.

That would be even better than just using the last difference, and
shouldn't add too much more code.  On some configurations the refresh
rate will change too (for power saving reasons it may be reduced and
then increased again when activity occurs) so that would have to be
accounted for.  Presumably we wouldn't care about the reduced rate
since it implies clients aren't actively using the display.

 The second problem is how to reinitialize the current vblank  
 timestamp in drm_update_vblank_count() when vblank interrupts get  
 reenabled after they've been disabled for a long period of time?
 
 One generic way to reinitialize would be to calculate elapsed time  
 since last known vblank timestamp from the computed vblank count  
 diff by multiplying the count with the known duration of the video  
 refresh interval. In that case, an accurate estimate of  
 glXGetMscRateOML would be important, so a ringbuffer with samples  
 would probably help.


Re: [RFC] DRI2 synchronization and swap bits

2009-11-07 Thread Mario Kleiner
On Nov 2, 2009, at 5:35 PM, Jesse Barnes wrote:

 Thanks a lot for taking time to go through this stuff, it's exactly  
 the
 kind of feedback I was hoping for.

Hello again

I'm relieved that i didn't screw up and annoy you already with my  
first post, so i'll continue to test my boundaries ;-)

 Doing the wakeups within a millisecond should definitely be possible,
 I don't expect the context switch between display server and client
 would be *that* high of a cost (but as I said I'll benchmark).

I don't expect that either. My comment was just to reinforce that  
very low latency matters for at least our class of applications. I'm  
currently benchmarking our toolkit wrt. timing precision and latency  
on a few different machine/gpu/os combinations. In case such numbers  
from other implementations (Linux with the proprietary drivers, OS/X,  
Windows) are interesting to you, let me know.

 I don't like this idea about entirely fake numbers and like to vote
 for a solution that is as close as possible to the non-redirected
 case.

 The raw numbers will always be exposed to the compositor and probably
 to applications via an opt-out mechanism (to be defined still, we  
 don't
 even have the extra compositor protocol defined).

Happy to hear that.

 Unreliable UST timestamps would make the whole OML_sync_control
 extension almost useless for us and probably other applications that
 require good sync e.g, btw. video and audio streams, so i'd ask you
 politely for improvements here.

 Definitely; these are just bugs, I certainly didn't design it to  
 behave
 this way! :)

Assumed that :). Currently 1.5% of our users are on Linux and i'd  
love to persuade a few more to adopt Linux in the next year. I just  
realized that helping to improve the Linux graphics stack in areas  
that matter to us makes more sense than doing what i did for all  
operating systems since years - trying to cope with limitations and  
driver bugs by use of weird hacks in our userspace application, the  
best i can do on OS/X and Windows.


 I guess one (simple from the viewpoint of  a non-kernel hacker?) way
 would be to always timestamp the vblank in the drm_handle_vblank()
 routine, immediately after incrementing the vblank_count, probably
 protecting both the timestamp acquisition and vblank increment by
 one spinlock, so both get updated atomically? Then one could maybe
 extend  drm_vblank_count() to readout and return vblank count and
 corresponding timestamp simultaneously under protection of the lock?
 Or any other way to provide the timestamp together with the vblank
 count in an atomic fashion to the calling code in
 drm_queue_vblank_event(), drm_queue_vblank_event() and
 drm_handle_vblank_events()?

 Yep, that would work and should be a fairly easy change.

I spent a bit more time thinking about this, i also read about the  
available synchronization primitives and started to code the  
following possible implementation. Again apologies if i'm stating the  
totally obvious, or stuff that's been done or planned already.

My proposal to use a spinlock was probably  rather stupid. Because of  
glXGetSyncValuesOML() - I830DRI2GetMSC - drmWaitVBlank -  
drm_wait_vblank - drm_vblank_count(), if multiple clients call  
glXGetSyncValuesOML() frequently, e.g., in a polling loop, i assume  
this could cause quite a bit of contention on a spinlock that must be  
acquired with minimal delay from the vblank irq handler. According to  
http://rt.wiki.kernel.org/index.php/RT_PREEMPT_HOWTO, critical  
sections protected by spinlock_t are preemptible if one uses a  
realtime kernel with preempt-rt patches applied, something i'd expect  
my users to do frequently. Maybe i overlook something, but this  
sounds unhealthy if it happens while drm_vblank_count() holds the lock?

A lockless method would be the better solution. Seqlocks seem to be a  
good fit? There's only one writer per crtc, drm_handle_vblank() (and  
occassionally drm_update_vblank_count() if vblank irqs get reenabled)  
which only writes infrequently (60 - 200 times per second on typical  
displays). There can be many readers which can read very frequently,  
and the datastructure to read is relatively simple and free of  
pointers, so this fits the model of seqlocks. Maybe one could even do  
with the versions that don't disable irqs, i.e., write_seqlock()  
instead of write_seqlock_irqsave()? Documentation says that one  
should use the irq-safe versions if the seqlock might be accessed  
from an interrupt handler. I looked at the implementation of seqlocks  
and as far as i can see, deadlock can only happen if a writer gets  
preempted by an irq handler that then tries to either write or read  
the seqlock itself? But a _vblank_seqlock would only get accessed for  
write access from the interrupt handler for a given crtc. The only  
other place of write access, drm_update_vblank_count(), gets called  
infrequently and within a spin_lock_irqsave(dev-vbl_lock,  
irqflags); 

Re: [RFC] DRI2 synchronization and swap bits

2009-11-02 Thread Jesse Barnes
On Sun, 1 Nov 2009 21:46:45 +0100
Mario Kleiner mario.klei...@tuebingen.mpg.de wrote:
 I read this RFC and i'm very excited about the prospect of having  
 well working support for the OML_sync_control extension in DRI2 on  
 Linux/X11. I was hoping for this to happen since years, so a big  
 thank you in advance! This is why i hope to provide some input from  
 the perspective of future power-users of functions like  
 glXGetSyncValuesOML(), glXSwapBuffersMscOML(), glXWaitForSbcOML. I'm  
 the co-developer of a popular free-software toolkit (Psychtoolbox)  
 that is used mostly in the neuroscience / cognitive science
 community by scientist to find out how the different senses (visual,
 auditory, haptic, ...) work and how they work together. Our
 requirements to graphics are often much more demanding than what a
 videogame, typical vr-environment or a mediaplayer has.

Thanks a lot for taking time to go through this stuff, it's exactly the
kind of feedback I was hoping for.

 Our users often have very strict requirements for scheduling frame- 
 accurate and tear-free visual stimulus display, synchronizing  
 bufferswaps across display-heads, and low-latency returns from swap- 
 completion. Often they need swap-completion timestamps which are  
 available with the shortest possible delay after a successfull swap  
 and accurately tied to the vblank at which scanout of a swapped
 frame started. The need for timestamps with sub-millisecond accuracy
 is not uncommon. Therefore, well working OML_sync_control support
 would be basically a dream come true and a very compelling feature
 for Linux as a platform for cognitive science.

Doing the wakeups within a millisecond should definitely be possible,
I don't expect the context switch between display server and client
would be *that* high of a cost (but as I said I'll benchmark).

 2. On the CompositePage in the DRM Wiki, there is this comment:  
 ...It seems that composited apps should never need to know about  
 real world screen vblank issues, ... When dealing with a  
 redirected window it seems it would be acceptable to come up with an  
 entirely fake number for all existing extensions that care about  
 vblanks..
 
 I don't like this idea about entirely fake numbers and like to vote  
 for a solution that is as close as possible to the non-redirected  
 case. Most of our applications run in non-redirected, full-screen,  
 undecorated, page-flipped windows, ie., without a compositor being  
 involved. I can think of a couple future usage cases though where  
 reasonably well working redirected/composited windows would be very  
 useful for us, but only if we get meaningful timestamps and vblank  
 counts that are tied to the actual display onset.

The raw numbers will always be exposed to the compositor and probably
to applications via an opt-out mechanism (to be defined still, we don't
even have the extra compositor protocol defined).

 3. The Wiki also mentions The direct rendered cases outlined in the  
 implementation notes above are complete, but there's a bug in the  
 async glXSwapBuffers that sometimes causes clients to hang after  
 swapping rather than continue. Looking through the code of http:// 
 cgit.freedesktop.org/~jbarnes/xf86-video-intel/tree/src/i830_dri.c? 
 id=a0e2e624c47516273fa3d260b86d8c293e2519e4 i can see that in  
 I830DRI2SetupSwap() and I830DRI2SetupWaitMSC(), in the if (divisor  
 == 0) { ...} path, the functions return after DRM_VBLANK_EVENT  
 submission without assigning *event_frame = vbl.reply.sequence;
 This looks problematic to me, as the xserver is later submitting  
 event_frame in the call to DRI2AddFrameEvent() inside DRI2SwapBuffers 
 () as a cookie to find the right events for clients to wait on?
 Could this be a reason for clients hanging after swap? I found a few
 other spots where i other misunderstood something or there are small
 bugs. What is the appropriate way to report these?

This list is fine, thanks for checking it out.  I'll fix that up.

 4. According to spec, the different OML_sync_control functions do  
 return a UST timestamp which is supposed to reflect the exact time
 of when the MSC last incremented, i.e., at the start of scanout of a
 new video frame. SBC and MSC are supposed to increment atomically/ 
 simultaneously at swap completion, so the UST in the (UST,SBC,MSC)  
 triplet is supposed to mark the time of transition of either MSC or  
 MSC and SBC at swap completion. This makes a lot of sense to me, it  
 is exactly the type of timestamp that our toolkit critically depends
 on.
 
 Ideally the UST timestamp should be corrected to reflect start of  
 scanout, but a UST that is consistently taken at vblank interrupt  
 time would do as well. In the current implementation this is *not*  
 the semantic we'd get for UST timestamps.
 
 The I830DRI2GetMSC() call uses a call to drmWaitVBlank() and its  
 returned vbl.reply.tval_sec and vbl.reply.tval_usec values for  
 computing UST.
 

Re: [RFC] DRI2 synchronization and swap bits

2009-11-01 Thread Kristian Høgsberg
On Fri, Oct 30, 2009 at 1:42 PM, Eric Anholt e...@anholt.net wrote:
 On Fri, 2009-10-30 at 10:59 -0700, Jesse Barnes wrote:
 I've put up some trees (after learning my lesson about working in the
 main tree) with the latest DRI2 sync/swap bits:
   git://git.freedesktop.org/home/jbarnes/xserver master branch
   git://git.freedesktop.org/home/jbarnes/mesa master branch

 They includes support for some new DRI2 requests (proto for which is in
 the dri2-swapbuffers branch of dri2proto), including:
   DRI2SwapBuffers
   DRI2GetMSC
   DRI2WaitMSC
 and
   DRI2WaitSBC

 These allow us to support GLX extensions like SGI_video_sync,
 OML_swap_control and SGI_swap_interval.

 There have been a few comments about the protocol so far:
   1) DRI2SwapBuffers
      a) Concern about doing another round trip to fetch new buffers
       following the swap.
         I think this is a valid concern, we could potentially respond
         from the swap with the new buffers, but this would make some
         memory saving optimizations more difficult (e.g. freeing
         buffers if no drawing comes in for a short time after the swap).

 You're doing one round-trip anyway, and if users are concerned about the
 second one, go use XCB already.  (We need to go fix Mesa to do that).

DRI2SwapBuffers is a one-way request, but it's required to follow up
with a DRI2GetBuffers.  So it's only one round trip whether we use XCB
or not.

cheers,
Kristian

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [RFC] DRI2 synchronization and swap bits

2009-11-01 Thread Mario Kleiner
Hello everybody

My name is Mario Kleiner and i'm new to this list, so i apologize  
beforehand should i violate some rules of netiquette, state the  
totally obvious, or if this post is somehow considered off-topic or  
way too long. Please tell me if so, and how to do better next time.  
First some background to why i am posting, then some proposals more  
to the point of this RFC.

I read this RFC and i'm very excited about the prospect of having  
well working support for the OML_sync_control extension in DRI2 on  
Linux/X11. I was hoping for this to happen since years, so a big  
thank you in advance! This is why i hope to provide some input from  
the perspective of future power-users of functions like  
glXGetSyncValuesOML(), glXSwapBuffersMscOML(), glXWaitForSbcOML. I'm  
the co-developer of a popular free-software toolkit (Psychtoolbox)  
that is used mostly in the neuroscience / cognitive science community  
by scientist to find out how the different senses (visual, auditory,  
haptic, ...) work and how they work together. Our requirements to  
graphics are often much more demanding than what a videogame, typical  
vr-environment or a mediaplayer has.

Our users often have very strict requirements for scheduling frame- 
accurate and tear-free visual stimulus display, synchronizing  
bufferswaps across display-heads, and low-latency returns from swap- 
completion. Often they need swap-completion timestamps which are  
available with the shortest possible delay after a successfull swap  
and accurately tied to the vblank at which scanout of a swapped frame  
started. The need for timestamps with sub-millisecond accuracy is not  
uncommon. Therefore, well working OML_sync_control support would be  
basically a dream come true and a very compelling feature for Linux  
as a platform for cognitive science.

I spent the last 12 hours reading the CompositeSwap page at the DRI- 
Wiki and through Jesse Barnes git-tree and the drivers/gpu/drm/ 
drm_irq.c file in the linux-next git-tree at kernel org, which i  
assume (correctly?) is the current state of art wrt. to the DRM, and  
have some thoughts or wishes.

1. Wrt to 2) DRI2WaitMSC/SBC a) Concern about blocking the client on  
the server side as opposed to a client side wait.

I'm not sure about the extra latency involved by blocking the client  
on the server side, instead of a client side wait, but i can assure  
you that for our applications, 1 millisecond extra delay between swap- 
completion and unblocking can make a significant difference. Quite  
often certain actions need to be triggered in sync with swap  
completion. Examples are starting recording equipment for brain  
activity (fMRI, EEG, MEG, eye-trackers) or other physiological  
responses, starting sound playback or recording, sending trigger  
packets over a network, driving special digital/analog I/O boards,  
driving motion simulators etc. So low-latency unblocking would be  
much appreciated from our side.

2. On the CompositePage in the DRM Wiki, there is this comment:  
...It seems that composited apps should never need to know about  
real world screen vblank issues, ... When dealing with a  
redirected window it seems it would be acceptable to come up with an  
entirely fake number for all existing extensions that care about  
vblanks..

I don't like this idea about entirely fake numbers and like to vote  
for a solution that is as close as possible to the non-redirected  
case. Most of our applications run in non-redirected, full-screen,  
undecorated, page-flipped windows, ie., without a compositor being  
involved. I can think of a couple future usage cases though where  
reasonably well working redirected/composited windows would be very  
useful for us, but only if we get meaningful timestamps and vblank  
counts that are tied to the actual display onset.

3. The Wiki also mentions The direct rendered cases outlined in the  
implementation notes above are complete, but there's a bug in the  
async glXSwapBuffers that sometimes causes clients to hang after  
swapping rather than continue. Looking through the code of http:// 
cgit.freedesktop.org/~jbarnes/xf86-video-intel/tree/src/i830_dri.c? 
id=a0e2e624c47516273fa3d260b86d8c293e2519e4 i can see that in  
I830DRI2SetupSwap() and I830DRI2SetupWaitMSC(), in the if (divisor  
== 0) { ...} path, the functions return after DRM_VBLANK_EVENT  
submission without assigning *event_frame = vbl.reply.sequence;  This  
looks problematic to me, as the xserver is later submitting  
event_frame in the call to DRI2AddFrameEvent() inside DRI2SwapBuffers 
() as a cookie to find the right events for clients to wait on? Could  
this be a reason for clients hanging after swap? I found a few other  
spots where i other misunderstood something or there are small bugs.  
What is the appropriate way to report these?

4. According to spec, the different OML_sync_control functions do  
return a UST timestamp which is supposed to reflect the exact 

Re: [RFC] DRI2 synchronization and swap bits

2009-10-31 Thread Jesse Barnes
On Fri, 30 Oct 2009 19:15:17 -0700
Keith Packard kei...@keithp.com wrote:

 Excerpts from Jesse Barnes's message of Fri Oct 30 10:59:08 -0700
 2009:
 
  These allow us to support GLX extensions like SGI_video_sync,
  OML_swap_control and SGI_swap_interval.
 
 Let's get the protocol nailed down before we go into detailed code
 review. Besides, you need to rebase -i to get rid of the broken
 versions.

Yeah, some merging/splitting of the commits is in order before merging
it upstream.

  There have been a few comments about the protocol so far:
1) DRI2SwapBuffers
   a) Concern about doing another round trip to fetch new buffers
  following the swap.
 
 Do we want to deal with stereo here?

I think the protocol is sufficient for that; it just requests the swap,
so for stereo buffers both would be swapped.

  I think this is a valid concern, we could potentially
  respond from the swap with the new buffers, but this would make some
  memory saving optimizations more difficult (e.g. freeing
  buffers if no drawing comes in for a short time after the
  swap).
 
 Hrm. Ideally, we'd send back new buffer IDs but delay creation until
 someone accessed them. That would require kernel magic to create an
 un-realized buffer, but perhaps avoiding an explicit round trip per
 swap would be worth it?

I don't see how we can avoid the round trip entirely, without sharing
some state between the server and client (i.e. re-introducing the
SAREA).  I'll do some benchmarking of the proposed buffer freeing and
see how bad it is.

2) DRI2WaitMSC/SBC
   a) Concern about blocking the client on the server side as
  opposed to a client side wait.
 
 So, some kind of cookie that you'd pass to the kernel for the wait
 instead of just blocking in the server? I can see a lot of uses for
 this kind of mechanism beyond X, which makes it somewhat more
 interesting to contemplate in this case.

Oh we never block the server.  The protocol here just tells the server
when the client should be awakened again by passing a cookie.  The open
question is whether the server should be putting the client to sleep
and waking it back up, or whether in the direct rendered case the
client gets a cookie from the server and sleeps itself (in the aiglx
case the server has to handle things regardless).

  The implementation tries to avoid blocking the clients at all for
  swap requests, only blocking them on wait requests that are
  specified to cause blocking.  This should allay the concerns raised
  in the page flipping thread about unnecessary blocking of clients
  (that's left as an implementation detail for the drivers supporting
  these new functions).
 
 Do we have a driver which does this the 'right' way yet?

The i915 page flipping code does this correctly, by marking the buffers
in question busy and blocking the client in the kernel if it tries to
access a busy buffer.  For the windowed swap case we may have to block
the client less nicely if we end up blitting between back  front.  (GL
fences could make this better.)

Jesse

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [RFC] DRI2 synchronization and swap bits

2009-10-30 Thread Eric Anholt
On Fri, 2009-10-30 at 10:59 -0700, Jesse Barnes wrote:
 I've put up some trees (after learning my lesson about working in the
 main tree) with the latest DRI2 sync/swap bits:
   git://git.freedesktop.org/home/jbarnes/xserver master branch
   git://git.freedesktop.org/home/jbarnes/mesa master branch
 
 They includes support for some new DRI2 requests (proto for which is in
 the dri2-swapbuffers branch of dri2proto), including:
   DRI2SwapBuffers
   DRI2GetMSC
   DRI2WaitMSC
 and
   DRI2WaitSBC
 
 These allow us to support GLX extensions like SGI_video_sync,
 OML_swap_control and SGI_swap_interval.
 
 There have been a few comments about the protocol so far:
   1) DRI2SwapBuffers
  a) Concern about doing another round trip to fetch new buffers
   following the swap.
 I think this is a valid concern, we could potentially respond
 from the swap with the new buffers, but this would make some
 memory saving optimizations more difficult (e.g. freeing
 buffers if no drawing comes in for a short time after the swap).

You're doing one round-trip anyway, and if users are concerned about the
second one, go use XCB already.  (We need to go fix Mesa to do that).

-- 
Eric Anholt
e...@anholt.net eric.anh...@intel.com




signature.asc
Description: This is a digitally signed message part
--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [RFC] DRI2 synchronization and swap bits

2009-10-30 Thread Jesse Barnes
On Fri, 30 Oct 2009 11:42:06 -0700
Eric Anholt e...@anholt.net wrote:

 On Fri, 2009-10-30 at 10:59 -0700, Jesse Barnes wrote:
  I've put up some trees (after learning my lesson about working in
  the main tree) with the latest DRI2 sync/swap bits:
git://git.freedesktop.org/home/jbarnes/xserver master branch
git://git.freedesktop.org/home/jbarnes/mesa master branch
  
  They includes support for some new DRI2 requests (proto for which
  is in the dri2-swapbuffers branch of dri2proto), including:
DRI2SwapBuffers
DRI2GetMSC
DRI2WaitMSC
  and
DRI2WaitSBC
  
  These allow us to support GLX extensions like SGI_video_sync,
  OML_swap_control and SGI_swap_interval.
  
  There have been a few comments about the protocol so far:
1) DRI2SwapBuffers
   a) Concern about doing another round trip to fetch new buffers
  following the swap.
  I think this is a valid concern, we could potentially
  respond from the swap with the new buffers, but this would make some
  memory saving optimizations more difficult (e.g. freeing
  buffers if no drawing comes in for a short time after the
  swap).
 
 You're doing one round-trip anyway, and if users are concerned about
 the second one, go use XCB already.  (We need to go fix Mesa to do
 that).

Yeah, I don't think it's a huge deal, but every context switch we add
is that much more overhead (especially for low end platforms).

-- 
Jesse Barnes, Intel Open Source Technology Center

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [RFC] DRI2 synchronization and swap bits

2009-10-30 Thread Keith Packard
Excerpts from Jesse Barnes's message of Fri Oct 30 10:59:08 -0700 2009:

 These allow us to support GLX extensions like SGI_video_sync,
 OML_swap_control and SGI_swap_interval.

Let's get the protocol nailed down before we go into detailed code
review. Besides, you need to rebase -i to get rid of the broken versions.

 There have been a few comments about the protocol so far:
   1) DRI2SwapBuffers
  a) Concern about doing another round trip to fetch new buffers
 following the swap.

Do we want to deal with stereo here?

 I think this is a valid concern, we could potentially respond
 from the swap with the new buffers, but this would make some
 memory saving optimizations more difficult (e.g. freeing
 buffers if no drawing comes in for a short time after the
 swap).

Hrm. Ideally, we'd send back new buffer IDs but delay creation until
someone accessed them. That would require kernel magic to create an
un-realized buffer, but perhaps avoiding an explicit round trip per
swap would be worth it?

We can even make the Xlib API asynchronous in this case; just requires
a bit of hackery to post an async reply handler and then a function to
collect the async reply data.

   2) DRI2WaitMSC/SBC
  a) Concern about blocking the client on the server side as opposed
 to a client side wait.

So, some kind of cookie that you'd pass to the kernel for the wait
instead of just blocking in the server? I can see a lot of uses for
this kind of mechanism beyond X, which makes it somewhat more
interesting to contemplate in this case.

 The implementation tries to avoid blocking the clients at all for swap
 requests, only blocking them on wait requests that are specified to
 cause blocking.  This should allay the concerns raised in the page
 flipping thread about unnecessary blocking of clients (that's left as
 an implementation detail for the drivers supporting these new
 functions).

Do we have a driver which does this the 'right' way yet?

-- 
keith.pack...@intel.com


signature.asc
Description: PGP signature
--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel