On 11/27/05, Nicolai Haehnle <[EMAIL PROTECTED]> wrote:
> On Sunday 27 November 2005 16:49, Philipp Klaus Krause wrote:
> > Nicolai Haehnle schrieb:
> >
> > > That's usually a bad idea because it breaks parallelism between CPU and
> GPU.
> >
> > It only breaks parallelism between the texture upload and the
> > application's rendering thread. Any other thread in any process can
> > still run at the same time.
>
> Working under the assumption that switching the GPU context is Horribly
> Slow(tm), we could switch CPU threads, but we wouldn't want to switch GPU
> threads, and then there's still a period of time where the GPU remains
> inactive because events would go basically like this:
>
> CPU Thread A calls glTexImage2d
> Driver emits upload command and blocks Thread A; no more commands are sent
> to the GPU
> CPU Thread B executes
> ...
> GPU finished the DMA, causes an IRQ and *becomes idle*

Or picks the next command out of the ring buffer.  It's going to do
things in order, but you can always queue up more work to do.  This is
perhaps why it's better to copy (or page lock) the texture so that we
can queue up more commands.

It's going to be a little funny since an image upload is privileged. 
That means it has to go into the ring buffer.  But basically, in your
indirect buffer, you have some commands and, via driver call, insert a
pointer to that into the ring buffer.  Then you insert an image upload
into the ring buffer the same way.  Then another packet from the
indirect buffer, etc.  The management isn't really THAT complex.

> Driver wakes up Thread A
> ... (depending on the scheduler, CPU Thread B may still be executing here)
> CPU Thread A actually begins executing again
> CPU Thread A calls some other rendering command
> Driver emits rendering commands to GPU
> GPU starts working again
>
> Note that there's a rather large hole of activity on the GPU side.

We have discussed before a solution to the context switch problem too.
 When you send GPU commands, you affect registers.  All a thread has
to do is track what those changes are.  If there's a structure in
memory shared with the kernel, then it can automatically restore state
for your thread (without having to stop things and download the state
for the previous process).

> Yes, in theory this hole could be filled out by doing work for a different
> OpenGL context, but
> a) despite all the interactivity in modern desktops, it's unlikely that many
> applications are really rendering at the same time, and

Don't forget X11.  It's just another client of the kernel driver and GPU.

> b) it would be painfully slow *anyway* unless the hardware magically
> supports fast rendering context switches (which I highly doubt it will)

No.  Just medium-speed ones via software mechanisms that aren't too
horribly complex.

> The "mark pages read-only" trick would work, but I have no idea if that's
> feasible (especially when the source data for the texture image is in
> shared memory).

It also depends on the OS.  A BSD may have an easy way to do this,
while Linux doesn't.  I don't know.  But perhaps we can piggyback on
the CoW mechanism used by fork().

_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to