On 11/27/05, Nicolai Haehnle <[EMAIL PROTECTED]> wrote: > On Sunday 27 November 2005 16:49, Philipp Klaus Krause wrote: > > Nicolai Haehnle schrieb: > > > > > That's usually a bad idea because it breaks parallelism between CPU and > GPU. > > > > It only breaks parallelism between the texture upload and the > > application's rendering thread. Any other thread in any process can > > still run at the same time. > > Working under the assumption that switching the GPU context is Horribly > Slow(tm), we could switch CPU threads, but we wouldn't want to switch GPU > threads, and then there's still a period of time where the GPU remains > inactive because events would go basically like this: > > CPU Thread A calls glTexImage2d > Driver emits upload command and blocks Thread A; no more commands are sent > to the GPU > CPU Thread B executes > ... > GPU finished the DMA, causes an IRQ and *becomes idle*
Or picks the next command out of the ring buffer. It's going to do things in order, but you can always queue up more work to do. This is perhaps why it's better to copy (or page lock) the texture so that we can queue up more commands. It's going to be a little funny since an image upload is privileged. That means it has to go into the ring buffer. But basically, in your indirect buffer, you have some commands and, via driver call, insert a pointer to that into the ring buffer. Then you insert an image upload into the ring buffer the same way. Then another packet from the indirect buffer, etc. The management isn't really THAT complex. > Driver wakes up Thread A > ... (depending on the scheduler, CPU Thread B may still be executing here) > CPU Thread A actually begins executing again > CPU Thread A calls some other rendering command > Driver emits rendering commands to GPU > GPU starts working again > > Note that there's a rather large hole of activity on the GPU side. We have discussed before a solution to the context switch problem too. When you send GPU commands, you affect registers. All a thread has to do is track what those changes are. If there's a structure in memory shared with the kernel, then it can automatically restore state for your thread (without having to stop things and download the state for the previous process). > Yes, in theory this hole could be filled out by doing work for a different > OpenGL context, but > a) despite all the interactivity in modern desktops, it's unlikely that many > applications are really rendering at the same time, and Don't forget X11. It's just another client of the kernel driver and GPU. > b) it would be painfully slow *anyway* unless the hardware magically > supports fast rendering context switches (which I highly doubt it will) No. Just medium-speed ones via software mechanisms that aren't too horribly complex. > The "mark pages read-only" trick would work, but I have no idea if that's > feasible (especially when the source data for the texture image is in > shared memory). It also depends on the OS. A BSD may have an easy way to do this, while Linux doesn't. I don't know. But perhaps we can piggyback on the CoW mechanism used by fork(). _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
