On Fri, Nov 29, 2002 at 01:13:22PM +0100, Felix Kühling wrote: > On Fri, 29 Nov 2002 11:15:19 +0000 > José Fonseca <[EMAIL PROTECTED]> wrote: > > > On Fri, Nov 29, 2002 at 10:19:52AM +0100, Felix Kühling wrote: > > > > First some high level considerations. Having each pipeline stage in a > > > separate thread would lead to a large number of threads (e.g. 9 in the > > > case of the radeon driver). Most of the pipeline stages take rather > > > short time so that the extra overhead of multi threading and > > > synchronization could have a significant impact. Alternatively one could > > > use a fixed number N of threads and schedule pipeline stages on them, > > > the main thread and N-1 "free" threads. If a "free" thread is available > > > the next pipeline stage would be executed on that thread and the OpenGL > > > client could continue on the main thread without waiting for all > > > pipeline stages to complete. Note that on a non-SMP system there would > > > be only the main thread which is equivalent to how the pipeline is > > > currently executed. > > > > I think that one thing that must be thought is whether the parallelism > > should be in the pipeline stages or in the pipeline data, i.e., if we > > I am not sure I understand the difference. The idea of a pipeline is > that you split the tasks performed on data into several stages. Mesa > does this part already. Then while one package is in stage 1 another one > can be processed in stage 2 at the same time. So I think I have > parallelism both in pipeline data and the stages.
The problem is two-fold in this case. First, most of the time not all of the stages are executed (i.e., the software rasterizer case is rarely executed). Second, most of the stages are very short. You'll spend most of your execution time synchronizing between the stages. I seem to recall the Carmack had a .plan update about that when he was adding SMP support to Quake3. I'll see if I can find it. Most research in parallelizing code points to doing whatever is possible to minimize synchronization costs. You might search through previous years SIGGRAPH papers to see what other people have done in this area. It's not a new field. I know that there are patents in this area (sigh) that go back at least 5 or 10 years. > > All assumptions have to be very well verified against all existing Mesa > > drivers, otherwise a discrete hack can cause havoc... > > All the hardware specific stages are drawing stages. So only one of them > will be executed at a time. I don't see any problem here. One tricky > part could be to find out, how much of the context actually has to be > copied. Obviously, all data that is modified by the pipeline stages > needs to be copied. Everything that is read only can be shared by all > context copies. What about TCL stages? I think one problem that you'll run into is that, as more and more of the OpenGL pipeline gets moved into hardware, you'll see less and less benefit in doing this. :( What might be worth looking into is using "left over" CPU time to optimize data that is being sent to the card. That is, if the card is the rendering bottleneck, use some CPU cycles to optimize triangle strips that are being submitted, optimize out redundant state changes from the command stream, etc. The trick is in deciding when to enable the optimizer pass. -- Smile! http://antwrp.gsfc.nasa.gov/apod/ap990315.html ------------------------------------------------------- This SF.net email is sponsored by: Get the new Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en _______________________________________________ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel