Hi Ferdi, The osgViewer::Viewer by default will open up a separate GraphicsWindow on each graphics card, and will assign a separate GraphicsThread to each of these contexts. One thing I previous found with running two cards on a PC was that the drivers really didn't like driving the two context by two threads concurrently, even though it was two separate GPU these were targeting, paradoxically I found that serializing the draw dispatch produced better performance than letting them run freely.
I've previously called for feedback about whether this problem with drivers/hardware existed on other machines/OS's, but haven't yet got the results back that would help decide what the best default would be for the range of platforms that the OSG supports. With the vacum of testing I've had to go with making the draw dispatch serialization the default. You can override the default by setting the env OSG_SERIALIZE_DRAW_DISPATCH to OFF. With my new machine with its dual 16x PCI-Express 2.0, on chip memory controller and bags of memory bandwidth and later NVidia drivers, the situation may be different w.r.t what works best. I haven't done the dual GPU testing yet on this machine, I'll post the results when I have. The draw serialization is one area that I'm looking closely at, as well as thread affinity. Robert. On Thu, Dec 4, 2008 at 11:08 AM, Ferdi Smit <[EMAIL PROTECTED]> wrote: > Robert, > > I believe this to be a driver issue as well. However, what it does show is > that it is probably preferable to have different contexts on different > cores, also to eliminate any possible opengl driver overhead. On 2 cores it > runs perfectly as expected; same frame rates as stand-alone. On a single > core they both drop back a little, probably because the quadro opengl driver > is eating up a lot of CPU time for some obscure reason. Even when the opengl > implementation is set to multi-threaded, I never see it start using any > additional core by itself. This only happens when you set the affinity > manually. So fully relying on the OS scheduling may not be the best option. > > I don't remember the exact stats, and I can't check them now as I don't have > the machine here. I think it was something like 0/0/x for the 8800 and 0/y/y > for the 5600, with x close to y. > > Robert Osfield wrote: >> >> Hi Ferdi, >> >> It sounds like the OpenGL driver is not behaviour well in your case >> and twisting the results from what you would normally expect. With >> scene of 25M polygons you are almost certainly GPU/bandwidth limited, >> and pretty likely that the OpenGL fifo is backing up and playing >> havoc. Multi-processing is this situation is not really going to make >> a huge difference in terms of potential gain because the bottleneck is >> elsewhere - you aren't CPU limited, it shouldn't make things worse, >> and the fact that it is suggests that the driver is having issues. >> >> What are you cull/draw/gpu stats? >> >> Robert. >> >> On Thu, Dec 4, 2008 at 10:43 AM, Ferdi Smit <[EMAIL PROTECTED]> wrote: >> >>> >>> I've been looking into thread/process affinity with respect to a dual-GPU >>> setup. This is what I found back then: >>> >>> 2 processes, CPU affinity on different cores, each renders the full scene >>> on >>> a different GPU >>> >>> OSG_THREADING=SingleThreaded >>> (1 core shows heavy use, 2nd core show moderate use, 2 cores idle) >>> >>> Quadro 5600 8800GTX >>> Multi-GPU / Multi-Processing 16 15 (fps) >>> >>> OSG_THREADING=ThreadPerContext >>> (CPU Affinity is set but appears to be ignored: 1 core shows heavy use, >>> others idle) >>> >>> Multi-GPU / Multi-Processing 11 14 (fps) >>> >>> So, yes, it helps if the CPU load is balanced across cores, even for this >>> simple 25M polygon scene that is for 99% GPU. Also, OSG seems to mess up >>> manual settings when set to multi-threading :) Check CPU loads in the >>> performance monitor. Don't let 3 cores stand idle all the time; it's a >>> waste >>> of available processing time, and this will only get worse in the future >>> with more cores. >>> >>> Side note, I finally received my 4GB memory, so I can get this quad core, >>> dual GTX260, pcie 2.0 machine running :) Let's see if it behaves >>> differently >>> from a Quadro+8800 on pcie 1.1. >>> >>> Ulrich Hertlein wrote: >>> >>>> >>>> Additionally: what is the communities' experience with manually >>>> assigning >>>> processes to cores? Does it work? >>>> >>>> It sounds like a good idea but in my experience it's best to leave the >>>> scheduling up to the operating system to handle core affinity (except >>>> when >>>> you're running Windows which has a completely fscked-up round-robin >>>> scheduler). >>>> >>>> Cheers, >>>> /ulrich >>>> _______________________________________________ >>>> osg-users mailing list >>>> [email protected] >>>> >>>> http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org >>>> >>>> >>> >>> -- >>> Regards, >>> >>> Ferdi Smit >>> INS3 Visualization and 3D Interfaces >>> CWI Amsterdam, The Netherlands >>> >>> _______________________________________________ >>> osg-users mailing list >>> [email protected] >>> http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org >>> >>> >> >> _______________________________________________ >> osg-users mailing list >> [email protected] >> http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org >> > > > -- > Regards, > > Ferdi Smit > INS3 Visualization and 3D Interfaces > CWI Amsterdam, The Netherlands > > _______________________________________________ > osg-users mailing list > [email protected] > http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org > _______________________________________________ osg-users mailing list [email protected] http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

