Hi Ferdi,

The osgViewer::Viewer by default will open up a separate
GraphicsWindow on each graphics card, and will assign a separate
GraphicsThread to each of these contexts.   One thing I previous found
with running two cards on a PC was that the drivers really didn't like
driving the two context by two threads concurrently, even though it
was two separate GPU these were targeting, paradoxically I found that
serializing the draw dispatch produced better performance than letting
them run freely.

I've previously called for feedback about whether this problem with
drivers/hardware existed on other machines/OS's, but haven't yet got
the results back that would help decide what the best default would be
for the range of platforms that the OSG supports.  With the vacum of
testing I've had to go with making the draw dispatch serialization the
default.  You can override the default by setting the env
OSG_SERIALIZE_DRAW_DISPATCH to OFF.

With my new machine with its dual 16x PCI-Express 2.0, on chip memory
controller and bags of memory bandwidth and later NVidia drivers, the
situation may be different w.r.t what works best.  I haven't done the
dual GPU testing yet on this machine, I'll post the results when I
have.  The draw serialization is one area that I'm looking closely at,
as well as thread affinity.

Robert.

On Thu, Dec 4, 2008 at 11:08 AM, Ferdi Smit <[EMAIL PROTECTED]> wrote:
> Robert,
>
> I believe this to be a driver issue as well. However, what it does show is
> that it is probably preferable to have different contexts on different
> cores, also to eliminate any possible opengl driver overhead. On 2 cores it
> runs perfectly as expected; same frame rates as stand-alone. On a single
> core they both drop back a little, probably because the quadro opengl driver
> is eating up a lot of CPU time for some obscure reason. Even when the opengl
> implementation is set to multi-threaded, I never see it start using any
> additional core by itself. This only happens when you set the affinity
> manually. So fully relying on the OS scheduling may not be the best option.
>
> I don't remember the exact stats, and I can't check them now as I don't have
> the machine here. I think it was something like 0/0/x for the 8800 and 0/y/y
> for the 5600, with x close to y.
>
> Robert Osfield wrote:
>>
>> Hi Ferdi,
>>
>> It sounds like the OpenGL driver is not behaviour well in your case
>> and twisting the results from what you would normally expect.  With
>> scene of 25M polygons you are almost certainly GPU/bandwidth limited,
>> and pretty likely that the OpenGL fifo is backing up and playing
>> havoc.  Multi-processing is this situation is not really going to make
>> a huge difference in terms of potential gain because the bottleneck is
>> elsewhere - you aren't CPU limited, it shouldn't make things worse,
>> and the fact that it is suggests that the driver is having issues.
>>
>> What are you cull/draw/gpu stats?
>>
>> Robert.
>>
>> On Thu, Dec 4, 2008 at 10:43 AM, Ferdi Smit <[EMAIL PROTECTED]> wrote:
>>
>>>
>>> I've been looking into thread/process affinity with respect to a dual-GPU
>>> setup. This is what I found back then:
>>>
>>> 2 processes, CPU affinity on different cores, each renders the full scene
>>> on
>>> a different GPU
>>>
>>> OSG_THREADING=SingleThreaded
>>> (1 core shows heavy use, 2nd core show moderate use, 2 cores idle)
>>>
>>>                                            Quadro 5600    8800GTX
>>> Multi-GPU / Multi-Processing        16        15 (fps)
>>>
>>> OSG_THREADING=ThreadPerContext
>>> (CPU Affinity is set but appears to be ignored: 1 core shows heavy use,
>>> others idle)
>>>
>>> Multi-GPU / Multi-Processing        11        14 (fps)
>>>
>>> So, yes, it helps if the CPU load is balanced across cores, even for this
>>> simple 25M polygon scene that is for 99% GPU. Also, OSG seems to mess up
>>> manual settings when set to multi-threading :) Check CPU loads in the
>>> performance monitor. Don't let 3 cores stand idle all the time; it's a
>>> waste
>>> of available processing time, and this will only get worse in the future
>>> with more cores.
>>>
>>> Side note, I finally received my 4GB memory, so I can get this quad core,
>>> dual GTX260, pcie 2.0 machine running :) Let's see if it behaves
>>> differently
>>> from a Quadro+8800 on pcie 1.1.
>>>
>>> Ulrich Hertlein wrote:
>>>
>>>>
>>>> Additionally: what is the communities' experience with manually
>>>> assigning
>>>> processes to cores?  Does it work?
>>>>
>>>> It sounds like a good idea but in my experience it's best to leave the
>>>> scheduling up to the operating system to handle core affinity (except
>>>> when
>>>> you're running Windows which has a completely fscked-up round-robin
>>>> scheduler).
>>>>
>>>> Cheers,
>>>> /ulrich
>>>> _______________________________________________
>>>> osg-users mailing list
>>>> [email protected]
>>>>
>>>> http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
>>>>
>>>>
>>>
>>> --
>>> Regards,
>>>
>>> Ferdi Smit
>>> INS3 Visualization and 3D Interfaces
>>> CWI Amsterdam, The Netherlands
>>>
>>> _______________________________________________
>>> osg-users mailing list
>>> [email protected]
>>> http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
>>>
>>>
>>
>> _______________________________________________
>> osg-users mailing list
>> [email protected]
>> http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
>>
>
>
> --
> Regards,
>
> Ferdi Smit
> INS3 Visualization and 3D Interfaces
> CWI Amsterdam, The Netherlands
>
> _______________________________________________
> osg-users mailing list
> [email protected]
> http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
>
_______________________________________________
osg-users mailing list
[email protected]
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

Reply via email to