Re: [osg-users] Draw threads serialized by default?
Hi Alex, I have been planning to make the processor affinity configurable, but just haven't had a chance to do this yet. This certainly isn't a feature for 2.8.1 as the minor point releases are contain just bug fixes. This feature will have to wait till the 2.9.x dev series and the final 2.10. This does assume that I or other get on to tackling this task before 2.10... Robert. On Thu, Feb 19, 2009 at 2:16 AM, Pecoraro, Alexander N alexander.n.pecor...@lmco.com wrote: I think it would be nice if the processor chosen for the draw thread by the osgViewer was somehow configurable instead of it just defaulting to starting at processor number 1 and going up from there. I, like Todd, seem to have found that running the draw thread on my second processor (on any of the four cores on my second processor) produces better performance than running it on any of the cores of my first processor. I can't explain why I get better performance on my second processor, but the only way I was able to make the draw thread run on my second processor was by modifying the osgViewer::startThreading() function because I found that calling the draw thread's setProcessorAffinity() function had no effect after the thread started running. Perhaps something for 2.8.1? Alex -Original Message- From: osg-users-boun...@lists.openscenegraph.org [mailto:osg-users-boun...@lists.openscenegraph.org] On Behalf Of Robert Osfield Sent: Monday, September 01, 2008 8:34 AM To: OpenSceneGraph Users Subject: Re: [osg-users] Draw threads serialized by default? HI Todd. osgViewer already sets the affinity, and indeed this makes a big difference to performance when running multi-threaded, mult-context/mulit-gpu work. The draw dispatch serialization that osgViewer::Renderer does on top of this makes even more difference on PCs I've tested. I would guess that a decent multi-processing architecture like the Onyx would scale much better, it might be that some very high PC hardware set ups would also scale much better (the AMD 4x4 motherboards spring to mind as a potential candidate for this better scaling). Robert. On Mon, Sep 1, 2008 at 4:22 PM, Todd J. Furlong t...@inv3rsion.com wrote: Robert, The post of yours that Paul linked to sounds very similar to something we saw with VR Juggler OSG a while back: terrible performance with OSG apps that had parallel draw threads. In our case, VR Juggler manages threading, but the same may apply to OSG with osgViewer. For us, it turned out that we had to set the threads' affinity to lock them to a particular CPU/core. The Linux scheduler moved the threads around and thrashed the cache, I believe. Setting the affinity boosted the parallel draw performance back up. The solution we ended up with is twofold: 1. Add a default behavior that sequentially locks draw threads to CPU cores (0,1,2,etc. repeat) 2. Use an environment variable to override the default behavior (VJ_DRAW_THREAD_AFFINITY, a space-delimited list of integers). The default behavior is good for most users, but we can squeeze out a little more performance by tweaking the environment variable. For a system with two draw threads and two dual-core CPUs, the default behavior locks the draw threads to CPUs 0 1, but we get slightly better performance if we set VJ_DRAW_THREAD_AFFINITY=2 3. Regards, Todd Robert Osfield wrote: Hi Paul, On Sat, Aug 30, 2008 at 10:19 PM, Paul Martz pma...@skew-matrix.com wrote: Hi Robert -- Prior to the 2.2 release, code was added to serialize the draw dispatch. Is there a reason that this behavior defaults to ON? (See DisplaySettings.cpp line 135.) I have somehow incorrectly documented this as defaulting to OFF in the ref man. Now that I see it's ON by default, I half wonder if this is a bug. Wanted to check with you: should I change the documentation, or the code? Which is right? The settings has been ON since I introduced the option to serialize the draw dispatch. Just before the 2.6 release I did testing at my end and still found serializing the draw dispatch to be far more effiecent on my Linux/NVidia drivers so I left the option on. In the original thread when I introduced the optional draw mutex into the draw dispatch I did call for testing on the performance impact but I didn't get sufficient feedback to make a more informed decision than just basing it on my own testing. I would still appreciate more testing, as I'd expect that best default setting to vary on different hardware and drivers - I for one would love to see better scalability in driver/hardware. Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.or g ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org
Re: [osg-users] Draw threads serialized by default?
I think it would be nice if the processor chosen for the draw thread by the osgViewer was somehow configurable instead of it just defaulting to starting at processor number 1 and going up from there. I, like Todd, seem to have found that running the draw thread on my second processor (on any of the four cores on my second processor) produces better performance than running it on any of the cores of my first processor. I can't explain why I get better performance on my second processor, but the only way I was able to make the draw thread run on my second processor was by modifying the osgViewer::startThreading() function because I found that calling the draw thread's setProcessorAffinity() function had no effect after the thread started running. Perhaps something for 2.8.1? Alex -Original Message- From: osg-users-boun...@lists.openscenegraph.org [mailto:osg-users-boun...@lists.openscenegraph.org] On Behalf Of Robert Osfield Sent: Monday, September 01, 2008 8:34 AM To: OpenSceneGraph Users Subject: Re: [osg-users] Draw threads serialized by default? HI Todd. osgViewer already sets the affinity, and indeed this makes a big difference to performance when running multi-threaded, mult-context/mulit-gpu work. The draw dispatch serialization that osgViewer::Renderer does on top of this makes even more difference on PCs I've tested. I would guess that a decent multi-processing architecture like the Onyx would scale much better, it might be that some very high PC hardware set ups would also scale much better (the AMD 4x4 motherboards spring to mind as a potential candidate for this better scaling). Robert. On Mon, Sep 1, 2008 at 4:22 PM, Todd J. Furlong t...@inv3rsion.com wrote: Robert, The post of yours that Paul linked to sounds very similar to something we saw with VR Juggler OSG a while back: terrible performance with OSG apps that had parallel draw threads. In our case, VR Juggler manages threading, but the same may apply to OSG with osgViewer. For us, it turned out that we had to set the threads' affinity to lock them to a particular CPU/core. The Linux scheduler moved the threads around and thrashed the cache, I believe. Setting the affinity boosted the parallel draw performance back up. The solution we ended up with is twofold: 1. Add a default behavior that sequentially locks draw threads to CPU cores (0,1,2,etc. repeat) 2. Use an environment variable to override the default behavior (VJ_DRAW_THREAD_AFFINITY, a space-delimited list of integers). The default behavior is good for most users, but we can squeeze out a little more performance by tweaking the environment variable. For a system with two draw threads and two dual-core CPUs, the default behavior locks the draw threads to CPUs 0 1, but we get slightly better performance if we set VJ_DRAW_THREAD_AFFINITY=2 3. Regards, Todd Robert Osfield wrote: Hi Paul, On Sat, Aug 30, 2008 at 10:19 PM, Paul Martz pma...@skew-matrix.com wrote: Hi Robert -- Prior to the 2.2 release, code was added to serialize the draw dispatch. Is there a reason that this behavior defaults to ON? (See DisplaySettings.cpp line 135.) I have somehow incorrectly documented this as defaulting to OFF in the ref man. Now that I see it's ON by default, I half wonder if this is a bug. Wanted to check with you: should I change the documentation, or the code? Which is right? The settings has been ON since I introduced the option to serialize the draw dispatch. Just before the 2.6 release I did testing at my end and still found serializing the draw dispatch to be far more effiecent on my Linux/NVidia drivers so I left the option on. In the original thread when I introduced the optional draw mutex into the draw dispatch I did call for testing on the performance impact but I didn't get sufficient feedback to make a more informed decision than just basing it on my own testing. I would still appreciate more testing, as I'd expect that best default setting to vary on different hardware and drivers - I for one would love to see better scalability in driver/hardware. Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.or g ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.or g ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.or g ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] Draw threads serialized by default?
Hi Paul, On Sat, Aug 30, 2008 at 10:19 PM, Paul Martz [EMAIL PROTECTED] wrote: Hi Robert -- Prior to the 2.2 release, code was added to serialize the draw dispatch. Is there a reason that this behavior defaults to ON? (See DisplaySettings.cpp line 135.) I have somehow incorrectly documented this as defaulting to OFF in the ref man. Now that I see it's ON by default, I half wonder if this is a bug. Wanted to check with you: should I change the documentation, or the code? Which is right? The settings has been ON since I introduced the option to serialize the draw dispatch. Just before the 2.6 release I did testing at my end and still found serializing the draw dispatch to be far more effiecent on my Linux/NVidia drivers so I left the option on. In the original thread when I introduced the optional draw mutex into the draw dispatch I did call for testing on the performance impact but I didn't get sufficient feedback to make a more informed decision than just basing it on my own testing. I would still appreciate more testing, as I'd expect that best default setting to vary on different hardware and drivers - I for one would love to see better scalability in driver/hardware. Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] Draw threads serialized by default?
Robert, The post of yours that Paul linked to sounds very similar to something we saw with VR Juggler OSG a while back: terrible performance with OSG apps that had parallel draw threads. In our case, VR Juggler manages threading, but the same may apply to OSG with osgViewer. For us, it turned out that we had to set the threads' affinity to lock them to a particular CPU/core. The Linux scheduler moved the threads around and thrashed the cache, I believe. Setting the affinity boosted the parallel draw performance back up. The solution we ended up with is twofold: 1. Add a default behavior that sequentially locks draw threads to CPU cores (0,1,2,etc. repeat) 2. Use an environment variable to override the default behavior (VJ_DRAW_THREAD_AFFINITY, a space-delimited list of integers). The default behavior is good for most users, but we can squeeze out a little more performance by tweaking the environment variable. For a system with two draw threads and two dual-core CPUs, the default behavior locks the draw threads to CPUs 0 1, but we get slightly better performance if we set VJ_DRAW_THREAD_AFFINITY=2 3. Regards, Todd Robert Osfield wrote: Hi Paul, On Sat, Aug 30, 2008 at 10:19 PM, Paul Martz [EMAIL PROTECTED] wrote: Hi Robert -- Prior to the 2.2 release, code was added to serialize the draw dispatch. Is there a reason that this behavior defaults to ON? (See DisplaySettings.cpp line 135.) I have somehow incorrectly documented this as defaulting to OFF in the ref man. Now that I see it's ON by default, I half wonder if this is a bug. Wanted to check with you: should I change the documentation, or the code? Which is right? The settings has been ON since I introduced the option to serialize the draw dispatch. Just before the 2.6 release I did testing at my end and still found serializing the draw dispatch to be far more effiecent on my Linux/NVidia drivers so I left the option on. In the original thread when I introduced the optional draw mutex into the draw dispatch I did call for testing on the performance impact but I didn't get sufficient feedback to make a more informed decision than just basing it on my own testing. I would still appreciate more testing, as I'd expect that best default setting to vary on different hardware and drivers - I for one would love to see better scalability in driver/hardware. Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] Draw threads serialized by default?
Ah! I just found the thread Serialization of draw dispatch on multi GPU systems Options, which, I imagine, will explain things after I have a close read: http://tinyurl.com/6hsvdv -Paul _ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Martz Sent: Saturday, August 30, 2008 3:20 PM To: 'OpenSceneGraph Users' Subject: [osg-users] Draw threads serialized by default? Hi Robert -- Prior to the 2.2 release, code was added to serialize the draw dispatch. Is there a reason that this behavior defaults to ON? (See DisplaySettings.cpp line 135.) I have somehow incorrectly documented this as defaulting to OFF in the ref man. Now that I see it's ON by default, I half wonder if this is a bug. Wanted to check with you: should I change the documentation, or the code? Which is right? Paul Martz Skew Matrix Software LLC http://www.skew-matrix.com http://www.skew-matrix.com/ +1 303 859 9466 ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org