Hi all,

I was curious and tried the modifications with the trunk version of osg with 
our viewer on a few of our biggest scene and there was no measurable 
performance difference.
By chance I have profiled our viewer with CodeXL a few days ago on a build in 
RelWithDebInfo mode. This was done on a almost static scene without moving the 
camera. Because I have an Intel CPU I could only do "time-base sampling" 
profiling and I don't know how accurate this is but for your information, the 
top hottest functions accounting for 25% of cpu time were (this only shows the 
time spend in the function and not in the functions that are called (and not 
inlined) from these):

Code:

Function,Samples, % of Hotspot Samples, Module
osg::Group::traverse(class osg::NodeVisitor &), 3437, 6.19692%, osg112-osgrd.dll
OpenThreads::Atomic::operator--(void), 2847, 5.13315%, ot20-OpenThreadsrd.dll
OpenThreads::Atomic::operator++(void), 2837, 5.11512%, ot20-OpenThreadsrd.dll
osg::Plane::transformProvidingInverse(class osg::Matrixd const &), 2595, 
4.67879%, osg112-osgrd.dll
osgUtil::StateGraph::find_or_insert(class osg::StateSet const *), 1918, 
3.45816%, osg112-osgUtilrd.dll



The rundown from find_or_insert:

Code:

Line, Address, Source Code, Code Bytes, Hotspot Samples, % of Hotspot Samples, 
Timer
175, ,         inline StateGraph* find_or_insert(const osg::StateSet* 
stateset), , , , , 
176, 0x7feed81afb0,         {, , 214, 11.1575, 214, 
177, ,             // search for the appropriate state group, return it if 
found., , , , , 
178, 0x7feed81afcb,             ChildList::iterator itr = 
_children.find(stateset);, , 1658, 86.4442, 1658, 
179, 0x7feed81b016,             if (itr!=_children.end()) return 
itr->second.get();, , 42, 2.18978, 42, 
180, , , , , , , 
181, ,             // create a state group and insert it into the children 
list, , , , , 
182, ,             // then return the state group., , , , , 
183, 0x7feed81b021,             StateGraph* sg = new 
StateGraph(this,stateset);, , , , , 
184, 0x7feed81b04c,             _children[stateset] = sg;, , , , , 
185, 0x7feed81b081,             return sg;, , , , , 
186, 0x7feed81b084,         }, , 4, 0.208551, 4, 




osg::Util::StateGraph::prune and moveStateGraph are somewhere all the way down 
in the list with only 0.2% of samples measured.
Note that in the osg trunk a Geode is a Group and we have a lot of Geodes with 
one drawable in our scene (we need to be able to move all the objects in 
realtime) that is not yet optimized out because we use the stable version of 
osg for our clients. So thats why the Group::traverse function is on top.
I think that the ++ and -- atomic operators come from all the ref matrices that 
are pushed/popped at stack traversal but I'm not sure. CodeXL does not want to 
show the callstack for those.
Cheers,
Pjotr

------------------
Read this topic online here:
http://forum.openscenegraph.org/viewtopic.php?p=60098#60098





_______________________________________________
osg-submissions mailing list
[email protected]
http://lists.openscenegraph.org/listinfo.cgi/osg-submissions-openscenegraph.org

Reply via email to