Hi Mikhail, Actual speed improvements is what I care about, not theoretical impact of number of heap allocations.
As for making your proposed an optional option, this only complicates the code further. Right now I have *no* evidence that the modification is actually required from a performance standpoint. The code adds complexity and will be harder to maintain and debug. For any extra complexity I set the bar higher for justifying it's inclusion - it has have a real and measurable benefit to justify the extra cost for managing the code. If you care critically about heap allocations then you can always override new/delete and provide your own custom scheme. The other aspect you can look at to avoid all these allocations is to have a scene graph that is less fine grained - if you are CPU limited due to draw dispatch then there is good chance that just building the scene graph is a different way will address it. When you have exhausted all these options and proven the case that this bottleneck is a real issue then we can come back and look at caching StateGraph objects. Robert. On 30 June 2014 15:36, Mikhail Izmestev <[email protected]> wrote: > > 30.06.2014 14:58, Robert Osfield пишет: > > What performance profiling have you done and what type of models? What >> scale of performance improvement are you seeing? >> > Ok, just got fresh numbers for my case. > 10 frames rendered, allocations count in render thread: > | total | related to StateGraph | > original | 32039 | 21440(67%) | > with cache | 17677 | 3042(17%) | > > I'm getting that numbers from real app, without any modifications (using > xperf for accounting allocation count and windbg to count 10 frames), so > I'm not able to make absolute numbers be same (I mean 32039-17677 is not > equal 21440-3042), but relative numbers can be compared. > > That is only allocation count improved, but you can ask me what about > speed improvements? > In case when no one want to do lot of allocations in same time - there is > no speed improvement, heap fast enough. But in case when we have other > threads where lot of heap allocations, then again we are depending on heap > implementation, if it can handle allocations from different threads fast > enough then no speed improvement. > But as you can see we are need 2k allocations per frame, so when heap > would be slower on each allocation than usual then overall slowdown can be > significant. > > > I've just done a review and I feel the extra complexity and the >> introduction of mutexes aren't something I'm happy with as a solution. >> > Yep, extra complexity required because we are trying to handle part of > heap job yourself. > I think it is possible to make this feature configurable by compile time > or runtime. > > > > Thanks, > Mikhail. > _______________________________________________ > osg-submissions mailing list > [email protected] > http://lists.openscenegraph.org/listinfo.cgi/osg- > submissions-openscenegraph.org >
_______________________________________________ osg-submissions mailing list [email protected] http://lists.openscenegraph.org/listinfo.cgi/osg-submissions-openscenegraph.org
