Hi Mikhail,

Actual speed improvements is what I care about, not theoretical impact of
number of heap allocations.

As for making your proposed an optional option, this only complicates the
code further.

Right now I have *no* evidence that the modification is actually required
from a performance standpoint.  The code adds complexity and will be harder
to maintain and debug.  For any extra complexity I set the bar higher for
justifying it's inclusion - it has have a real and measurable benefit to
justify the extra cost for managing the code.

If you care critically about heap allocations then you can always override
new/delete and provide your own custom scheme.

The other aspect you can look at to avoid all these allocations is to have
a scene graph that is less fine grained - if you are CPU limited due to
draw dispatch then there is good chance that just building the scene graph
is a different way will address it.

When you have exhausted all these options and proven the case that this
bottleneck is a real issue then we can come back and look at caching
StateGraph objects.

Robert.


On 30 June 2014 15:36, Mikhail Izmestev <[email protected]> wrote:

>
> 30.06.2014 14:58, Robert Osfield пишет:
>
>  What performance profiling have you done and what type of models? What
>> scale of performance improvement are you seeing?
>>
> Ok, just got fresh numbers for my case.
> 10 frames rendered, allocations count in render thread:
> | total | related to StateGraph |
> original | 32039 | 21440(67%) |
> with cache | 17677 | 3042(17%) |
>
> I'm getting that numbers from real app, without any modifications (using
> xperf for accounting allocation count and windbg to count 10 frames), so
> I'm not able to make absolute numbers be same (I mean 32039-17677 is not
> equal 21440-3042), but relative numbers can be compared.
>
> That is only allocation count improved, but you can ask me what about
> speed improvements?
> In case when no one want to do lot of allocations in same time - there is
> no speed improvement, heap fast enough. But in case when we have other
> threads where lot of heap allocations, then again we are depending on heap
> implementation, if it can handle allocations from different threads fast
> enough then no speed improvement.
> But as you can see we are need 2k allocations per frame, so when heap
> would be slower on each allocation than usual then overall slowdown can be
> significant.
>
>
>  I've just done a review and I feel the extra complexity and the
>> introduction of mutexes aren't something I'm happy with as a solution.
>>
> Yep, extra complexity required because we are trying to handle part of
> heap job yourself.
> I think it is possible to make this feature configurable by compile time
> or runtime.
>
>
>
> Thanks,
> Mikhail.
> _______________________________________________
> osg-submissions mailing list
> [email protected]
> http://lists.openscenegraph.org/listinfo.cgi/osg-
> submissions-openscenegraph.org
>
_______________________________________________
osg-submissions mailing list
[email protected]
http://lists.openscenegraph.org/listinfo.cgi/osg-submissions-openscenegraph.org

Reply via email to