Hi,

The KdTrees in the OSG do not affect the cull traversal in any way whatsoever.

KdTree we have only affect the intersection performance.

Robert.

2011/1/23 Полищук Сергей <[email protected]>:
> Hi,
>
> I think you can reduce cull time with build kdtrees option in osgdb registry 
> or env var OSG_BUILD_KDTREES (if you not already using it). As for draw its 
> related to large number of state changes i believe, so you should try to 
> merge statesets. Large number of primitive sets is kinda bad, but with 
> display lists (at least on nvidia hardware) it's dont hurt that much actually.
>
> 22.01.2011, 00:13, "Jean-Sébastien Guay" <[email protected]>:
>> Hi all,
>>
>> I thought I had a pretty firm grasp on what to optimize given a certain
>> set of scene stats, but I've optimized what I can and I'm still getting
>> little improvement in results. So I'll explain my situation here and
>> hope you guys have some good suggestions. Sorry if this is a long
>> message, but I prefer to give all the relevant data now rather than get
>> asked later.
>>
>> The whole scene is about a 200m x 200m square (apart from the ocean and
>> skydome but these are not significant, I have removed them and confirmed
>> that the situation is the same). The worst case viewpoint is a flying
>> view where the whole scene could be visible at once. So I need to
>> balance culling cost with draw cost, since in some views we will see
>> only part of the scene (so we should be able to cull away at least part
>> of what's not visible) and in the flying view everything is visible so
>> we shouldn't waste too much time doing cull tests which we know will not
>> cull anything.
>>
>> The other thing is that there are a lot of dynamic objects, so there are
>> a lot of transforms. But I can't change this, it's part of our simulation.
>>
>> So, after doing some optimization (removing redundant groups, building
>> texture atlases where possible, merging geodes and geometry, generating
>> triangle strips, most of which I did with the osgUtil::Optimizer), I get
>> the following stats, which I'll talk about a bit later:
>>
>> Scene stats:
>> StateSets     1345
>> Groups         392
>> Transforms     672
>> Geodes         992
>> Geometry       992
>> Vertices    139859
>> Primitives   87444
>>
>> Camera stats:
>> State graphs       1282
>> Drawables          2151
>> PrimitiveSets     73953
>> Triangles          3538
>> Tri. Strips      211091
>> Tri. Fans            16
>> Quads             11526
>> Quad Strips         534
>> Total primitives 226705
>>
>> And, both in our simulator and in osgViewer, for the same scene and same
>> viewpoint, I get:
>>
>> FPS: ~35
>> Cull: 5.4ms
>> Draw: 19ms
>> GPU: 19ms
>>
>> This is on a pretty good machine: Core i7 920, GeForce GTX 260.
>>
>> First of all, the stats above tell me that the "Primitives" part of the
>> scene stats refers to primitive sets, not just primitives... Since the
>> camera stats tell me there are over 226000 primitives in the current view.
>>
>> As you can see, the number of primitiveSets is very high. If I
>> understand correctly, each PrimitiveSet will result in an OpenGL draw
>> call, and since my draw time is what's high now, I would want to reduce
>> that (since I'm currently at about 3 primitives per primitiveSet on
>> average). If I remove triangle strip generation from the optimizer
>> options, the stats become:
>>
>> Scene stats:
>> StateSets     1345
>> Groups         392
>> Transforms     672
>> Geodes         992
>> Geometry       992
>> Vertices    190392
>> Primitives   51197
>>
>> Camera stats:
>> State graphs       1254
>> Drawables          2117
>> PrimitiveSets      4899
>> Triangles         17122
>> Tri. Strips         191
>> Tri. Fans          7212
>> Quads            106464
>> Quad Strips         534
>> Total primitives 131523
>>
>> This indicates to me that the tristrip visitor in the optimizer does a
>> pretty bad job. I looked at an .osg dump, and it seems to generate a
>> separate strip for each quad (so one strip for 4 vertices) which is
>> ridiculous... But that's a subject for another day.
>>
>> When I disabled the tristripper, you can see a massive decrease in the
>> number of primitiveSets (and even in the number of primitives), however
>> there was no significant change in the frame rate and timings. I don't
>> understand this. I would have expected, with more primitives per
>> primitiveSet (I'm now at about 26 prims per primSet on average, as
>> opposed to around 3 before) and much less draw calls, that the draw time
>> would have been much lower. That's not what happens in practice.
>>
>> My previous attempts at optimizing (using the osgUtil::Optimizer) were
>> also centered around lowering the number of primitives (by creating
>> texture atlases and sharing state so the merging of geodes and geometry
>> objects gave good results). And even though that also lowered the
>> numbers (I started at around 2215 Geodes and 2521 Geometry objects in
>> the same scene, compare that to 992 each now), it also had underwhelming
>> results in practice.
>>
>> Clearly there are more than one primitiveSet per Geometry in the above
>> stats. What I see in the dumped .osg file, is there is often things like:
>>
>>            PrimitiveSets 4
>>            {
>>              DrawArrays TRIANGLES 0 12
>>              DrawArrays QUADS 12 152
>>              DrawArrays TRIANGLES 164 12
>>              DrawArrays QUADS 176 152
>>            }
>>
>> I would expect, by reordering the vertex/color/normal/texCoord data, I
>> would be able to get only 2 primitiveSets there, one TRIANGLES and one
>> QUADS. Am I wrong? Why does the osgUtil::Optimizer not do this already
>> when merging Geometry objects? I expect because it's easier not to do
>> it, but still, it gives sub-optimal results...
>>
>> Of course I can't do that for strips or fans, unless I insert new
>> vertices to restart the strip. Again this is something that could be
>> done, but might bring diminishing returns in my case given that my own
>> scene contains many more triangles and quads than strips and fans (when
>> I turn off tristripping).
>>
>> So, first of all, am I on the right track trying to reduce the number of
>> primitiveSets? Do you think on current hardware, disabling tristripping
>> is a good idea?
>>
>> Why, when disabling tristripping which reduced the number of
>> primitiveSets from 73953 to 4899, didn't I see an increase in performance?
>>
>> Is there some other way to find out what's going on and seeing what I
>> can improve to increase the performance? I've tried running our app in
>> gDEBugger, which tipped me off that I was batching poorly when using
>> triangle strips (about 3 prims per primitiveSet as I said above).
>> Turning off triangle strips improved the situation (as gDEBugger sees
>> it), but not by that much, which is probably coherent with what I'm
>> seeing in practice, but I'm no closer to finding out what to improve
>> next. What is not mergeable now is like that because of different
>> settings in StateSets (backface culling on vs off, can't use texture
>> atlas because the wrap mode is set to REPEAT, etc.), so I don't think
>> osgUtil::Optimizer can help me improve the situation further...
>>
>> I have looked at video memory usage by the way, and I'm fine in that
>> respect, so I don't think I'm getting any thrashing or paging between
>> video RAM and main RAM at runtime. Also, I'm using display lists for
>> most of the objects in the scene, I tried using Vertex Buffer Objects
>> and it actually slowed it down.
>>
>> I should also mention that these results are obtained using
>> osgShadow::LightSpacePerspectiveShadowMap. I can run the dumped .osg
>> file with
>>
>>    osgshadow --lispsm --noUpdate --mapres 2048 <dumped_file>.osg
>>
>> and I get the results above, which are pretty similar to our simulator.
>> If I run the same data file in plain osgViewer without shadows, it runs
>> at a solid 60Hz, with stats and timings:
>>
>> Scene stats:
>> StateSets     1345
>> Groups         392
>> Transforms     672
>> Geodes         992
>> Geometry       992
>> Vertices    190392
>> Primitives   51197
>>
>> Camera stats:
>> State graphs        321
>> Drawables           810
>> PrimitiveSets      1774
>> Triangles          7243
>> Tri. Strips          85
>> Tri. Fans          2508
>> Quads             39370
>> Quad Strips         178
>> Total primitives  49384
>>
>> FPS: 60
>> Cull: 1.7ms
>> Draw: 8ms
>> GPU: 6.8ms
>>
>> (that's the no tristrips version, so compare these stats to the second
>> set of stats from the top, not the first)
>>
>> I would have expected most numbers there to be half what they were with
>> shadows enabled, but as you can see they're consistently less than half,
>> so shadows added more than a 100% overhead... Note that even if it added
>> exactly 100% overhead, I would still be at 16ms draw, which is too much,
>> but I'm just mentioning it in case it may prompt some other suggestions.
>>
>> I'm not sure I could send my whole scene to everyone on the list, but I
>> might be able to send it to someone if they want to see firsthand. Just
>> the bare .osg file without any textures and without ocean and skydome
>> shows the problem adequately well.
>>
>> Thanks in advance for any suggestions you might have. I really need to
>> improve this, and I've been working for a while already with only a
>> small improvement to show for my time...
>>
>> J-S
>>
>> --
>> ______________________________________________________
>> Jean-Sebastien Guay    [email protected]
>>                                http://www.cm-labs.com/
>>                         http://whitestar02.webhop.org/
>> _______________________________________________
>> osg-users mailing list
>> [email protected]
>> http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
> _______________________________________________
> osg-users mailing list
> [email protected]
> http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
>
_______________________________________________
osg-users mailing list
[email protected]
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

Reply via email to