Hi, The KdTrees in the OSG do not affect the cull traversal in any way whatsoever.
KdTree we have only affect the intersection performance. Robert. 2011/1/23 Полищук Сергей <[email protected]>: > Hi, > > I think you can reduce cull time with build kdtrees option in osgdb registry > or env var OSG_BUILD_KDTREES (if you not already using it). As for draw its > related to large number of state changes i believe, so you should try to > merge statesets. Large number of primitive sets is kinda bad, but with > display lists (at least on nvidia hardware) it's dont hurt that much actually. > > 22.01.2011, 00:13, "Jean-Sébastien Guay" <[email protected]>: >> Hi all, >> >> I thought I had a pretty firm grasp on what to optimize given a certain >> set of scene stats, but I've optimized what I can and I'm still getting >> little improvement in results. So I'll explain my situation here and >> hope you guys have some good suggestions. Sorry if this is a long >> message, but I prefer to give all the relevant data now rather than get >> asked later. >> >> The whole scene is about a 200m x 200m square (apart from the ocean and >> skydome but these are not significant, I have removed them and confirmed >> that the situation is the same). The worst case viewpoint is a flying >> view where the whole scene could be visible at once. So I need to >> balance culling cost with draw cost, since in some views we will see >> only part of the scene (so we should be able to cull away at least part >> of what's not visible) and in the flying view everything is visible so >> we shouldn't waste too much time doing cull tests which we know will not >> cull anything. >> >> The other thing is that there are a lot of dynamic objects, so there are >> a lot of transforms. But I can't change this, it's part of our simulation. >> >> So, after doing some optimization (removing redundant groups, building >> texture atlases where possible, merging geodes and geometry, generating >> triangle strips, most of which I did with the osgUtil::Optimizer), I get >> the following stats, which I'll talk about a bit later: >> >> Scene stats: >> StateSets 1345 >> Groups 392 >> Transforms 672 >> Geodes 992 >> Geometry 992 >> Vertices 139859 >> Primitives 87444 >> >> Camera stats: >> State graphs 1282 >> Drawables 2151 >> PrimitiveSets 73953 >> Triangles 3538 >> Tri. Strips 211091 >> Tri. Fans 16 >> Quads 11526 >> Quad Strips 534 >> Total primitives 226705 >> >> And, both in our simulator and in osgViewer, for the same scene and same >> viewpoint, I get: >> >> FPS: ~35 >> Cull: 5.4ms >> Draw: 19ms >> GPU: 19ms >> >> This is on a pretty good machine: Core i7 920, GeForce GTX 260. >> >> First of all, the stats above tell me that the "Primitives" part of the >> scene stats refers to primitive sets, not just primitives... Since the >> camera stats tell me there are over 226000 primitives in the current view. >> >> As you can see, the number of primitiveSets is very high. If I >> understand correctly, each PrimitiveSet will result in an OpenGL draw >> call, and since my draw time is what's high now, I would want to reduce >> that (since I'm currently at about 3 primitives per primitiveSet on >> average). If I remove triangle strip generation from the optimizer >> options, the stats become: >> >> Scene stats: >> StateSets 1345 >> Groups 392 >> Transforms 672 >> Geodes 992 >> Geometry 992 >> Vertices 190392 >> Primitives 51197 >> >> Camera stats: >> State graphs 1254 >> Drawables 2117 >> PrimitiveSets 4899 >> Triangles 17122 >> Tri. Strips 191 >> Tri. Fans 7212 >> Quads 106464 >> Quad Strips 534 >> Total primitives 131523 >> >> This indicates to me that the tristrip visitor in the optimizer does a >> pretty bad job. I looked at an .osg dump, and it seems to generate a >> separate strip for each quad (so one strip for 4 vertices) which is >> ridiculous... But that's a subject for another day. >> >> When I disabled the tristripper, you can see a massive decrease in the >> number of primitiveSets (and even in the number of primitives), however >> there was no significant change in the frame rate and timings. I don't >> understand this. I would have expected, with more primitives per >> primitiveSet (I'm now at about 26 prims per primSet on average, as >> opposed to around 3 before) and much less draw calls, that the draw time >> would have been much lower. That's not what happens in practice. >> >> My previous attempts at optimizing (using the osgUtil::Optimizer) were >> also centered around lowering the number of primitives (by creating >> texture atlases and sharing state so the merging of geodes and geometry >> objects gave good results). And even though that also lowered the >> numbers (I started at around 2215 Geodes and 2521 Geometry objects in >> the same scene, compare that to 992 each now), it also had underwhelming >> results in practice. >> >> Clearly there are more than one primitiveSet per Geometry in the above >> stats. What I see in the dumped .osg file, is there is often things like: >> >> PrimitiveSets 4 >> { >> DrawArrays TRIANGLES 0 12 >> DrawArrays QUADS 12 152 >> DrawArrays TRIANGLES 164 12 >> DrawArrays QUADS 176 152 >> } >> >> I would expect, by reordering the vertex/color/normal/texCoord data, I >> would be able to get only 2 primitiveSets there, one TRIANGLES and one >> QUADS. Am I wrong? Why does the osgUtil::Optimizer not do this already >> when merging Geometry objects? I expect because it's easier not to do >> it, but still, it gives sub-optimal results... >> >> Of course I can't do that for strips or fans, unless I insert new >> vertices to restart the strip. Again this is something that could be >> done, but might bring diminishing returns in my case given that my own >> scene contains many more triangles and quads than strips and fans (when >> I turn off tristripping). >> >> So, first of all, am I on the right track trying to reduce the number of >> primitiveSets? Do you think on current hardware, disabling tristripping >> is a good idea? >> >> Why, when disabling tristripping which reduced the number of >> primitiveSets from 73953 to 4899, didn't I see an increase in performance? >> >> Is there some other way to find out what's going on and seeing what I >> can improve to increase the performance? I've tried running our app in >> gDEBugger, which tipped me off that I was batching poorly when using >> triangle strips (about 3 prims per primitiveSet as I said above). >> Turning off triangle strips improved the situation (as gDEBugger sees >> it), but not by that much, which is probably coherent with what I'm >> seeing in practice, but I'm no closer to finding out what to improve >> next. What is not mergeable now is like that because of different >> settings in StateSets (backface culling on vs off, can't use texture >> atlas because the wrap mode is set to REPEAT, etc.), so I don't think >> osgUtil::Optimizer can help me improve the situation further... >> >> I have looked at video memory usage by the way, and I'm fine in that >> respect, so I don't think I'm getting any thrashing or paging between >> video RAM and main RAM at runtime. Also, I'm using display lists for >> most of the objects in the scene, I tried using Vertex Buffer Objects >> and it actually slowed it down. >> >> I should also mention that these results are obtained using >> osgShadow::LightSpacePerspectiveShadowMap. I can run the dumped .osg >> file with >> >> osgshadow --lispsm --noUpdate --mapres 2048 <dumped_file>.osg >> >> and I get the results above, which are pretty similar to our simulator. >> If I run the same data file in plain osgViewer without shadows, it runs >> at a solid 60Hz, with stats and timings: >> >> Scene stats: >> StateSets 1345 >> Groups 392 >> Transforms 672 >> Geodes 992 >> Geometry 992 >> Vertices 190392 >> Primitives 51197 >> >> Camera stats: >> State graphs 321 >> Drawables 810 >> PrimitiveSets 1774 >> Triangles 7243 >> Tri. Strips 85 >> Tri. Fans 2508 >> Quads 39370 >> Quad Strips 178 >> Total primitives 49384 >> >> FPS: 60 >> Cull: 1.7ms >> Draw: 8ms >> GPU: 6.8ms >> >> (that's the no tristrips version, so compare these stats to the second >> set of stats from the top, not the first) >> >> I would have expected most numbers there to be half what they were with >> shadows enabled, but as you can see they're consistently less than half, >> so shadows added more than a 100% overhead... Note that even if it added >> exactly 100% overhead, I would still be at 16ms draw, which is too much, >> but I'm just mentioning it in case it may prompt some other suggestions. >> >> I'm not sure I could send my whole scene to everyone on the list, but I >> might be able to send it to someone if they want to see firsthand. Just >> the bare .osg file without any textures and without ocean and skydome >> shows the problem adequately well. >> >> Thanks in advance for any suggestions you might have. I really need to >> improve this, and I've been working for a while already with only a >> small improvement to show for my time... >> >> J-S >> >> -- >> ______________________________________________________ >> Jean-Sebastien Guay [email protected] >> http://www.cm-labs.com/ >> http://whitestar02.webhop.org/ >> _______________________________________________ >> osg-users mailing list >> [email protected] >> http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org > _______________________________________________ > osg-users mailing list > [email protected] > http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org > _______________________________________________ osg-users mailing list [email protected] http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

