Re: [Flightgear-devel] Performance and compiler options - or maybe
However, the (so far to me unknown) C++ subrouting actually bringing clouds into the visibly rendered scenery is even way slower - I can read the message that the property writing is over after the expected 2.5 seconds, but continue to see clouds appear in the scenery for 30 seconds and more. This effect of 'asynchronously', 'delayed' loading of 3D models sounds quite familiar to me and might reflect an intended feature in order to save the framerate in these moments when a densely modelled chunk of Scenery appears in the view. I don't doubt that this is an intended feature, and I don't complain that it tries to save framerate. I just have the feeling something very inefficient is causing performance drop in the first place. Consider: Stuart's 3d clouds and mine are based on very similar technology. There is a collection of textures for cloudlets, and these are rotated in the scenery towards the viewer by vertex shaders (I adapted Stuart's shaders for my purposes, so they are almost identical and I checked that my modifications did not change the performance significantly). Yet a standard 3d cloud layer loads, taking into account the different number of cloudlets, different view ranges and different texture size, builds about 1000 times (!) faster than my clouds (once it is in the scenery, there's not so much difference any more - same technology...). I know that doing things from Nasal is slower than doing it from C++, but a factor 1000 seems a bit too much to be explained that way. So my speculation is that Stuart's way of loading clouds into the scenery 'knows' that they are just identical copies of the same texture set over and over, whereas the routine doing it for me doesn't, so it burns framerate loading the same textures over and over again. Just my speculation of course... Anyway, I would *really* appreciate if anyone could take a look at the chunk of code loading models via the /models/ property node and see if that factor 1000 cannot be changed into a 100 or even a 10. Cheers, * Thorsten -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Performance and compiler options - or maybe something else
Not sure, maybe it is connected with an other issue we recently discovered. There are indeed some OSG operations which don't scale well. For example, OSG keeps a simple list of references at each shared model - so each shared model knows all nodes it is shared to. Adding a new member to the list takes almost no time - no matter on how large the list is. However, removing a shared model from a node can be very expensive - since it needs to search the entire list. The issue is negligible when a model isn't shared too often (say 5000 times). But it get's really, really ugly when a model has a lot more shares (10.000). This has recently caused another really bad performance issue. Enabling random scenery objects resulted in about 60.000 cows and about 30.000 horses being created (no kidding) to populate the FlightGear scenery. Creating these friendly animals was extremely efficient (no delay to be noticed). But when a scenery tile had to be removed, it had to disconnect a few thousand shares from the shared model - and each instance had to be looked-up in a list of about 60K elements... now guess what... this took 2-10 seconds per scenery tile. Removing several tiles could easily block the thread for a minute or two - meanwhile no new scenery tiles could be loaded... That's why we had to cull all the scenery animals for now. So, the implementation for loading shared objects is really efficient - but unloading a shared model can be really terrible - heavily depending on the number of shares. When you load new clouds - does this also involve dropping older clouds (removing some shares)? *g* There's indeed a cloud story to go along with the cow story - loading clouds is comparatively easy and done by appending objects to the existing array of objects in the scenery, but unloading involves searching the array for a particular subset, which takes much longer. I spent 5 months solving that. Over the time, I have tried a number of solutions - keeping an array of pointers to the objects indexed by tile so that I don't have to search the long array for instance. The most efficient solution which is in now has been to mark each object by tile index and keep a record of currently active tiles. Then a housekeeping loop can crawl slowly through the large array, processing a few objects each frame, compare the object index with the list of active tiles and remove if no match is found. That means that clouds may still exist 20 seconds or so after their tile has formally been deleted - but then again, who cares? Unloading objects doesn't cause a peak load anywhere, instead the performance needs are spread out constantly across all frames. But that's not what the present issue is. So, let me try to explain in detail. What I do to generate a cloud is: * assemble a cloud object in Nasal space with position, altitude, tile index, texture types... as properties (and management methods) * pass that to a routine which writes into the /models/ subdirectory of the property tree and append a pointer to the subnode I create in /models/ to the Nasal object Then for me the work from Nasal is over, some C++ subsystem picks up the info from the property tree and eventually the cloud appears in the scenery. This is, by the way the same technique by which the AI tanker is created and by which objects can be at runtime placed into the scenery using the ufo. Creating the object Nasal-internally is lightning-fast - I haven't tested the limit, but I sure can assemble 1000 clouds per frame without problem. Writing properties into the tree is somewhat slower - currently I write no more than 20 clouds per frame into the tree - so if I have 20 fps, writing the 1000 clouds takes the next 2.5 seconds. However, the (so far to me unknown) C++ subrouting actually bringing clouds into the visibly rendered scenery is even way slower - I can read the message that the property writing is over after the expected 2.5 seconds, but continue to see clouds appear in the scenery for 30 seconds and more. This depends on texture quality - at one point I was testing 2048x2048 high resolution cloud textures, and it took 4 minutes (!) for all clouds to appear - simply not feasible. And one can observe that the framerate drops notably and that the load on the second CPU is high. So, I guess my question is: I am usually not loading more than 30 distinct objects, the remaining (970 in the above example) are just copies - can this information not be used to speed up the process? I believe someone on this list must be able to identify the subroutine in question, given all I can tell about it... Cheers, * Thorsten -- Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read
Re: [Flightgear-devel] Performance and compiler options - or maybe
thorsten.i.r...@jyu.fi wrote: However, the (so far to me unknown) C++ subrouting actually bringing clouds into the visibly rendered scenery is even way slower - I can read the message that the property writing is over after the expected 2.5 seconds, but continue to see clouds appear in the scenery for 30 seconds and more. This effect of 'asynchronously', 'delayed' loading of 3D models sounds quite familiar to me and might reflect an intended feature in order to save the framerate in these moments when a densely modelled chunk of Scenery appears in the view. Cheers, Martin. -- Unix _IS_ user friendly - it's just selective about who its friends are ! -- -- Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read more! http://p.sf.net/sfu/dell-eql-dev2dev ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Performance and compiler options - or maybe
Op 15-11-10 11:19, Martin Spott schreef: This effect of 'asynchronously', 'delayed' loading of 3D models sounds quite familiar to me and might reflect an intended feature in order to save the framerate in these moments when a densely modelled chunk of Scenery appears in the view. Do 3D models get unloaded once they are out of the current view or are they only unloaded when they are not in the current tile? In other words, if the viewing angle is changed and 3D models are panned out of view, will they be unloaded and later reloaded when the viewing angle is turned back to the original? This would perhaps partly explain why I am having problems with clouds being loaded again and again in local weather. The behaviour seems to be depending on cloud density, which reminds me of a caching mechanism or similar optimisation. m -- Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read more! http://p.sf.net/sfu/dell-eql-dev2dev ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Performance and compiler options - or maybe something else
With regard to the speed of loading models from Nasal into the scenery I was writing about a while ago, I have made some discovery yesterday. I was testing a setup in 2.0.0 with some heavy numerics running on the second CPU, and this pushed the behaviour of the framerate into the behaviour I was observing with GIT. I did some follow-up testing and discovered that while multithreading is apparently by default on in 2.0.0, it is off in GIT. I subsequently observed that when I run GIT with multithreading on in GIT, the load on the second CPU is usually modest (5-10%, but when I load the initial configuration of clouds into the scenery, it increases to 80-90%. In addition, the experience of flying with heavy cloud configurations was much smoother in GIT and the multithreading seems to take care of a good part of the difference I have seen between 2.0.0 and GIT (I don't know if all - that requires some systematic testing). So this ends up to a sugestive picture - local weather actually benefits a lot from a second CPU on board, and although it can be flown with just a single CPU, it runs much smoother with a second one (which raises the question - does Heiko who reported the framerate drop on loading models for the first time usually fly with a single CPU machine?). I still have no real understanding why this is so, or why loading a large number of *identical* models into the scenery should take a long time (I would think that one can make use of the fact that there are really multiple copies of the same model around to speed things up, and while I was told that OSG does that automatically, this isn't what I observe) - if anyone can aid my understanding, please do so, it would be rather important. Cheers, * Thorsten To follow up on my previous message: Not so with my GIT binary: Loading of the initial cloud configuration brings me down to 4 fps, and every time (!) a cloud is loaded from the buffer my framerate drops from 34+ to something like 20+ for a moment - which makes the whole experience rather jerky. I have now made a series of tests to quantify the effect. The test situation is --disable-fullscreen --geometry=1200x900 --aircraft=ufo --airport=KINS --timeofday=noon --disable-real-weather-fetch 2.0.0 prebuilt: empty sky: 190 fps with 3d clouds: 128 fps with static cold sector tile: 90 fps when loaded, 34 while loading with dynamical cold sector tile: 45 fps when loaded, 30 while loading (note that this is *not* a fair comparison between standard 3d clouds and local weather clouds as the visibility and cloud view distance is rather different - not the point of the exercise) GIT built against my self-compiled OSG 2.9.10: empty sky: 234 fps with 3d clouds: 145 fps with static cold sector tile: 95 fps when loaded, 6 (!) while loading with dynamical cold sector tile: 46 fps when loaded, 7 (!) while loading GIT build against the prebuilt OSG 2.9.6 coming with my 2.0.0 binary: empty sky: 230 fps with 3d clouds: 128 fps cold sector, static: 90 fps when loaded, 6 while loading cold sector dynamical: 48 fps when loaded, 8 while loading From this I conclude that what I'm seeing is not associated with OSG or the way I compile OSG. I also conclude that it's not related to performance issues of GIT in general - I get actually a better framerate than in 2.0.0 with GIT once things are loaded. -- Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read more! http://p.sf.net/sfu/dell-eql-dev2dev ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Performance and compiler options - or maybe something else
From: thorsten.r...@jy... - 2010-11-12 10:13 I still have no real understanding why this is so, or why loading a large number of *identical* models into the scenery should take a long time (I would think that one can make use of the fact that there are really multiple copies of the same model around to speed things up, and while I was told that OSG does that automatically, this isn't what I observe) - if anyone can aid my understanding, please do so, it would be rather important. Not sure, maybe it is connected with an other issue we recently discovered. There are indeed some OSG operations which don't scale well. For example, OSG keeps a simple list of references at each shared model - so each shared model knows all nodes it is shared to. Adding a new member to the list takes almost no time - no matter on how large the list is. However, removing a shared model from a node can be very expensive - since it needs to search the entire list. The issue is negligible when a model isn't shared too often (say 5000 times). But it get's really, really ugly when a model has a lot more shares (10.000). This has recently caused another really bad performance issue. Enabling random scenery objects resulted in about 60.000 cows and about 30.000 horses being created (no kidding) to populate the FlightGear scenery. Creating these friendly animals was extremely efficient (no delay to be noticed). But when a scenery tile had to be removed, it had to disconnect a few thousand shares from the shared model - and each instance had to be looked-up in a list of about 60K elements... now guess what... this took 2-10 seconds per scenery tile. Removing several tiles could easily block the thread for a minute or two - meanwhile no new scenery tiles could be loaded... That's why we had to cull all the scenery animals for now. So, the implementation for loading shared objects is really efficient - but unloading a shared model can be really terrible - heavily depending on the number of shares. When you load new clouds - does this also involve dropping older clouds (removing some shares)? cheers, Thorsten -- Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read more! http://p.sf.net/sfu/dell-eql-dev2dev ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Performance and compiler options
On 10/2/2010 5:40 AM, thorsten.i.r...@jyu.fi wrote: To follow up on my previous message: Not so with my GIT binary: Loading of the initial cloud configuration brings me down to 4 fps, and every time (!) a cloud is loaded from the buffer my framerate drops from 34+ to something like 20+ for a moment - which makes the whole experience rather jerky. I have now made a series of tests to quantify the effect. The test situation is --disable-fullscreen --geometry=1200x900 --aircraft=ufo --airport=KINS --timeofday=noon --disable-real-weather-fetch 2.0.0 prebuilt: empty sky: 190 fps with 3d clouds: 128 fps with static cold sector tile: 90 fps when loaded, 34 while loading with dynamical cold sector tile: 45 fps when loaded, 30 while loading (note that this is *not* a fair comparison between standard 3d clouds and local weather clouds as the visibility and cloud view distance is rather different - not the point of the exercise) GIT built against my self-compiled OSG 2.9.10: empty sky: 234 fps with 3d clouds: 145 fps with static cold sector tile: 95 fps when loaded, 6 (!) while loading with dynamical cold sector tile: 46 fps when loaded,7 (!) while loading GIT build against the prebuilt OSG 2.9.6 coming with my 2.0.0 binary: empty sky: 230 fps with 3d clouds: 128 fps cold sector, static: 90 fps when loaded, 6 while loading cold sector dynamical: 48 fps when loaded, 8 while loading From this I conclude that what I'm seeing is not associated with OSG or the way I compile OSG. I also conclude that it's not related to performance issues of GIT in general - I get actually a better framerate than in 2.0.0 with GIT once things are loaded. But there is a dramatical difference in the impact on performance while new models are loaded (if you're flying, 30 fps vs. 6 fps is an issue...). That difference must be somewhere in the simgear or flightgear code. I can only stress that finding that difference and making GIT as fast as 2.0.0 in loading models will decide if local weather runs smoothly or not. At this point, the system itself is now fairly optimized and runs reasonably fast. Any help would be most welcome. Cheers, * Thorsten I've had some luck using the Intel compiler instead of gcc on processor heavy applications. It would be interesting to see what effect it may have on FG performance. http://software.intel.com/en-us/articles/non-commercial-software-download/ -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today. http://p.sf.net/sfu/beautyoftheweb ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
[Flightgear-devel] Performance and compiler options
Hello, with significant help, I've recently succeeded to compile my own GIT binary. Initially this was very slow, in the mean time I've been told some flags for the compiler which I've been using to recompile OpenSceneGraph, Simgear and Flightgear which improved the available framerate by a factor two. Previously I've been using a pre-built Linux 2.0.0 binary and OpenSceneGraph 2.9.6 by Jon Stockill (of packages linked from the website, FlightGear-2.0.0-i686-1_slack11.0.tgz) which has been working very well for me. I've been developing, testing and optimizing the local weather package mainly with this binary. My self-compiled GIT binary and OSG 2.9.10 is now *almost* as fast - most of the time I guess it has about ~10-15% less framerate - with one important exception: Loading models into the scenery. Which means that as long as I have a given cloud configuration in the scenery, I reach even with wind-drift on above 34+ fps in a test situation (noon Cumulus layer around Las Vegas seen from the F-14b). But this changes whenever a cloud model is loaded. Local weather loads clouds by writing into the /models/ node of the property tree (pretty much the way the tanker.nas script generates a custom tanker). With my 2.0.0 binary, I have a noticeable drop down to 15+ fps when loading the initial configuration of ~1000 cloudlets. Once that is done, cloud are loaded from a buffer at a speed of 1 cloudlet per frame. This doesn't lead to any detectable drop in framerate. Not so with my GIT binary: Loading of the initial cloud configuration brings me down to 4 fps, and every time (!) a cloud is loaded from the buffer my framerate drops from 34+ to something like 20+ for a moment - which makes the whole experience rather jerky. I am sure that this difference is not caused by differences in the speed of writing from Nasal into the property tree (cloud drift writes several hundred properties per frame, but doesn't slow me down below 30 fps, loading a new model has far less writing processes) but by whatever the Flightgear core does after the /models/ node has been written to bring the model into the scenery. This always has been a bottleneck for me even in 2.0.0 which I have addressed by buffering the clouds - but with the GIT binary, it basically is the one issue which determines the speed of the local weather system. I'm pretty puzzled as to why this would be so, since it is working fine with my prebuilt 2.0.0 binary and OSG. Thus my question: Would Jon be so kind to let me know with what set of options the prebuilt slackware binaries and OSG libs were compiled, so that I can check if what I see is related to the way I compile? Or does anyone know if the code responsible for loading models into the scenery has been changed since 2.0.0 and if that could account for the difference in performance? Or does anyone have a different theory as to what is happening? (The good thing is that apparently there is a solution, because it does work smooth and well in 2.0.0) * Thorsten P.S.: I've also experienced that the Landmass effect shader is a complete show-stopper for my system - it brings me from 34+ fps down to 7 fps - is that normal? -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel