After sleeping it over... > Now set > all Advanced Weather settime() to 0.0 and retest with METAR. Wow!!! > Improvement, not as good as Basic Weather , but much better. Worst fps is > stable at 24, av. is still unstable.
You had a worst of 27 in your list http://dl.dropbox.com/u/57645542/stagger-data.htm running everything and were unhappy. Now a stable worst of 24 makes you happy? It'd be good to test this not with METAR but with high pressure core. Basically, if you clock all loops at 0.0, as far as Nasal is concerned you make the average frame duration come down to the worst frame duration. Which would indeed trade framerate for smoothness, except... on my system, this runs into the garbage problem, so I get a really bad worst frame delay when I run all stuff at all times. So the lesson I learned is as long as garbage collection is a problem, avoid running stuff on a per-frame basis whenever you can. This looks to be very system specific, though. > Right now, with Advanced Weather we have a weather simulator with a > FlightSim attached. We're spending 10 (yes 10!) times as long in the Events > Sub-module with Advanced Weather than in Basic, and 5 times as long as we > spend in Flight. > For loops are bad - in C++ as much as in Nasal. For loops hold up > the execution of the main loop, even when they do very little. I'm not quite sure you actually appreciate the task being done, so let me expand a bit. First of all, in terms of raw floating point operations, the water shader beats flight (and weather) probably by a few orders of magnitude. So what we really have is a water reflection simulator with a few percent of the remaining performance dedicated to environment and flight. What saves our ass in performance lists is just that water reflection can run on the GPU and hence doesn't even show up in CPU load comparison - it slows you down nevertheless in practice, dramatically so. Flight has the task of solving equations of motion for a single object, based on a series of coefficients (to be determine by functions or interpolation tables). At the core is a set of differential equations, we know how to solve them by discretizing, replacing derivatives by differences applying some error corrections and cleverness and integrating forward in time. I know the problem reasonably well, I could write code doing it if needed. Advanced Weather has to solve equations of motion for all clouds in the scene (on a nice summer afternoon, say we have about 5000 Cumulus clouds, 1000 of these have a significant thermal underneath). If the equations of motion would be as complex as for an airplane, we'd have a framerate (if we start out having 60 fps without clouds) of 60/6000 = 0.01. That wouldn't be good. So we use very simple equations of motion which give plausible behaviour. Say these are a factor 100 faster, which gives us a framerate of 1 fps. Still not good. This is where the brute-force approach ends. Next, we realize that not all objects need priority attention. Nearby objects like the thermal we're in are more relevant than distant objects (like the thermal we're not in). Visible objects like clouds need more attention than invisible objects like thermals - it's more acceptable for a thermal to have a 'jumpy' motion than for a cloud nearby. Thus, we do not need to solve every equation of motion per graphical frame. It's sufficient to solve them at a slower rate, we can visit every object in the scene only once in a while. But it shouldn't be too jumpy - a 10 m discontinuous motion can't be felt in a glider, but a 100 m jump can (I tested these and compared with my real-life experience). Clouds in the scene should best not move discontinuously at all (you'd be the first to point this out...). Thermals should not be decorrelated from cap clouds over time, rain should not leak out underneath clouds,... that limits the way we can distribute tasks. In other words, the weather system needs to know where the clouds are even if the clouds are actually moved in the scene by the shader. The weather system still needs to correlate thermals and do the vertical motion of the clouds in response to terrain obstacles, and for that it doesn't just need the information where the cloud is but also what the terrain underneath is like. Say in doing all that cleverly, we can get the framerate from 1 to 30 fps. It's clear that assigning priorities will cause some fluctuations in the framerate, because the system has to make guesses what is important at the precise moment, and sometimes it guesses wrong and the task it decided to do is easier than expected. At the same time, we may also move through the scene. We have supersonic aircraft, the Concorde at cruise altitude for instance goes through 3 weather tiles every 50 seconds, that gives you a whopping 16 seconds to get to know the terrain, build 5000 clouds dependent on what you fly over, display them, and remove them again. If we have loops with one iteration per graphical frame and run at a stable 30 fps, we get to load and unload 480 clouds in the available timeslot. Not good. Not good at all, that's just 1/10 of what we need with no emergency reserve. So we would be able to out-fly the weather, clouds would remain behind and create an ever-increasin backlog of unremoved clouds, which would pile up in memory,... we don't want that. So, we want to have a system which is robust against 'just being there and watching clouds drift' as in, say, flying a balloon where you expect that the local conditions don't change because you move with the weather and a system which is robust against racing through everything at Mach 3 (or even Mach 6 with the X-15). At the same time, it shouldn't create discontinuities as you slow down an land, so there can't be a separate supersonic and a slow mode. This is where the internal for-loops come in. Doing one object per frame is way too slow, you need to do several. If you are capable of building and removing 30 clouds per frame, you're Concorde-save. Of course, if you are in a slow plane, that number is more than you need - but then, to distinguish at which rate you currently need to build clouds to fill all frames equally (which depends on your speed, if you're flying turns or not, on the current visibility, on the average number of clouds per weather tile in the current situation) requires a hideously complex heuristics - which also may guess wrong in the end. In addition, we struggle with problems such as 'We want to pre-build a cloud configuration beyond the visibility range, so that it is already there when we need a new tile. Unfortunately, that's a 'can't do' - there is no terrain loaded beyond visual range, and since we want to build a cloud pattern which is consistent with the terrain, we need terrain info before building clouds. That gives a relatively narrow window of opportunity in which clouds can be generated as we approach tile edges, we can't really do it all the time. I could go on with that for a while, there's a whole host of other complications waiting... But - please just don't give me 'for loops are bad'. I don't get the impression that you understand well why the individual subsystems do their tasks the way they do. The reality is a pretty nightmare of scheduling tasks and priorizing resources. I've spent months learning how to schedule tasks so that things are robust and still reasonably fast. You look in the end at a cloud-filled sky, think that this somehow looks like a somewhat prettier version of what Basic Weather does, and observe that it's just a bit slower and jerkier. But that's not what it is underneath at all. Basic Weather has no real understanding of the terrain, it can't auto-generate a plausible configuration of thermals, it can't move them in lockstep with cap clouds, it can't age thermals and decay them when they reach a water surface,... Advanced Weather can. > "Optimising" code is bad. You might make it better for your system, > but make it worse for everyone else. OSG/OpenGL do a pretty good job all > by themselves Here's my version: Optimizing code is badly needed. OSG/OpenGL don't do a pretty good job all by themselves for sufficiently complex tasks and that's a fact - been there, seen it - Advanced Weather without any optimization would drive you to single digit framerates no matter what system you run easily. What we're talking here is the tip of the iceberg - can we optimize task scheduling even more such that you get a smooth 50 fps out. The actual task length to completely unload a weather tile is about 4 seconds on my box, the length to build and load it is often ~8 seconds, and I'd say that's pretty well hidden these days given that these are by nature discontinuous operations. Your strategy appears to be to just throw resources at the problem and not try any optimization (which makes it smooth allright), and in the tasks you have had to deal with, you apparently got away with it. It's all nice that you get enough framerate out of the water shader and can affort to compute cloud coverage in the shader per frame per pixel, but those of us not running high end machines would like to run it as well, and I am glad I figured out a way to make it 50% faster for me. > For loops are bad - in C++ as much as in Nasal. For loops hold up > the execution of the main loop, even when they do very little. In fact, any computation holds up the execution of the main loop, that's trivial. Inside the main loop, certain tasks need to be done. That takes time (and reduces framerate). The framerate loss is proportional to the time of the task inside the main loop, independent on how the task is done technically. I can easily come up with a function that stalls the main loop a lot without going into a for loop. Simple counter-example to your claim - if you want a smooth motion of 10 objects in the visual field, you can't move one per frame. That makes for jerky motion of everything. You need to move all per frame, so you need the for loop, because the task is such that it demands it. On the other hand, in a fuel consumption calculation, you might get away by doing one tank per frame instead of looping over all tanks, since the precision of a per-frame fuel consumption isn't actually needed. The nature of the task dictates whether you need for loops or not inside the main loop. So what needs to be optimized is the time spent per iteration of the main loop, given the constraints posed by the task to be done. Cheers, * Thorsten ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel