Re: [Flightgear-devel] Performance and compiler options - or maybe

2010-11-17 Thread thorsten . i . renk
 However, the (so far to me unknown) C++ subrouting actually bringing
 clouds into the visibly rendered scenery is even way slower - I can read
 the message that the property writing is over after the expected 2.5
 seconds, but continue to see clouds appear in the scenery for 30 seconds
 and more.

 This effect of 'asynchronously', 'delayed' loading of 3D models sounds
 quite familiar to me and might reflect an intended feature in order to
 save the framerate in these moments when a densely modelled chunk of
 Scenery appears in the view.

I don't doubt that this is an intended feature, and I don't complain that
it tries to save framerate. I just have the feeling something very
inefficient is causing performance drop in the first place.

Consider:

Stuart's 3d clouds and mine are based on very similar technology. There is
a collection of textures for cloudlets, and these are rotated in the
scenery towards the viewer by vertex shaders (I adapted Stuart's shaders
for my purposes, so they are almost identical and I checked that my
modifications did not change the performance significantly).

Yet a standard 3d cloud layer loads, taking into account the different
number of cloudlets, different view ranges and different texture size,
builds about 1000 times (!) faster than my clouds (once it is in the
scenery, there's not so much difference any more - same technology...).

I know that doing things from Nasal is slower than doing it from C++, but
a factor 1000 seems a bit too much to be explained that way. So my
speculation is that Stuart's way of loading clouds into the scenery
'knows' that they are just identical copies of the same texture set over
and over, whereas the routine doing it for me doesn't, so it burns
framerate loading the same textures over and over again. Just my
speculation of course... Anyway, I would *really* appreciate if anyone
could take a look at the chunk of code loading models via the /models/
property node and see if that factor 1000 cannot be changed into a 100 or
even a 10.

Cheers,

* Thorsten


--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
___
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel


Re: [Flightgear-devel] Performance and compiler options - or maybe something else

2010-11-15 Thread thorsten . i . renk

 Not sure, maybe it is connected with an other issue we recently
 discovered. There are indeed some OSG operations which don't scale
 well.
 For example, OSG keeps a simple list of references at each shared
 model - so each shared model knows all nodes it is shared to. Adding a
 new member to the list takes almost no time - no matter on how large
 the list is.
 However, removing a shared model from a node can be very expensive -
 since it needs to search the entire list. The issue is negligible when
 a model isn't shared too often (say  5000 times). But it get's
 really, really ugly when a model has a lot more shares (10.000).
 This has recently caused another really bad performance issue.
 Enabling random scenery objects resulted in about 60.000 cows and
 about 30.000 horses being created (no kidding) to populate the
 FlightGear scenery. Creating these friendly animals was extremely
 efficient (no delay to be noticed). But when a scenery tile had to be
 removed, it had to disconnect a few thousand shares from the shared
 model - and each instance had to be looked-up in a list of about 60K
 elements... now guess what... this took 2-10 seconds per scenery tile.
 Removing several tiles could easily block the thread for a minute or
 two - meanwhile no new scenery tiles could be loaded... That's why we
 had to cull all the scenery animals for now.
 So, the implementation for loading shared objects is really
 efficient - but unloading a shared model can be really terrible -
 heavily depending on the number of shares.
 When you load new clouds - does this also involve dropping older
 clouds (removing some shares)?


*g*

There's indeed a cloud story to go along with the cow story - loading
clouds is comparatively easy and done by appending objects to the existing
array of objects in the scenery, but unloading involves searching the
array for a particular subset, which takes much longer. I spent 5 months
solving that.

Over the time, I have tried a number of solutions - keeping an array of
pointers to the objects indexed by tile so that I don't have to search the
long array for instance.

The most efficient solution which is in now has been to mark each object
by tile index and keep a record of currently active tiles. Then a
housekeeping loop can crawl slowly through the large array, processing a
few objects each frame, compare the object index with the list of active
tiles and remove if no match is found. That means that clouds may still
exist 20 seconds or so after their tile has formally been deleted - but
then again, who cares? Unloading objects doesn't cause a peak load
anywhere, instead the performance needs are spread out constantly across
all frames.

But that's not what the present issue is. So, let me try to explain in
detail. What I do to generate a cloud is:

* assemble a cloud object in Nasal space with position, altitude, tile
index, texture types... as properties (and management methods)

* pass that to a routine which writes into the /models/ subdirectory of
the property tree and append a pointer to the subnode I create in /models/
to the Nasal object

Then for me the work from Nasal is over, some C++ subsystem picks up the
info from the property tree and eventually the cloud appears in the
scenery. This is, by the way the same technique by which the AI tanker is
created and by which objects can be at runtime placed into the scenery
using the ufo.

Creating the object Nasal-internally is lightning-fast - I haven't tested
the limit, but I sure can assemble 1000 clouds per frame without problem.
Writing properties into the tree is somewhat slower - currently I write no
more than 20 clouds per frame into the tree - so if I have 20 fps, writing
the 1000 clouds takes the next 2.5 seconds.

However, the (so far to me unknown) C++ subrouting actually bringing
clouds into the visibly rendered scenery is even way slower - I can read
the message that the property writing is over after the expected 2.5
seconds, but continue to see clouds appear in the scenery for 30 seconds
and more. This depends on texture quality - at one point I was testing
2048x2048 high resolution cloud textures, and it took 4 minutes (!) for
all clouds to appear - simply not feasible. And one can observe that the
framerate drops notably and that the load on the second CPU is high.

So, I guess my question is: I am usually not loading more than 30 distinct
objects, the remaining (970 in the above example) are just copies - can
this information not be used to speed up the process? I believe someone on
this list must be able to identify the subroutine in question, given all I
can tell about it...

Cheers,

* Thorsten




--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read 

Re: [Flightgear-devel] Performance and compiler options - or maybe

2010-11-15 Thread Martin Spott
thorsten.i.r...@jyu.fi wrote:

 However, the (so far to me unknown) C++ subrouting actually bringing
 clouds into the visibly rendered scenery is even way slower - I can read
 the message that the property writing is over after the expected 2.5
 seconds, but continue to see clouds appear in the scenery for 30 seconds
 and more.

This effect of 'asynchronously', 'delayed' loading of 3D models sounds
quite familiar to me and might reflect an intended feature in order to
save the framerate in these moments when a densely modelled chunk of
Scenery appears in the view.

Cheers,
Martin.
-- 
 Unix _IS_ user friendly - it's just selective about who its friends are !
--

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel


Re: [Flightgear-devel] Performance and compiler options - or maybe

2010-11-15 Thread fierst42
Op 15-11-10 11:19, Martin Spott schreef:

 This effect of 'asynchronously', 'delayed' loading of 3D models sounds
 quite familiar to me and might reflect an intended feature in order to
 save the framerate in these moments when a densely modelled chunk of
 Scenery appears in the view.



Do 3D models get unloaded once they are out of the current view or are 
they only unloaded when they are not in the current tile?

In other words, if the viewing angle is changed and 3D models are panned 
out of view, will they be unloaded and later reloaded when the viewing 
angle is turned back to the original?

This would perhaps partly explain why I am having problems with clouds 
being loaded again and again in local weather. The behaviour seems to be 
depending on cloud density, which reminds me of a caching mechanism or 
similar optimisation.

m


--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel


Re: [Flightgear-devel] Performance and compiler options - or maybe something else

2010-11-12 Thread thorsten . i . renk

With regard to the speed of loading models from Nasal into the scenery I
was writing about a while ago, I have made some discovery yesterday.

I was testing a setup in 2.0.0 with some heavy numerics running on the
second CPU, and this pushed the behaviour of the framerate into the
behaviour I was observing with GIT.

I did some follow-up testing and discovered that while multithreading is
apparently by default on in 2.0.0, it is off in GIT. I subsequently
observed that when I run GIT with multithreading on in GIT, the load on
the second CPU is usually modest (5-10%, but when I load the initial
configuration of clouds into the scenery, it increases to 80-90%. In
addition, the experience of flying with heavy cloud configurations was
much smoother in GIT and the multithreading seems to take care of a good
part of the difference I have seen between 2.0.0 and GIT (I don't know if
all - that requires some systematic testing).

So this ends up to a sugestive picture - local weather actually benefits a
lot from a second CPU on board, and although it can be flown with just a
single CPU, it runs much smoother with a second one (which raises the
question - does Heiko who reported the framerate drop on loading models
for the first time usually fly with a single CPU machine?).

I still have no real understanding why this is so, or why loading a large
number of *identical* models into the scenery should take a long time (I
would think that one can make use of the fact that there are really
multiple copies of the same model around to speed things up, and while I
was told that OSG does that automatically, this isn't what I observe) - if
anyone can aid my understanding, please do so, it would be rather
important.

Cheers,

* Thorsten


 To follow up on my previous message:

 Not so with my GIT binary: Loading of the initial cloud configuration
 brings me down to 4 fps, and every time (!) a cloud is loaded from the
 buffer my framerate drops from 34+ to something like 20+ for a moment -
 which makes the whole experience rather jerky.

 I have now made a series of tests to quantify the effect. The test
 situation is

 --disable-fullscreen --geometry=1200x900 --aircraft=ufo --airport=KINS
 --timeofday=noon --disable-real-weather-fetch

 2.0.0 prebuilt:

 empty sky: 190 fps
 with 3d clouds: 128 fps
 with static cold sector tile: 90 fps when loaded,  34 while loading
 with dynamical cold sector tile: 45 fps when loaded,  30 while loading

 (note that this is *not* a fair comparison between standard 3d clouds and
 local weather clouds as the visibility and cloud view distance is rather
 different - not the point of the exercise)

 GIT built against my self-compiled OSG 2.9.10:

 empty sky: 234 fps
 with 3d clouds: 145 fps
 with static cold sector tile: 95 fps when loaded,  6 (!) while loading
 with dynamical cold sector tile: 46 fps when loaded, 7 (!) while loading

 GIT build against the prebuilt OSG 2.9.6 coming with my 2.0.0 binary:

 empty sky: 230 fps
 with 3d clouds: 128 fps
 cold sector, static: 90 fps when loaded,  6 while loading
 cold sector dynamical: 48 fps when loaded,  8 while loading

 From this I conclude that what I'm seeing is not associated with OSG or
 the way I compile OSG. I also conclude that it's not related to
 performance issues of GIT in general - I get actually a better framerate
 than in 2.0.0 with GIT once things are loaded.


--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel


Re: [Flightgear-devel] Performance and compiler options - or maybe something else

2010-11-12 Thread ThorstenB
From: thorsten.r...@jy... - 2010-11-12 10:13
 I still have no real understanding why this is so, or why loading a large
 number of *identical* models into the scenery should take a long time (I
 would think that one can make use of the fact that there are really
 multiple copies of the same model around to speed things up, and while I
 was told that OSG does that automatically, this isn't what I observe) - if
 anyone can aid my understanding, please do so, it would be rather
 important.

Not sure, maybe it is connected with an other issue we recently
discovered. There are indeed some OSG operations which don't scale
well.
For example, OSG keeps a simple list of references at each shared
model - so each shared model knows all nodes it is shared to. Adding a
new member to the list takes almost no time - no matter on how large
the list is.
However, removing a shared model from a node can be very expensive -
since it needs to search the entire list. The issue is negligible when
a model isn't shared too often (say  5000 times). But it get's
really, really ugly when a model has a lot more shares (10.000).
This has recently caused another really bad performance issue.
Enabling random scenery objects resulted in about 60.000 cows and
about 30.000 horses being created (no kidding) to populate the
FlightGear scenery. Creating these friendly animals was extremely
efficient (no delay to be noticed). But when a scenery tile had to be
removed, it had to disconnect a few thousand shares from the shared
model - and each instance had to be looked-up in a list of about 60K
elements... now guess what... this took 2-10 seconds per scenery tile.
Removing several tiles could easily block the thread for a minute or
two - meanwhile no new scenery tiles could be loaded... That's why we
had to cull all the scenery animals for now.
So, the implementation for loading shared objects is really
efficient - but unloading a shared model can be really terrible -
heavily depending on the number of shares.
When you load new clouds - does this also involve dropping older
clouds (removing some shares)?

cheers,
Thorsten

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel


Re: [Flightgear-devel] Performance and compiler options

2010-10-05 Thread Reagan Thomas
  On 10/2/2010 5:40 AM, thorsten.i.r...@jyu.fi wrote:
 To follow up on my previous message:

 Not so with my GIT binary: Loading of the initial cloud configuration
 brings me down to 4 fps, and every time (!) a cloud is loaded from the
 buffer my framerate drops from 34+ to something like 20+ for a moment -
 which makes the whole experience rather jerky.
 I have now made a series of tests to quantify the effect. The test
 situation is

 --disable-fullscreen --geometry=1200x900 --aircraft=ufo --airport=KINS
 --timeofday=noon --disable-real-weather-fetch

 2.0.0 prebuilt:

 empty sky: 190 fps
 with 3d clouds: 128 fps
 with static cold sector tile: 90 fps when loaded,  34 while loading
 with dynamical cold sector tile: 45 fps when loaded,  30 while loading

 (note that this is *not* a fair comparison between standard 3d clouds and
 local weather clouds as the visibility and cloud view distance is rather
 different - not the point of the exercise)

 GIT built against my self-compiled OSG 2.9.10:

 empty sky: 234 fps
 with 3d clouds: 145 fps
 with static cold sector tile: 95 fps when loaded,  6 (!) while loading
 with dynamical cold sector tile: 46 fps when loaded,7 (!) while loading

 GIT build against the prebuilt OSG 2.9.6 coming with my 2.0.0 binary:

 empty sky: 230 fps
 with 3d clouds: 128 fps
 cold sector, static: 90 fps when loaded,  6 while loading
 cold sector dynamical: 48 fps when loaded,  8 while loading

  From this I conclude that what I'm seeing is not associated with OSG or
 the way I compile OSG. I also conclude that it's not related to
 performance issues of GIT in general - I get actually a better framerate
 than in 2.0.0 with GIT once things are loaded.

 But there is a dramatical difference in the impact on performance while
 new models are loaded (if you're flying, 30 fps vs. 6 fps is an issue...).
 That difference must be somewhere in the simgear or flightgear code.

 I can only stress that finding that difference and making GIT as fast as
 2.0.0 in loading models will decide if local weather runs smoothly or not.
 At this point, the system itself is now fairly optimized and runs
 reasonably fast. Any help would be most welcome.

 Cheers,

 * Thorsten



I've had some luck using the Intel compiler instead of gcc on processor 
heavy applications. It would be interesting to see what effect it may 
have on FG performance.

http://software.intel.com/en-us/articles/non-commercial-software-download/


--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel


[Flightgear-devel] Performance and compiler options

2010-09-29 Thread thorsten . i . renk

Hello,

with significant help, I've recently succeeded to compile my own GIT
binary. Initially this was very slow, in the mean time I've been told some
flags for the compiler which I've been using to recompile OpenSceneGraph,
Simgear and Flightgear which improved the available framerate by a factor
two.

Previously I've been using a pre-built Linux 2.0.0 binary and
OpenSceneGraph 2.9.6 by Jon Stockill (of packages linked from the website,
FlightGear-2.0.0-i686-1_slack11.0.tgz) which has been working very well
for me. I've been developing, testing and optimizing the local weather
package mainly with this binary.

My self-compiled GIT binary and OSG 2.9.10 is now *almost* as fast - most
of the time I guess it has about ~10-15% less framerate - with one
important exception: Loading models into the scenery. Which means that as
long as I have a given cloud configuration in the scenery, I reach even
with wind-drift on above 34+ fps in a test situation (noon Cumulus layer
around Las Vegas seen from the F-14b). But this changes whenever a cloud
model is loaded.

Local weather loads clouds by writing into the /models/ node of the
property tree (pretty much the way the tanker.nas script generates a
custom tanker). With my 2.0.0 binary, I have a noticeable drop down to 15+
fps when loading the initial configuration of ~1000 cloudlets. Once that
is done, cloud are loaded from a buffer at a speed of 1 cloudlet per
frame. This doesn't lead to any detectable drop in framerate.

Not so with my GIT binary: Loading of the initial cloud configuration
brings me down to 4 fps, and every time (!) a cloud is loaded from the
buffer my framerate drops from 34+ to something like 20+ for a moment -
which makes the whole experience rather jerky.

I am sure that this difference is not caused by differences in the speed
of writing from Nasal into the property tree (cloud drift writes several
hundred properties per frame, but doesn't slow me down below 30 fps,
loading a new model has far less writing processes) but by whatever the
Flightgear core does after the /models/ node has been written to bring the
model into the scenery.

This always has been a bottleneck for me even in 2.0.0 which I have
addressed by buffering the clouds - but with the GIT binary, it basically
is the one issue which determines the speed of the local weather system.

I'm pretty puzzled as to why this would be so, since it is working fine
with my prebuilt 2.0.0 binary and OSG.

Thus my question: Would Jon be so kind to let me know with what set of
options the prebuilt slackware binaries and OSG libs were compiled, so
that I can check if what I see is related to the way I compile?

Or does anyone know if the code responsible for loading models into the
scenery has been changed since 2.0.0 and if that could account for the
difference in performance?

Or does anyone have a different theory as to what is happening?

(The good thing is that apparently there is a solution, because it does
work smooth and well in 2.0.0)

* Thorsten

P.S.: I've also experienced that the Landmass effect shader is a complete
show-stopper for my system - it brings me from 34+ fps down to 7 fps - is
that normal?


--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel