Re: [Flightgear-devel] Performance and compiler options - or maybe something else

thorsten . i . renk Mon, 15 Nov 2010 00:50:48 -0800

> Not sure, maybe it is connected with an other issue we recently
> discovered. There are indeed some OSG operations which don't scale
> well.
> For example, OSG keeps a simple list of references at each shared
> model - so each shared model knows all nodes it is shared to. Adding a
> new member to the list takes almost no time - no matter on how large
> the list is.
> However, removing a shared model from a node can be very expensive -
> since it needs to search the entire list. The issue is negligible when
> a model isn't shared too often (say < 5000 times). But it get's
> really, really ugly when a model has a lot more shares (>10.000).
> This has recently caused another really bad performance issue.
> Enabling "random scenery objects" resulted in about 60.000 cows and
> about 30.000 horses being created (no kidding) to populate the
> FlightGear scenery. Creating these friendly animals was extremely
> efficient (no delay to be noticed). But when a scenery tile had to be
> removed, it had to disconnect a few thousand shares from the shared
> model - and each instance had to be looked-up in a list of about 60K
> elements... now guess what... this took 2-10 seconds per scenery tile.
> Removing several tiles could easily block the thread for a minute or
> two - meanwhile no new scenery tiles could be loaded... That's why we
> had to "cull" all the scenery animals for now.
> So, the implementation for "loading" shared objects is really
> efficient - but unloading a shared model can be really terrible -
> heavily depending on the number of shares.
> When you "load" new clouds - does this also involve dropping older
> clouds (removing some shares)?



*g*

There's indeed a cloud story to go along with the cow story - loading
clouds is comparatively easy and done by appending objects to the existing
array of objects in the scenery, but unloading involves searching the
array for a particular subset, which takes much longer. I spent 5 months
solving that.

Over the time, I have tried a number of solutions - keeping an array of
pointers to the objects indexed by tile so that I don't have to search the
long array for instance.

The most efficient solution which is in now has been to mark each object
by tile index and keep a record of currently active tiles. Then a
housekeeping loop can crawl slowly through the large array, processing a
few objects each frame, compare the object index with the list of active
tiles and remove if no match is found. That means that clouds may still
exist 20 seconds or so after their tile has formally been deleted - but
then again, who cares? Unloading objects doesn't cause a peak load
anywhere, instead the performance needs are spread out constantly across
all frames.

But that's not what the present issue is. So, let me try to explain in
detail. What I do to generate a cloud is:

* assemble a cloud object in Nasal space with position, altitude, tile
index, texture types... as properties (and management methods)

* pass that to a routine which writes into the /models/ subdirectory of
the property tree and append a pointer to the subnode I create in /models/
to the Nasal object

Then for me the work from Nasal is over, some C++ subsystem picks up the
info from the property tree and eventually the cloud appears in the
scenery. This is, by the way the same technique by which the AI tanker is
created and by which objects can be at runtime placed into the scenery
using the ufo.

Creating the object Nasal-internally is lightning-fast - I haven't tested
the limit, but I sure can assemble 1000 clouds per frame without problem.
Writing properties into the tree is somewhat slower - currently I write no
more than 20 clouds per frame into the tree - so if I have 20 fps, writing
the 1000 clouds takes the next 2.5 seconds.

However, the (so far to me unknown) C++ subrouting actually bringing
clouds into the visibly rendered scenery is even way slower - I can read
the message that the property writing is over after the expected 2.5
seconds, but continue to see clouds appear in the scenery for 30 seconds
and more. This depends on texture quality - at one point I was testing
2048x2048 high resolution cloud textures, and it took 4 minutes (!) for
all clouds to appear - simply not feasible. And one can observe that the
framerate drops notably and that the load on the second CPU is high.

So, I guess my question is: I am usually not loading more than 30 distinct
objects, the remaining (970 in the above example) are just copies - can
this information not be used to speed up the process? I believe someone on
this list must be able to identify the subroutine in question, given all I
can tell about it...

Cheers,

* Thorsten




------------------------------------------------------------------------------
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
_______________________________________________
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel

Re: [Flightgear-devel] Performance and compiler options - or maybe something else

Reply via email to