Hi All,

Over the last week I've been working on improving the performance of
the DatabasePager when paging data over http, and storing the tiles in
a local file cache, I checked in the first cut of the work last night.
 Previous rev of the DatabasePager had only a single thread for
reading tiles, and if one of these tiles was an http request it would
stall the loading of all tiles behind it in the queue till the http
read had completed, effectively stalls the paging - so you see a long
hesitation before any new tiles are loaded in.  This artefact is most
obvious when you have lots of data already downloaded in your local
file cache, so parts of your database loads in very quickly - almost
instantly, but when a single tile has to be pulled in via http
everything stalls, even if you move to an area where there is lots of
data in the local file cache where no http requests are required
nothing happens till the http request is complete, something that
might take 5 to 10 seconds.  All in all it leads rather unpleasant
stop/start experience of your data - although I must add the note, its
not the frame rate that stalls as you'll still get a solid 75Hz (or
whatever vysnc is set to) but the stalls paging is what ruins things.

My first take was to put threading into the curl plugin, with it
multi-threading the http requests, this worked but... it was
complicated to manage the dialog between the DatabasePager and the
plugin, and it couldn't take advantage of the tile priority system
that the pager has for making sure the most visually important tiles
get loaded in, and the no longer visible ones get discarded.  There
was also problems with closing the threads and exit of the
application.  I did work around this via an explicit destruction of
the osgDB::Registry and new ReaderWriter::stopThreading() method
implemented by the curl plugin, but this still meant that all apps
using http paging would have to add this explicitly destruction just
to prevent crashes on exit.   Sometimes you head off on a particular
code route and few it getting more and more convoluted way beyond the
actual complexity of the problem being tackled, and this was just one
instance, when this happens you have to take a step back and realise
that the solution to the problem is an inappropriate one, and that
it's time to throw away the work and start afresh.

So... on to the second take - to refactor the DatabasePager so that
rather than have one thread, to have a list of threads, each with
different responsibilities.  Original DatabasePager subclassed from
OpenThreads::Thread, so it was a case of "is a thread", now
DatabasePager doesn't subclass from OpenThreads::Thread at all,
instead has a list of DatabaseThread objects, each of which subclasses
from OpenThreads::Thread, so now we have DatabasePager "has a" list of
threads.  One of the rules of OO programming is 'prefer "has a" over
"is a" '  so I guess we might be on the right track.

The DatabasePager can now potentially be set up like:

  1) A single DatabaseThread that handles all http and non http requests
  2) A DatabaseThread for handling non http requests, and one
DatabaseThread for handling http requests
  3) Multiple DatabaseThread for each of non http and http requests

I haven't just made this configurable, but it's my plan to provide
public methods for configuring how many threads to allocate, first I
need to decide on the least awkward API for it.   One can toggle
between the above right now by tweaking #if #endif blocks in the
DatabasePager.cpp, and it's possible to run all of the three above, so
I already know that all of the above work.  For now option 2 is
complied by default as this provides a reasonable balance of
performance.

Another change to the DatabasePager was to store the DatabaseRequest
objects that are used internally by the pager directly on the
ProxyNode/PagedLOD nodes as well as internally in the various queues
in the DatabasePager.  In the ProxyNode/PagedLOD classes you'll now
see getDatabaseRequest method that returns an
osg::ref_ptr<osg::Referenced>&.  This tweak makes it much quicker for
the cull traversal to update the DatabaseRequests according to the
needs for the new
frame, and avoids the need to costly mutex locks and linear searches
that was previously being done in the DatabasePager::requestNodeFile
method.  Previously the cost of the requestNodeFile could get rather
high when lots of requests backed up - it was a O(n squared) cost that
could accumulate high to 5+ms extra cull traveral time each frame,
enough to break frame in some instances.   The new code is O(n) as
well as having much lower overhead per op, so now the cost in cull
traversal rarely goes above 0.1ms even for large databases.

The improvements to cull overhead has all be achieved without any
introduction of extra search data structures, so the DatabasePager
code itself has not been complicated by the change the
DatabasPager::request/PagedLOD/ProxyNode, in fact it actually
simplified some parts.  However, the change has meant that
the NodeVisitor::DatabaseRequestHandler::requestNodeFile() method now
has an additional argument, as does the
DatabasePager::requestNodeFile, this alas breaks backwards
compatibility.   For the vast majority of users this won't be any
issue as subclassing from the DatabasePager, and adding own custom
paged
nodes is very rare, but for those who have you'll need tweak you code.
  I believe FlightGear might be one such example, just ping me if you
spot problems.

Another change I've made is to moved the checking of and writing of
the OSG_FILE_CACHE from the curl plugin directly into the
DatabasePager as this was required to make sure that local and http
requests are properly separated into the appropriate threads (i.e. a
http request that is the local file cache must be handled by the non
http thread, rather than the http thread otherwise you end up with the
old stalls in paging.)  My intention is to eventually move the
OSG_FILE_CACHE support into the Registry or perhaps into its own
helper class i.e. a osgDB::FileCache.   I haven't finalized any of
these details though.  More info on this as I moved towards a final
solution for it.

I've been doing lots of testing at me end, and so far things looks to
be hanging together pretty well - the paging stalls that plagued
viewing of http with local file cache databases are now gone, the
whole experience is far more smooth and effortless - as nature
intended ;-)  Testing has only been carried out on my main work Linux
machine so though, and on a small set of paged databases, so I'd
appreciate testing out in the wild on other platforms, other databases
and applications.

I will continue testing at me end, and also continue to evolve parts
of the API - such as thread control and FileCache support, but in
terms of the overall solution I believe we should now be 90% there.
The vast majority of application develops I expect to unaffected by
these changes - just a recompile against the SVN version of the OSG
should be sufficient, however, with all new code one has to vigilant
to new bugs that might have crept in.  If you do spot issues just
raise them and I'll try to get them resolved while the pager work is
still fresh in my mind.

Good luck with the new pager :-)
Robert.
_______________________________________________
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

Reply via email to