[Flightgear-devel] Some ideas for better performance

Renk Thorsten Wed, 19 Sep 2012 01:00:33 -0700

Hi All,

unfortunately, due to personal reasons, I won't have much coding time for the 
rest of the year, which means that I probably won't have a chance to do some 
things I sort of promised to do (make more shaders work in the atmospheric 
light scattering framework, help making atmospheric light scattering work in 
Rembrandt...). If anyone else wants to have a go at converting model shaders or 
the runway shader, I can still try to help, but right now I've even started up 
Flightgear just once during the last 4 weeks, so coding is just not something I 
can do.


Anyway - there are some ideas and observations which I made and where I wanted 
to follow up, and I thought I just put them up here so that they may be 
discussed.

I'm interested in making the appearance of the scenery more realistic with 
improved light and procedural texturing, i.e. things that end up in the 
shaders. At least on my system, if I run any eye candy features, the bottleneck 
is shaders (I can render a normal scene from the ufo  without any shader 
effects with about 200 fps, on the other hand rendering urban effect on the 
whole screen drives me down to ~5 fps - I'm sure various statements which have 
been made on this list that our performance would be CPU limited are based on 
something, it's just nothing I am able to experience on my system, on my system 
it's shaders, more specifically fragment shader). So, that's what I would like 
to address - making Flightgear faster while expensive shaders are on.

My general understanding of the situation

There's a tendency for all the high quality eye candy to end up in the fragment 
shaders - see urban effect, water sine wave shader or my procedural texturing. 
In some sense that's good, because that decouples performance from the 
visibility range (since the number of pixels doesn't change), but it may also 
be bad if that constant performance is too low. I don't know how it is for 
others, but when I switch urban effect on, it relies on not too many patches of 
urban terrain in my field of view - if ~1/5 of the scene is urban, I still get 
decent performance, but if all I see is urban it becomes unusable and single 
digit experience. I guess there's part of the reason Stuart spent so much time 
on random buildings... My point being, something like the urban effect doesn't 
really generalize, we can't put expensive stuff all into the fragment shader if 
we can anticipate that it will fill a lot of the scene.

Shuffling some load into the vertex shader helps a lot in the near zone (where 
a triangle has many pixels, and they're all filled based on three expensive 
operations at the corners) but loses out at the far edge of the visual range 
(where several vertices may fall into a single pixel). At least on my system, 
some things only run with usable speed (provided the visibility doesn't exceed 
a limit) because there's load in the vertex shader, so there's virtue in making 
the vertex shader do stuff.

My understanding of Rembrandt is that essentially all operations go to the 
fragment stage since geometry is rendered in one initial pass and then 
buffered. If the situation is fragment-shader limited in the default scheme, my 
guess is that it will be even worse fragment-shader limited in deferred 
rendering. But admittedly what I have to say about deferred rendering is a bit 
guesswork.

Clouds are a bit of an unrelated topic, as they can be very heavy on the vertex 
shader (all the billboard rotations need to be computed...).

So, ideally I'd like to

* make use of the speedup vertex shading operations can provide in the near 
zone without buying into their disadvantages in the far zone
* speed the fragment shaders up as much as possible

Schemes to speed things up vertex shading (and what doesn't work)

The obvious solution to speed up a vertex shader is to drop/pass through 
vertices, or to reduce the number of vertices up front. I understand none of 
this is much of an issue for Rembrandt where the geometry pass doesn't take the 
largest share of time, but I believe deferred rendering isn't unconditionally 
superior - it definitely rules for opaque objects blocking each other and 
multiple light sources, but for a terrain mesh seen from above, I believe the 
ability to do work in the vertex shader and interpolate would under general 
conditions translate to a performance advantage.

I've looked into a few of such schemes (in theory and for some in practice):

* Dropping vertices from the mesh to simplify it for large distances can, if at 
all, be done only with a geometry shader, because the vertex shader has no idea 
what the surroundings look like. Geometry shaders seem to be so expensive to 
run that this scheme is basically dead up-front.

* 'Passing through' vertices which are later going to be invisible works in 
some situations (in heavy ground fog I got a 20% performance boost) - the 
problem is coming up with a criterion to tag these vertices early on which 
isn't so expensive that the performance gain is eaten up by evaluating it

* The terrain mesh from above doesn't really respond well to standard 
techniques like depth buffering or backface culling - from high enough, we see 
the mesh, almost all of the mesh and almost no backface of the mesh, and the 
few back sides of hills don't really make the difference

So, I believe any solution must come from outside the rendering pipeline, once 
vertices are in, it's too late. What makes most sense to me is a pre-computed 
scheme in which we have a high, medium and lowres version of terrain tiles (or 
just high and low) on disk and successively load the LOD stages of the same 
tile as we approach. Since the decision what vertex to cull and how to 
reconnect the mesh afterwards is probably computationally not cheap, a 
pre-computed scheme makes more sense to me than a runtime LOD scheme for the 
mesh.

75% of vertices are typically beyond 0.5 * visibility, but they're increasingly 
heavily fogged. We just see the outline of distant mountains and some topology. 
So in a lowres version of terrain, we could drop many/all landclass vertices 
and just give the whole tile a single texture, because it's going to be more 
than 75% fogged in any case. Topology could be reduced by dropping 80% of the 
vertices based on a criterion that they do not mark a sharp gradient in slope 
(there's lots of literature how to drop vertices in LOD calculations). We'd end 
up with the vertex shader load reduced by a huge margin, which means that the 
vertex shader does only work in the near zone where it's useful.

So, my question to scenery people - would it be possible to run a processing 
step in which we protect all vertices at the tile boundary (to avoid creating 
gaps) and cull a large number of vertices inside the tile, so that we 
pre-generate hires and lowres LOD levels? And could we structure the terrain 
tile manager such that it supports a such a LOD by loading first the lowres and 
then the hires version?

Schemes to speed fragment shaders up

The obvious candidates to drop pixels are things obscuring each other. 
Partially this is automatically taken care of in Rembrandt I think, partially 
not because transparent objects are rendered the same way in default and 
Rembrandt.

Now, I've identified two most promising candidates: The instrument panel blocks 
view to basically anything else, and clouds block a lot of terrain when seen 
from above or a lot of each other when at layer altitude.

The instrument panel issue is tricky to get really right due to near and far 
camera issues, but I typically get 70% of its pixels by a simple rectangular 
mask. I see performance boosts of 50% and more, and the really neat thing is, 
the information that the instrument panel blocks all scenery can be used to 
drop fragments all over the place, independent of transparency issues - I can 
drop trees, scenery, clouds, you name it. It's easy. It requires 3 parameters 
to be defined for each airplane (the obscuring rectangle) and a per-frame 
routine translating that into screen coordinates based on current view, zoom 
and screen resolution. 

Would it be worth to code this properly? Apparently any stencil buffer based 
solution is hugely complicated because the panel is always in the near camera 
whereas what it obscures is in the far camera - so a simple solution may be 
superior.

Clouds blocking terrain or other clouds runs into transparency issues which are 
a bit trickier. 

* I've tried a scheme in which I render the opaque bits of clouds with depth 
buffering and the transparent bits without, but rendering two passes of clouds 
is too slow already, so that's not feasible.

But - what if we would do an early pass rendering simple proxies (one-layered 
discs, rectangles) as a mask in front of the scenery? If these discs are put 
into the scenery at cloud creation time such that they get most of the opaque 
bits of a cloud, they would result in a depth mask against which we could 
decide what scenery and what other clouds to render? I lack the ability to 
actually pull this off, but I think it might just work. It galls me to spend an 
enormous amount of performance in broken cloud cover to render all the scenery 
with all bells and whistles, then burn an equally large performance rendering 
all the clouds which obscure 80% of the terrain and then again a large 
performance to obscure 60% of the scene with the instrument panel. It seems to 
me we should be getting by with less than half the work.

Anyway, that's the ideas which I've been thinking about of late - maybe they're 
helpful.

Best,

* Thorsten




------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel

[Flightgear-devel] Some ideas for better performance

Reply via email to