Re: [Flightgear-devel] Shader optimization

Tim Moore Tue, 16 Oct 2012 05:39:07 -0700

On Tue, Oct 16, 2012 at 12:05 PM, Renk Thorsten <thorsten.i.r...@jyu.fi> wrote:
>> One can assume that
>> a vec4 varying is no more expensive than a vec3.
> (...)
>> I'm not sure it's useful to think of each component of a varying
>> vector as a "varying" i.e., three vec3 varying values use up three
>> varying "slots," and so do 3 float varying values
>
> I dunno...
>
> Just counting the number of operations, mathematically the best case scenario 
> for interpolating a general vector across a triangle is in Cartesian 
> coordinates where each coordinate interpolates as an independent number, so 
> the cost of a vec4 would be the same as the cost for 4 floats. In any other 
> case, like curved coordinates or Minkowski space, a Jacobian comes to bite 
> and the vector is more expensive than just 4 scalar numbers.


Yes, I acknowledge that interpolating a vec4 requires more operations
than interpolating a float :)

>
> Now, what I don't know if there's some fancy hardware trick which makes a 
> Cartesian vec4 as cheap as a float. In this case, we could use this by 
> combining every four varying float into one varying vec4 and get the same job 
> done for 25% of the cost. But...

That's the crux of it. I thought the answer was obvious, but it very
much depends on the hardware. For a very long time graphics hardware
has had to rasterize, i.e., interpolate, multiple values across screen
space: depth, color, alpha, texture coordinates.... I just assumed
that it would be no more expensive to interpolate vector values.
However, this very good web page,
http://fgiesen.wordpress.com/2011/07/10/a-trip-through-the-graphics-pipeline-2011-part-8/,
contains this quote:

Update: Marco Salvi points out (in the comments below) that while
there used to be dedicated interpolators, by now the trend is towards
just having them return the barycentric coordinates to plug into the
plane equations. The actual evaluation (two multiply-adds per
attribute) can be done in the shader unit.

So the cost of interpolating  values is indeed incurred as operations
in the (prolog of the) fragment shader. Even the oldest hardware that
supports OpenGL programmable shaders implements vector operations, and
a vector multiply-add has, as far as I know, the same cost as a scalar
operation. On the other hand, the shader compiler might be able to
combine multiple scalar interpolations into vector ops. You can
examine the assembly language for shaders if you want to see what's
actually going on.

I do recommend that web page and the others in the series; they are
quite interesting.

>
> ... the thing I did try is that in adapting the urban shader to atmospheric 
> scattering I ran out of varying slots, I needed two more varying float. I 
> solved this by deleting one varying vec3 (the binormal) and computing it as 
> the cross product - and that gave me the two slots I needed (and presumably 
> one left, but I didn't try that). So this would suggest that indeed each 
> vector component counts the same as a varying float.

They do at the OpenGL API level, which doesn't necessarily correspond
to the hardware implementation.

>
>
>> One reason to pass this as a varying is that on old hardware, GeForce
>> 7 and earlier, it is very expensive to change a uniform that is used
>> by a fragment shader. It forces the shader to be recompiled. So, this
>> is actually a well-known optimization for old machines.
>
> Okay, I didn't know that... But pretty much all weather and 
> environment-dependent stuff (ground haze functions, the wave amplitude for 
> the water shader, overcast haze for the skydome,...) makes use of slowly but 
> continuously changing uniforms (I think gl_LightSource is technically also a 
> uniform), so it doesn't really make sense to have this old machine friendly 
> code in one place in the shader but not in other places in the same shader.
>

True.

>> Also, I want to point out that, in your example, lightdir is in the
>> local coordinate system of the terrain, if in fact you are shading
>> terrain. I would call "world space" the earth-centric coordinate
>> system in which the camera orientation is defined.
>
> gl_Vertex is in some coordinate system which I've usually encountered as 
> 'world space' in shader texts as opposed to gl_Position which is supposed to 
> contain the vertex coordinates in 'eye space'. I realize that gl_Vertex is 
> *not* in the global (xyz) coordinates of Flightgear Earth, although I don't 
> know how the two relate.  Somehow once in the shader world, z is always up... 
> Just a matter of semantics?

I think more usual usage for the local coordinate system is "model
coordinates." The model matrix transforms those coordinates into world
coordinates; the view matrix transforms world coordinates into eye
coordinates. In OpenGL, even in pre-shader days, we tend not to talk
about "world" space much because there is (was) only one matrix stack,
which contains the concatenation of the model and view matrices.

"z is always up" is a matter of convenience. "Z is up" only at the
center of a tile. The tile data on disk is actually stored in a
coordinate system that is aligned with the earth-centric system, so Z
points to the north pole. We rotate the coordinates back to a local
coordinate system because that provides a much more useful bounding
box for intersection testing and culling... and also lets you program
snow lines in shaders :)

Tim

------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel

Re: [Flightgear-devel] Shader optimization

Reply via email to