On Tue, Aug 31, 2010 at 2:14 AM, Ian Romanick <i...@freedesktop.org> wrote:
> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > While I was trying to get one of the Humus demos working today, it > occurred to me that we can possibly do better than > ir_vec_index_to_cond_assign to lower variable indexing of vectors. In > addition to using conditional assignment, we can also use a dot-product > to pick a single element out of a vector. The variable index operation > becomes: > > const vec4 gl_vec_selector[4] = > vec4[4](vec4(1.0, 0.0, 0.0, 0.0), > vec4(0.0, 1.0, 0.0, 0.0), > vec4(0.0, 0.0, 1.0, 0.0), > vec4(0.0, 0.0, 0.0, 1.0)); > > ... > > float f = dot(v, gl_vec_selector[i]); > > This potentially replaces a big pile of instructions with three: > > 1. Load the address register. > 2. Do the dot-product. > 3. Re-load the address register. > > This means we'd also want to add support to ir_algebraic to convert > dot(v, vec3(0.0, 1.0, 0.0)) to v.y. > > The down-side is that it uses constant slots. Architectures that lack > the ability to do real vector indexing also tend to be starved for both > instructions and constant slots. R500 may be an exception here, but > R300 and i915 are definitely in this category. Are there cases where > this optimization could cause a shader to not fit in hardware limits > when it would have otherwise? > Neither r300 nor r500 supports the ARL opcode in fragment shaders (it's a D3D10 feature), which kind of makes this optimization a no-go. I suggest using SEQ instead: bvec4 selector = equal(vec4(i), vec4(0,1,2,3)); float f = dot(v, vec4(selector)); which should end up being just SEQ followed by DP4. Marek
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev