Roland Scheidegger wrote: > On 29.03.2010 04:50, Marek Olšák wrote: >> We were talking a bit on IRC that the GLSL compiler implements the sqrt >> function somewhat inefficiently. Instead of rsq+rcp+cmp instructions as >> is in the original code, the proposed patch uses just rsq+mul. Please >> see the patch log for further explanation, and please review. > > I'll definitely agree with the mul instead of rcp part, as that should > be more efficient on a lot of modern hardware (rcp usually being part of > some special function block instead of main alu). > As far as I can tell though we still need the cmp unfortunately, since > invsqrt(0) is infinite and multiplying by 0 will give some undefined > result, for IEEE it should be NaN (well depending on hardware I guess, > if you have implementation which clamps infinity to its max > representable number it should be ok). In any case, glsl says invsqrt(0) > is undefined, hence can't rely on this.
Yeah, I'm going to keep the x==0 test for now. I'm replacing the rcp with mul, per Marek's idea. Thanks, Marek! > Thinking about it, we'd possibly want a SQRT opcode, both in mesa and > tgsi. Because there's actually hardware which can do sqrt (i965 > MathBox), and just as importantly because this gives drivers a way to > implement this as invsqrt + mul without the cmp, if they can. For > instance AMD hardware generally has 3 rounding modes for these ops, > "IEEE" (which gives infinity for invsqrt(0)), "DX" (clamps to > MAX_FLOAT), and "FF" (which clamps infinity to 0, exactly what you need > to implement sqrt with a mul and invsqrt and no cmp - though actually it > should work with "DX" clamping as well). I'd be happy to see a new SQRT instruction. I'll put it on my to-do list. -Brian ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev