On 25 January 2013 13:18, Matt Turner <matts...@gmail.com> wrote: > On Fri, Jan 25, 2013 at 9:55 AM, Paul Berry <stereotype...@gmail.com> > wrote: > > On 25 January 2013 07:49, Paul Berry <stereotype...@gmail.com> wrote: > >> > >> On 24 January 2013 19:44, Matt Turner <matts...@gmail.com> wrote: > >>> > >>> Following this email are eight patches that add the 4x8 pack/unpack > >>> operations that are the difference between what GLSL ES 3.0 and > >>> ARB_shading_language_packing require. > >>> > >>> They require Chad's gles3-glsl-packing series and are available at > >>> > >>> > http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing > >>> > >>> I've also added testing support on top of Chad's piglit patch. The > >>> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to > >>> spot why. > >> > >> > >> I had minor comments on patches 4/8 and 5/8. The remainder is: > >> > >> Reviewed-by: Paul Berry <stereotype...@gmail.com> > >> > >> I didn't spot anything that would explain the failure in unpackUnorm4x8 > >> tests. I'll go have a look at your piglit tests now, and if I don't > find > >> anything there either, I'll fire up the simulator and see if I can see > >> what's going wrong. > > > > > > I found the problem. On i965, floating point divisions are implemented > as > > multiplication by a reciprocal, whereas on the CPU there's a floating > point > > division instruction. Therefore, unpackUnorm4x8's computation of "f / > > 255.0" doesn't yield consistent results when run on the CPU vs the > > GPU--there is a tiny difference due to the accumulation of floating point > > rounding errors. > > > > That's why the "fs" and "vs" variants of the tests failed, and the > "const" > > variant passed--because Mesa does constant folding using the CPU's > floating > > point division instruction, which matches the Python test generator > > perfectly, whereas the "fs" and "vs" variants use the actual GPU. > > > > It's only by dumb luck that this rounding error issue didn't bite us > until > > now, because in principle it could equally well have occurred in the > > unpack2x16 functions. > > > > I believe we should relax the test to allow for these tiny rounding > errors > > (this is what the other test generators, such as > > gen_builtin_uniform_tests.py, do). As an experiment I modified > > gen_builtin_packing_tests.py so that in vs_unpack_2x16_template and > > fs_unpack_2x16_template, "actual == expect${j}" is replaced with > > "distance(actual, expect${j}) < 0.00001". With this change, the test > > passes. > > > > However, that change isn't good enough to commit to piglit, for two > reasons: > > > > (1) It should only be applied when testing the functions whose definition > > includes a division (unpackUnorm4x8, unpackSnorm4x8, unpackUnorm2x16, and > > unpackSnorm2x16). A properly functioning implementation ought to be > able to > > get exact answers with all the other packing functions, and we should > test > > that it does. > > > > (2) IMHO we should test that ouput values of 0.0 and +/- 1.0 are produced > > without error, since a shader author might conceivably write code that > > relies on these values being exact. That is, we should check that the > > following conversions are exact, with no rounding error: > > > > unpackUnorm4x8(0) == vec4(0.0) > > unpackUnorm4x8(0xffffffff) == vec4(1.0) > > unpackSnorm4x8(0) == vec4(0.0) > > unpackSnorm4x8(0x7f7f7f7f) == vec4(1.0) > > unpackSnorm4x8(0x80808080) == vec4(-1.0) > > unpackSnorm4x8(0x81818181) == vec4(-1.0) > > unpackUnorm2x16(0) == vec2(0.0) > > unpackUnorm2x16(0xffffffff) == vec4(1.0) > > unpackSnorm2x16(0) == vec4(0.0) > > unpackSnorm2x16(0x7fff7fff) == vec4(1.0) > > unpackSnorm2x16(0x80008000) == vec4(-1.0) > > unpackSnorm2x16(0x80018001) == vec4(-1.0) > > > > My recommendation: address problem (1) by modifying the templates to > accept > > a new parameter that determines whether the test needs to be precise or > > approximate (e.g. "func.precise"). Address problem (2) by hand-coding a > few > > shader_runner tests to check the cases above. IMHO it would be ok to > leave > > the current patch as is (modulo my previous comments) and do a pair of > > follow-on patches to address problems (1) and (2). > > Interesting. Thanks a lot for finding that and writing it up. > > Since div() is used in by both the Snorm and Unorm unpacking > functions, any idea why it only adversely affects the results of > Unorm? Multiplication by 1/255 yields lower precision than by 1/127? >
After messing around with numpy for a while, it looks like 1/255 expressed as a float32 happens to fall almost exactly between two representable float32 values: 0.0039215683937072754 (representable float32) 0.0039215686274509803 (true value of 1/255) 0.0039215688593685627 (next representable float32) So regardless of which way the rounding goes the relative error is approximately 5.9e-8. By luck, 1/127, 1/32767, and 1/65535 are all much closer to representable float32's, with relative errors of 3.7e-9, 9.3e-10, and 2.2e-10 respectively. So yeah, the relative error introduced by multiplication by 1/255 just happens to be particularly bad. > In investigating the Unorm unpacking failure I did notice that some > values worked (like 0.0, 1.0, and even 0.0078431377), so I don't > expect any problems with precision on the values you suggest. > Thanks for double-checking--I checked that too, to make sure there wasn't some deeper problem lurking. > > I agree with your recommended solution. I'll push these patches today > for the 9.1 branch and do follow-on patches to piglit like you > suggest. > Sounds good. Thanks, Matt.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev