On 01/22/2013 10:24 AM, Paul Berry wrote: > On 21 January 2013 15:24, Chad Versace <chad.vers...@linux.intel.com> wrote: > >> Generate the following test files: >> {const,vs,fs}-{pack,unpack}{Snorm,Unorm,Half}2x16.shader_test >> >> The tests are generated by a new Python script, >> gen_builtin_packing_tests.py, and placed into directory >> spec/glsl-es-3.00/execution/built-in-functions. >> >> v2: Add reduced_input_table. This allows us to generate a smaller set of >> inputs for packHalf2x16 in order to avoid Linux's oom-killer. >> > > The change looks good, thanks. > > I've included a couple of minor comments below, including some other ideas > for speeding up performance if it's still slow, but I don't consider any of > them blocking issues, so consider this patch: > > Reviewed-by: Paul Berry <stereotype...@gmail.com> > > >> >> CC: Paul Berry <stereotype...@gmail.com> >> Signed-off-by: Chad Versace <chad.vers...@linux.intel.com> >> --- >> generated_tests/CMakeLists.txt | 6 +- >> generated_tests/gen_builtin_packing_tests.py | 1230 >> ++++++++++++++++++++++++++ >> 2 files changed, 1235 insertions(+), 1 deletion(-) >> create mode 100644 generated_tests/gen_builtin_packing_tests.py
>> +# Test evaluation of constant pack2x16 expressions. >> +const_pack_2x16_template = Template(dedent("""\ >> + [require] >> + GL ES >= 3.0 >> + GLSL ES >= 3.00 >> + >> + [vertex shader] >> + const vec4 red = vec4(1, 0, 0, 1); >> + const vec4 green = vec4(0, 1, 0, 1); >> + >> + in vec4 vertex; >> + out vec4 vert_color; >> + >> + void main() >> + { >> + ${func.result_precision} uint actual; >> + >> + gl_Position = vertex; >> + >> + % for io in func.inout_seq: >> + actual = ${func.name}(vec2(${io.input[0]}, ${io.input[1]})); >> + >> + if (true >> + % for u in io.valid_outputs: >> > > Not a huge deal, but you might get better performance out of the shader > compiler if you change this to "% for u in sorted(set(io.valid_outputs))". > (This drops duplicate entries from io.valid_outputs, which happen a lot > because frequently the different rounding/flushing options don't affect the > result.) Execution time is troublesome on Sandybridge. So I'll use your suggestion. >> + && actual != ${u} >> + % endfor >> + ) { >> + vert_color = red; >> + return; >> + } >> + >> + % endfor >> + >> + vert_color = green; >> + } >> > > Another possible performance improvement would be to change the logic to: > > vert_color = green; > % for io in func.inout_seq: > actual = ... > if (...) { > vert_color = red; > } > % endfor > > In orther words, instead of using a "return" statement to shortcut the > shader once a failure is found, set vert_color to green initially, and then > only change it to red if a failure is found. The reason this might improve > performance is because on architectures with limited JMP support, the > version with the embedded return statements has to get rewritten to a very > deeply-nested if tree. Good idea. I'll do that. >> +class FuncOpts: >> + """Options that modify the evaluation of the GLSL pack/unpack >> functions. >> + >> + Given an input and a pack/unpack function, there exist multiple valid >> + outputs because the GLSL specs permit variation in the implementation >> of >> + the function. The actual output is dependent on the GLSL compiler's >> and >> + hardware's choice of rounding mode (for example, to even or to >> nearest) >> + and handling of subnormal (also called denormalized) floating point >> + numbers. >> + >> + This class attempts to capture such permitted variations in >> implementation >> + behavior. To select a particular behavior, pass the appropriate enums >> to >> + the constructor. >> + >> + What follows is an explanation of the variations and how to select >> them. >> + >> + Flushing subnormal floats to zero >> + --------------------------------- >> + The GLSL ES 3.00 and GLSL 4.10 specs allows implementations to >> truncate >> + subnormal floats to zero. From section 4.5.1 "Range and Precision" of >> the >> + two specs: >> + Any subnormal (denormalized) value input into a shader or >> + potentially generated by any operation in a shader can be >> + flushed to 0. >> + >> + The constructor parameter 'flush_mode' selects the flushing behavior. >> + Valid values are: >> + - FLUSH_NONE: Flush no float to zero. >> + - FLUSH_FLOAT32: Flush subnormal float32 values to zero. >> > > Thinking about this some more, I have to admit some surprise that it's > necessary to account for flushing of subnormal float32 values in this test > generator. It seems to me that since all subnormal float32's are > extraordinarily tiny (-2^-126 < x < 2^-126), they will always behave > equivalently to zero considering the coarseness of the packing functions. > Are you aware of a counterexample? Why didn't I see that? I'll remove the float32 flushing before committing. >> + else: >> + # f32 lies in the range [max_normal16 + max_step16, inf), which is >> + # outside the range of finite float16 values. The resultant >> float16 is >> + # infinite. >> > > You might want to define max_step16 above. A little mistake. I'll fix it. >> +def make_inputs_for_pack_snorm_2x16(): >> + # The domain of packSnorm2x16 is [-inf, +inf]^2. The function clamps >> + # its input into the range [-1, +1]^2. >> + # >> + # We test -0.0 in order to stress the implementation's handling of >> zero. >> + # The implementation should return a uint16 that encodes -0.0; that >> is, >> + # a uint16 # with the sign bit set. >> > > Is there a reasoning error in this comment? uint16's are unsigned > integers, so they don't have a sign bit and can't encode -0.0. > > I'm all for testing -0.0 as an input, but I think the only place we need to > be careful to make sure that a -0.0 input yields a negative result is in > packHalf2x16 and unpackHalf2x16. > > Having said that, I think the code is correct. Right. Sometimes I'm surprised that I can write correct code with such stupid comments. >> +def make_inputs_for_pack_half_2x16(): >> + # The domain of packHalf2x16 is ([-inf, +inf] + {NaN})^2. The function >> + # does not clamp its input. >> + # >> + # We test both -0.0 and +0.0 in order to stress the implementation's >> + # handling of zero. >> + >> + subnormal_min = 2.0**(-14) * (1.0 / 2.0**10) >> + subnormal_max = 2.0**(-24) * (1023.0 / 2.0**10) >> > > Whoa, shouldn't that -24 be a -14? Fortunately it doesn't look like > subnormal_max is used below. Oops. >> +def make_inputs_for_unpack_half_2x16(): >> + # For each of the two classes of float16 values, subnormal and >> normalized, >> + # below are listed the exponent and mantissa of the class's boundary >> + # values some values slightly inside the bounds. >> > > Missing the word "and" here? Will fix. _______________________________________________ Piglit mailing list Piglit@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/piglit