Re: [Piglit] [PATCH v3 2/3] arb_shader_precision: add framework for calculating tolerances for complex functions

Micah Fedke Fri, 06 Mar 2015 13:51:02 -0800

So use the "max/min of all permutations" method for all ops?  e.g.:


    def __mul__(self, other):
        a = self.high * other.high
        b = self.high * other.low
        c = self.low * other.high
        d = self.low * other.low
        high = numpy.float32(numpy.amax([a, b, c, d]))
        low = numpy.float32(numpy.amin([a, b, c, d]))
        return ValueInterval(high, low)

And tack on the tolerance at the end like this, for ops that have atolerance? Things should move in the right direction after high and lowhave been determined, if I'm not mistaken.


    def __truediv__(self, other):
        tol = numpy.float32(2.5)
        a = self.high / other.high
        b = self.high / other.low
        c = self.low / other.high
        d = self.low / other.low
        self.high = numpy.float32(numpy.amax([a, b, c, d]))
        self.low = numpy.float32(numpy.amin([a, b, c, d]))
        self.high += _ulpsize(self.high) * tol
        self.low -= _ulpsize(self.low) * tol
        return self

As for manual fma's, that should work. I wonder, though - adouble-round manual fma has the potential to produce more error than asingle-round, and the spec allows either method, so don't we want toevaluate the more error-ful option?



-mf

On 03/06/2015 03:12 PM, Ilia Mirkin wrote:

+    def __truediv__(self, other):
+        tol = numpy.float32(2.5)
+        a = self.high / (other.high + _ulpsize(other.high) * tol)
+        b = self.high / (other.low - _ulpsize(other.low) * tol)
+        c = self.low / (other.high + _ulpsize(other.high) * tol)
+        d = self.low / (other.low - _ulpsize(other.low) * tol)

That's not how I understand the tolerance... I understand it as a
"this operation can be off by this much" sort of thing. i.e. it should
be applied *after* the division. Also due to signs you may be moving
things in the wrong direction. You do this sort of thing in your other
ops as well. Also the non-i* operators shouldn't modify self but
rather return a new value.

Lastly I wonder how this can deal with the uncertainty presented by
a*b+c being implemented as 2 separate ops vs one that doesn't do an
intermediate step. Perhaps it should keep around the low/high at
33-bit (or 64-bit for simplicity) precision and use that in
adds/multiplies (and a non-add/multiply operation would "flush" it to
the 32-bit value)


On Fri, Mar 6, 2015 at 3:49 PM, Micah Fedke <micah.fe...@collabora.co.uk> wrote:

Ilia,

Here is a POC for your interval arithmetic-like method:

http://cgit.collabora.com/git/user/fedke.m/piglit.git/log/?h=complex_tolerances_ia

Only normalize() and length() are implemented right now.

Is that along the lines of what you were thinking?


-mf

On 02/28/2015 12:18 PM, Ilia Mirkin wrote:


BTW, and this is a bigger concern... your approach appears to
basically be along the lines of

1. Perform calculation with ~infinite precision
2. Perform calculation with ... "finite" precision
3. Compare. If diff is too big, don't bother, otherwise keep summing
ULP's based on that diff (rather than the allowable error).

However this doesn't handle compounding that well. If I have an op
that can be off by 1000 ULPs, the computation with low precision won't
capture that. Let's say you define an op like 'fma' as a complex op,
and the mul part of it can be off by 10 ULPs but the 'add' part of it
can't be off by any ULPs. Will the error computed by your "complex"
path provide a tolerance of 10 ULPs in that case?

I think that the better approach (albeit somewhat complex) would be to
define a class like "ImpreciseValue" which would keep both the "low"
and "high" values allowed (as np.float32/float64's). Then you can
define all operations to compute all 4 possible values, i.e. if it's
multiplication, do

a.low * b.low
a.high * b.low
a.low * b.high
a.high * b.high

And take the min/max of those, and if the multiplication operation has
a tolerance of, say, 10 ULP's, spread the min/max apart by that much.
Each operation would return one of these ImpreciseValue things, and
you should be able to determine an accurate tolerance. (I think the
low/high approach works for all the functions we have here... haven't
proven it out, but seems like it could be fine, esp over small ranges,
since in practice the highest error is with pow and that's only 16
ULPs and it's never compounded.) You could even add syntactic sugar
and override things like __add__ and so on so that you can keep using
+, -, etc.

Thoughts?

    -ilia


On Fri, Feb 27, 2015 at 6:02 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote:


On Fri, Feb 27, 2015 at 5:36 PM, Micah Fedke
<micah.fe...@collabora.co.uk> wrote:


+def _gen_tolerance(name, rettype, args):
+    """Return the tolerance that should be allowed for a function for
the
+    test vector passed in.  Return -1 for any vectors that would push
the
+    tolerance outside of acceptable bounds



It seems to alternatively return scalars or lists. Why?

+    """
+    if name in simple_fns:
+        return simple_fns[name]
+    elif name in complex_fns:
+        if name in componentwise:
+            # for componentwise functions, call the reference function
+            # for each component (scalar components multiplex here)
+            # eg. for fn(float a, vec2 b, vec2 c) call:
+            # _fn_ref(a, b[0], c[0])
+            # _fn_ref(a, b[1], c[1])
+            # and then capture the results in a list
+            errors = []
+            badlands = []
+            component_tolerances = []
+            for component in range(rettype.num_cols *
rettype.num_rows):
+                current_args = []
+                for arg in args:
+                    # sanitize args so that all analyze operations can
be performed on a list
+                    sanitized_arg = _listify(arg)
+                    current_args.append(sanitized_arg[component %
len(sanitized_arg)])



This is confusing. Does this ever happen on matrix "outputs"? When
would component be > len(sanitized_arg)? And when it is, why does
modding it by the length again make sense?

+                cur_errors, cur_badlands, cur_component_tolerances =
_analyze_ref_fn(complex_fns[name], current_args)
+                errors.extend(cur_errors)
+                badlands.extend(cur_badlands)
+                component_tolerances.extend(cur_component_tolerances)
+        else:
+            # for non-componentwise functions, all args must be passed
in in a single run
+            # sanitize args so that all analyze operations can be
performed on a list



Why did you choose to split these up? Why not just make the various
functions just always take a list and operate on the whole list? It
seems like that would save a lot of heartache and confusion...

+            current_args = map(_listify, args)
+            errors, badlands, component_tolerances =
_analyze_ref_fn(complex_fns[name], current_args)
+        # results are in a list, and may have one or more members
+        return -1.0 if True in badlands else map(float,
component_tolerances)



If badlands is just a true/false (as it seems to be), makes sense to
not keep it as a list and just have it as a plain bool right?

Also the

a if b else c

syntax is generally meant for "a is the super-common case, but in
these awkward situations, it may be c". Or "I _really_ _really_ want a
single expression here". It doesn't seem like it'd reduce readability
to just do the more common

if True in badlands:
    return -1
else:
    return map(float, component_tolerances)

Which also makes it more obvious that something funky is going on
since one path returns a scalar while the other returns a list.

+    elif name in mult_fns:
+        x_type = glsl_type_of(args[0])
+        y_type = glsl_type_of(args[1])
+        # if matrix types are involved, the multiplication will be
matrix multiplication
+        # and we will have to pass the args through a reference
function
+        if x_type.is_vector and y_type.is_matrix:
+            mult_func = _vec_times_mat_ref
+        elif x_type.is_matrix and y_type.is_vector:
+            mult_func = _mat_times_vec_ref
+        elif x_type.is_matrix and y_type.is_matrix:
+            mult_func = _mat_times_mat_ref
+        else:
+            # float*float, float*vec, vec*float or vec*vec are all done
as
+            # normal multiplication and share a single tolerance
+            return mult_fns[name]
+        # sanitize args so that all analyze operations can be performed
on a list
+        errors, badlands, component_tolerances =
_analyze_ref_fn(mult_func, _listify(args))
+        # results are in a list, and may have one or more members
+        return -1.0 if True in badlands else map(float,
component_tolerances)
+    else:
+        # default for any operations without specified tolerances
+        return 0.0



We've had a few rounds of these reviews, and I'm still generally
confused by the specifics of how this is done. I normally don't have
this much trouble reading code... not sure what's going on. Feel free
to find someone else to review if you're getting frustrated by my
apparently irreparable state of confusion.

Cheers,

    -ilia



--

Micah Fedke
Collabora Ltd.
+44 1223 362967
https://www.collabora.com/
https://twitter.com/collaboraltd


--

Micah Fedke
Collabora Ltd.
+44 1223 362967
https://www.collabora.com/
https://twitter.com/collaboraltd
_______________________________________________
Piglit mailing list
Piglit@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/piglit

Re: [Piglit] [PATCH v3 2/3] arb_shader_precision: add framework for calculating tolerances for complex functions

Reply via email to