Re: [Mesa-dev] [RFC PATCH] Add GL_MESA_ieee_fp_alu_mode specification draft

2020-02-24 Thread Matteo Bruni
On Mon, Feb 24, 2020 at 8:21 PM Ilia Mirkin  wrote:
>
> On Mon, Feb 24, 2020 at 1:10 PM Ian Romanick  wrote:
> >
> > On 2/23/20 5:57 PM, Ilia Mirkin wrote:
> > > ---
> > >
> > > We talked about something like this a while back, but the end result
> > > was inconclusive. I added a TGSI MUL_ZERO_WINS shader property for nine.
> > > But it'd be nice for wine to be able to control this too.
> > >
> > > I couldn't actually find any evidence of the discussion from 2017 or so,
> > > so ... let's have another one.
> > >
> > >  docs/specs/MESA_ieee_fp_alu_mode.spec | 136 ++
> > >  1 file changed, 136 insertions(+)
> > >  create mode 100644 docs/specs/MESA_ieee_fp_alu_mode.spec
> > >
> > > diff --git a/docs/specs/MESA_ieee_fp_alu_mode.spec 
> > > b/docs/specs/MESA_ieee_fp_alu_mode.spec
> > > new file mode 100644
> > > index 000..cb274f06571
> > > --- /dev/null
> > > +++ b/docs/specs/MESA_ieee_fp_alu_mode.spec
> > > @@ -0,0 +1,136 @@
> > > +Name
> > > +
> > > +MESA_ieee_fp_alu_mode
> > > +
> > > +Name Strings
> > > +
> > > +GL_MESA_ieee_fp_alu_mode
> > > +
> > > +Contact
> > > +
> > > +Ilia Mirkin, ilia 'at' x.org
> > > +
> > > +IP Status
> > > +
> > > +No known IP issues.
> > > +
> > > +Status
> > > +
> > > +Proposed
> > > +
> > > +Version
> > > +
> > > +Number
> > > +
> > > +TBD
> > > +
> > > +Dependencies
> > > +
> > > +OpenGL 3.0 or OpenGL ES 3.0 is required.
> > > +
> > > +The extension is written against the OpenGL GL 3.0 and OpenGL ES 3.0
> > > +specifications.
> > > +
> > > +Overview
> > > +
> > > +Pre-GL3 hardware did not generally have full IEEE floating point 
> > > operation
> > > +support. Among other things, 0 * Infinity would work out to 0, and 
> > > NaN's
> > > +might not be generated, or otherwise be treated improperly. 
> > > GL3-class and
> > > +later hardware introduced full IEEE FP support, including NaN, 
> > > Infinity,
> > > +and the proper generation of these.
> > > +
> > > +Some software targeted at older hardware makes assumptions about how 
> > > the
> > > +shader ALU works. And to accomodate these, GL3-class hardware has a 
> > > way to
> > > +change how the shader ALU behaves. There are no standards around 
> > > this, and
> > > +different hardware has different ways of dealing with it. However 
> > > these
> > > +modes were designed specifically with such older software in mind.
> > > +
> > > +This extension introduces a way to configure a context to be in 
> > > non-IEEE
> > > +ALU mode. This extension does not specify precisely what this means, 
> > > as
> > > +each vendor has something different. Generally it means non-IEEE 
> > > compliant
> > > +handling of multiplication, as well as any other unspecified changes.
> >
> > I think many of the other things are specified.  They're the non-IEEE
> > behaviors of GL_ARB_vertex_program and GL_ARB_fragment_program, and
> > those mimic the required behavior of early DX shader models.  There are
> > a bunch of cases that specify that zero is generated when IEEE would
> > require NaN.
> >
> > If there's just a small handful of things like this, we'd probably be
> > better adding a couple new built-in functions to do the job.  The
> > problem on Intel hardware is... we really, really don't want to switch
> > to non-IEEE mode because it changes how a bunch of things work, and we
> > haven't tested any of that in many years.  I'd much rather put in some
> > kind of work-arounds for things that don't want multiplication or pow()
> > to generate NaN.
>
> So basically anything that ever involves multiplication needs to have
> these variants. Things like dot, the various crazy ops of days past
> whose names escape me but involve complex calculations, etc. Things
> like pow are questionable (depends on if they get decomposed or not),
> and things like rcp/rsq unquestionably produce NaN's (or Infinity,
> sorry not 100% sure but easily checked) on NVIDIA irrespective of that
> mode being enabled.
>
> Also on Intel hardware, as you mention, the "non-ieee" mode is ...
> interesting, so to allow for that, I didn't want to say anything other
> than the positive cases. If you have no interest in exposing this, I
> could rewrite this in a NVIDIA/AMD-friendly manner.
>
> >
> > As for the mechanism, I'm very strongly in favor of something that would
> > be locked-in when the shader is compiled.  I really want to avoid any
> > potential that an external glEnable could trigger a a recompile.
>
> Stefan Dösinger suggested a context flag on IRC. I'd be fine with that
> too, even if I have to go create 2 exts due to GLX/EGL.
>
> >
> > The more I think about it... having an extension that adds a handful
> > built-in functions that give old shader model behavior would be a good
> > idea.  We could even test it. :)  I've looked a lot of shaders, and I've
> > seen a lot of not-quite-what-they-wanted methods for avoiding NaN
> > behavior in a 

Re: [Mesa-dev] [PATCH] glsl/linker: check same name is not used in block and outside

2018-01-30 Thread Matteo Bruni
2018-01-30 16:11 GMT+01:00 Juan A. Suarez Romero <jasua...@igalia.com>:
> According with OpenGL GLSL 3.20 spec, section 4.3.9:
>
>   "It is a link-time error if any particular shader interface
>contains:
>  - two different blocks, each having no instance name, and each
>having a member of the same name, or
>  - a variable outside a block, and a block with no instance name,
>where the variable has the same name as a member in the block."
>
> This fixes a previous commit 9b894c8 ("glsl/linker: link-error using the
> same name in unnamed block and outside") that covered this case, but
> did not take in account that precision qualifiers are ignored when
> comparing blocks with no instance name.

Thanks! This also fixes Wine, although the reason is slightly
different: Wine might end up linking a program with a vertex shader
containing "#extension GL_ARB_cull_distance : enable" together with a
vertex shader without it. I guess this makes the gl_PerVertex builtin
defined in the two shader objects differ and that triggered the old
check but not the new one from this patch.

I don't know if you want to change the commit message to account for
that (maybe make it more general?) but in any case, and FWIW:

Tested-by: Matteo Bruni <matteo.myst...@gmail.com>

I also have a nitpick which you can entirely ignore, below.

> + if (var->get_interface_type() != existing->get_interface_type()) {
> +if (!var->get_interface_type() || 
> !existing->get_interface_type()) {
> +   linker_error(prog, "declarations for %s `%s` are inside block 
> "
> +"`%s` and outside a block",
> +mode_string(var), var->name,
> +var->get_interface_type()
> +   ? var->get_interface_type()->name
> +   : existing->get_interface_type()->name);
> +   return;
> +} else if (strcmp(var->get_interface_type()->name,
> +  existing->get_interface_type()->name) != 0) {
> +   linker_error(prog, "declarations for %s `%s` are inside 
> blocks "
> +"`%s` and `%s`",
> +mode_string(var), var->name,
> +existing->get_interface_type()->name,
> +var->get_interface_type()->name);

What about declaring a couple of variables for the interface types and
using them through? Like:

const glsl_type var_itype = var->get_interface_type();
const glsl_type existing_itype = existing->get_interface_type();

(maybe with different names, I don't know what's Mesa's preferred style)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] r600g: use ieee variants of multiplication instructions

2017-01-24 Thread Matteo Bruni
2017-01-24 19:15 GMT+01:00 Ilia Mirkin <imir...@alum.mit.edu>:
> On Tue, Jan 24, 2017 at 1:11 PM, Matteo Bruni <matteo.myst...@gmail.com> 
> wrote:
>> 2017-01-24 3:18 GMT+01:00 Ilia Mirkin <imir...@alum.mit.edu>:
>>> This matches the behavior of most other drivers, including nouveau.
>>
>> Doesn't this break all the applications depending on d3d9 NaN behavior
>> (including, but not limited to, d3d9 games in Wine) on r600g?
>>
>> If I got this right, flipping around the two patches in this series
>> and enabling the TGSI_PROPERTY_MUL_ZERO_WINS flag for OpenGL
>> non-compute shaders (if that's not the case already) should avoid
>> regressions.
>
> This patch normalizes r600g wrt multiply handling with the other
> DX10/11 hardware drivers. nv50, nvc0, si, and i965 all use the IEEE
> behavior. I don't know for sure, but assume that nv30 and r300 have
> the DX9 behavior natively without IEEE support.
>
> The next patch allows for the MUL_ZERO_WINS property to be used to get
> the DX9 behavior, which st/nine will make use of.

That doesn't help Wine or any "native" OpenGL application which
happens to depend on the old behavior.
Even if there are none of them (which doesn't sound right to me)
applying this patch before 2/2 means that you are changing behavior
for nine in this one patch and changing it back again with the next,
which looks to me as something generally better avoided.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] r600g: use ieee variants of multiplication instructions

2017-01-24 Thread Matteo Bruni
2017-01-24 3:18 GMT+01:00 Ilia Mirkin :
> This matches the behavior of most other drivers, including nouveau.

Doesn't this break all the applications depending on d3d9 NaN behavior
(including, but not limited to, d3d9 games in Wine) on r600g?

If I got this right, flipping around the two patches in this series
and enabling the TGSI_PROPERTY_MUL_ZERO_WINS flag for OpenGL
non-compute shaders (if that's not the case already) should avoid
regressions.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] NaN behavior in GLSL (was Re: [PATCH] glsl: always do sqrt(abs()) and inversesqrt(abs()))

2017-01-13 Thread Matteo Bruni
2017-01-13 3:37 GMT+01:00 Ilia Mirkin :
> On Thu, Jan 12, 2017 at 9:13 PM, Jason Ekstrand  wrote:
>> Unless, of course, it's controlled by the same hardware bit... Clearly, we
>> can can give you abs on rsq without denorm flushing (easy shader hacks) but
>> not the other way around.
>
> OK, so somehow I missed that earlier. However there's an interesting
> section in the PRM:
>
> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-skl-vol07-3d_media_gpgpu.pdf
>
> on PDF page 854, "Dismissed Legacy Behaviors" which has a list of
> suggested IEEE 754 deviations for DX9. One of them is indeed that 0 *
> x = 0, but another is that input NaNs be propagated with certain
> exceptions. Also they suggest that RCP(0)/RSQ(0) = fmax. Interesting.
>
> So at this point, the zero_wins thing is pretty much blown. i965
> appears to have an all-or-nothing approach, and additionally that
> approach doesn't match up exactly to what NVIDIA does (or at least I'm
> not aware of a clamp-everything mode).
>
> This will take some thought to figure out how something can be
> specified so that a single spec works for both i965 and nv/amd. OTOH
> we could have two different specs that just expose different things -
> e.g. i965 could expose a MESA_shader_float_alt_mode or whatever which
> is spec'd to do the things that the PRM says, and nv/amd have the
> MESA_shader_float_zero_wins ext which does what we were talking about
> earlier.
>
> I'm open to other suggestions too.

Maybe we can go back to the original idea and have the extension
require that no NaNs can be generated by GLSL mathematical operators
and builtin functions (if no operand is a NaN?) It's possible that's
not exactly it but in any case the idea is to just specify expected
results, without requiring a specific route to get there. The
extension could introduce undefined behavior where necessary e.g.
allowing (but not requiring) INF results to be always flushed to fmax
when enabled.

For Intel that would work trivially. For AMD it should be a matter of
using the special instructions where necessary and "be careful" in a
few places (in the same vein as the RSQ and POW opcodes of ARB
programs Marek mentioned). Not sure about nouveau, I guess it should
be similar to AMD in the end.

Would that be too messy? Am I completely missing the point?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] NaN behavior in GLSL (was Re: [PATCH] glsl: always do sqrt(abs()) and inversesqrt(abs()))

2017-01-12 Thread Matteo Bruni
2017-01-12 23:41 GMT+01:00 Axel Davy :
>> Do you refer to the d3d9 MAD or the hardware instruction? If the
>> former, just generating MUL and ADD separately should do the trick. In
>> the latter case, I guess that means the "NaN switch" should also
>> affect code generation (although I think that should be already
>> covered by the "precise" qualifier.)
>>
> all radeon card released so far have special instructions for both mul and
> mad to have 0*inf = 0.
>
>
> I guess you'd need some gl extension to use them if available.

I see. Does it need to be a separate extension though? I.e. isn't it
enough if the driver uses those special instructions when we don't
want NaN and use the normal instructions otherwise?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] NaN behavior in GLSL (was Re: [PATCH] glsl: always do sqrt(abs()) and inversesqrt(abs()))

2017-01-12 Thread Matteo Bruni
2017-01-12 22:54 GMT+01:00 Axel Davy :
>
> Preventing NaN from being generated is not sufficient to fix the 0*inf = 0
> issue.
>
> For example radeonsi does convert all NaN to zeros via a hardware setting.
>
> But 0*inf = 0 behaviour should be also in mad, and with the NaN to zero
> conversion, you get 0 * inf + 24 = 0 instead of 24.

Do you refer to the d3d9 MAD or the hardware instruction? If the
former, just generating MUL and ADD separately should do the trick. In
the latter case, I guess that means the "NaN switch" should also
affect code generation (although I think that should be already
covered by the "precise" qualifier.)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] NaN behavior in GLSL (was Re: [PATCH] glsl: always do sqrt(abs()) and inversesqrt(abs()))

2017-01-12 Thread Matteo Bruni
2017-01-12 22:25 GMT+01:00 Roland Scheidegger <srol...@vmware.com>:
> Is there actually a formal requirement that d3d9 hw never generates
> NaNs? I think d3d9 is very lacking in spec there - if that is specified
> somewhere I've never seen it... Maybe just everybody is expecting no
> NaNs there too (because earlier hw couldn't do it). I think some hw
> couldn't do infinity neither.

Right, I don't think there is any clear requirement in d3d9. Probably
applications at some point just started to depend (knowingly or not)
on not getting NaNs and later on the d3d9 drivers for newer hardware
which supported NaN had to preserve the established behavior to avoid
breaking them.

d3d9 and below don't have much of an actual spec though, similar
patterns of formally unspecified behavior actually requiring a
specific handling happen a lot so it doesn't seem particularly
surprising to me. On top of my mind, there's the behavior when
sampling from a texture unit with no texture bound but if you're
interested you can just look at the d3d9 tests in Wine to find a bunch
more...

> In any case, being able to select NaN behavior looks potentially quite
> useful to me.
>
> Roland
>
>
>> Cheers,
>> Matteo Bruni.
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] NaN behavior in GLSL (was Re: [PATCH] glsl: always do sqrt(abs()) and inversesqrt(abs()))

2017-01-12 Thread Matteo Bruni
2017-01-11 19:09 GMT+01:00 Jason Ekstrand <ja...@jlekstrand.net>:
> Another reason why I'm not a huge fan is that there is some momentum in the
> industry to make GLSL better defined with respect to NaN.  I don't know that
> anything will ever come of it (because it may break apps) but if something
> does, we may find ourselves having to make SQRT and RSQ NaN-correct in the
> future and, hey look, it'll break apps.

Hijacking the thread to focus on this. What's the interest about
specifically (assuming that's something that can be talked about
publicly?)

I'm asking because in Wine we have been bitten many times by the GLSL
behavior WRT NaN and at the moment we really can't do anything
sensible about it in the general case. Many d3d9 applications depend
on not generating NaN in cases where e.g. IEEE 754 would expect to.
The RSQ (i.e. inversesqrt()) from the original thread is one of those,
but that's not really a problem for sane applications since it's
"defined" in d3d8/d3d9 as taking the absolute value of the operand and
indeed we translate RSQ to inversesqrt(abs(x)). The real issue is with
cases like 0*inf, which AFAIK on d3d9 are supposed to give 0 as
result.

Of course, that changes with d3d10+, which generally requires to
generate and preserve NaN, as Roland already mentioned.

So, what would be really nice to have is a GLSL extension for some
kind of switch to select the requested behavior WRT NaN. For example a
three-way option with "don't generate NaN in arithmetic operations",
"do generate NaN" and "don't care". It could also be a GL state if
that's easier to implement with the existing hardware, since an
individual application isn't supposed to require different behavior
from one shader to the next.

Is anyone interested in / favorable to something like this? It would
solve the issue with defining NaN behavior in GLSL while making things
a bit more compatible with "other API a lot of games are ported from
which happens to be supported by all the desktop GPUs".


Cheers,
Matteo Bruni.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/formatquery: limit ES target support, fix core context support

2016-09-03 Thread Matteo Bruni
2016-09-03 21:11 GMT+02:00 Ilia Mirkin <imir...@alum.mit.edu>:
> First off, as late as ES 3.2, GetInternalformat only supports
> RENDERBUFFER and 2DMS(_ARRAY) targets.
>
> Secondly, the _mesa_has_ext helpers are very accurate... a little too
> accurate, some might say. If we only show an extension in compat
> profiles because core profiles have the functionality guaranteed, they
> will return false. Fix these to either check for a core profile
> explicitly, or to a different-but-identical extension available in core
> profile.
>
> Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu>
> Cc: mesa-sta...@lists.freedesktop.org

I also wrote a patch for this bug, essentially identical except yours
has that couple of additional ES changes. Your patch looks good and it
works for me so, FWIW, have my

Reviewed-by: Matteo Bruni <matteo.myst...@gmail.com>
Tested-by: Matteo Bruni <matteo.myst...@gmail.com>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/10] glsl: Don't allow gl_FragData[i], with i0 in GLES shaders

2014-12-01 Thread Matteo Bruni
2014-12-01 14:04 GMT+01:00 Eduardo Lima Mitev el...@igalia.com:
 The OpenGL ES Shading Language specification describes the
 values that may be output by a fragment shader. These are
 gl_FragColor and gl_FragData[0]. Multiple render targets
 are not supported in GLES.

 Fixes dEQP test:
   * dEQP-GLES3.functional.shaders.fragdata.invalid_assign_to_1
 ---
  src/glsl/ast_array_index.cpp | 21 -
  1 file changed, 20 insertions(+), 1 deletion(-)

 diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp
 index ff0c757..b507d34 100644
 --- a/src/glsl/ast_array_index.cpp
 +++ b/src/glsl/ast_array_index.cpp
 @@ -46,7 +46,9 @@ ast_array_specifier::print(void) const
   *
   * This function also checks whether the array is a built-in array whose
   * maximum size is too small to accommodate the given index, and if so uses
 - * loc and state to report the error.
 + * loc and state to report the error. It also checks that the built-in array
 + * gl_FragData is not accessed with indexes greater than zero in OpenGL ES,
 + * where multiple render targets are not allowed.
   */
  static void
  update_max_array_access(ir_rvalue *ir, int idx, YYLTYPE *loc,
 @@ -54,6 +56,23 @@ update_max_array_access(ir_rvalue *ir, int idx, YYLTYPE 
 *loc,
  {
 if (ir_dereference_variable *deref_var = ir-as_dereference_variable()) {
ir_variable *var = deref_var-var;
 +
 +  /* Page 89 in the section 3.8 (Fragment Shaders) of the the
 +   * OpenGL ES 2.0.25 spec says:
 +   * The OpenGL ES Shading Language specification describes the
 +   * values that may be output by a fragment shader. These are
 +   * gl_FragColor and gl_FragData[0].
 +   *  ...
 +   * gl_FragData is supported for compatibility with the desktop
 +   * OpenGL Shading Language, but only a single fragment color
 +   * output is allowed in the OpenGL ES Shading Language.
 +   */
 +  if (state-es_shader  idx  0 
 +  strcmp(var-name, gl_FragData) == 0) {
 + _mesa_glsl_error(loc, state,
 +  multiple render targets are not supported);
 +  }
 +
if (idx  (int)var-data.max_array_access) {
   var-data.max_array_access = idx;

 --
 2.1.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

AFAICS this restriction is lifted in ES 3.0+ (e.g. see section 3.9.2
in the OpenGL ES 3.0 spec).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev